About this Event

Add to calendar

"Learning Lifetime Disease Liability Reveals and Removes Genetic Confounding in Electronic Health Records"

 

Abstract:

Electronic health records (EHRs) have become the cornerstone of population-scale genetic studies, but factors including patterns of healthcare use shape which and how diagnoses are recorded, leading to confounding effects in genetic associations with EHR codes In this study we propose EDGAR, a deep learning framework that recovers lifetime disease liability from EHR by aligning diagnostic codes with clinically validated measures and disease labels in a set of individuals prioritized through active learning. EDGAR yields representations that better capture disease-specific effects in genome-wide association analyses (GWAS). It also enables us to isolate a genetic factor that captures systemic biases in EHR codes, which distorts cross-disease correlations and drives spurious links with behavioral and socio-economic traits. We find that this factor generalizes across EHRs, and its identification in one EHR enables its removal from existing GWAS in another. Overall, our work presents a promising direction for improving specificity of EHR-based GWAS.

Date: Tuesday, May 12, 2026
Time: 9 - 10am ET
Zoom Registration Link
Zoom Meeting ID: 961 8387 9377

Zoom meeting info will be sent after registration.
 

 

Speaker:

 

Na Cai, DPhil
Assistant Professor
Department of Biosystems Science and Engineering
Basel Research Centre for Child Health
na.cai@bsse.ethz.ch
X: @caina89
BlueSky: @caina89.bsky.social


Dr Na Cai is a statistical geneticist whose research focus is primarily on the genetic underpinnings of psychiatric disorders. Na has performed the first genetic study that identified significant and replicated genetic associations with Major Depressive Disorder during her PhD, and have since worked on critically assessing and maximizing the use of all types of available data (clinical cohorts, volunteer-based biobanks, self-reports through consumer genomic companies, and electronic health records) specifically for genetic research. Her goal is to identify the genetic effects that are specific to the (potentially heterogeneous) pathologic pathways in psychiatric disorders, and use them to identity genes and cell types that are potentially relevant for development of targeted treatments and interventions.


Faculty Host

Chelsea Lowther, PhD
Assistant Professor
Institute for Genomic Health

 

Event Details