The subject of this proposal is a new, collaborative approach to improve the diagnosis of primary immunodeficiency diseases (PIDs). These patients have individually rare, monogenic disorders leading to severe infections, autoimmunity, and inflammation. The prevalence of PIDs is ~1:10,000 and approximately half have antibody deficiencies as their main immunological phenotype. Most doctors are unaware of these diseases and many patients go years without a diagnosis, costing the system tens of thousands of dollars per patient yearly and unnecessarily increasing morbidity and mortality. There is a tremendous, untapped opportunity to advance the diagnosis of patients with PIDs. We propose to utilize new machine-learning approaches to algorithmically identify patients with PIDs from their electronic health records (EHR). To accomplish our goals, we have built a coalition of computational genomics groups at UCLA, UCSF, and Vanderbilt (Computational team), and clinical immunology groups at the five University of California medical centers (Los Angeles, San Francisco, Irvine, San Diego, and Davis) (Immunology team). We propose to: Identify patients with rare immune diseases by phenotype risk scoring (Aim 1). We will speed the identification of patients with rare immune diseases by surveilling the EHR using a phenotype risk scoring approach, building upon recently published work in Science. We will apply this approach to the UCLA, UCSF, and Vanderbilt clinical data repositories to identify potential cases. We will improve risk scoring by considering gender, age, and race/ethnicity. We will classify patients by whether they have an infection phenotype or immune dysregulation phenotype. Subsequently, we will expand to the larger, UC Health-wide Data Warehouse (UCHWDW), entailing 15+ million patients across all UC medical centers. We will then Identify the genetic immune diseases for these newly found subjects (Aim 2). We will follow the state-of-the-art approach employed by the UCLA and Vanderbilt Undiagnosed Disease Network (UDN) sites. We will start by sequencing all the known antibody deficiency patients across the Immunology team sites while collaboratively pre-reviewing identified cases from Aim 1 on monthly video-calls. For selected subjects, we will perform whole genome and RNA sequencing. Clinical and research laboratory testing will bring closure to the diagnostic odyssey for these subjects. The overall impact of this work accelerates the diagnosis and cure of PIDs. This project will also serve as a demonstration of how immunology sites can work together sharing electronic medical records and genomic data to advance care.

Public Health Relevance

A major challenge for patients with immune diseases is the delay in diagnosis that leads to morbidity and mortality. This proposal combines teams from the five medical centers of the University of California and Vanderbilt to apply advanced computational, machine-learning approaches to identify patients with genetic immunodeficiency diseases. This approach will dramatically improve the care of otherwise-neglected patients with rare diseases.

National Institute of Health (NIH)
National Institute of Allergy and Infectious Diseases (NIAID)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Voulgaropoulou, Frosso
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Los Angeles
Schools of Medicine
Los Angeles
United States
Zip Code