We aim to maximize the discovery of risk genes for structural birth defects using an integrated analysis of genome sequencing data from multiple diseases. Structural birth defects, such as congenital heart disease (CHD), congenital diaphragmatic hernias (CDH), and craniofacial malformations, are present in 3% of live births. WES studies have discovered a few dozen new candidate risk genes. However, it is estimated that the total number of risk genes with large genetic effect is on the order of hundreds, and the vast majority of them are yet to be discovered. Previous studies by us and others indicate that de novo variants with large genetic effect are often pleiotropic and may cause developmental defects in multiple organs, and gene haploinsufficiency is one of the main molecular mechanisms by which these variants cause disease. Based on this model, we propose to jointly analyze the whole genome sequencing (WGS) data from multiple birth defects generated by the Gabriella Miller Kids First Program (Kids First) together with available whole exome sequencing (WES) data for birth defects and WGS/WES for neurodevelopmental disorders to maximize the power of identifying candidate risk genes, and to prioritize candidate genes and variants by gene dosage sensitivity predicted from epigenomic data and mutation intolerance. We will also exploit the genetic connection between cancer and developmental disorders to improve interpretation of germline variants in birth defects based on somatic mutation patterns in cancer. To this end, we propose two specific aims:
Specific Aim 1. Joint genetic analysis of structural birth defects and neurodevelopmental disorders to maximize gene discovery.
Specific Aim 2. Prioritization of variants and genes based on haploinsufficiency and cancer somatic mutation patterns. The proposed study will maximize the genetic discovery potential of the Kids First WGS data sets for birth defects and improve our understanding of the pleiotropic effects and tissue specificity of risk genes and variants. The analytical approaches developed in this study will be applicable to genetic data of birth defects and developmental disorders from future Kids First cohorts and other programs.

Public Health Relevance

We propose to analyze whole genome sequencing data from the Kids First cohorts using new computational approaches that maximize the risk gene discovery and improve interpretation of candidate variants. Our research will enhance the utility of Kids First program generated data, and improve genetic diagnosis of birth defects and pediatric cancer.

Agency
National Institute of Health (NIH)
Institute
National Heart, Lung, and Blood Institute (NHLBI)
Type
Small Research Grants (R03)
Project #
1R03HL138352-01
Application #
9374910
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Leerkes, Maarten R
Project Start
2017-07-01
Project End
2019-06-30
Budget Start
2017-07-01
Budget End
2018-06-30
Support Year
1
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Columbia University (N.Y.)
Department
Biochemistry
Type
Schools of Medicine
DUNS #
621889815
City
New York
State
NY
Country
United States
Zip Code
10032