We aim to maximize discovery of new risk genes and elucidate the genetic architecture of structural birth defects. To achieve that, we propose cross-disease genetic analysis of both protein-coding and noncoding variants and integration of gene expression data to prioritize candidate risk genes. Better understanding of the genetic basis of structural birth defects will lead to new insights into human developmental biology and will provide targets for medical intervention and treatment. Recent large-scale genome and exome sequencing studies of birth defects have identified new risk genes, especially through the analysis of de novo variants in protein coding regions. However, we are still far from complete understanding of the genetic causes of birth defects. Estimates are that there are 400- 800 risk genes of large effect size for birth defects such as congenital heart disease, and the vast majority of these genes are unknown. This is primarily due to the lack of statistical power. While increasing sample size is essential and is part of the core deliverables of the Gabriella Miller Kids First (GMKF) programs, we also need to develop and apply new analytical methods that improve power and maximize the utility of the available genetic data by using other types of data and biological knowledge. In addition, in most prior studies, the analysis of rare genetic variation has been focused on small variants in the coding regions or large copy number variants (CNV). The data and methods to interrogate the contribution of rare noncoding variants is rudimentary, limiting our understanding of genetic architecture of these diseases. In this study, we propose two aims to address these questions by leverage GMKF cross-disease whole genome sequencing data sets:
Specific Aim 1. Elucidate genetic architecture by cross-disease analysis of rare coding and non-coding variants.
Specific Aim 2. Integrate gene expression with genome sequencing data to improve discovery and biological interpretation of risk genes of structural birth defects. The proposed study will maximize the genetic discovery potential of the GMKF WGS data sets for birth defects and improve our understanding of the pleiotropic effects and tissue specificity of risk genes and variants. The analytical approaches developed in this study will be applicable to genetic data of birth defects and developmental disorders from future GMKF cohorts and other programs.

Public Health Relevance

We propose to analyze whole genome sequencing data from the Gabriella Miller Kids First cohorts using new computational approaches that maximize the risk gene discovery and improve interpretation of candidate genes. Our research will enhance the utility of Kids First program generated data and improve our understanding of the genetics of birth defects.

National Institute of Health (NIH)
National Heart, Lung, and Blood Institute (NHLBI)
Small Research Grants (R03)
Project #
Application #
Study Section
Therapeutic Approaches to Genetic Diseases Study Section (TAG)
Program Officer
Li, Huiqing
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Schools of Medicine
New York
United States
Zip Code