Orofacial clefts (OFCs) comprise a significant fraction of human birth defects in all populations (ranging between 1/500 to 1/2500 live births) and represent a major public health challenge. Individuals born with OFCs require surgical, nutritional, dental, speech, medical and behavioral interventions, imposing a substantial economic and personal burden. There has been convincing evidence that non-syndromic OFCs represent human complex disorders with a multifactorial etiology including genetic risk factors, environmental exposures, and their complex interactions. So far, there have been ~10 genome-wide association studies (GWAS) conducted for non-syndromic CL/P (NSCL/P) and >15 genomic loci reported with compelling statistical support, including genes such as IRF6, PAX7, and ABCA4 and the 8q24 locus. In addition, next-generation sequencing (NGS) as well as exome array have been conducted with extra depth of genotyping that enable detection of rare variants associated with OFCs. However, gaps exist in how to interpret these variants and how to identify novel variants from the large volume of data, with high expectations for new methods and new models for ?second- analysis? of the genome-wide data. In this proposal, we propose two complementary aims to carry out deep and second-analysis of genome-wide data for OFCs.
In Aim 1, we propose a deep learning method to build in silico models that can predict the effect of genetic variants in the context of rich craniofacial epigenomic features. With substantial fine map of sequence patterns, ad hoc motifs will be revealed and variants that disturb these motifs will provide mechanistic insights on OFCs.
In Aim 2, we shift our focus to the gene level and propose a network assisted method to discover sensibly combined genes in spatial and temporal points that are critical to orofacial development. We target on all forms of OFCs, with particular interest in NSCL/P. To guarantee the success of this proposal, we form a multi-disciplinary team and local computational infrastructure equipped with GPUs for the implementation of both aims.
Our aims are non-overlapping; rather, they are integrated and strongly focused on our fundamental question of interest: how genetic variants function to cause OFCs. The successful completion of our proposal will lead to deep understanding of genetic components in OFCs.

Public Health Relevance

Orofacial clefts (OFCs) comprise a significant fraction of human birth defects in all populations and represent a major public health challenge. Substantial progress has been made to identify genes and variants for OFCs through recent genome-wide association studies and next- generation sequencing projects. In this proposal, we aim to develop computational methods to comprehensively fine map and interpret the effects of these genetics variants, which will eventually lead to understanding of the mechanisms and new therapies for OFCs.

National Institute of Health (NIH)
National Institute of Dental & Craniofacial Research (NIDCR)
Small Research Grants (R03)
Project #
Application #
Study Section
Special Emphasis Panel (ZDE1)
Program Officer
Wang, Lu
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Health Science Center Houston
Sch Allied Health Professions
United States
Zip Code