Dental caries (i.e., dental cavities or tooth decay) and orofacial clefts cause tremendous public health burden. In addition to the costs of treatment, these conditions greatly affect the quality of life of patients and their families. The etiology of caries and clefts has been studied extensively, and both environmental and genetic risk factors have been implicated. Despite some success in the study of genetic factors of caries and oral clefts, there is still a large portion of missing heritability (i.e., the identified genetic variants explain only a small proportion of the estimated heritability). One important reason for this missing heritability is the lack of powerful statistical methods for the efficient use of data. In addition, translating the knowledge gained from scientific studies into clinical practice has been recognized as an essential step toward future personalized healthcare, but even more is lacking in this area. Such a lack of research on the risk prediction of dental caries and orofacial clefts needs to be addressed urgently given the critical role of the mouth and teeth in our daily lives. The long-term goals are to improve understanding of the mechanisms leading to dental and craniofacial disorders and to use the scientific findings to aid clinical practice. The objectives of this application are to develop new statistical methods to facilitate the identification of novel genetic variants contributing to dental caries and orofacial clefts and the prediction of the risk of their occurring using genome-wide data, with the following aims: (1) To develop a novel family-based statistical method to identify the joint effects of multiple genetic markers on multiple phenotypes with cross-sectional or longitudinal observations, and to apply it to dental caries GWAS data. Dental disorders usually show a strong familial aggregation. Complex diseases are often multifaceted and the joint analysis of these correlated phenotypes can increase power in risk gene discovery. To our knowledge, there are no such methods available.
This aim i ntends to fill this gap. (2) To develop and validate high-dimensional, unified Bayesian models to identify genetic variants underlying dental caries and orofacial clefts and to predict the risks of them in family and unrelated samples. We will develop a flexible, multiple-marker-based, hierarchical modeling framework that can handle different types of traits (e.g. discrete and quantitative traits) and analyze both family and unrelated samples, for both disease-variant identification and risk prediction. This is the first work to estimate and test both group effect and the effect of individual variants for family data and the first high-dimensional, family- based risk-prediction models for orofacial clefts and dental caries. We will analyze multiple GWAS datasets obtained from dbGaP and our collaborators. The successful completion of the Aims will lead to powerful statistical models and helpful software, and new discoveries on the etiology of dental caries and orofacial clefts. These expected results will expand our understanding and ultimately enhance our ability to decipher the genetic basis of dental caries and orofacial clefts and help us to efficiently prevent, diagnose, and treat them.

Public Health Relevance

The proposed research is relevant to public health because dental caries (also known as tooth decay) and orofacial clefts create tremendous burden for patients, their families, and the whole society. We propose to develop powerful statistical methods to facilitate the identification of novel genetic variants contributing to dental caries and orofacial clefting and the prediction of the risk of them using genome-wide genetic data. The project is closely related to the mission of NIDCR and NIH because the successful completion of this project will eventually lead to better understanding of the etiology, efficient intervention and prevention, and eventually the personalized prevention and treatment strategies of dental caries and orofacial clefts.

National Institute of Health (NIH)
National Institute of Dental & Craniofacial Research (NIDCR)
Small Research Grants (R03)
Project #
Application #
Study Section
Special Emphasis Panel (ZDE1-GZ (16))
Program Officer
Harris, Emily L
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Indiana University Bloomington
United States
Zip Code
Zhang, Xinyan; Pei, Yu-Fang; Zhang, Lei et al. (2018) Negative Binomial Mixed Models for Analyzing Longitudinal Microbiome Data. Front Microbiol 9:1683
Tang, Zaixiang; Shen, Yueping; Li, Yan et al. (2018) Group spike-and-slab lasso generalized linear models for disease prediction and associated genes detection by incorporating pathway information. Bioinformatics 34:901-910
Zhang, Xinyan; Li, Bingzong; Han, Huiying et al. (2018) Predicting multi-level drug response with gene expression profile in multiple myeloma using hierarchical ordinal regression. BMC Cancer 18:551
Tang, Zaixiang; Shen, Yueping; Zhang, Xinyan et al. (2017) The spike-and-slab lasso Cox model for survival prediction and associated genes detection. Bioinformatics 33:2799-2807
Zhang, Xinyan; Li, Yan; Akinyemiju, Tomi et al. (2017) Pathway-Structured Predictive Model for Cancer Survival Prediction: A Two-Stage Approach. Genetics 205:89-100