Identification of genetic rare variants that predispose individuals to complex diseases -- such as obesity, heart disease, and type 2 diabetes (T2D) -- is an important step toward understanding disease etiology, which in turn has the potential to lead to breakthroughs in diagnosis, prevention, and treatment. Recent large-scale sequencing studies have started to identify rare variants of disease susceptibility, and further discoveries will be facilitated with more efficient designs and powerful statistical methods to integrate all available data. When multiple studies investigate the same disease or trait, the power to identify rare disease-susceptibility variants i greatly improved by integrating them via meta-analysis. Additionally, we can increase sample size and hence power by using sequenced samples from studies of other diseases as controls. Finally, by incorporating functional information of rare variants collected from various experiments into our association tests, analysis power can be improved. Our proposal represents several critical methodological improvements for all three strategies, which will increase power significantly. Specifically, we will develop 1) robust meta-analysis methods for rare-variant association tests for binary traits; 2) methods to use external samples as control samples to increase power while controlling for a possible batch effect; 3) an integrative analysis approach for testing non-coding regions by incorporating functional annotations. The proposed methods will be evaluated through extensive simulation studies and applications to multiple real datasets. In addition we will continue to develop, distribute, and support open-source software packages for the proposed methods and update and support our current software.

Public Health Relevance

Complex diseases such as obesity, heart disease, and type 2 diabetes (T2D) are major public health concerns. The proposed research will develop advanced computational and statistical methods to improve power to identify rare variants of disease susceptibility. The power gain from these methods will be translated into gains in our understanding of human disease etiology and eventual improvements in human health.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG008773-05
Application #
9916780
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Sofia, Heidi J
Project Start
2016-05-17
Project End
2021-04-30
Budget Start
2020-05-01
Budget End
2021-04-30
Support Year
5
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Michigan Ann Arbor
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
073133571
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109
Dutta, Diptavo; Scott, Laura; Boehnke, Michael et al. (2018) Multi-SKAT: General framework to test for rare-variant association with multiple phenotypes. Genet Epidemiol :
Zhou, Wei; Nielsen, Jonas B; Fritsche, Lars G et al. (2018) Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet 50:1335-1341
Dey, Rounak; Schmidt, Ellen M; Abecasis, Goncalo R et al. (2017) A Fast and Accurate Algorithm to Test for Binary Phenotypes and Its Application to PheWAS. Am J Hum Genet 101:37-49
Lee, Seunggeun; Kim, Sehee; Fuchsberger, Christian (2017) Improving power for rare-variant tests by integrating external controls. Genet Epidemiol 41:610-619
He, Zihuai; Lee, Seunggeun; Zhang, Min et al. (2017) Rare-variant association tests in longitudinal studies, with an application to the Multi-Ethnic Study of Atherosclerosis (MESA). Genet Epidemiol 41:801-810
Lee, Seunggeun; Sun, Wei; Wright, Fred A et al. (2017) An improved and explicit surrogate variable analysis procedure by coefficient adjustment. Biometrika 2:303-316