Identification of genetic rare variants that predispose individuals to complex diseases -- such as obesity, heart disease, and type 2 diabetes (T2D) -- is an important step toward understanding disease etiology, which in turn has the potential to lead to breakthroughs in diagnosis, prevention, and treatment. Recent large-scale sequencing studies have started to identify rare variants of disease susceptibility, and further discoveries will be facilitated with more efficient designs and powerful statistical methods to integrate all available data. When multiple studies investigate the same disease or trait, the power to identify rare disease-susceptibility variants i greatly improved by integrating them via meta-analysis. Additionally, we can increase sample size and hence power by using sequenced samples from studies of other diseases as controls. Finally, by incorporating functional information of rare variants collected from various experiments into our association tests, analysis power can be improved. Our proposal represents several critical methodological improvements for all three strategies, which will increase power significantly. Specifically, we will develop 1) robust meta-analysis methods for rare-variant association tests for binary traits; 2) methods to use external samples as control samples to increase power while controlling for a possible batch effect; 3) an integrative analysis approach for testing non-coding regions by incorporating functional annotations. The proposed methods will be evaluated through extensive simulation studies and applications to multiple real datasets. In addition we will continue to develop, distribute, and support open-source software packages for the proposed methods and update and support our current software.
Complex diseases such as obesity, heart disease, and type 2 diabetes (T2D) are major public health concerns. The proposed research will develop advanced computational and statistical methods to improve power to identify rare variants of disease susceptibility. The power gain from these methods will be translated into gains in our understanding of human disease etiology and eventual improvements in human health.