, including type II diabetes related glycemic traits, coronary artery diseases, and age-related macular degeneration. Most of the rare variants involved in complex human diseases have moderate effects, which makes it necessary to analyze large sample sizes. Technology advances such as the use of social media and consumer directed genetics have greatly empowered researchers to quickly recruit study participants with interesting phenotypes. Thanks to the decreasing cost of sequencing and microarray genotyping, there is an unprecedented opportunity to assess the impact of rare variants in these ever-growing reservoir of sequenced/genotyped samples, and understand the genetic architecture of rare variants. Meta-analyses have been a powerful tool to aggregate genotype-phenotype association information from multiple cohorts. Compared to methods that require pooling individual level data, meta-analyses better protect study participant privacy, more robust against heterogeneity between studies, and offer equal power for detecting associations. In many settings, meta-analysis is the only potential solution where sharing individual level information is impossible. In the sequencing age, meta-analyses warrant additional development, in order to accommodate the much increased scale of the datasets, enable more accurate assessment of statistical significance for analyzing low frequency variants, and allow for more robust association analyses. In Aim 1, we will develop novel methods to enable more accurate meta-analyses of low frequency variants extending ideas from Firth and Bartlett correction. In Aim 2, we will develop more scalable methods to meta-analyze low frequency variants, borrowing strength from large reference panels e.g. from the Haplotype Reference Consortium. In Aim 3, we will develop methods to accommodate sequence data heterogeneities and enable more robust meta-analyses. In Aim 4, we will develop methods that enable the global assessment of genetic architectures, allowing more accurate enrichment analyses and tissue specific analyses. For all methods arising from this proposal, we will provide useful softwares implementing these methods, continuing our strong track record in this direction (Aim 5). To achieve our research goals, we assembled a strong research team, consisting of not only method developers, but also geneticists leading big, high profile studies. Methods and tools from this proposal will be applied to some of the largest datasets in the world for studying nicotine and alcohol dependence, lipid levels, heart disease and macular degenerations. PUBLIC HEALTH RELEVANCE: Rare variants have established functional roles in a variety of human complex disease with significant public health relevance, including type II diabetes, age-related macular degenerations and coronary artery diseases etc. With decreasing cost of sequencing, there is an unprecedented opportunity to understand complex traits genetic architectures in ever-increasing reservoirs of sequenced or genotyped samples. Our proposed methodology will greatly facilitate the meta-analysis of sequence data and address the challenges for modeling genetic architectures of complex traits.
Showing the most recent 10 out of 13 publications