Title: Models and Methods for Population Genomics Abstract: Understanding genome-wide genetic variation and its role in health-related complex traits in humans is one of the most important goals of modern biomedical research. There continues to be a substantial need for new statistical models and methods that can be applied in these studies, particularly as study designs become more ambitious and sample sizes increase. The overall goal of the proposed research is to develop statistical methods and software useful in understanding population genomics studies that involve genome-wide genotyping, many simultaneously measured traits, and very large sample sizes. Our focus is on flexible modeling that adapts to systematic variation and robustly models data encountered in these modern studies.
The specific aims i nvolve (1) developing tests of association immune to arbitrary population structure that work for general distributions of traits, many simultaneous traits, or extreme large sample sizes; (2) introducing new models and estimates of kinship and FST in generalized settings, which will lead to improved quantitative genetic modeling of complex traits; (3) introducing new estimation and testing frameworks for population structure that show superior performance to existing approaches; (4) developing and distributing software; and (5) analyzing important data sets to discover new biology and validate our methods and software.
Understanding genome-wide patterns of genetic variation among individuals and how this relates to complex diseases is one of the primary goals of modern medical research. The proposed research will contribute to this goal by tackling a number of open problems in such a way that a coherent statistical framework and set of methodologies will emerge that can be applied to data sets of genome-wide genetic variation to produce a clearer picture of the genetic basis of human disease.
|Gopalan, Prem; Hao, Wei; Blei, David M et al. (2016) Scaling probabilistic models of genetic variation to millions of humans. Nat Genet 48:1587-1590|
|Hao, Wei; Song, Minsun; Storey, John D (2016) Probabilistic models of genetic variation in structured populations applied to global human studies. Bioinformatics 32:713-21|
|Song, Minsun; Hao, Wei; Storey, John D (2015) Testing for genetic associations in arbitrarily structured populations. Nat Genet 47:550-4|
|Chung, Neo Christopher; Storey, John D (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics 31:545-54|