One of the central goals of modern human genetics is to understand why complex genetic dis- eases are as prevalent as they are, and why genetic risk is distributed among individuals and across the genome in the way that it is. Over the past decade, genome wide association studies (GWAS) have gen- erated a deluge of information about the mutations that underlie the variation in susceptibility for complex disease. These findings show that for many diseases, variation in susceptibility arises from many hundreds or even thousands of variants, many of which segregate at appreciable frequencies in the population but have vanishingly small penetrance. Yet we still lack a good understanding of why there is so much genetic variation affecting the susceptibility to diseases that often involve a severe fitness cost, and what shapes this genetic variation (e.g., the distribution of variant frequencies and effect sizes) Despite the basic and practical importance of these question, there has been surprisingly little work aimed at answering them, and specifically at understanding how population genetics processes give rise to the genetic basis of disease susceptibility being uncovered by GWAS. The goal of the proposed research is to fill this gap.
The first aim i s to develop models describing how the genetic architecture and the population prevalence of complex disease results from an interplay between internal biological forces, such as the mutation rate, the distribution of mutational effects on the disease, and on other traits, and external population level forces, such as natural selection, population size changes, or variation in diet and lifestyle.
The second aim i s to develop a likelihood based statistical framework for inferring the parameters corresponding to these factors from the results of GWAS, and applying the inference to data for at least 10 complex disease in order to learn about the processes and parameters that shape their genetic architecture and determine their prevalence. An open access and well documented software package implementing the statistical inference will be made freely available to the research community. The proposed models and statistical inferences will the first to address these questions based on a principled biological model of disease, and are expected to substantially advance our understanding of the processes that shape complex disease susceptibility in humans.

Public Health Relevance

An enormous amount of effort has been expended over the last decade in identifying genetic variants associated with complex genetic disease, yet a comprehensive understanding of how we should expect disease preva- lence and the distribution of genetic risk among individuals and across the genome remains lacking. I will develop mathematical models of complex disease evolution in human populations and statistical techniques to de- termine what we can learn from presently available data about how these diseases have evolved. The results will be useful as a guide for designing future studies of disease, and will be informative about the nature of disease more broadly.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Postdoctoral Individual National Research Service Award (F32)
Project #
1F32GM126787-01
Application #
9470361
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Ravichandran, Veerasamy
Project Start
2018-05-01
Project End
2020-04-30
Budget Start
2018-05-01
Budget End
2019-04-30
Support Year
1
Fiscal Year
2018
Total Cost
Indirect Cost
Name
Columbia University (N.Y.)
Department
Biology
Type
Graduate Schools
DUNS #
049179401
City
New York
State
NY
Country
United States
Zip Code
10027