In the last decade there has been major progress toward identifying the genetic bases of complex diseases and developing polygenic predictors for individuals who are at increased risk. Polygenic prediction models are now approaching the point of clinical relevance for several important diseases. However, since most of the polygenic risk is due to extremely large numbers of small-effect variants it is difficult to construct maximally efficient prediction models even using very large GWAS samples. At present, the largest samples are currently available for European ancestry individuals. Prediction models developed in these samples usually do not port well into other groups, although the precise reasons for the limited portability are not yet fully understood. In this project we will (1) measure the specific importance of different factors that contribute to the limited portability across groups; (2) implement and evaluate new statistical methods for computing polygenic predictors using joint inference across populations, and using functional information as priors; and (3) implement and evaluate new statistical methods for combining genetic information with other types of clinical data for prospective prediction in clinical settings. In summary our project will provide a framework of efficient statistical methods for polygenic prediction within and across populations.
In the last decade there has been huge progress in identifying the genetic bases of complex diseases and in developing polygenic predictors for individuals who are at increased risk. However, these polygenic predictors do not extract all the signal that is available in the genetic data, and they also lack portability across populations. In this project we seek to develop better statistical methods for producing better, and more portable polygenic prediction models.