The intellectual merit of this project lies in the development of population genetic theory and statistical genomic approaches for modeling the demographic history of admixed human populations, including minority populations in the U.S. The project has three major aims: AIM 1: To develop improved population genetic models of admixed human populations. Genomes of admixed individuals can be modeled as mosaics of chromosomal tracts that originate from a finite number of ancestors and populations. Current models of admixture are limited by simplifying assumptions regarding the amount, duration, or direction of the admixture process and fail to accurately predict patterns of genetic variation observed in real data. To redress this issue, the investigators propose a family of higher order Markovian models and coalescent approaches that provide much greater flexibility. AIM 2: To develop statistical inference tools for estimating historical admixture. This project will improve upon existing methods for robustly inferring the population of origin of chromosomal segments in admixed genomes. Given the distribution of ''admixture tracts'', one then wishes to model the recent demographic history of the population. The investigators will use the results of Aims 1 and 2 along with a growing database in the investigator's lab on genomic variation in the Americas to provide detailed models of admixture that can empower medical and association studies in the Americas. AIM 3: To develop a Linkage/Admixture Map of Hispanic/Latino populations. Recently, it was shown that motifs associated with population specific alleles at the PRDM9 locus account for a significant fraction of recombination rate variation among populations. The investigators propose analyzing data from more than 20,000 genetically and ethnically diverse U.S. participants across at least 2.5 million markers to identify population-specific recombination hotspots and their relationship with ancestry. Specifically, the investigators aim to identify Native-American specific recombination hotspots and produce the highest-resolution recombination map of the human genome to date.

Genome-Wide Association Studies (GWAS) have dramatically increased the scientific community's understanding of the genetic basis of complex disease by identifying thousands of genetic variants associated with chronic diseases including Type 1 and Type 2 diabetes, heart disease, hypertension, and many cancers. A key limitation of existing studies is that they have focused largely on participants of European descent and some of these genotype/phenotype associations do not readily translate from one ethnic group to another. Furthermore, since the next generation of studies will focus largely on querying rare genetic variants (i.e., <5% frequency) for association, this ''transferability problem'' is likely to get worse and risks perpetuating or even widening existing health disparity among ethnic groups in the U.S. Broadening representation in medical genomics studies is a key mechanism for redressing these biases. Critical to enabling trans- and multi-ethnic medical genetic studies is a rigorous understanding of how 500 years of admixture among diverse European, African, and indigenous American source populations have shaped the genomes of African-American, Hispanic/Latinos, and Native Americans today. A key outcome of this research will be user-friendly software that implements a comprehensive, probabilistic, and flexible family of methods for inferring genetic ancestry. The investigators will train undergraduate, high school, community college, graduate, and post-doctoral students including many from underrepresented minority groups. Project results will be disseminated to the public through the San Jose Tech Museum and participation in conferences aimed at minority groups including SACNAS and the National Urban League. Partnerships with key industry leaders will allow us to reach potentially millions of participants.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1201234
Program Officer
Nandini Kannan
Project Start
Project End
Budget Start
2012-08-01
Budget End
2017-07-31
Support Year
Fiscal Year
2012
Total Cost
$1,590,784
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305