Studies of human genetic variation over the last decade have revealed that mixture events between highly diverged population groups (archaic admixture), such as between Neandertals and non-Africans, have been a common occurrence and are likely to have had a major impact on human phenotypes. For example, studies have documented its phenotypic impact in analyses of individual loci, such as the MHC locus. However, a lack of adequate analytical tools has hindered a systematic understanding of the phenotypic impact of archaic admixture. This K99/R00 research proposal proposes to develop and validate statistical methods to infer the genetic structure arising from archaic admixture and to leverage this structure to identify genetic variants introduced by archaic admixture that influence phenotypes. Insights from the application of these methods will not only produce a more complete understanding of the genetic factors underlying complex phenotypes, such as common diseases, but will also ensure that currently under-served minority populations, many of whom descend from admixture events or from ancestral groups distinct from those of Europeans, can be studied just as effectively as populations of European descent and can benefit from the discoveries of genomic medicine. The first goal of this proposal is to extend and validate our current statistical model for accurate inference of local ancestry in archaic admixtures. The proposed model attempts to integrate a large number of patterns of genetic variation using the statistical framework of Conditional Random Fields (CRF). An important first example for the application of this model is the inference of Neandertal local ancestry in non-African populations. The inferred Neandertal ancestry will be leveraged for the second goal: to associate Neandertal variants with specific phenotypes. This goal will be pursued by analyzing a custom array designed to capture Neandertal-derived variants and by extending the CRF to infer Neandertal ancestry from SNP genotyping arrays rather than from next-generation sequencing. A complementary approach to study the action of natural selection on Neandertal variants, using a novel diffusion process-based statistical test, will be explored. Finally, the CR will be generalized to handle multiple ancestral populations as well as to the case where no reference genomes are available for the ancestral populations, and will be tested and validated on important examples for each case such as Denisovan admixture into Melanesian populations and sub-Saharan African populations that have evidence of unknown archaic ancestry. All of the methods and the results from this research will be made publicly available.

Public Health Relevance

Although mixture events between highly diverged population groups (archaic admixture) are now known to have been common throughout human history and are likely to have had a major impact on human phenotypes, this impact has not been systematically understood due to the lack of powerful analytical tools. The proposed research will develop and validate statistical methods to understand the phenotypic impact of archaic admixture. Insights from the application of these methods will not only produce a more complete understanding of genetic variants that modulate complex phenotypes, such as common diseases, but will also ensure that currently under-served minority populations, many of whom descend from admixture events or from ancestral groups distinct from those of Europeans, can be studied just as effectively as populations of European descent and can benefit from the discoveries of genomic medicine.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Transition Award (R00)
Project #
5R00GM111744-04
Application #
9210099
Study Section
Special Emphasis Panel (NSS)
Program Officer
Janes, Daniel E
Project Start
2014-09-15
Project End
2018-12-31
Budget Start
2017-01-01
Budget End
2017-12-31
Support Year
4
Fiscal Year
2017
Total Cost
$224,096
Indirect Cost
$78,579
Name
University of California Los Angeles
Department
Biostatistics & Other Math Sci
Type
Schools of Engineering
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Johnson, Ruth; Shi, Huwenbo; Pasaniuc, Bogdan et al. (2018) A unifying framework for joint trait analysis under a non-infinitesimal model. Bioinformatics 34:i195-i201
Wu, Yue; Sankararaman, Sriram (2018) A scalable estimator of SNP heritability for biobank-scale data. Bioinformatics 34:i187-i194
Hormozdiari, Farhad; Zhu, Anthony; Kichaev, Gleb et al. (2017) Widespread Allelic Heterogeneity in Complex Traits. Am J Hum Genet 100:789-802
Jégou, B; Sankararaman, S; Rolland, A D et al. (2017) Meiotic Genes Are Enriched in Regions of Reduced Archaic Ancestry. Mol Biol Evol 34:1974-1980
Mallick, Swapan; Li, Heng; Lipson, Mark et al. (2016) The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538:201-206
Hormozdiari, Farhad; van de Bunt, Martijn; Segrè, Ayellet V et al. (2016) Colocalization of GWAS and eQTL Signals Detects Target Genes. Am J Hum Genet 99:1245-1260
Sankararaman, Sriram; Mallick, Swapan; Patterson, Nick et al. (2016) The Combined Landscape of Denisovan and Neanderthal Ancestry in Present-Day Humans. Curr Biol 26:1241-7
Moorjani, Priya; Sankararaman, Sriram; Fu, Qiaomei et al. (2016) A genetic method for dating ancient genomes provides a direct estimate of human generation interval in the last 45,000 years. Proc Natl Acad Sci U S A 113:5652-7