All humans are admixtures of various historical source populations. This admixture has occurred across a range of time-scales, from recent admixture such as the intercontinental admixture in African Americans or Hispanics, to ancient admixture such as the admixture with Neanderthals that occurred when modern humans migrated out of Africa around 50,000 years ago. Local ancestry is the population-of-origin of an individual?s chromosomes at each point in the genome. Local ancestry is essential for many applications, including admixture mapping and inferring demographic history. Local ancestry is not directly observed, but must be inferred from an individual?s genotype data. Existing methods for inferring local ancestry are inadequate for untangling complex admixtures in human populations. Existing methods struggle when reference data for the source populations are limited or poorly-matched. These methods are unable to handle genetically similar ancestral populations, divergent admixture times, or more than a few ancestral populations. We propose to develop new methods and computational tools to address these gaps. Our methods will utilize a state-of-the-art haplotype frequency model and new computational methods to greatly improve the accuracy and computational efficiency of local ancestry inference. Our methods will overcome current limitations by flexible modelling of admixture times, by enabling local ancestry inference when reference panels are from closely-related populations, and by creating new reference panels from admixed data when no existing reference population is well-matched to the ancestral population. Our methods will be implemented in user-friendly, computationally efficient, open-source software that scales to analysis of very large samples of sequenced individuals. We will call fine-scale local ancestry in sequence data from diverse African populations. We will use local ancestry calls to detect past migration within Africa, to detect post-admixture selection, and to perform admixture mapping for a broad spectrum of traits including heart, lung, and blood traits, and infectious disease status.
Human populations are all inter-related, and each person has ancestry from a variety of ancestral populations. This project will develop new statistical methods to infer an individual?s ancestry at each point in the genome. The inferred ancestry will be useful for learning about the histories of human populations, and for finding the regions of the genome that are involved in causing disease.