This action funds an NSF Postdoctoral Research Fellowship in Biology for FY 2011, Broadening Participation. The fellowship supports a research and training plan in a host laboratory for the Fellow who also presents a plan to broaden participation in biology. The title of the research and training plan for this fellowship to Michael DeGiorgio is "Using mathematical models to study the spatial distribution of human genetic variation." The host institution for this research is University of California, Berkeley and the sponsoring scientist is Dr. Rasmus Nielsen.
The research promises a better understanding of the processes that led to the current distribution of genetic variation among human populations. The research plan includes testing different hypotheses of modern human origins by developing and analyzing models of human genetic history and applying statistical techniques to next-generation sequence data from human populations. Further, the Fellow is deriving mathematical formulas to describe the distribution of genetic variation under different models of human evolutionary history. Findings from this study add to greater knowledge of the demographic processes that shaped the spatial distribution of human genetic variation, enabling investigators to distinguish signals of adaptation from those of demography.
The training objectives include learning to make population genetic inferences from low-coverage next-generation sequencing data and gaining new perspectives for interpreting and understanding evolution. The new computational tools being developed are being made available to the public and results published with open access.
Broader impacts include research experience for undergraduate students from underrepresented groups early in their scientific training through the Berkeley Biology Scholars Program to expose them to population genomic research, computation, and the ethical implications of the research.
The research goal of this fellowship was to study the evolutionary processes that shape human genetic diversity both across the genome and across the globe. To gain insight into these processes, we took an integrative approach involving mathematical theory and data analysis. First, we performed a theoretical study to examine how genetic diversity across a geographic landscape is influenced by a past expansion of a population across the range of the landscape. In particular, we sought to address a debate in the field concerning the direction of greatest variation across a geographic landscape after a range expansion. Some believe that the direction is perpendicular to the expansion, while others believe that it is parallel. We used computer simulations to show that both directions can be correct, and that the observed direction may depend on how populations were sampled across the geographic landscape. These findings have important implications for using the direction of greatest variation to identify past range expansions, showing that caution should be taken when using this technique to make inferences about the demographic history of populations. Next, we investigated the genetic basis of living at high altitudes in Ethiopians. High altitude environments typically contain low oxygen levels, and individuals living in these environments may experience oxygen deprivation, resulting in a condition called hypoxia. We performed a genome-wide scan to uncover genes that may have evolved to enable Ethiopians to live in low-oxygen environments. The top candidate gene identified by our scan was BHLHE41, which is known to play a major role in the hypoxia pathway. However, this gene was only identified once we placed Ethiopians within a broader context by considering their potential genetic mixture with non-Africans. Thus, this study is a textbook example of how demographic relationships can be used to facilitate the identification of genes involved in adaptation. Moreover, the candidate gene identified by our study can now be followed up by molecular analyses to determine the mechanisms by which this gene has evolved to enable Ethiopians to live in such an extreme environment. Finally, we developed a new statistical method to search genomic data for signatures of balancing selection, an evolutionary phenomenon that maintains genetic diversity in a population or species. Our new technique is the first of its type and accounts for spatial variation around a genomic site experiencing balancing selection. Comparison of our method to several others showed that is the most powerful developed thus far. Additionally, application of our method to human genomic data revealed a novel candidate gene called FANK1, which is involved in a number of important biological processes. Recently, we implemented our method in open source software, which will be made freely available to the public. NSF funding for these and other related studies has resulted in the publication of nine peer-reviewed scientific articles in top journals in the field. Additionally, it has enabled me to achieve a second major goal of this fellowship, which was to perform outreach activities to increase the participation of individuals from underrepresented groups in the sciences. First, in collaboration with other fellows at the University of California, Berkeley, I taught an Advanced Biology class at Berkeley High School about DNA forensics twice for two years. Second, through the Berkeley Biology Scholar Program (a program for underrepresented students in the sciences), I led a computational lab about HIV evolution to freshmen students twice for two years. Finally, over the past two years, I have served as an advisory board member and faculty instructor for the Summer Internship for Native Americans in Genomics workshop, which seeks to increase the participation of Native Americans in genomics. This is a week-long workshop that recruits experts from across the nation to teach about genomics and its ethical, legal, and social issues, particularly with respect to indigenous communities. Participation in the workshop is open to Native Americans of all educational backgrounds, ranging from no college to Ph.D. graduates, enabling current topics in genomics to reach a broad Native American audience.