This action funds an NSF Postdoctoral Research Fellowship for FY 2010. The fellowship supports a research and training plan entitled "Understanding the genetic basis of evolution: a statistical and empirical exploration of evolutionary genomics" for Brittny Calsbeek. The host institution for this research is North Carolina State University, and the sponsoring scientist is Trudy Mackay.
Evolutionary biology is entering an exciting new phase where genomics techniques are revealing the molecular processes driving the evolution of quantitative traits. A major challenge is developing computational methods to analyze the increasingly complex datasets produced by genomics techniques. This project provides a statistical framework for using tensor decomposition to analyze evolutionary genomics datasets. These methods are being coded in Matlab programming language; and the resulting software is being shared on the host laboratory's website. In addition, the statistical methods developed for analyzing genomics datasets are being applied to the lab's existing, large-scale evolutionary genomics dataset using fruit flies (Drosophila melanogaster). The statistical methods developed as part of this research provide a biologically relevant framework for exploring the genetic basis of quantitative trait evolution.
This fellowship will prepare the fellow for a future career as an independent researcher by providing 1) experience with cutting-edge statistical techniques for collecting and analyzing genomics data, 2) further training in computer programming and software development, and 3) training in the most current laboratory techniques for collecting evolutionary genomics data using D. melanogaster as a study system. The analysis of the Drosophila dataset broadly impact the scientific community by addressing the gaps in understanding of the genetic processes underlying evolution. Resolving this shortcoming is critical if we hope to accurately predict the response of natural populations to selective pressures such as those associated with global climate change.
Our ability to respond to the current global challenges in evolutionary biology, such as predicting a population’s ability to respond to climate change, depends on understanding the molecular genetic details of trait evolution. Until recently, these details could only be approximated using classic genetic theory. New, high-throughput genomic techniques are providing an unprecedented glance at the molecular variation responsible for the evolution of populations over time. As a postdoctoral fellow in the lab of Dr. Trudy Mackay at North Carolina State University, I developed computational and statistical models to analyze the large datasets produced by these new high-throughput genomic techniques. The Mackay lab developed a living library of sequenced Drosophila (fruit fly) lines that can be studied by the scientific community to identify the genes and genetic networks underlying complex traits. This dataset is exceptional in that it provides one of the first opportunities to study variation in, and interactions between gene sequences and expression in multiple different traits. During my fellowship I developed computational methods for analyzing genome-wide variation in gene expression using tiling arrays. This work included developing segmentation methods for tiling array data that allows for traditional regression analyses of traits on gene expression units. These methods allowed us to detection of new regions of the genome that contribute to trait variation. I also developed methods for the detection of single feature polymorphisms (variable areas of the genome) in genome-wide expression datasets and the statistical correction of expression signal errors due to these variations. These methods have allowed us to explore variation in gene expression that leads to variation in over twelve different traits including aggressive behavior, sleep, and longevity.