Gene expression is the primary mechanism in which information encoded by the genome is converted into developmental, morphological, and physiological phenotypes. Gene expression is also an important source of evolutionary change within and between species and aberrant gene expression has been implicated in the pathogenesis of numerous diseases. Thus, understanding the amount, structure, and patterns of gene expression variation is of fundamental importance to biomedical research and evolutionary biology. Recent studies in model organisms and humans have unambiguously shown that regulatory variation is both common and pervasive. However, many fundamental questions remain about how gene expression variation is distributed within and between human populations. To this end, the goals of this proposal are to develop a novel and quantitatively rigorous statistical framework for characterizing gene expression variation in structured populations, and apply these methods to gene expression data in geographically diverse human populations. More specifically, in Specific Aim 1 we will develop new statistical models and methods of analysis for characterizing gene expression variation in structured populations, which will facilitate a deeper understanding of expression variation.
In Specific Aim 2, we will apply these new analysis tools to publicly available gene expression data collected in the HapMap individuals. Furthermore, we will perform allele specific quantitative PCR on 30 differentially expressed genes to assess the contribution of cis-regulatory variation to gene expression variation. Finally, in Specific Aim 3, we will perform detailed evolutionary analyses on 10 genes that show evidence of cis-regulatory variation and are differentially expressed between populations by resequencing their promoter and regulatory regions in 90 humans and 7 non-human primates. Relevance: One of the most difficult challenges confronting human genetics is to find genes that contribute to common complex diseases such as diabetes, cancer, and hypertension. Research that increases our understanding of gene expression variation will facilitate disease gene mapping studies.