More than 98% of the human genome contains non-coding DNA, some of which encodes cis-regulatory sequences that control gene expression. These cis-regulatory sequences function by binding to transcription factors (TFs). Mutations to these regulatory elements-and the TFs that bind these elements-give rise to the adaptive phenotypic variation as well as disease. Despite their importance, many questions remain about how these regulatory sequences function and evolve. To date, studies investigating the evolution of cis-regulatory sequences have focused primarily on how orthologous non-coding sequences either diverge in function or retain their function despite DNA sequence divergence. Much less has been learned about how new cis- regulatory sequences evolve. Here, I propose to investigate two enhancers that independently evolved similar activity to identify the similaritis and differences between the sets of TFs regulating these elements. This work provides insight into not only the flexibility of cis-regulatory sequences, but also the flexibility of the trans- regulatory networks with which they interact. To explore how enhancer activity arises from non-coding sequences, I will compare the sets of TFs regulating a pair of enhancers with convergent activities. The same set of TFs may recognize distinct sequences, creating convergent phenotypes. Alternatively, different sets of TFs may recognize the distinct sequences in a manner that produces convergent phenotypes. A dual reporter gene system will be used to identify the sets of TFs regulating two enhancers driving convergent expression. Specifically, I will examine the well-characterized enhancers of the gene yellow in Drosophila melanogaster and D. willistoni to study how independently evolved enhancers drive similar expression patterns. Finally, I will test whether these transcription factors directly bind either of these regulatory regions as a test of the molecular mechanisms underlying these expression patterns. Through these efforts, I will gain insights into the diversity of pathways that produce identical phenotypes, the origin of new cis-regulatory sequences, and the syntax of cis-regulatory DNA. These insights will inform our understanding of phenotypes, such as disease and cancers, arising from mutations affecting cis-regulatory sequences.
Genetic changes affecting non-coding sequences lead to a range of phenotypes, from diseases to adaptive phenotypes. The goal of this project is to determine how non-coding sequences with no sequence similarity can drive identical patterns of gene expression. By using genetic approaches and biochemical methods, the sets of proteins that drive gene expression from dissimilar sequences will be determined and compared. Insights from this work will reveal the flexibility or rigidity of gene regulatory networks, which ill increase understanding of the effects of non-coding mutations in human disease and cancers.