The genome of an organism encodes not only genes and their RNA and protein products, but also the integrated programs that define when, where, and to what extent different genes are activated or silenced. At the DNA level, gene regulatory signals are encoded by regulatory elements that comprise clustered recognition sites for DNA binding proteins. However, the location and function of the vast majority of Arabidopsis regulatory sequences is currently obscure. In this project, novel high-throughput epigenomic technologies--Digital DNaseI Mapping and Digital Genomic Footprinting--will be applied to map and characterize regulatory DNA across the A. thaliana genome at nucleotide resolution. These technologies are capable of mapping the locations of regulatory DNA sequences, and delineating the specific sites of regulatory factor binding within such regions. Because gene regulatory programs vary widely both between different cell types and within a cell type during differentiation, the project will encompass multiple developmental stages and tissues of a reference strain. As sessile organisms, plants integrate many cues into appropriate developmental and stress responses, most of which rely on major re-programming gene regulatory responses. Regulatory DNA involved in such responses will therefore be mapped through study of standard stress conditions. At the population level, most phenotypic variation is likely to derive from non-coding genetic variation. By systematically extending maps of regulatory DNA across both diverse A. thaliana accessions and related species, the project will expose relationships between genotypic variation and gene regulatory programs on a genome-wide scale. The resulting data will provide unprecedented insight into endogenous and environmentally-responsive plant regulatory programs, and will significantly accelerate the identification of functional non-coding variation underlying relevant phenotypic variation.

Broader impacts. This project has the potential to change fundamentally the landscape of gene regulation research in A. thaliana and in plants generally, both as it applies to basic mechanisms and in its application to solve diverse quantitatively varying phenotypes. The availability of comprehensive, high-resolution regulatory DNA maps for A. thaliana stages, tissues, treatments, accessions, and evolutionarily related species will immediately bring A. thaliana to the forefront of regulatory genomics, and will provide a powerful attraction for bringing dynamic new investigators to the field. The comprehensive annotation of A. thaliana regulatory regions and transcription factor binding sites targeted under this project will be of use to the entire plant biology community, and will develop significant data resources that will potentiate experimental approaches to determining gene function. The project will foster the advancement of plant regulatory genomics through rapid dissemination of data to the public domain via genomic databases as well as relevant analytical tools to assist in its utilization by diverse investigators. The project will also encompass a significant educational component aimed at training next-generation leaders in plant regulatory genomics, and recruitment and training of talented undergraduate and graduate scientists from diverse backgrounds.

Project Report

Plants are critical for ecosystem stability and human survival, and face the challenge of adapting to a warmer climate. In spite of their central role in human health and substantial genetic resources, studies of plant gene regulation have surprisingly lagged those of animals and particularly human. To address this impasse, we developed and applied a plant-optimized version of a technology called genomic DNaseI footprinting to study the major model plant, Arabidopsis thaliana. Genomic DNaseI footprinting maps the ‘dark matter’ of the genome—those segments of the genome that do not directly encode for functional molecules (e.g., proteins), but instead are responsible for orchestrating the when and where of functional molecule production. Funding for this project led to a detailed map of the general features of the A. thaliana regulatory DNA landscape, enabled predictions for how the proteins that bind to this DNA (called transcription factors or TFs) interact in a network, and provided the first-ever glance into how the regulatory DNA dynamically responds to two of the major environmental stressors a plant faces: light and heat. We anticipate that our results will accelerate analysis of traits of interest to plant breeders and facilitate the transfer of knowledge from model plants to crops. For example, our data could be used to pinpoint trait-associated variation in regulatory DNA among A. thaliana accessions using classical quantitative trait locus (QTL) studies or guide the selection of specific DNA elements to be targeted through genome engineering to more precisely disrupt TF networks. In addition, our results constitute a reference against which other closely related plant species may be compared, allowing new insights into how the immense diversity in plant form and function among plant species arose during evolution. Collectively, our results will influence future studies of gene regulation in many organisms, and be of particularly high relevance for the many researchers trying to develop robust plant breeds to face current and future food security needs. We have already shared our data and genetic resources with the scientific community (details can be found at our project website: www.plantregulome.org). In addition, a detailed report of our findings is currently under review for publication. A number of undergraduate, graduate and post-graduate trainees have been mentored during the course of this project. The results and overall goals of the work have been incorporated into classes taught by the PI and co-PIs, as well as serving as a launching point for numerous outreach activities for K-12 students, their teachers and the general public.

Agency
National Science Foundation (NSF)
Institute
Division of Molecular and Cellular Biosciences (MCB)
Application #
0929046
Program Officer
Karen C. Cone
Project Start
Project End
Budget Start
2009-12-01
Budget End
2013-11-30
Support Year
Fiscal Year
2009
Total Cost
$2,011,482
Indirect Cost
Name
University of Washington
Department
Type
DUNS #
City
Seattle
State
WA
Country
United States
Zip Code
98195