PI: Doreen Ware (Cold Spring Harbor Laboratory) Co-PI: Pankaj Jaiswal (Oregon State University)
Senior Personnel: Paul Kersey and Helen Parkinson (European Molecular Biology Laboratory-European Bioinformatics Institute), Lincoln Stein (Ontario Institute for Cancer Research) and Crispin Taylor (American Society of Plant Biologists)
Key Collaborators: Peter D'Eustachio (New York University School of Medicine), Joshua Stein (Cold Spring Harbor Laboratory) and Palitha Dharmawardhana (Oregon State University)
Research in plant biology is reaching a tipping point at which the facility to examine entire systems, increasing integration of information, and ever more sophisticated analyses of experimental data are poised to enable sweeping advances in knowledge and insight. Meanwhile, progress in understanding plant genomes and novel information technologies are transforming the pace, scale, and strategies used to conduct plant science research, allowing, for example, much richer appreciation for the connections among genes, environments, and plant form and function. These advances in plant science promise to revolutionize crop breeding and help to keep pace with future population growth, environmental pressures, and energy needs. In the past decade, the Gramene (www.gramene.org) database has served the research community as a portal to information about gene structure and function for multiple plant species. Tools offered through the Gramene portal allow researchers to develop novel hypotheses on the bases of the most recent information and analyses, using a standardized framework. Yet, there are many remaining opportunities to more fully leverage functional information available from diverse plant species. In this project, Gramene will substantially expand the number of plant genomes incorporated into the portal, add new capabilities for studying gene expression, pathways, and networks, and fundamentally improve the database architecture and user interfaces to facilitate sophisticated systems-level analyses. At least twenty reference genomes, annotated by the scientists who are studying them and including crop plant, model organism, and other species, will be incorporated into the expanded Gramene portal. Rice, maize, and Arabidopsis will be emphasized through the addition of publicly available data from expression, epigenomic, and genome-wide association studies (GWAS). A single internationally-coordinated resource for these data types will be maintained through collaboration with the Ensembl Plants project (plants.ensembl.org). To better capture information about the function of plant genes, Gramene will adopt two powerful tools: the Reactome (www.reactome.org) platform, which can be used to represent curated pathways and for performing pathways-based analyses; and the ATLAS (www.ebi.ac.uk/gxa/) platform, which can be used to display and analyze expression profiles. These interfaces, along with Ensembl, BioMart, and a GWAS viewer, will form an integrated analysis system that is built on structured metadata and implemented through a high-capacity data warehouse and an advanced search engine.
To engage and nurture early career scientists, the project will provide research training opportunities for postdoctoral associates and students at all levels. Training on the use of Gramene tools will be accomplished through presentations and workshops held at internationally attended conferences; broader audiences will be reached through a series of webinars. Another major objective is to achieve broad interoperability among community resources. Standards development will be accomplished via outreach to collaborative data partners and scientists identified through several NSF-supported Research Coordination Networks (RCNs). Establishing common protocols, vocabularies, and semantics will have the powerful effect of bringing scientists together in research and educational fields, both domestically and abroad. A further objective is to engage biologists who are best qualified in their field to contribute to genome annotations on the basis of experimental evidence. Gramene will hold annual jamborees to train scientists in the use of standard curation tools that produce consistent definitions of data types and metadata. Finally, the ubiquitous publication of peer-reviewed journals online enables new opportunities to better link genome databases with the scientific literature. Gramene will collaborate with the American Society of Plant Biologists (ASPB), publisher of two high-impact journals, to prototype ways to couple collection of structured experimental data to the manuscript acceptance and publication process. As a whole, this project will contribute to long-term progress in merging information sciences with natural sciences that will ultimately lead to positive impacts on technology and society.