Co-PIs: Eric Lyons and Mark Beilstein (University of Arizona)

Long non-coding RNAs (lncRNAs) are an emerging class of molecules gaining attention for their roles in various biological processes. lncRNAs are defined by the fact that they do not code for proteins and are therefore not mRNAs. In addition, they do not fit into other well-defined small silencing RNA-producing categories such as small interfering RNAs (siRNA) and microRNAs (miRNAs). Despite the importance of lncRNAs in development, epigenetic modification, and stress responses, there is still much to be learned about their structure, protein interactions, and functions, especially in model and crop plant species. This project will address this significant gap using a combination of genomic, evolutionary, and bioinformatics approaches. It is anticipated that the data, web-accessible genome analytical tools, and data management systems developed by the project will provide novel insights into plant gene expression regulation by lncRNAs, and provide important new findings and resources for studies focused on the improvement of numerous crop and genetic model plants. With regard to outreach and training, this project will provide interdisciplinary research training in RNA biology, computational science and evolutionary biology for students and postdoctoral associates. In addition, the project will develop an interdisciplinary course entitled "Applied Concepts in RNA Biology" that will leverage large-scale computing and datasets to understand various aspects of the role of RNA in biological systems. This project-based course will teach the fundamentals of RNA biology, next-generation sequencing techniques, distributed and high performance computing, data-intensive science, and collaborative research techniques that will be used in student-driven research projects. The course will be taught simultaneously at the University of Pennsylvania and the University of Arizona, with two-way audio/video conferencing and lecture topics alternatively taught at each site. All project outcomes will be made readily accessible to the broader research community through a project website (https://genomevolution.org/wiki/index.php/EPIC-CoGe_Tutorial), the iPlant Collaborative and long-term repositories such as GenBank and the Short Read Archive (SRA).

This project is uniquely positioned to provide insights into the structure and function of lncRNAs, and their interaction with specific epigenomic regulatory modifications in the genome. The specific goals of the project are to define a subset of lncRNAs that are important for proper gene regulation in both normal development and stress response. Specifically, the project will focus on identifying and functionally characterizing those lncRNAs that are (1) nuclear, (2) highly structured, (3) stress responsive, (4) protein bound, and (5) evolutionary conserved in genetic models (Eutrema salsugineum and Arabidopsis thaliana) and in crop species (Camelina sativa, Brassica rapa, Zea mays, and Sorghum bicolor), focusing on their roles in stress adaptation. Finally, the project will expand EPIC-CoGe, a central repository for plant epigenomics data across all species, with advanced data integration, visualization, and analysis tools to allow for the integration of functional genomics data to provide new insight into genome-wide epigenomic interactions.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
1444490
Program Officer
Peter McCartney
Project Start
Project End
Budget Start
2015-08-01
Budget End
2020-07-31
Support Year
Fiscal Year
2014
Total Cost
$2,593,502
Indirect Cost
Name
University of Pennsylvania
Department
Type
DUNS #
City
Philadelphia
State
PA
Country
United States
Zip Code
19104