RNA-protein binding is critical to gene regulation, controlling fundamental processes including RNA splicing, translation, localization and stability. RNA-protein interactions play a role in a wide variety of diseases including muscular dystrophy, fragile X syndrome, mental retardation, Prader-Willi syndrome, retinitis pigmentosa, spinal muscular atrophy, and cancer. Advances towards understanding the underlying mechanisms of RNA-protein interaction have great value for improvement of human health. While most studies of gene regulation have focused on DNA- protein interactions, the mechanisms by which RNA-binding proteins (RBPs) act remain enigmatic. However, new high-throughput measurements will soon yield vast amounts of RNA- protein interaction data. These could dramatically clarify RNA regulation, but also demand novel analytical approaches. To meet this challenge, we propose novel computational methods to determine sequences and structures critical to RNA-protein binding, based on hundreds of CLIP-seq and SELEX datasets now being generated for the human ENCODE project. Current methods for modeling RNA-protein interactions have low predictive power, which we hypothesize is due, mainly, to two different issues: (a) they ignore combinatorial binding of multiple elements within each RNA to the protein, and (b) they only account for structure in a superficial manner. We will solve these problems by developing innovative methods that use complementary CLIP-seq and SELEX data to: determine the different classes of RNA elements binding each RBP; learn the combinatorial logic among classes; and learn the sequences and structures that define each class. We will additionally validate our methods for at least two RBPs using RNA-protein gel shift experiments. We expect that this exploratory study will yield powerful, experimentally validated software tools to determine combinatorial and structural aspects of RNA-protein binding from high-throughput sequencing data.

Public Health Relevance

Gene regulation at the RNA-level is central to many human diseases. However, our understanding of RNA regulation, especially the mechanisms by which proteins bind to RNAs, is rudimentary, despite the fact that large datasets on these mechanisms are now being generated. Our combination of computational and experimental approaches will help decipher mechanisms from these datasets and provide new insights into how human disease can be ameliorated by targeting of RNA-level processes.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21HG007554-02
Application #
8793793
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Gilchrist, Daniel A
Project Start
2014-01-25
Project End
2015-12-31
Budget Start
2015-01-01
Budget End
2015-12-31
Support Year
2
Fiscal Year
2015
Total Cost
$276,413
Indirect Cost
$130,163
Name
Jackson Laboratory
Department
Type
DUNS #
042140483
City
Bar Harbor
State
ME
Country
United States
Zip Code
04609
Gu, Tongjun; Gatti, Daniel M; Srivastava, Anuj et al. (2016) Genetic Architectures of Quantitative Variation in RNA Editing Pathways. Genetics 202:787-98
Menghi, Francesca; Inaki, Koichiro; Woo, XingYi et al. (2016) The tandem duplicator phenotype as a distinct genomic configuration in cancer. Proc Natl Acad Sci U S A 113:E2373-82
Ishimura, Ryuta; Nagy, Gabor; Dotu, Ivan et al. (2016) Activation of GCN2 kinase by ribosome stalling links translation elongation with translation initiation. Elife 5: