Informatics platform for mammalian gene regulation at isoform-level

Davuluri, Ramana

Abstract

In recent years, the notion of """"""""one gene makes one protein that functions in one signaling pathway"""""""" in mammalian cells has been shown to be overly simplistic. Recent evidence suggests that more than 50% of the human genes produce multiple protein isoforms, through alternative splicing and alternative usage of transcription initiation and/or termination. Notably, the disruption of many of these genes is implicated in cancer and several neuropsychiatric disorders. For majority of human genes the resulting multiple protein isoforms are functionally different and can participate in different signaling pathways. However, nearly after a decade since the completion of the human genome draft sequence, we still assume """"""""gene"""""""" as the basic functional unit in a cell. We argue that the isoform-level gene products - """"""""transcript variants"""""""" and """"""""protein isoforms"""""""" are the basic functional units in a mammalian cell, and accordingly, the informatics resources for managing and analyzing gene regulation data in mammalian cells should adopt """"""""gene isoform centric"""""""" rather than """"""""gene centric"""""""" approaches. We propose to build an informatics platform for understanding gene regulation at isoform-level by developing statistically rigorous bioinformatics resources for processing Next-Generation Sequencing (NGS) data. Recently, computational approaches that combine seemingly disparate experimental data have been successful in developing concise gene regulation models and transcriptional modules. We plan to extend these methodologies to perform integrative analysis of multiple high-throughput data sets currently generated across different laboratories, including ours at Wistar, into computational models to predict different transcriptional isoforms of mammalian genes and protein-DNA interactions at isoform level. We will apply innovative statistical modeling approaches that combine state-of-the-art meta-classification algorithms, such as Na?ve Bayes Tree, Bagging and LogitBoost, with Random Forest feature selection to classify different types of target promoters with good classification accuracy and reduced instability, in order to predict gene promoters and infer the protein-DNA interactions from ChIP-seq data. The computational models and the derived information will be integrated into a novel database, which will serve as an in silico platform for transcriptional regulation studies. This will be completed by pursuing the following aims, (1) Develop statistically rigorous novel algorithms and bioinformatics pipelines to identify the orthologous promoters, corresponding transcript variants and protein isoforms that are conserved between human and mouse, (2) develop novel algorithms and informatics pipelines for integrative analysis of NGS datasets to estimate the activity and expression of both known and novel promoters and their transcript variants, in various tissues, developmental stages, and disease conditions, and (3) develop a web-accessible database for integrating the information generated. The novel bioinformatics methods developed by this project will help in silico discovery and research for accelerating the linkage of phenotypic and genomic information, at gene-isoform level.

Public Health Relevance

The disruption of numerous human genes and their isoforms driven by alternative splicing and alternative transcription is implicated in cancer and several neuropsychiatric disorders, including Parkinson's disease, schizophrenia, bipolar disorder and autism. The development of bioinformatics methods and user-friendly software in this study will provide useful tools to better understand gene regulatory mechanisms in mammalian cells, and more importantly, how dis-regulation of these mechanisms leads to a variety of diseases.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 5R01LM011297-03
Application #: 8658144
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2013-05-02
Project End: 2016-04-30
Budget Start: 2014-05-01
Budget End: 2015-04-30
Support Year: 3
Fiscal Year: 2014
Total Cost: $337,196
Indirect Cost: $118,946

Institution

Name: Northwestern University at Chicago
Department: Public Health & Prev Medicine
Type: Schools of Medicine
DUNS #: 005436803

City: Chicago
State: IL
Country: United States
Zip Code: 60611

Related projects


NIH 2020 R01 LM	Informatics Platform for Mammalian Gene Regulation at Isoform-level Davuluri, Ramana V. / Northwestern University at Chicago
NIH 2020 R01 LM	Informatics Platform for Mammalian Gene Regulation at Isoform-level Davuluri, Ramana V. / State University New York Stony Brook
NIH 2019 R01 LM	Informatics Platform for Mammalian Gene Regulation at Isoform-level Davuluri, Ramana V. / Northwestern University at Chicago
NIH 2018 R01 LM	Informatics Platform for Mammalian Gene Regulation at Isoform-level Davuluri, Ramana V. / Northwestern University at Chicago
NIH 2017 R01 LM	Informatics Platform for Mammalian Gene Regulation at Isoform-level Davuluri, Ramana V. / Northwestern University at Chicago
NIH 2015 R01 LM	Informatics Platform for Mammalian Gene Regulation at Isoform-level Davuluri, Ramana V. / Northwestern University at Chicago	$337,196
NIH 2014 R01 LM	Informatics platform for mammalian gene regulation at isoform-level Davuluri, Ramana V. / Northwestern University at Chicago	$337,196
NIH 2013 R01 LM	Informatics platform for mammalian gene regulation at isoform-level Davuluri, Ramana V. / Wistar Institute	$387,000
NIH 2013 R01 LM	Informatics Platform for Mammalian Gene Regulation at Isoform-Level Davuluri, Ramana V. / Northwestern University Chicago	$152,050

Publications

Calvert, Andrea E; Chalastanis, Alexandra; Wu, Yongfei et al. (2017) Cancer-Associated IDH1 Promotes Growth and Resistance to Targeted Therapies in the Absence of Mutation. Cell Rep 19:1858-1873

Liu, Xianpeng; Zhao, Bo; Sun, Limin et al. (2017) Orthogonal ubiquitin transfer identifies ubiquitination substrates under differential control by the two ubiquitin activating enzymes. Nat Commun 8:14286

Shilpi, Arunima; Bi, Yingtao; Jung, Segun et al. (2017) Identification of Genetic and Epigenetic Variants Associated with Breast Cancer Prognosis by Integrative Bioinformatics Analysis. Cancer Inform 16:1-13

Dapas, Matthew; Kandpal, Manoj; Bi, Yingtao et al. (2017) Comparative evaluation of isoform-level gene expression estimation algorithms for RNA-seq and exon-array platforms. Brief Bioinform 18:260-269

Vannini, Ivan; Wise, Petra M; Challagundla, Kishore B et al. (2017) Transcribed ultraconserved region 339 promotes carcinogenesis by modulating tumor suppressor microRNAs. Nat Commun 8:1801

Malchenko, Sergey; Sredni, Simone Treiger; Bi, Yingtao et al. (2017) Stabilization of HIF-1? and HIF-2?, up-regulation of MYCC and accumulation of stabilized p53 constitute hallmarks of CNS-PNET animal model. PLoS One 12:e0173106

Van Roosbroeck, Katrien; Fanini, Francesca; Setoyama, Tetsuro et al. (2017) Combining Anti-Mir-155 with Chemotherapy for the Treatment of Lung Cancers. Clin Cancer Res 23:2891-2904

Macyszyn, Luke; Akbari, Hamed; Pisapia, Jared M et al. (2016) Imaging patterns predict patient survival and molecular subtype in glioblastoma via machine learning techniques. Neuro Oncol 18:417-25

Bell, Jonathan B; Eckerdt, Frank D; Alley, Kristen et al. (2016) MNK Inhibition Disrupts Mesenchymal Glioma Stem Cells and Prolongs Survival in a Mouse Model of Glioblastoma. Mol Cancer Res 14:984-993

Jin, Hong-Jian; Jung, Segun; DebRoy, Auditi R et al. (2016) Identification and validation of regulatory SNPs that modulate transcription factor chromatin binding and gene expression in prostate cancer. Oncotarget 7:54616-54626

Showing the most recent 10 out of 20 publications

Comments

Be the first to comment on Ramana Davuluri's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: