Landscape of transcription in human and mouse

Gingeras, Thomas

Abstract

The overall goal of this project is to generate fine-structure RNA maps in human and mouse (C57BL/6NJ) tissues and primary cell lines using a variety of high-throughput sequencing platforms, to evaluate the biological importance of novel transcripts by determining if evidence of their translated products can be identified. From each sample analyzed, we propose to isolate long (>200 nucleotides) and short (<200 nucleotides) RNA in biological duplicate. Illumina-based maps for these samples will initially be generated using (1) RNA sequencing (-seq) of ribosomal (r-)RNA depleted long total RNA. (2) RNA-seq of tobacco acid pyrophosphatase (TAP) pre-treated short RNA (3) Pair-end Cap Analysis of Gene Expression (PE-CAGE) of total RNA. Additionally, for a subset of primary cell lines we will generate the above libraries from nuclear and cytoplasmic subcellular fractions. Long RNA-Seq data will be distilled down into functional elements consisting of splice junctions, polyadenylatio sites and de novo genes and transcripts. The short RNA data will be distilled into contigs representing the 5'ends of short RNAs up to the read length. PE-CAGE data will be analyzed to form clusters representing the 5'ends of transcripts linked to a tag internal to the transcript body. Importantly, each element will be assessed for reproducibility using a nonparametric Irreproducible Detection Rate (nplDR) script. Collectively, these data will allow for the detection of novel transcribed regions and supportive information as to the location of promoter regions and subcellular residence of transcripts. In aggregate, these data will be used to generate models of both noncoding and protein coding transcripts and to distinguish isoforms at complex loci necessary to obtain a comprehensive view of mammalian transcriptomes. For a subset of these samples we will simultaneously collect the genome sequence of the human donors to provide a reference map that will be used to map the RNA data against and derive information concerning allele-specific expression and RNA editing. Unannotated transcript models will be tested using long-range (PacBio/454) sequencing. Lastly, proteogenomic analysis will be done and the results compared against the unannotated transcripts.

Public Health Relevance

The data being proposed herein are foundational to basic, clinical and applied research. In the spirit of transparency and with a policy of rapid-release, th scientific and health care communities can make immediate use of these findings and will benefit from improved human and mouse genome annotations and broadly sampled expression data.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Specialized Center--Cooperative Agreements (U54)
Project #: 3U54HG007004-03S1
Application #: 8922259
Study Section: Special Emphasis Panel (ZHG1-HGR-M (M1))
Program Officer: Feingold, Elise A

Project Start: 2012-09-21
Project End: 2016-07-31
Budget Start: 2014-08-01
Budget End: 2015-07-31
Support Year: 3
Fiscal Year: 2014
Total Cost: $81,540
Indirect Cost: $6,040

Institution

Name: Cold Spring Harbor Laboratory
Department
Type
DUNS #: 065968786

City: Cold Spring Harbor
State: NY
Country: United States
Zip Code: 11724

Related projects


NIH 2017 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$837,865
NIH 2016 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$897,204
NIH 2015 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$2,469,258
NIH 2014 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$2,079,879
NIH 2014 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$113,245
NIH 2014 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$81,540
NIH 2013 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$2,026,822
NIH 2012 U54 HG	Landscape of transcription in human and mouse Gingeras, Thomas Raymond / Cold Spring Harbor Laboratory	$2,124,096

Publications

Ballouz, Sara; Dobin, Alexander; Gingeras, Thomas R et al. (2018) The fractured landscape of RNA-seq alignment: the default in our STARs. Nucleic Acids Res 46:5125-5138

Rodríguez-Martín, Bernardo; Palumbo, Emilio; Marco-Sola, Santiago et al. (2017) ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data. BMC Genomics 18:7

Batut, Philippe J; Gingeras, Thomas R (2017) Conserved noncoding transcription and core promoter regulatory code in early Drosophila development. Elife 6:

Breschi, Alessandra; Gingeras, Thomas R; Guigó, Roderic (2017) Comparative transcriptomics in human and mouse. Nat Rev Genet 18:425-440

Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Carbonell, Silvia et al. (2017) High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat Genet 49:1731-1740

Hon, Chung-Chau; Ramilowski, Jordan A; Harshbarger, Jayson et al. (2017) An atlas of human long non-coding RNAs with accurate 5' ends. Nature 543:199-204

Breschi, Alessandra; Djebali, Sarah; Gillis, Jesse et al. (2016) Gene-specific patterns of expression variation across organs and species. Genome Biol 17:151

Dobin, Alexander; Gingeras, Thomas R (2016) Optimizing RNA-Seq Mapping with STAR. Methods Mol Biol 1415:245-62

Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Santoyo-Lopez, Javier et al. (2016) Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq). Nat Commun 7:12339

Pervouchine, Dmitri D; Djebali, Sarah; Breschi, Alessandra et al. (2015) Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression. Nat Commun 6:5903

Showing the most recent 10 out of 23 publications

Comments

Be the first to comment on Thomas Gingeras's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: