The purpose of this project is to determine the role that repetitive elements (REs) play in the biological outcome of environmental exposures. While it is known that the expression of REs changes in response to environmental agents, mechanistic insights into the impact of REs on the biology of cells and organisms is an area of research that has not been explored in depth. We are specifically interested in studying the extent to which REs alter the expression of adjacent genes through the formation of fusion transcripts (FTs). We chose to use RNA-seq to study this problem, and we have developed a robust analytical pipeline to detect FTs. We have analyzed a data set associated with cocaine exposure, which was selected based on the fact that our collaborator, Dr Eric Nestler, previously demonstrated that the expression of REs is altered in the brains of mice treated with cocaine. Also, we reasoned that the identification of FTs that are responsive to cocaine could provide a link between FTs and environmental exposures. Over the past couple of years we finalized and published our analytical strategy to identify genome-wide FTs using RNA-seq data. Over this past year we have focused our efforts on establishing whether FTs have biological functions that may impact organismal physiology. To this end we have concentrated on two genes that express FTs: Pgc1 and Arhgef10. Pgc1, or peroxisome proliferator gamma co-activator 1, is a master metabolic regulator that was identified back in 2004 as a co-transcriptional activator of the mitochondrial biogenesis program. Arhgef10 is a rho guanine nucleotide exchange factor that ultimately regulates the actin cytoskeleton in a way that can influence cellular morphology, migration, and cytokinesis. We cloned the different canonical and FT isoforms of both genes (with and without a C-terminal tag) with the goal of producing infectious viral particles to express each isoform in cultured cells and in the brains of mice. For Arhgef10, our initial analysis (RNA-seq and quantitative PCR) identified, upon cocaine exposure, an increase in the levels of two FTs isoforms identified for Arhgef10. The first of these FTs arises from the activation of an LTR in the 5 flanking region that splices to the first coding exon of the gene. The net result would be an increase in the expression of a wild-type protein. The second FT arises from the activation of an LTR within the second intron that splices to the second coding exon of the gene. The net result would be expression of a transcript with a truncation at its 5 end; this would result in the production of a protein that is truncated at its amino terminus. Stereotaxic infection of viral particles expressing each of the FTs were performed into the brains of male mice and tests were performed to evaluate cocaine reward behavior. The FT expressing the wild-type protein had a remarkably strong ability to blunt the cocaine reward behavior effect. Additional experiments over the next fiscal year will be focused on following-up on these findings. For Pgc1, our analytical strategy identified two repeat-containing isoforms of Pgc1 in the mouse brain: one involving a simple sequence repeat (SSR), about 500 Kb upstream from the canonical promoter that splices to exon 2 of the gene. The second fusion isoform involves the same SSR that splices to a SINE (small interspersed nuclear element) that is about 250 Kb downstream from it, which in turn splices to exon 2. Analysis of various publicly available RNA-seq data sets revealed that both of these new FT isoforms are brain specific. Moreover, within the brain, the SSR-SINE-exon 2 isoform seems to be the only isoform of Pgc1 expressed in neurons while the SSR-exon 2 isoform is expressed only in oligodendrocytes. We also analyzed publicly available ribosomal profiling data sets and found evidence that the SSR-containing isoforms are actively translated in the brain. Additional support that these new FTs make proteins come from our work in which the cloned SSR-SINE-exon 2 isoform containing a myc tag were found to give rise to proteins of the expected size as judged by Western blot. We recently generated mice that specifically lack the two FT expressing Pgc1 in the brain by deleting either the SSR or the SINE using the CRISPR/cas9 technology. While the former should ablate the two brain isoforms, the SINE-deletion should abolish only the neuronal form. Because neither of these isoforms is expressed in other tissues, we expect to be able to dissect the role of these FTs in the brain. We do not know yet whether FT-derived proteins, which we predict will have different N-termini, will have the same biological function as the proteins derived from the canonical isoforms of Pgc1. Data in the literature proposes that the N-terminal region of PGC1 regulates the choice of downstream targets, we are testing whether the proteins encoded by the FTs have different downstream targets. Prediction of amino acid content based on FT mRNA sequence indicates that the SSR-SINE-exon 2 loses all 16 amino acids coded by exon 1 replacing with 6 coded by the SINE. The SSR-exon 2 while losing the exon 1-encoded amino acids would have 29 residues coded by the SSR itself. These possibilities will be first tested in our cell culture model overexpressing these proteins and then confirmed in the animal models. We are preparing antibodies against the amino terminus of the different forms of proteins and will be testing these antibodies over the next fiscal year. Most of the effort over the past year has involved expanding the colonies for 4 different CRISPR/cas9 generated alleles for Pgc1 and phenotyping the mutant mice. Two of these alleles involve different sized deletions that involve the SSR and two other alleles involve different sized deletions of portions of the SINE. The one allele that we have the most phenotyping data for is a 4 bp deletion that is just downstream from the predicted ATG in the SINE element. Detailed analysis of the brains of the mutant mice revealed no obvious pathological defects. We performed the water maize, open field and rotarod behavioral tests on mice homozygous for the 4 bp deletion and found a dramatic effect only with the latter test. This prompted us to evaluate gene expression profiles in the cerebellum of the mutant mice using microarrays, which revealed substantial up-regulation of a number of genes associated with the synthesis and metabolism of serotonin. Additional experiments over the next fiscal will be focused on further analyzing the gene expression alterations that occur within the cerebellum of the mutant mice.

Project Start
Project End
Budget Start
Budget End
Support Year
4
Fiscal Year
2017
Total Cost
Indirect Cost
Name
U.S. National Inst of Environ Hlth Scis
Department
Type
DUNS #
City
State
Country
Zip Code
Wang, Tianyuan; Santos, Janine H; Feng, Jian et al. (2016) A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq. PLoS One 11:e0159028
Carlin, Danielle J; Rider, Cynthia V; Woychik, Rick et al. (2013) Unraveling the health effects of environmental mixtures: an NIEHS priority. Environ Health Perspect 121:A6-8