While variation in gene expression clearly affects the phenotypes of organisms, there is a fundamental gap in understanding the role of post-transcriptional processes. Because post-transcriptional processes are critical for the regulation of protein production, addressing this knowledge gap will facilitate better models of the relationship between genotype and phenotype. upstream Open Reading Frames (uORFs) are regulatory elements found in most human genes, and some disease-linked mutations appear to alter the presence of uORFs. Recently, hundreds of uORFs initiating with non-AUG start codons have been identified in many organisms. Despite the vast number of these elements, the functions and evolution of AUG and non-AUG uORFs remain elusive. The long-term goal of this project is to determine the functions and impact of genetic variation in post-transcriptional cis-regulatory elements. The objective of this proposal is to determine how new uORFs evolve and regulate translation. Our central hypothesis is that AUG and non-AUG uORFs have different regulatory roles, leading to different evolutionary trajectories. This hypothesis is based on our preliminary data, as we have identified hundreds of AUG and non-AUG uORFs with different genomic properties in Saccharomyces yeasts. We will test our central hypothesis by pursuing the following specific aims: 1) Investigate the evolution of AUG and non-AUG uORFs; 2) Determine the functions of AUG and non- AUG uORFs; and 3) Identify the roles of RNA binding proteins in uORF-mediated regulation. In the first aim, we will identify active uORFs in diverse strains and species of yeasts grown under four conditions to test the hypothesis that AUG and non-AUG uORFs have different evolutionary tempo and mode.
The second aim will determine the gene-regulatory functions of hundreds of uORFs using FACS-uORF, a novel dual-fluorescence reporter system we developed. We will use these data to generate predictive models of uORF function.
Aim 3 will identify genome-wide roles of RNA Binding Proteins in regulating uORFs, and integrate this knowledge in our predictive models. This approach is innovative, in that it combines exquisite systems biology tools with an excellent model genus to investigate the evolution of post-transcriptional gene regulation. The proposed research is significant because it is expected to fundamentally advance the fields of genomics, evolutionary, and systems biology by deciphering the evolutionary and functional properties of uORFs. Because translation mechanisms are highly conserved, the knowledge generated by the proposed work is expected to improve models of the relationship between non-coding genetic variation and phenotype in other eukaryotes, including humans.

Public Health Relevance

The proposed research is relevant to public health because sequence variants in upstream Open Reading Frames (uORFs) are associated with many human genetic disorders. Identifying how (uORFs) sequence variation affects uORF functions is expected to ultimately improve our understanding of the impact of such mutations. As such, the proposed project is relevant to the part of the NIH mission relating to fostering innovative discoveries to improve public health.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Schools of Arts and Sciences
United States
Zip Code
Wang, Hao; Kingsford, Carl; McManus, C Joel (2018) Using the Ribodeblur pipeline to recover A-sites from yeast ribosome profiling data. Methods 137:67-70
Spealman, Pieter; Naik, Armaghan W; May, Gemma E et al. (2018) Conserved non-AUG uORFs revealed by a novel regression analysis of ribosome profiling data. Genome Res 28:214-222