This application describes a novel approach for the discovery of non-annotated short open reading frame encoded peptides and small proteins (SEPs), a unique class of understudied peptides in the human genome. Application of this approach to a human leukemia cell line revealed the existence of 32 novel human SEPs, the largest number ever reported. Since SEPs are produced from short open reading frames (sORFs) in the genome this discovery also represents the characterization of 32 new human genes. Analysis of the SEP producing sORFs revealed a number of interesting features about mammalian genes,such as the existence of polycistronic genes, the use of non-ATG start codons to produce protein, and the discovery that some 'non- coding RNAs' have been mistakenly assigned because they actually encode peptides. Likewise, some of the SEPs have features typically found in proteins, such as the ability to localize to specific subcellular compartments and partake in protein-protein interactions, which indicates that they may serve functional roles in the cell. One of these newly discovered SEPs, for instance, partners in a specific protein-protein interaction with a known regulator of cancer cell proliferation to suggesta potential function for this SEP in cell growth.The discovery of these SEPs are significant because they indicate that genome and proteome are larger than previously anticipated and demonstrate the need for additional investigation of these unique human genes. The goals of this application are to discover, characterize, and explore the biology, including any role in disease, of SEPs.

Public Health Relevance

This application details the analysis of a leukemia cell line using a novel approach that led to the discovery of a new group of human genes that encode peptides. This is a significant finding because it indicates human genome and proteome are larger than previously appreciated and may contain non-annotated genes that have important functions. In this application we endeavor to discover; validate; and functionally characterize these novel human genes including their roles in disease.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
7R01GM102491-03
Application #
8892643
Study Section
Enabling Bioanalytical and Imaging Technologies Study Section (EBIT)
Program Officer
Edmonds, Charles G
Project Start
2012-09-21
Project End
2016-08-31
Budget Start
2014-07-01
Budget End
2014-08-31
Support Year
Fiscal Year
2014
Total Cost
$42,998
Indirect Cost
$20,834
Name
Salk Institute for Biological Studies
Department
Type
DUNS #
078731668
City
La Jolla
State
CA
Country
United States
Zip Code
92037
Pauli, Andrea; Norris, Megan L; Valen, Eivind et al. (2014) Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science 343:1248636
Slavoff, Sarah A; Heo, Jinho; Budnik, Bogdan A et al. (2014) A human short open reading frame (sORF)-encoded polypeptide that stimulates DNA end joining. J Biol Chem 289:10950-7
Ma, Jiao; Ward, Carl C; Jungreis, Irwin et al. (2014) Discovery of human sORF-encoded polypeptides (SEPs) in cell lines and tissue. J Proteome Res 13:1757-65
Schwaid, Adam G; Shannon, D Alexander; Ma, Jiao et al. (2013) Chemoproteomic discovery of cysteine-containing human short open reading frames. J Am Chem Soc 135:16750-3
Slavoff, Sarah A; Mitchell, Andrew J; Schwaid, Adam G et al. (2013) Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat Chem Biol 9:59-64