The development of next-generation DNA sequencing methods for quickly acquiring genome and gene expression information has transformed biology. The basis of """"""""""""""""next-gen"""""""""""""""" DNA sequencing is the acquisition of large numbers of short reads (typically 35-500 nucleotides) in parallel. Currently available single-molecule next-gen sequencing platforms monitor the sequencing of single DNA molecules using fluorescence microscopy, allowing for approx. a billion sequencing reads per run. Unfortunately, no method of similar scale and throughput exists to identify and quantify specific proteins in complex mixtures, representing a critical bottleneck in many biochemical, molecular diagnostic, and biomarker discovery assays. What is urgently needed is a massively parallel method, akin to next-gen DNA sequencing, for identifying and quantifying individual peptides or proteins in a sample. I propose a single-molecule peptide sequencing strategy that will achieve exactly this goal. This will in principle allow billions of distinct peptides to be sequenced in parallel (or at least sequenced sufficiently to provide informative sequence patterns), thereby identifying proteins composing the sample and digitally quantifying them by direct counting of peptides. This transformative approach should enable the quantitative, massively parallel sequencing of proteins. Success of the proposed research wil create a technology suficiently ready for real-world protein sequencing problems. Such an approach would have broad applications across biology and medicine, and could be as fundamental for proteins as, for example, PCR is for nucleic acid research. Potential applications include, for example, profiling of protein expression in normal body niches or in disease, metaproteomics, profiling the circulating serum antibodies, the search for and quantification of protein post-translational modifications, and, of particular interest, identifyin biomarkers relevant to cancer and infectious diseas

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
NIH Director’s Pioneer Award (NDPA) (DP1)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1-NDPA-A (01))
Program Officer
Sheeley, Douglas
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Austin
Schools of Arts and Sciences
United States
Zip Code
Robotham, Scott A; Horton, Andrew P; Cannon, Joe R et al. (2016) UVnovo: A de Novo Sequencing Algorithm Using Single Series of Fragment Ions via Chromophore Tagging and 351 nm Ultraviolet Photodissociation Mass Spectrometry. Anal Chem 88:3990-7
Laurent, Jon M; Young, Jonathan H; Kachroo, Aashiq H et al. (2016) Efforts to make and apply humanized yeast. Brief Funct Genomics 15:155-63
Liebeskind, Benjamin J; McWhite, Claire D; Marcotte, Edward M (2016) Towards Consensus Gene Ages. Genome Biol Evol 8:1812-23
Teperek, Marta; Simeone, Angela; Gaggioli, Vincent et al. (2016) Sperm is epigenetically programmed to regulate gene transcription in embryos. Genome Res 26:1034-46
Phanse, Sadhna; Wan, Cuihong; Borgeson, Blake et al. (2016) Proteome-wide dataset supporting the study of ancient metazoan macromolecular complexes. Data Brief 6:715-21
Kim, Eiru; Hwang, Sohyun; Kim, Hyojin et al. (2016) MouseNet v2: a database of gene networks for studying the laboratory mouse and eight other model vertebrates. Nucleic Acids Res 44:D848-54
Toriyama, Michinori; Lee, Chanjae; Taylor, S Paige et al. (2016) The ciliopathy-associated CPLANE proteins direct basal body recruitment of intraflagellar transport machinery. Nat Genet 48:648-56
Young, Jonathan H; Peyton, Michael; Seok Kim, Hyun et al. (2016) Computational discovery of pathway-level genetic vulnerabilities in non-small-cell lung cancer. Bioinformatics 32:1373-9
Wan, Cuihong; Borgeson, Blake; Phanse, Sadhna et al. (2015) Panorama of ancient metazoan macromolecular complexes. Nature 525:339-44
Swaminathan, Jagannath; Boulgakov, Alexander A; Marcotte, Edward M (2015) A theoretical justification for single molecule peptide sequencing. PLoS Comput Biol 11:e1004080

Showing the most recent 10 out of 28 publications