The development of next-generation DNA sequencing methods for quickly acquiring genome and gene expression information has transformed biology. The basis of next-gen DNA sequencing is the acquisition of large numbers of short reads (typically 35500 nucleotides) in parallel. Currently available single-molecule next-gen sequencing platforms monitor the sequencing of single DNA molecules using fluorescence microscopy, allowing for approx. a billion sequencing reads per run. Unfortunately, no method of similar scale and throughput exists to identify and quantify specific proteins in complex mixtures, representing a critical bottleneck in many biochemical, molecular diagnostic, and biomarker discovery assays. What is urgently needed is a massively parallel method, akin to next-gen DNA sequencing, for identifying and quantifying individual peptides or proteins in a sample. I propose a single-molecule peptide sequencing strategy that will achieve exactly this goal. This will in principle allow billions of distinct peptides to be sequenced in parallel (or at least sequenced sufficiently to provide informative sequence patterns), thereby identifying proteins composing the sample and digitally quantifying them by direct counting of peptides. This transformative approach should enable the quantitative, massively parallel sequencing of proteins. Success of the proposed research will create a technology sufficiently ready for real-world protein sequencing problems. Such an approach would have broad applications across biology and medicine, and could be as fundamental for proteins as, for example, PCR is for nucleic acid research. Potential applications include, for example, profiling of protein expression in normal body niches or in disease, metaproteomics, profiling the circulating serum antibodies, the search for and quantification of protein post-translational modifications, and, of particular interest, identifying biomarkers relevant to cancer and infectious diseases. In short, single-molecule sequencing could potentially solve many of the problems currently limiting the field of proteomics and biomarker discovery, promising improvements of more than 5 orders of magnitude in sensitivity and throughput and offering a path forward that can reasonably be expected to mirror the phenomenal growth and impact of next-gen DNA sequencing.

Public Health Relevance

While nucleic acid mutations underlie nearly all cancers, these changes are most readily embodied by proteins and often expressed in bodily compartments (i.e. saliva, blood, urine) accessible without invasive procedures such as biopsies. Thus, an approach capable of sensitive identification and quantitative profiling of protein abundances in these compartments would significantly impact the search for and application of protein biomarkers in the diagnosis, characterization, and monitoring of most, if not all, cancers. I propose such a method, suitable for high-throughput identification and quantification of individual peptides or proteins within a sample;this technology will be directly applicable to cancer diagnosis, characterization, and protein cancer biomarker discovery, and, beyond biomarker discovery, will have broad applications across biology and medicine.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
NIH Director’s Pioneer Award (NDPA) (DP1)
Project #
5DP1GM106408-03
Application #
8710285
Study Section
Special Emphasis Panel (ZGM1-NDPA-A (01))
Program Officer
Sheeley, Douglas
Project Start
2012-09-30
Project End
2017-07-31
Budget Start
2014-08-01
Budget End
2015-07-31
Support Year
3
Fiscal Year
2014
Total Cost
$772,500
Indirect Cost
$272,500
Name
University of Texas Austin
Department
Chemistry
Type
Schools of Arts and Sciences
DUNS #
170230239
City
Austin
State
TX
Country
United States
Zip Code
78712
Hwang, Sohyun; Kim, Chan Yeong; Yang, Sunmo et al. (2018) HumanNet v2: human gene networks for disease research. Nucleic Acids Res :
Gibeaux, Romain; Acker, Rachael; Kitaoka, Maiko et al. (2018) Paternal chromosome loss and metabolic crisis contribute to hybrid inviability in Xenopus. Nature 553:337-341
Sun, Xiaolong; Boulgakov, Alexander A; Smith, Leilani N et al. (2018) Photography Coupled with Self-Propagating Chemical Cascades: Differentiation and Quantitation of G- and V-Nerve Agent Mimics via Chromaticity. ACS Cent Sci 4:854-861
Swaminathan, Jagannath; Boulgakov, Alexander A; Hernandez, Erik T et al. (2018) Highly parallel single-molecule identification of proteins in zeptomole-scale mixtures. Nat Biotechnol :
Akhmetov, Azat; Laurent, Jon M; Gollihar, Jimmy et al. (2018) Single-step Precision Genome Editing in Yeast Using CRISPR-Cas9. Bio Protoc 8:
Teufel, Ashley I; Johnson, Mackenzie M; Laurent, Jon M et al. (2018) The many nuanced evolutionary consequences of duplicated genes. Mol Biol Evol :
Akhmetov, Azat; Ellington, Andrew D; Marcotte, Edward M (2018) A highly parallel strategy for storage of digital information in living cells. BMC Biotechnol 18:64
Verbeke, Eric J; Mallam, Anna L; Drew, Kevin et al. (2018) Classification of Single Particles from Human Cell Extract Reveals Distinct Structures. Cell Rep 24:259-268.e3
Hernandez, Erik T; Rogelio Escamilla, P; Kwon, Sang-Yop et al. (2018) 2,2'-Bipyridine and hydrazide containing peptides for cyclization and complex quaternary structural control. New J Chem 42:8577-8582
Kuboniwa, Masae; Houser, John R; Hendrickson, Erik L et al. (2017) Metabolic crosstalk regulates Porphyromonas gingivalis colonization and virulence during oral polymicrobial infection. Nat Microbiol 2:1493-1499

Showing the most recent 10 out of 43 publications