Proteins are the workhorses of cells and organisms. At present, no technologies are available for the routine proteome-scale sequencing and quantification of this physiologically important class of molecules. In this project, we propose to develop a nanopore technology for direct de novo single-molecule protein sequencing. Analogous to nanopore DNA sequencing, the sequences of protein molecules are determined by electronically measuring the alteration of the ionic current flow by the residues along the linear polypeptides while the fully denatured molecules are electrophoretically translocated one at a time through the nanopores. To realize the technology, we need to overcome three major challenges: (1) engineering of nanopores with dimensions (~1 nm diameter and ~0.5 nm thickness) required to distinguish 20 amino acid residues (~0.38 spacing) along the linear polypeptide chain; (2) a method for controlled unidirectional translocation of unfolded polypeptides through the nanopores; (3) algorithms for decoding sequences from current blockage profiles. Through several years of conceptual, theoretical and experimental work, we have found potential solutions to these challenges. First, we have invented a new hybrid solid-state/protein/cyclopeptide nanopore architecture that will enable the engineering of nanopores capable of distinguishing the 20 different amino acid residues for de novo protein sequencing. Second, we have also demonstrated a strategy to impart uniform charge density along the polypeptide chain to enable unidirectional translocation of protein through nanopores. Third, we have developed a strategy that has enabled us to model and compute the current blockages of all 207 (=1.28x109) heptamer combinations of 20 amino acids, and thus the current blockage profiles of any proteins. We have also developed the algorithms to decode amino acid sequences from the computed current blockage profiles. Excitingly, with these recent breakthroughs, we have been able to demonstrate the theoretical feasibility of de novo nanopore protein sequencing. We have shown that 12 amino acid residues can be sequenced with >90% consensus accuracy, 2 residues with >85% accuracy, and the other 6 residue decoded as 3 pairs with >90% accuracy. In this project, we propose to implement these innovative approaches aiming to lay the foundation for the experimental realization of nanopore protein sequencing. The ability to sequence and enumerate proteins will enable routine proteome-scale identification and digital quantification of proteins with the ultimate single-molecule and single-cell sensitivity. The ability to sequence proteins at the single-molecule level will enable routine proteome-scale identification and digital quantification of proteins with the ultimate single- molecule sensitivity. If successful, such a disruptive technology will find broad general applications from basic research, drug target identification and precision clinical diagnosis of human diseases, will have potential to transform many aspects of biomedical research, personalized healthcare and medical practice.

Public Health Relevance

We propose to develop a technology for single-molecule nanopore protein sequencing. The technology will enable the rapid and low-cost proteome-scale sequencing and precise digital quantification of proteins with the ultimate single-molecule and single-cell sensitivity. Our technology will have the potential to transform proteomic and biomedical research, drug development and molecular diagnosis of human diseases.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Instrumentation and Systems Development Study Section (ISD)
Program Officer
Smith, Ward
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California, San Diego
Engineering (All Types)
Schools of Arts and Sciences
La Jolla
United States
Zip Code