The rapid advance of nanotechnology has generated much excitement in the scientific and engineering community. Its application to the biological front created the new area of single-molecule biology: Scientists were able to investigate biological processes on a molecule-by-molecule basis, opening the door to addressing many problems that were inaccessible just a few decades ago. The new frontier also raises many statistical challenges, calling upon an urgent need for new statistical inference tools and new stochastic models because of the stochastic nature of single-molecule experiments and because many classical models derived from oversimplified assumptions are no longer valid for single-molecule experiments. The current proposal focuses on the statistical challenges in the single-molecule approach to biology. The proposed research consists of three projects: (1) Using stochastic networks to model enzymatic reaction kinetics. The goal is to provide models not only biologically meaningful, but also capable of explaining the recent single-molecule experimental discoveries that contradict the classical Michaelis-Menten model. (2) Using the kernel method to infer biochemical properties from doubly stochastic Poisson process data, in particular, photon arrival data from single-molecule experiments. The goal is to develop nonparametric inference tools to recover the dependence structure, such as the autocorrelation function, from the doubly stochastic Poisson data. (3) Using bootstrap moment estimates to infer the helical diffusion of DNA-binding proteins. The goal is to elucidate how DNA-binding proteins interact with DNA, and estimate the associated energy landscape. The proposed research aims to provide essential statistical models and inference tools to study biological processes at the single-molecule level and to advance significantly our understanding of how important biological processes such as enzymatic reactions and protein-DNA interactions actually occur in our cells. The single-molecule approach to biology presents many opportunities for interdisciplinary research, calling upon collective efforts from mathematical, biological and physical scientists. The proposed research seeks to meet a high academic standard and aims to reach out to the general scientific community to collaborate and cross-fertilize the interdisciplinary field.

Public Health Relevance

The proposed research is relevant to human health because both enzymatic reactions and protein-DNA interactions play fundamental roles in the healthy function of our cells. For example, deficiency of the beta- galactosidase, an enzyme studied in the proposal, can result in galactosialidosis or Morquio B syndrome;the misfunction of hOgg1, a DNA-repair protein studied in this proposal, can result in harmful genetic mutations.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1-CBCB-5 (BM))
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Qian, Hong; Kou, S C (2014) Statistics and Related Topics in Single-Molecule Biophysics. Annu Rev Stat Appl 1:465-492
Hua, Xia; Kou, S C (2011) Convergence of the Equi-Energy Sampler and Its Application to the Ising Model. Stat Sin 21:1687-1711
Zhang, Tingting; Kou, S C (2010) Nonparametric Inference of Doubly Stochastic Poisson Process Data via the Kernel Method. Ann Appl Stat 4:1913-1941