Novel genomics technologies, such as gene editing and single-cell sequencing, have brought great progress on defining the structure and function of human cells at unprecedented resolution and scale. Traditional ?bulk? genomic measurements are often insufficient as sample heterogeneity and differential cell state dynamics could be masked. CRISPR-based gene-editing technology enables single-cell lineage tracking using evolvable barcodes. Building on this technology, we will address mathematical challenges arising from single-cell barcoding and decoding, by developing online optimization and single-cell time series analysis methods. This line of research would generate analytical tools for barcoded singlecell data, with provable theoretical guarantees, that are both sharable and deployable for defining the genetic basis of cellular lineage and gene expression states. Application of these tools will help uncover the complex biology behind cell evolution and interaction in health and diseases. The research includes projects suitable for student participation and training at various levels, and open-source software development.

Current cellular barcoding tool is limited by its scalability, and the single-cell transition data made possible by the barcoding technology presents new analytical challenges. To significantly improve the capacity of cell barcoding, we will develop learning-theoretic optimization methods for efficiently finding the best barcoding design over a large combinatorial design space. Further, we will develop state embedding methods to identify mathematical abstractions of cell states from gene-expression paths, as supported by the barcoded trajectories, that imply critical cell dynamics and genetic regulation. In particular, we will use kernelized low-rank approximation and convex polytope approximation schemes to estimate metastable cell clusters and gene-expression landmarks. Finally, we will validate the dynamics decoding tools using data from real cancer model experiments. The research findings will open up new potential in biological-driven mathematical methods. They will have implications for stem-cell and developmental biology, neuroscience, immunology, and other areas of biological investigation.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1953686
Program Officer
Yong Zeng
Project Start
Project End
Budget Start
2020-08-01
Budget End
2023-07-31
Support Year
Fiscal Year
2019
Total Cost
$420,000
Indirect Cost
Name
Princeton University
Department
Type
DUNS #
City
Princeton
State
NJ
Country
United States
Zip Code
08544