Biological tissues are composed of different structurally organized cell types which play distinct and cooperative functional roles in phenotypes. Recent spatial transcriptomics technologies have enabled spatially-resolved RNA profiling of single cells with cell identities and localizations for understanding cells’ organizations and functions. The project will develop new machine learning methods for mining RNA profiles collected from single cells and their spatial locations. The research community will benefit from the collection of tools for the analysis of spatial and single-cell genomic data in studying molecular characteristics of cellular structures in tissue. The new methods will be applied to the study of spatial cell heterogeneity of ovarian cancer and circadian rhythms in Brassica rapa. The two applications will improve understanding of cellular structure and pathology of ovarian tissues and the association of cell-specific circadian gene expression patterns with crop improvement traits. Underrepresented graduate and undergraduate students will be advised on research conduction. A summer camp for K-12 students will promote early career interest in big data, genomics, and plant science.

The project will develop models to jointly analyze spatial RNA and single cell RNA profiles. The models will consider spatial structures among cells to provide interpretations of cellular mechanisms in the micro-environment of surrounding cells, macro-structures among multiple tissue regions, and spatiotemporal structures of tissue over circadian rhythms. The proposed research will lead to a class of new computational methods on integrating single-cell gene expressions with spatial and temporal structures to connect single-cell molecular profiling to tissue micro-environment and the dynamics of spatial regions in tissue. Aim 1 of the research is to develop tensor-based learning methods and graph-based neural networks to integrate spatial transcritomics data with cell images, cell spatial locations, and molecular networks for gene expression imputation, spatial gene module detection, spatial clustering to discover cell types, and co-clustering spatial locations and genes. Aim 2 will develop a multitask tensor decomposition method to integrate spatial arrangement of multiple tissue bisection regions to discover the variations of cell diversity and the trajectory of cell proliferation in large tissue (or organ). Application of the method to identify cell heterogeneity and spatial origin of single cells in ovarian cancer tissue will be carried out. Aim 3 will design a multitask joint tensor-matrix factorization method regularized by a circadian function to capture different periodical patterns for studying the dynamic characteristics of spatial gene expressions in a tissue sample. Detecting spatial variations in the circadian clock across Brassica rapa leaf cross-sections using the proposed method will be performed. The results and tools will be made available through

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

National Science Foundation (NSF)
Division of Biological Infrastructure (DBI)
Standard Grant (Standard)
Application #
Program Officer
Jean Gao
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Minnesota Twin Cities
United States
Zip Code