Understanding how genes' activities are controlled is crucial for elucidating the basic operating rules of biology and molecular mechanisms of diseases. Recent innovations in single-cell genomic technologies have opened the door to analyzing a variety of functional genomic features in individual cells. These technologies enable scientists to systematically discover unknown cell subpopulations in complex tissue and disease samples, and allow them to reconstruct a sample's gene regulatory landscape at an unprecedented cellular resolution. Despite these promising developments, many challenges still exist and must be overcome before one can fully decode gene regulation at the single-cell resolution. In particular, current technologies lack the ability to accurately measure the activity of each individual cis-regulatory element (CRE) in a single cell. They also cannot measure all functional genomic data types in the same cell. Moreover, the prevalent technical biases and noises in single-cell genomic data make computational analysis non-trivial. With rapid growth of data, lack of computational tools for data analysis has become a rate-limiting factor for effective applications of single-cell genomic technologies. The objective of this proposal is to develop computational and statistical methods and software tools for mapping and analyzing gene regulatory landscape using single-cell genomic data.
Our Aim 1 addresses the challenge of accurately measuring CRE activities in single cells using single-cell regulome data. Regulome, de?ned as the activities of all cis-regulatory elements in a genome, contains crucial information for understanding gene regulation. The state-of-the-art technologies for mapping regulome in a single cell produce sparse data that cannot accurately measure activities of individual CREs. We will develop a new computational framework to allow more accurate analysis of individual CREs' activities in single cells using sparse data.
Our Aim 2 addresses the challenge of collecting multiple functional genomic data types in the same cell. We will develop a method that uses single-cell RNA sequencing (scRNA-seq), the most widely used single-cell functional genomic technology, to predict cells' regulatory landscape. Since most scRNA-seq datasets do not have accompanying single-cell data for other -omics data types, our method will also signi?cantly expand the utility and increase the value of scRNA- seq experiments.
Our Aim 3 addresses the challenge of integrating different data types generated by different single-cell genomic technologies from different cells. We will develop a method to align single-cell RNA-seq and single-cell regulome data to generate an integrated map of transcriptome and regulome. Upon completion of this proposal, we will deliver our methods through open-source software tools. These tools will be widely useful for analyzing and integrating single-cell regulome and transcriptome data. By addressing several major challenges in single-cell genomics, our new methods and tools will help unleash the full potential of single-cell genomic technologies for studying gene regulation. As such, they can have a major impact on advancing our understanding of both basic biology and human diseases.

Public Health Relevance

Understanding how genes' activities are controlled at single-cell resolution is crucial for studying human diseases. This proposal will develop a coordinated set of computational and statistical methods and software tools for mapping and analyzing gene regulatory programs using single-cell genomic data. These methods and tools will allow scientists to more accurately and comprehensively reconstruct gene regulatory landscape of individual cells in complex tissue and disease samples, and they can have a major impact on advancing our understanding of both basic biology and human diseases.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
1R01HG010889-01
Application #
9649896
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Gilchrist, Daniel A
Project Start
2019-08-22
Project End
2023-06-30
Budget Start
2019-08-22
Budget End
2020-06-30
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Johns Hopkins University
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
001910777
City
Baltimore
State
MD
Country
United States
Zip Code
21205