Reconstructing regulatory networks from time series single cell data New technological advances are enabling researchers to profile the gene expression of single cells. These experiments, termed single cell RNA-Seq, open the door to several important applications. These include the ability to elucidate the networks and pathways controlling cellular differentiation and understanding the sequence of regulatory events that lead to, and control, cell fate decisions. Such models and networks provide critical information for investigators attempting to derive specific types of differentiated human cells which in turn opens the door to several applications ranging from disease modeling to the ability to use regenerative cells for potential reconstitution of damaged cells or tissues. However, the analysis of single cell RNA-Seq data, and specifically time series single cell data which is required for such developmental studies, raises several new challenges. Determining which cells should be combined to construct developmental models is challenging. Cells at each time point usually come from a mixture of cell types, each of which may be a progenitor of one, or several, specific lineages. To reconstruct the networks controlling cell differentiation we first need to determine a `time series' by linking single cells within and between time points and then use these assignments to reconstruct the networks and pathways that drive cell fate decisions. A specific example of a differentiation process we intend to study is abnormal lung development which often arises due to genetic perturbations and can lead to congenital or neonatal lung diseases. Our preliminary results indicate that single cell RNA-Seq data has great potential to illuminate the complex gene regulatory networks that control normal development of several different types of cells in the lung and to aid in identifying regulatory mechanism that may go awry during abnormal development leading to disease. Given these initial findings, in this project we will develop and test computational methods, based on probabilistic graphical models, for the analysis and modeling of time series single cell RNA-Seq data. Our methods would allow the determination of the different types of cells at each time point, relationship between cells across time points and the reconstruction of regulatory networks that control the differentiation process. The reconstructed networks would also allow us to identify key genes and factors controlling the differentiation process and would lead to testable hypotheses about the proteins regulating key events. We will apply the methods we develop to study and model normal and diseased lung development by performing new single experiments on human induced pluripotent stem (iPS) cells.

Public Health Relevance

Reconstructing regulatory networks from time series single cell data Development of approaches for analyzing single cell RNA seq time series data is important for understanding the complex regulatory networks, including transcription factors and signaling pathways, that control embryonic development of tissue lineages and organs. Understanding this sequence of developmental milestones is a critical step in our ability to engineer methods for deriving differentiated human cell types in vitro from patient-derived pluripotent stem cells. Being able to derive these differentiated cells from any patient of any age provides unprecedented opportunities to model disease, design drug therapies, and prepare regenerative cells for potential reconstitution of damaged cells or tissues. In this proposal we will apply our computational methods to improve the differentiation of normal and patient-specific induced stem cells into lung epithelial progenitors allowing us to better understand a congenital/developmental pediatric lung disease that arises from heterozygous mutations in NKX2-1, a master transcriptional regulator of lung development.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Resat, Haluk
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code