Regulatory networks that control which genes are expressed when, are critical players in the maintenance and transitions of different cell states. In mammalian systems such networks are established by a complex interplay of thousands of regulatory proteins such as transcription factors, chromatin remodelers and signaling proteins, histone post-translational modifications and three-dimensional organization of the genome. Hence, the identification of genome-scale regulatory networks and their changes remains a computational and experimental challenge, especially for rare and novel cell types. Through recent efforts of consortia projects we now have rich datasets measuring multiple components of the regulation machinery in model cell lines. These data enable the creation of a more complete regulatory network for these cell lines. Can we use this information to identify networks in new cell types where measuring only a few components of the regulation machinery is possible (e.g. the transcriptome)? Can we leverage more complete regulatory networks to predict new cell types, and to predict the effect of network perturbations to cellular state? To tackle these questions, in this proposal we will develop innovative network reconstruction methods to identify regulatory networks in novel and rare cell types by leveraging their relationships to well-studied cell types, as well as to each other. Our methods will use the framework of non-stationary graphical models to represent cell type-specific regulatory networks and will use multi-task learning to incorporate shared information between cell types in a lineage. Methods in Aim 1 will infer modular gene regulatory networks for each cell type and additionally refine an existing incomplete or uncertain lineage structure. Methods in Aim 2 will identify cell type-specific directed dependencies among chromatin state and transcription factors and how they impact target gene expression through proximal and long-range regulation. Our methods will be applied to two cell-fate specification problems: cellular reprogramming and multi-cell lineage forward differentiation. In cellular reprogramming, regulators and subnetworks hindering reprogramming efficiency will be predicted and tested using genetic perturbation experiments. In forward differentiation, regulatory network changes that drive alternate lineages will be identified and tested. Successful completion of this project will provide two broadly applicable software tools that will enable researchers to (i) accurately identify regulatory networks and their changes between different cell states in complex cell lineages, (ii) examine interactions among multiple levels of regulation and their impact on cell type-specific gene expression, and (iii) efficiently identify the most upstream regulatory genes and subnetworks that change cellular states. Software tools from this project will be made available and will be broadly applicable to diverse types of dynamic biological processes in development and disease.

Public Health Relevance

Many human diseases are due to aberrant activity of genes in specific cell types. Transcriptional regulatory networks specify what genes must be expressed in which cell types and play a central role in the type and function of cells. Therefore, the ability to identify these networks in each cell type and the network changes that result in changes in cell type is important for a better understanding of human health and disease. Computational methods that can identify regulatory networks and their dynamics across related mammalian cell types offer a powerful approach to create better models of human disease and for regenerative medicine.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Wisconsin Madison
Biostatistics & Other Math Sci
Schools of Medicine
United States
Zip Code