The long-term goal of this research is to reveal the key regulators that determine the usually ordered development of an animal from undifferentiated pluripotent cells to specialized cells that carry out all of the functions in our body. The coordinated expression of the genome underlies these processes and is orchestrated by networks of interacting genes that we are only beginning to unveil. Cell circuitry is complex but the discovery of the Yamanaka factors demonstrates that even less than a handful of transcription factors can exert profound changes on cell and tissue fates. Thus, the combinations of genes needed to unlock cell determinants seem tantalizingly parsimonious. Large-scale projects are underway to catalog the genomic, epigenomic, and functional genomic landscapes of many different cells in multiple different organisms. As high- throughput techniques such as DNA and RNA sequencing mature, there is an increase in demand for integrative approaches to elucidate the rules underlying intrinsic, adaptive, and programmed phenotypic changes that cells undergo that can be inferred from such data. Our starting point will be to extend the pathway integrative framework developed over the past several years for the interpretation of cancer genomics datasets for the Cancer Genome Atlas project. Extensions to the input pathways used, and advances in the model to enrich the formal representation, will be developed so that a breadth of datasets in human and model organisms can be analyzed. The approach will culminate in the combining of machine-learning classification with probabilistic graphical models. The classifiers will identify predictive pathway features for cell state distinctions in a large database. Genetic manipulations among these features can then be proposed, in any combination, as formal interventions on the graphical model of the resulting classifiers, a major advantage of this work. The pathway models will be applied to the prediction of factors that can confer differentiation and de-differentiation queues to human cortical neurons. Computationally predicted gene perturbations in this system will be tested in living cells. Identifying critical modulators of the cell fate decisions underlying the conversion of stem cells to neural progenitors to mature neural cell types will advance our understanding of neural development. These same regulators may also play an important role in glioma, a disease where the tumor cells appear to be in a neural progenitor-like state. Taken together, the proposed theoretical and applied informatics approaches will contribute powerful tools for interpreting and predicting both routine and aberrant cellular responses. Researchers will be able to query the complex networks with computer algorithms as high fidelity surrogates. In the not so distant future, our hope is to advance our understanding of normal differentiation and shed light on how the regulation of these programs breaks down in disease processes like cancer, shedding light on diagnostic, prognostic, and therapeutic strategies.

Public Health Relevance

This project aims to extend machine-learning and probabilistic graphical modeling approaches developed in the field of cancer genomics to the analysis of a broad range of human and model organism datasets. Novel methods for proposing genetic perturbations using a formal computational analysis will be developed and tested for their ability to suggest pluripotent and lineage-committing factors in a neural progenitor differentiation assay. The methods developed will contribute significant theoretical advances as well as reveal common mechanisms of stem cells and tumor biology to shed light on new treatment options for cancer.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Modeling and Analysis of Biological Systems Study Section (MABS)
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Santa Cruz
Engineering (All Types)
Biomed Engr/Col Engr/Engr Sta
Santa Cruz
United States
Zip Code
Fiddes, Ian T; Lodewijk, Gerrald A; Mooring, Meghan et al. (2018) Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis. Cell 173:1356-1369.e22
Malta, Tathiane M; Sokolov, Artem; Gentles, Andrew J et al. (2018) Machine Learning Identifies Stemness Features Associated with Oncogenic Dedifferentiation. Cell 173:338-354.e15
Graim, Kiley; Liu, Tiffany Ting; Achrol, Achal S et al. (2017) Revealing cancer subtypes with higher-order correlations applied to imaging and omics data. BMC Med Genomics 10:20
Gönen, Mehmet; Weir, Barbara A; Cowley, Glenn S et al. (2017) A Community Challenge for Inferring Genetic Predictors of Gene Essentialities through Analysis of a Functional Screen of Cancer Cell Lines. Cell Syst 5:485-497.e3
Carlin, Daniel E; Paull, Evan O; Graim, Kiley et al. (2017) Prophetic Granger Causality to infer gene regulatory networks. PLoS One 12:e0170340
Liu, Tiffany T; Achrol, Achal S; Mitchell, Lex A et al. (2017) Magnetic resonance perfusion image features uncover an angiogenic subgroup of glioblastoma patients with poor survival and better response to antiangiogenic treatment. Neuro Oncol 19:997-1007
Newton, Yulia; Novak, Adam M; Swatloski, Teresa et al. (2017) TumorMap: Exploring the Molecular Similarities of Cancer Samples in an Interactive Portal. Cancer Res 77:e111-e114
Farshidfar, Farshad; Zheng, Siyuan; Gingras, Marie-Claude et al. (2017) Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH-Mutant Molecular Profiles. Cell Rep 18:2780-2794
Liu, T T; Achrol, A S; Mitchell, L A et al. (2016) Computational Identification of Tumor Anatomic Location Associated with Survival in 2 Large Cohorts of Human Primary Glioblastomas. AJNR Am J Neuroradiol 37:621-8

Showing the most recent 10 out of 20 publications