Over the past decade, the landscape of cancer research has changed with the explosion of publicly available and investigator generated datasets, and the rapidly growing number of sophisticated computational methods and tools to integrate and analyze them. There are continuing challenges to the research community as it seeks to harness this wealth of data and analysis tools to move the cancer research agenda forward. The entire cancer research community needs a way to easily collaborate on, document, capture, and share their work, from conception through analysis to publication. Moreover, cancer biologists may have difficulty choosing the right tools and using them correctly, effectively putting this powerful capability out of their direct reach. The goal of thisU24 proposal is to use the GenePattern computational genomics platform, which has served the cancer community since 2004, as the foundation for a new electronic notebook environment to meet these needs. Through these efforts we will support a diverse community of users at the forefront of cancer research who seek to better understand the underlying mechanisms of disease, translate improved methods for patient diagnosis and prognosis to the clinic, and identify new drug targets.
Aim 1. Develop a GenePattern electronic notebook for collaborative in silico research. Leveraging a novel blend of GenePattern, Google Drive/Docs, and the IPython platform, we will develop an environment for creating and deploying electronic notebooks to support the entirety of ongoing collaborative studies, including running analyses, presenting results, recording comments and interpretation of results, and capturing the reproducible computational workflow.
Aim 2. Create a collection of GenePattern notebooks for cancer research. We will formulate and deploy dynamic GenePattern notebooks embodying complete analysis studies based on driving cancer projects, to guide investigators through relevant considerations at each analysis execution step to choices best supporting their research goals.
Aim 3. Add GenePattern modules to address cancer complexity. We will add new modules as required for the notebook collection in Aim 2, including new information-theoretic approaches to identifying biomarkers, clustering, classification, and dimension reduction.
Aim 4. Provide training and GenePattern Notebook support for the cancer research community. We will provide a high level of support for the notebook environment; develop cancer focused training materials featuring notebooks based on driving cancer projects; deploy a public GenePattern server on the high- performance computing infrastructure at the Pittsburgh Supercomputing Center.

Public Health Relevance

GenePattern is a popular bioinformatics software environment that puts sophisticated computational methods within the reach of all biomedical researchers to address a variety of problems at the forefront of cancer research, including patient diagnosis and prognosis, identification of new drug targets, and understanding disease mechanisms. The work in this project will build on GenePattern's foundation to provide GenePattern Notebook, a beginning-to-end computational electronic lab notebook environment for combining analysis and text. We will also create notebooks to capture and share with cancer investigators scientist-oriented cancer analysis scenarios and tasks for use in their own studies.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-TCRB-9 (J1))
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Diego
Internal Medicine/Medicine
Schools of Medicine
La Jolla
United States
Zip Code
Archer, Tenley C; Ehrenberger, Tobias; Mundt, Filip et al. (2018) Proteomics, Post-translational Modifications, and Integrative Analyses Reveal Molecular Heterogeneity within Medulloblastoma Subgroups. Cancer Cell 34:396-410.e8
Silterra, Jacob; Gillette, Michael A; Lanaspa, Miguel et al. (2017) Transcriptional Categorization of the Etiology of Pneumonia Syndrome in Pediatric Patients in Malaria-Endemic Areas. J Infect Dis 215:312-320
Viswanathan, Vasanthi S; Ryan, Matthew J; Dhruv, Harshil D et al. (2017) Dependency of a therapy-resistant state of cancer cells on a lipid peroxidase pathway. Nature 547:453-457
Reich, Michael; Tabor, Thorin; Liefeld, Ted et al. (2017) The GenePattern Notebook Environment. Cell Syst 5:149-151.e1
Dhingra, Priyanka; Martinez-Fundichely, Alexander; Berger, Adeline et al. (2017) Identification of novel prostate cancer drivers using RegNetDriver: a framework for integration of genetic and epigenetic alterations with tissue-specific regulatory network. Genome Biol 18:141
Boulay, Gaylor; Awad, Mary E; Riggi, Nicolo et al. (2017) OTX2 Activity at Distal Regulatory Elements Shapes the Chromatin Landscape of Group 3 Medulloblastoma. Cancer Discov 7:288-301
Kim, Jong Wook; Abudayyeh, Omar O; Yeerna, Huwate et al. (2017) Decomposing Oncogenic Transcriptional Signatures to Generate Maps of Divergent Cellular States. Cell Syst 5:105-118.e9
Carlin, Daniel; Kosnicki, Kassi; Garamszegi, Sara et al. (2017) A multi-tool recipe to identify regions of protein-DNA binding and their influence on associated gene expression. F1000Res 6:784
Huang, Franklin W; Mosquera, Juan Miguel; Garofalo, Andrea et al. (2017) Exome Sequencing of African-American Prostate Cancer Reveals Loss-of-Function ERF Mutations. Cancer Discov 7:973-983
Zhu, Xiaodong; Girardo, David; Govek, Eve-Ellen et al. (2016) Role of Tet1/3 Genes and Chromatin Remodeling Genes in Cerebellar Circuit Formation. Neuron 89:100-12

Showing the most recent 10 out of 20 publications