Genome-wide expression analysis has become a routine and powerful tool in biomedical research. However, extracting the full biological insight contained in such data remains a major challenge. Knowledge-based approaches have the potential to accelerate the interpretation of results and generation of hypotheses, which can then be experimentally validated. We have recently developed a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing not on single genes, but on gene sets - that is, groups of genes that share common biological function, chromosomal location or regulation. GSEA has proven successful in providing insight into a variety of disease-related studies, including in diabetes and cancers. We have also created an initial resource, the Molecular Signatures Database (MSigDB), consisting of approximately 1300 annotated gene sets to be used with GSEA. The GSEA software and the MSigDB database are freely distributed as user-friendly, platform-independent software tools to bring the power of GSEA, available to the entire research community. In only 6 months since their informal release, over 200 users have downloaded the tools. The goal of this grant is to enhance the GSEA software and MSigDB database and to ensure their distribution to research community. The two specifics aims thus focus on:
Aim 1. Enhancements to GSEA and MSigDB to Better Support Users and Their Research.
Aim 2. Maintenance and User Support for GSEA and MSigDB. We have extensive experience in software engineering, including the development and distribution of the GenePattern software that is used by over 1300 scientists world-wide. We also have a solid history of producing successful user workshops and documentation. This, together with our initial GSEA user base, makes us well poised to carry out the aims of this proposal.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-L (51))
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Broad Institute, Inc.
United States
Zip Code
Archer, Tenley C; Ehrenberger, Tobias; Mundt, Filip et al. (2018) Proteomics, Post-translational Modifications, and Integrative Analyses Reveal Molecular Heterogeneity within Medulloblastoma Subgroups. Cancer Cell 34:396-410.e8
Huang, Justin K; Carlin, Daniel E; Yu, Michael Ku et al. (2018) Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. Cell Syst 6:484-495.e5
Milne, Roger L (see original citation for additional authors) (2017) Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer. Nat Genet 49:1767-1778
Huang, Franklin W; Mosquera, Juan Miguel; Garofalo, Andrea et al. (2017) Exome Sequencing of African-American Prostate Cancer Reveals Loss-of-Function ERF Mutations. Cancer Discov 7:973-983
Silterra, Jacob; Gillette, Michael A; Lanaspa, Miguel et al. (2017) Transcriptional Categorization of the Etiology of Pneumonia Syndrome in Pediatric Patients in Malaria-Endemic Areas. J Infect Dis 215:312-320
Viswanathan, Vasanthi S; Ryan, Matthew J; Dhruv, Harshil D et al. (2017) Dependency of a therapy-resistant state of cancer cells on a lipid peroxidase pathway. Nature 547:453-457
Boulay, Gaylor; Awad, Mary E; Riggi, Nicolo et al. (2017) OTX2 Activity at Distal Regulatory Elements Shapes the Chromatin Landscape of Group 3 Medulloblastoma. Cancer Discov 7:288-301
Michailidou, Kyriaki (see original citation for additional authors) (2017) Association analysis identifies 65 new breast cancer risk loci. Nature 551:92-94
Kim, Jong Wook; Abudayyeh, Omar O; Yeerna, Huwate et al. (2017) Decomposing Oncogenic Transcriptional Signatures to Generate Maps of Divergent Cellular States. Cell Syst 5:105-118.e9
Hachigian, Lea J; Carmona, Vitor; Fenster, Robert J et al. (2017) Control of Huntington's Disease-Associated Phenotypes by the Striatum-Enriched Transcription Factor Foxp2. Cell Rep 21:2688-2695

Showing the most recent 10 out of 65 publications