The genomics research community is facing a time of tremendous opportunity and challenge as it deals with the increasing amounts of data produced by ever improving technology platforms. The vast amounts of data of diverse types (sequence, expression, epigenetic, RNAi screens, etc.) offer the prospect of integrating different views of biological states, and the promise of a deeper mechanistic understanding. However, such integration is out of reach for most biologists. The increasing need to use multiple sophisticated computational methods and tools to gain insights from the wealth of available data poses a major barrier. The number of computational tools and methods has also grown quickly over the past few years increasing the difficulty of finding the appropriate ones to use and getting them to work easily together. As a result, many biologists rely on the few tools they may already be accustomed to and thus are unable to take advantage of more advanced approaches that could transform biomedical research. In this Program Project we propose to tackle the challenge of integrative genomic analysis by leveraging the power of new methods for sharing data and applications that are appearing on the World Wide Web in other domains. We will develop GenomeSpace, a modular and extensible computational environment where a wide range of analytical methods and tools can interoperate and be made accessible to biologists. GenomeSpace will be a community-driven site where tool developers can easily share their methods and users can adopt and use them. Thus, while tools will continue their independent development efforts and retain their own look and feel, they will also be able to interoperate with a wealth of other methods, tools, and genomics resources. We will drive the GenomeSpace development in two ways. Six successful and popular software packages (Genomica, GenePattern, (Cytoscape, the Integrative Genomics Viewer, Galaxy, and the UCSC Genome Browser) will seed GenomeSpace with the capability for integrative genomic analysis, while guiding the development of the infrastructure to ensure the widest range of architectures and applications can participate. Three Driving Biological Projects - in cancer genomics, epigenomics, and hematopoiesis - will test the ability of GenomeSpace to perform important analyses in the context of real biological problems. Together, they will ensure we provide unprecedented accessibility for computational analysis for genomics data, thus transforming integrative genomics analysis and biomedical research.

Public Health Relevance

The GenomeSpace community Web environment will put the universe of genomic analysis tools within the reach of all biomedical researchers. Through the integrative analysis of genomic data from diverse sources and types, GenomeSpace users will be able to address a variety of problems at the forefront of biomedical research including patient diagnosis and prognosis, identification of new drug targets, and understanding biological mechanisms.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Program Projects (P01)
Project #
3P01HG005062-04S1
Application #
8720223
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer
Bonazzi, Vivien
Project Start
2009-09-11
Project End
2014-06-30
Budget Start
2012-07-01
Budget End
2014-06-30
Support Year
4
Fiscal Year
2013
Total Cost
$541,000
Indirect Cost
$187,959
Name
Broad Institute, Inc.
Department
Type
DUNS #
623544785
City
Cambridge
State
MA
Country
United States
Zip Code
02142
Benitez, Cecil M; Qu, Kun; Sugiyama, Takuya et al. (2014) An integrated cell purification and genomics strategy reveals multiple regulators of pancreas development. PLoS Genet 10:e1004645
Lee, Mark N; Roy, Matthew; Ong, Shao-En et al. (2013) Identification of regulators of the innate immune response to cytosolic DNA and retroviral infection by an integrative approach. Nat Immunol 14:179-85
Adiconis, Xian; Borges-Rivera, Diego; Satija, Rahul et al. (2013) Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat Methods 10:623-9
Meyer, Laurence R; Zweig, Ann S; Hinrichs, Angie S et al. (2013) The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res 41:D64-9
Kuhn, Robert M; Haussler, David; Kent, W James (2013) The UCSC genome browser and associated tools. Brief Bioinform 14:144-61
Dreszer, Timothy R; Karolchik, Donna; Zweig, Ann S et al. (2012) The UCSC Genome Browser database: extensions and updates 2011. Nucleic Acids Res 40:D918-23
Fujita, Pauline A; Rhead, Brooke; Zweig, Ann S et al. (2011) The UCSC Genome Browser database: update 2011. Nucleic Acids Res 39:D876-82
Smoot, Michael E; Ono, Keiichiro; Ruscheinski, Johannes et al. (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27:431-2