Crowd-sourcing A STAR Functional Genomic Characterization of Cancer with Open Big Data

Hadley, Dexter

Abstract

There are data from over 1.6 million open digital samples in the NCBI's Gene Expression Omnibus (GEO; www.ncbi.nlm.nih.gov/geo). GEO houses high quality experiments that the NIH has funded to measure the functional genomics of diseased and healthy individual samples. The majority of these samples interrogate cancer phenotypes and can be used to better characterize the genomics of that disease. However, GEO digital samples lack any structured biological annotations (bioannotations) because they are variably described across different experiments by free text attributes. This proposal is about using the Search Tag Analyze Resource (STAR) as a genomics discovery platform to crowdsource the precise bioannotation of this open Big Data. We will demonstrate the utility of a well-structured GEO to better characterize cancer functional genomics and to estimate a robust molecular nosology across the disease. The robust gene signatures we define is a first step towards a more comprehensive genomic understanding of the spectrum of the disease and making novel drug and biomarker discoveries. Therefore, successful funding and completion of this work has the potential to improve translational discoveries that greatly reduce the burden of disease on patients and thus improve overall health and wellbeing of society.

Public Health Relevance

This proposal is about crowdsourcing a deeper molecular understanding of cancer with open functional genomics data from the NCBI's Gene Expression Omnibus (GEO). This data contains over 1.6 million digital samples across a great many diseases that can be mined for translational discovery and clinical impact. Although this Big Data is rich in content, it is difficult to interpret for molecular characteristics that can readil translate into novel drugs and biomarkers for disease. This is because samples are poorly described by unstructured free text attributes with little biological semantics or interpretable meaning. We previously built the Search Tag Analyze Resource (STAR; stargeo.org) as an online tool for anyone to bioannotate this data uniformly across studies to characterize disease genomics. Here, we will investigate how to drive precision in the STAR bioannotation process, how to characterize cancer genomics on a massive scale, and how to compare and contrast the performance of GEO relative to other open functional genomics cancer datasets.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Exploratory/Developmental Cooperative Agreement Phase I (UH2)
Project #: 5UH2CA203792-02
Application #: 9243231
Study Section: Special Emphasis Panel (ZRG1-BST-U (50)R)
Program Officer: Miller, David J

Project Start: 2016-04-01
Project End: 2018-03-31
Budget Start: 2017-04-01
Budget End: 2018-03-31
Support Year: 2
Fiscal Year: 2017
Total Cost: $317,000
Indirect Cost: $117,000

Institution

Name: University of California San Francisco
Department: Pediatrics
Type: Schools of Medicine
DUNS #: 094878337

City: San Francisco
State: CA
Country: United States
Zip Code: 94118

Related projects


NIH 2017 UH2 CA	Crowd-sourcing A STAR Functional Genomic Characterization of Cancer with Open Big Data Hadley, Dexter D. / University of California San Francisco	$317,000
NIH 2016 UH2 CA	Crowd-sourcing A STAR Functional Genomic Characterization of Cancer with Open Big Data Hadley, Dexter D. / University of California San Francisco

Publications

Hadley, Dexter; Pan, James; El-Sayed, Osama et al. (2017) Precision annotation of digital samples in NCBI's gene expression omnibus. Sci Data 4:170125

Himmelstein, Daniel Scott; Lizee, Antoine; Hessler, Christine et al. (2017) Systematic integration of biomedical knowledge prioritizes drugs for repurposing. Elife 6:

Chen, Bin; Sirota, Marina; Fan-Minogue, Hua et al. (2015) Relating hepatocellular carcinoma tumor samples and cell lines using gene expression data in translational research. BMC Med Genomics 8 Suppl 2:S5

Comments

Be the first to comment on Dexter Hadley's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: