Integrating Microarray and Proteomic Data by Ontology-based Annotation

Butte, Atul

Abstract

With the completion of the Human Genome Project, there is a need to translate genome-era discoveries into clinical utility. One difficulty in making bench-to-bedside translations with gene-expression and proteomic data is our current inability to relate these findings with each other and with clinical measurements. A translational researcher studying a particular biological process using microarrays or proteomics will want to gather as many relevant publicly-available data sets as possible, to compare findings. Translational investigators wanting to relate clinical or chemical data with multiple genomic or proteomic measurements will want to find and join related data sets. Unfortunately, finding and joining relevant data sets is particularly challenging today, as the useful annotations of this data are still represented only by unstructured free-text, limiting its secondary use. A question we have sought to answer is whether prior investments in biomedical ontologies can provide leverage in determining the context of genomic data in an automated manner, thereby enabling integration of gene expression and proteomic data and the secondary use of genomic data in multiple fields of research beyond those for which the data sets were originally targeted. The three specific aims to address this question are to (1) develop tools that comprehensively map contextual annotations to the largest biomedical ontology, the Unified Medical Language System (UMLS), built and supported by the National Library of Medicine, validate, and disseminate the mappings, (2) execute a four-pronged strategy to evaluate experiment-concept mappings, and (3) apply experiment-context mappings to find and integrate data within and across microarray and proteomics repositories. To keep these tools relevant to biomedical investigators, we have included three Driving Biological Projects (DBPs), in the domains of breast cancer, organ transplantation, and T-cell biology. To accomplish these DBPs, our tools and mappings will be used to find and join experimental data within and across microarray and proteomic repositories. Having DBPs to address will focus our development on a set of scalable tools that can access and analyze experimental data covering a large variety of diseases. Through our advisory committee of world-renowned NIH-funded investigators, we will ensure that our findings will have broad applicability and are useful to a wide variety of biomedical researchers.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 5R01LM009719-03
Application #: 7929664
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2008-09-30
Project End: 2012-09-29
Budget Start: 2010-09-30
Budget End: 2011-09-29
Support Year: 3
Fiscal Year: 2010
Total Cost: $277,200
Indirect Cost

Institution

Name: Stanford University
Department: Pediatrics
Type: Schools of Medicine
DUNS #: 009214214

City: Stanford
State: CA
Country: United States
Zip Code: 94305

Related projects


NIH 2011 R01 LM	Integrating Microarray and Proteomic Data by Ontology-based Annotation Butte, Atul J. / Stanford University	$266,112
NIH 2010 R01 LM	Integrating Microarray and Proteomic Data by Ontology-based Annotation Butte, Atul J. / Stanford University	$277,200
NIH 2009 R01 LM	Integrating Microarray and Proteomic Data by Ontology-based Annotation Butte, Atul J. / Stanford University	$280,000
NIH 2008 R01 LM	Integrating Microarray and Proteomic Data by Ontology-based Annotation Butte, Atul J. / Stanford University	$280,000

Publications

Kodama, Keiichi; Zhao, Zhiyuan; Toda, Kyoko et al. (2016) Expression-Based Genome-Wide Association Study Links Vitamin D-Binding Protein With Autoantigenicity in Type 1 Diabetes. Diabetes 65:1341-9

Kodama, Keiichi; Toda, Kyoko; Morinaga, Shojiroh et al. (2015) Anti-CD44 antibody treatment lowers hyperglycemia and improves insulin resistance, adipose inflammation, and hepatic steatosis in diet-induced obese mice. Diabetes 64:867-75

Corona, Erik; Chen, Rong; Sikora, Martin et al. (2013) Analysis of the genetic basis of disease in the context of worldwide human relationships and migration. PLoS Genet 9:e1003447

Hsu, Irving; Chen, Rong; Ramesh, Aditya et al. (2013) Systematic identification of DNA variants associated with ultraviolet radiation using a novel Geographic-Wide Association Study (GeoWAS). BMC Med Genet 14:62

Kodama, Keiichi; Tojjar, Damon; Yamada, Satoru et al. (2013) Ethnic differences in the relationship between insulin sensitivity and insulin response: a systematic review and meta-analysis. Diabetes Care 36:1789-96

Patel, Chirag J; Chen, Rong; Kodama, Keiichi et al. (2013) Systematic identification of interaction effects between genome- and environment-wide associations in type 2 diabetes mellitus. Hum Genet 132:495-508

Chen, Rong; Dudley, Joel T; Ruau, David et al. (2012) Quantifying multi-ethnic representation in genetic studies of high mortality diseases. AMIA Jt Summits Transl Sci Proc 2012:11-8

Patel, Chirag J; Chen, Rong; Butte, Atul J (2012) Data-driven integration of epidemiological and toxicological data to select candidate interacting genes and environmental factors in association with disease. Bioinformatics 28:i121-6

Kodama, Keiichi; Horikoshi, Momoko; Toda, Kyoko et al. (2012) Expression-based genome-wide association study links the receptor CD44 in adipose tissue with type 2 diabetes. Proc Natl Acad Sci U S A 109:7049-54

Dudley, Joel T; Chen, Rong; Sanderford, Maxwell et al. (2012) Evolutionary meta-analysis of association studies reveals ancient constraints affecting disease marker discovery. Mol Biol Evol 29:2087-94

Showing the most recent 10 out of 48 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: