? ? With the completion of the Human Genome Project, there is a need to translate genome-era discoveries into clinical utility. One difficulty in making bench-to-bedside translations with gene-expression and proteomic data is our current inability to relate these findings with each other and with clinical measurements. A translational researcher studying a particular biological process using microarrays or proteomics will want to gather as many relevant publicly-available data sets as possible, to compare findings. Translational investigators wanting to relate clinical or chemical data with multiple genomic or proteomic measurements will want to find and join related data sets. Unfortunately, finding and joining relevant data sets is particularly challenging today, as the useful annotations of this data are still represented only by unstructured free-text, limiting its secondary use. A question we have sought to answer is whether prior investments in biomedical ontologies can provide leverage in determining the context of genomic data in an automated manner, thereby enabling integration of gene expression and proteomic data and the secondary use of genomic data in multiple fields of research beyond those for which the data sets were originally targeted. The three specific aims to address this question are to (1) develop tools that comprehensively map contextual annotations to the largest biomedical ontology, the Unified Medical Language System (UMLS), built and supported by the National Library of Medicine, validate, and disseminate the mappings, (2) execute a four-pronged strategy to evaluate experiment-concept mappings, and (3) apply experiment-context mappings to find and integrate data within and across microarray and proteomics repositories. To keep these tools relevant to biomedical investigators, we have included three Driving Biological Projects (DBPs), in the domains of breast cancer, organ transplantation, and T-cell biology. To accomplish these DBPs, our tools and mappings will be used to find and join experimental data within and across microarray and proteomic repositories. Having DBPs to address will focus our development on a set of scalable tools that can access and analyze experimental data covering a large variety of diseases. Through our advisory committee of world-renowned NIH-funded investigators, we will ensure that our findings will have broad applicability and are useful to a wide variety of biomedical researchers. ? ? ?

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Schools of Medicine
United States
Zip Code
Kodama, Keiichi; Zhao, Zhiyuan; Toda, Kyoko et al. (2016) Expression-Based Genome-Wide Association Study Links Vitamin D-Binding Protein With Autoantigenicity in Type 1 Diabetes. Diabetes 65:1341-9
Kodama, Keiichi; Toda, Kyoko; Morinaga, Shojiroh et al. (2015) Anti-CD44 antibody treatment lowers hyperglycemia and improves insulin resistance, adipose inflammation, and hepatic steatosis in diet-induced obese mice. Diabetes 64:867-75
Corona, Erik; Chen, Rong; Sikora, Martin et al. (2013) Analysis of the genetic basis of disease in the context of worldwide human relationships and migration. PLoS Genet 9:e1003447
Hsu, Irving; Chen, Rong; Ramesh, Aditya et al. (2013) Systematic identification of DNA variants associated with ultraviolet radiation using a novel Geographic-Wide Association Study (GeoWAS). BMC Med Genet 14:62
Kodama, Keiichi; Tojjar, Damon; Yamada, Satoru et al. (2013) Ethnic differences in the relationship between insulin sensitivity and insulin response: a systematic review and meta-analysis. Diabetes Care 36:1789-96
Patel, Chirag J; Chen, Rong; Kodama, Keiichi et al. (2013) Systematic identification of interaction effects between genome- and environment-wide associations in type 2 diabetes mellitus. Hum Genet 132:495-508
Morgan, Alexander A; Chen, Rong; Butte, Atul Janardhan (2012) Clinical utility of sequence-based genotype compared with that derivable from genotyping arrays. J Am Med Inform Assoc 19:e21-7
Patel, Chirag J; Cullen, Mark R; Ioannidis, John P A et al. (2012) Systematic evaluation of environmental factors: persistent pollutants and nutrients correlated with serum lipid levels. Int J Epidemiol 41:828-43
Kang, H P; Yang, X; Chen, R et al. (2012) Integration of disease-specific single nucleotide polymorphisms, expression quantitative trait loci and coexpression networks reveal novel candidate genes for type 2 diabetes. Diabetologia 55:2205-13
Khatri, Purvesh; Sirota, Marina; Butte, Atul J (2012) Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol 8:e1002375

Showing the most recent 10 out of 48 publications