CREATION AND APPLICATION OF A DIABETES KNOWLEDGE BASE The applicant is an Instructor in Pediatrics at Harvard Medical School and an associate in bioinformatics and pediatric endocrinology at Children's Hospital, Boston. The applicant completed an NLM-funded fellowship in informatics and received a Masters Degree in Medical Informatics from MIT. Since completing his fellowship less than two years ago, he has first-authored six publications, co-authored eight publications, senior authored two publications, and co-authored a book on microarray analysis. The applicant plans to pursue a career in basic research in diabetes genomics and bioinformatics, with a joint appointment in both an academic pediatric endocrinology department and a medical informatics program. The mentor is Dr. Isaac Kohane, director of the Children's Hospital Informatics Program with a staff of 20 including 10 faculty and extensive computational resources, funded through several NIH grants. The past 10 years have led to a variety of measurements tools in molecular biology that are near comprehensive in nature. For example, RNA expression detection microarrays can provide systematic quantitative information on the expression of over 40,000 unique RNAs within cells. Yet microarrays are just one of at least 30 large-scale measurement or experimental modalities available to investigators in molecular biology. We see scientific value in being able to integrate multiple large-scale data sets from all biological modalities to address biomedical questions that could otherwise not be answered. We recognize that the full agenda of working out the details for all possible inferential processes between all near-comprehensive modalities is too large. The goal of this project is to serve as a model automated system for gathering data related to particular experimental characteristic and perform inferential operators on these data. For this application, we are focusing on a pragmatic subset. Specifically, we propose intersecting near comprehensive data sets by phenotype, and intersecting lists of significant and related genes within these data sets in an automated manner. The central hypothesis for this application is that integrating large-scale data sets across measurement modalities is a synergistic process to create new knowledge and testable hypothesis in the area of diabetes, and inferential processes involving intersection across genes can be automated.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Career Transition Award (K22)
Project #
5K22LM008261-04
Application #
7240579
Study Section
Special Emphasis Panel (ZLM1-HS-B (O1))
Program Officer
Ye, Jane
Project Start
2005-01-15
Project End
2008-05-31
Budget Start
2007-06-01
Budget End
2008-05-31
Support Year
4
Fiscal Year
2007
Total Cost
$153,843
Indirect Cost
Name
Stanford University
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Kodama, Keiichi; Horikoshi, Momoko; Toda, Kyoko et al. (2012) Expression-based genome-wide association study links the receptor CD44 in adipose tissue with type 2 diabetes. Proc Natl Acad Sci U S A 109:7049-54
Dudley, Joel T; Chen, Rong; Sanderford, Maxwell et al. (2012) Evolutionary meta-analysis of association studies reveals ancient constraints affecting disease marker discovery. Mol Biol Evol 29:2087-94
Morgan, Alexander A; Dudley, Joel T; Deshpande, Tarangini et al. (2010) Dynamism in gene expression across multiple studies. Physiol Genomics 40:128-40
Davydov, Eugene V; Goode, David L; Sirota, Marina et al. (2010) Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput Biol 6:e1001025
Li, Li; Wadia, Persis; Chen, Rong et al. (2009) Identifying compartment-specific non-HLA targets after renal transplantation by integrating transcriptome and ""antibodyome"" measures. Proc Natl Acad Sci U S A 106:4148-53
Sirota, Marina; Schaub, Marc A; Batzoglou, Serafim et al. (2009) Autoimmune disease classification by inverse association with SNP alleles. PLoS Genet 5:e1000792
Liu, Yueyi I; Wise, Paul H; Butte, Atul J (2009) The ""etiome"": identification and clustering of human disease etiological factors. BMC Bioinformatics 10 Suppl 2:S14
Dudley, Joel T; Tibshirani, Robert; Deshpande, Tarangini et al. (2009) Disease signatures are robust across tissues and experiments. Mol Syst Biol 5:307
Chiang, A P; Butte, A J (2009) Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clin Pharmacol Ther 86:507-10
Dudley, Joel; Butte, Atul J (2008) Enabling integrative genomic analysis of high-impact human diseases through text mining. Pac Symp Biocomput :580-91

Showing the most recent 10 out of 20 publications