Interoperation of Genome Databases and Tools

Cheung, Kei-Hoi

Abstract

This application for an NIH Mentored Quantitative Research Career Award requests support for Dr. Kei-Hoi Cheung as he embarks on a faculty career focused on genome-related bioinformatics. This application presents a research career development plan in the field of bioinformatics, bridging computer science and biology. The plan includes two partially overlapping phases: (1) a didactic phase that emphasizes training, including coursework and laboratory work in the area of genetics and genomics to complement Dr. Cheung's doctoral training in Computer Science and (ii) a development phase that focuses on intense development of the proposed research. These two phases will be closely supervised by a steering committee of senior scientists, who will serve as mentors or advisors, in the area of biology and bioinformatics. The human genome project and the rapid advance in genomic technology (e.g., microarrays) have produced numerous local, national, and international genome databases, many of which are Web-accessible. To answer questions that arise in advanced genome research projects, researchers often need to analyze a large amount of data that are collected from multiple related databases. Therefore, it is important to explore (1) how to integrate the databases involved in a flexible and useful fashion and (2) how to perform large-scale data analyses as easily and rapidly as possible. To this end, we propose two complimentary approaches. 1. The problem of data integration or interoperation is difficult because of the syntactic and semantic heterogeneities involved. To address this problem, we propose a metadata-driven approach using eXtensible Markup Language (XML), which incorporates standardized vocabulary to map heterogeneous Web-accessible data sets into a common format that facilitates interoperability. 2. To facilitate and speed up analysis of a large quantity of data, we will also explore a range of computational techniques including the use of Turbogenomics, which represents collaboration with the high performance computing group within the Yale department of Computer Science. These techniques allow (i) integration of heterogeneous software components (analysis tools) to be done easily and (ii) exploitation of the power of parallel computing. We will design, develop, test, and evaluate the approach in the context of current database projects including: 1) TRIPLES that manages data for large-scale yeast genome analysis (with Prof Snyder) and 2) ALFRED that stores gene frequency data on different human populations (with Prof Kidd). We have identified a number of related external Web-accessible databases as well as tools that users would like to access from TRIPLES and ALFRED in an integrated fashion. We will initially develop and apply our approach to integrate these databases and tools. We will extend our approach to other types of genomic data such as microarray data, which both laboratories and others will soon be generating in large quantities.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Mentored Quantitative Research Career Development Award (K25)
Project #: 5K25HG002378-03
Application #: 6649804
Study Section: Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer: Good, Peter J

Project Start: 2001-09-01
Project End: 2006-08-31
Budget Start: 2003-09-01
Budget End: 2004-08-31
Support Year: 3
Fiscal Year: 2003
Total Cost: $141,992
Indirect Cost

Institution

Name: Yale University
Department: Anesthesiology
Type: Schools of Medicine
DUNS #: 043207562

City: New Haven
State: CT
Country: United States
Zip Code: 06520

Related projects


NIH 2005 K25 HG	Interoperation of Genome Databases and Tools Cheung, Kei-Hoi / Yale University	$146,210
NIH 2004 K25 HG	Interoperation of Genome Databases and Tools Cheung, Kei-Hoi / Yale University	$144,077
NIH 2003 K25 HG	Interoperation of Genome Databases and Tools Cheung, Kei-Hoi / Yale University	$141,992
NIH 2002 K25 HG	Interoperation of Genome Databases and Tools Cheung, Kei-Hoi / Yale University	$143,037
NIH 2001 K25 HG	Interoperation of Genome Databases and Tools Cheung, Kei-Hoi / Yale University	$147,734

Publications

Crasto, Chiquito J; Marenco, Luis N; Liu, Nian et al. (2007) SenseLab: new developments in disseminating neuroscience information. Brief Bioinform 8:150-62

Lam, Hugo Y K; Marenco, Luis; Clark, Tim et al. (2007) AlzPharm: integration of neurodegeneration data using RDF. BMC Bioinformatics 8 Suppl 3:S4

Smith, Andrew K; Cheung, Kei-Hoi; Yip, Kevin Y et al. (2007) LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics. BMC Bioinformatics 8 Suppl 3:S5

Yip, Kevin Y; Qi, Peishen; Schultz, Martin et al. (2006) SemBiosphere: a semantic web approach to recommending microarray clustering services. Pac Symp Biocomput :188-99

Lam, Hugo Y K; Marenco, Luis; Shepherd, Gordon M et al. (2006) Using web ontology language to integrate heterogeneous databases in the neurosciences. AMIA Annu Symp Proc :464-8

Carriero, Nicholas; Osier, Michael V; Cheung, Kei-Hoi et al. (2005) A high productivity/low maintenance approach to high-performance computation for biomedicine: four case studies. J Am Med Inform Assoc 12:90-8

Cheung, Kei-Hoi; Yip, Kevin Y; Smith, Andrew et al. (2005) YeastHub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics 21 Suppl 1:i85-96

Osier, Michael V; Zhao, Hongyu; Cheung, Kei-Hoi (2004) Handling multiple testing while interpreting microarrays with the Gene Ontology Database. BMC Bioinformatics 5:124

de Knikker, Remko; Guo, Youjun; Li, Jin-Long et al. (2004) A web services choreography scenario for interoperating bioinformatics applications. BMC Bioinformatics 5:25

Cheung, Kei-Hoi; de Knikker, Remko; Guo, Youjun et al. (2004) Biosphere: the interoperation of web services in microarray cluster analysis. Appl Bioinformatics 3:253-6

Showing the most recent 10 out of 14 publications

Comments

Be the first to comment on Kei-Hoi Cheung's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: