The broad long-term objectives of this proposal are to collaborate with a group of biology researchers with real world needs to develop and distribute a general-purpose system (BioMediator) to permit integration and analysis of diverse types of biologic data. BioMediator will combine information from a variety of different public and private sources (e.g. experimental data) to help answer biologic questions. BioMediator builds on the foundations laid by the currently funded GeneSeek data integration system. The GeneSeek system was originally developed to query only public domain data sources (both structured and semi-structured) to assist in the curation of the GeneClinics genetic testing knowledge base.
The specific aims leading to the development of the BioMediator system are: 1) Interface to additional public domain biological data sources (e.g. pathway databases, function databases). 2) Incorporate access to private databases of experimental results (e.g. proteomics and expression array data). 3) Extend model to include analytic tools operating across distributed biological data sources (e.g. across a set of both proteomic and expression array data). 4) Evolve centralized BioMediator system into a model peer to peer data sharing and analysis system. 5) Distribute and maintain BioMediator production software as a resource for the biological community. The health relatedness of the project is that biologists seeking to understand the molecular basis of human health and disease are struggling with large and increasing volumes of diverse data (mutation, expression array, proteomic) that need to be brought together (integrated) and analyzed in order to develop and test hypotheses about disease mechanisms and normal physiology. The research design is to develop BioMediator by combining and leverage recent developments in a) the domain of open source analytic tools for biologic data and b) ongoing theoretical and applied research by members of the current GeneSeek research team on both general purpose and biologic data integration systems. The methods are: a) to use an iterative rapid prototyping software development model evaluated in a real-world test bed and b) to expand the existing GeneSeek research team (with expertise in informatics, computer science, and software development) to include biological expertise (four biologists forming a biology working group) and biostatistics expertise. The goal is to ensure the BioMediator system 1) meets the needs of a group of end users acquiring, integrating and analyzing diverse biologic data sets, 2) does so in a scaleable and expandable manner drawing on the latest theoretical developments in data analysis and integration.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG002288-05
Application #
6805962
Study Section
Genome Study Section (GNM)
Program Officer
Good, Peter J
Project Start
2000-08-15
Project End
2006-08-31
Budget Start
2004-09-01
Budget End
2005-08-31
Support Year
5
Fiscal Year
2004
Total Cost
$322,326
Indirect Cost
Name
University of Washington
Department
Pediatrics
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Cadag, Eithon; Tarczy-Hornoch, Peter; Myler, Peter J (2012) Learning virulent proteins from integrated query networks. BMC Bioinformatics 13:321
Sarkar, Indra Neil; Butte, Atul J; Lussier, Yves A et al. (2011) Translational bioinformatics: linking knowledge across biological and clinical realms. J Am Med Inform Assoc 18:354-7
Shen, Terry H; Tarczy-Hornoch, Peter; Detwiler, Landon T et al. (2010) Evaluation of probabilistic and logical inference for a SNP annotation system. J Biomed Inform 43:407-18
Lacson, Ronilda; Pitzer, Erik; Hinske, Christian et al. (2009) Evaluation of a large-scale biomedical data annotation initiative. BMC Bioinformatics 10 Suppl 9:S10
Shen, Terry H; Carlson, Christopher S; Tarczy-Hornoch, Peter (2009) Evaluating the accuracy of a functional SNP annotation system. BMC Bioinformatics 10 Suppl 9:S11
Shen, Terry H; Carlson, Christopher S; Tarczy-Hornoch, Peter (2009) SNPit: a federated data integration system for the purpose of functional SNP annotation. Comput Methods Programs Biomed 95:181-9
Louie, Brenton; Tarczy-Hornoch, Peter; Higdon, Roger et al. (2008) Validating annotations for uncharacterized proteins in Shewanella oneidensis. OMICS 12:211-5
(2007) Bio*Medical Informatics and Genomic Medicine: Research and Training. J Biomed Inform 40:1-4
Cadag, Eithon; Louie, Brent; Myler, Peter J et al. (2007) Biomediator data integration and inference for functional annotation of anonymous sequences. Pac Symp Biocomput :343-54
Anderson, Nicholas R; Lee, E Sally; Brockenbrough, J Scott et al. (2007) Issues in biomedical research data management and analysis: needs and barriers. J Am Med Inform Assoc 14:478-88

Showing the most recent 10 out of 20 publications