Analysis and Annotation Pipeline for Functional Genomics

Ochs, Michael

Abstract

This grant proposes the development of an extendable, scalable automated data analysis pipeline for functional genomics data. Functional genomics, including microarrays and proteomics, is evolving quickly, with data sets increasing rapidly in size and new analysis methodologies appearing monthly. Because there are no de facto standards for addressing typical experimental questions, the application of multiple analyses is desirable, but rarely performed due to the effort required. Furthermore, the analysis of functional genomics data is generally a multi-step process, with many possible methods in use at each step (e.g., for image analysis, data normalization, statistical analysis, data mining), leading to a combinatorial explosion of effort when using multiple analyses. The functional genomics data pipeline proposed in this application will provide the ability to automatically perform multiple analyses, will provide easy extendibility for adding new functions and data types, will provide a distributed computing environment to provide adequate computational power, and will integrate automated annotation to allow analyses to be guided by biological knowledge. The system will utilize Enterprise Java Beans to provide a robust server architecture, Java server pages for dynamic generation of web interfaces, and object oriented design patterns to optimize the software architecture. The system will be extendable during operation through use of the Strategy design pattern coupled to the Java reflection mechanism. Functional genomics data sets will be encapsulated within data objects that include links to the NCI caBIO objects to utilize the NCI Center for Bioinformatics data resources. In addition, annotations will be retrievable from web sites and through the Distributed Annotation System. Documentation and testing will proceed in parallel with development, and will integrate end users during design and deployment to tune the user interface. The final system will provide dramatic improvements in researchers' abilities to fully explore their growing data sets and to interpret their experimental results in light of the larger biological knowledge bases. It will be fully supported and released to the community open source.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Exploratory/Developmental Grants (R21)
Project #: 1R21LM008309-01A1
Application #: 6867168
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2005-02-01
Project End: 2007-01-31
Budget Start: 2005-02-01
Budget End: 2006-01-31
Support Year: 1
Fiscal Year: 2005
Total Cost: $196,948
Indirect Cost

Institution

Name: Fox Chase Cancer Center
Department
Type
DUNS #: 073724262

City: Philadelphia
State: PA
Country: United States
Zip Code: 19111

Related projects


NIH 2006 R21 LM	Analysis and Annotation Pipeline for Functional Genomics Ochs, Michael F. / Fox Chase Cancer Center	$230,215
NIH 2005 R21 LM	Analysis and Annotation Pipeline for Functional Genomics Ochs, Michael F. / Fox Chase Cancer Center	$196,948

Publications

Ochs, Michael F; Casagrande, John T (2008) Information systems for cancer research. Cancer Invest 26:1060-7

Kossenkov, Andrew V; Peterson, Aidan J; Ochs, Michael F (2007) Determining transcription factor activity from microarray data using Bayesian Markov chain Monte Carlo sampling. Stud Health Technol Inform 129:1250-4

Wang, Guoli; Kossenkov, Andrew V; Ochs, Michael F (2006) LS-NMF: a modified non-negative matrix factorization algorithm utilizing uncertainty estimates. BMC Bioinformatics 7:175

Bidaut, Ghislain; Suhre, Karsten; Claverie, Jean-Michel et al. (2006) Determination of strongly overlapping signaling activity from microarray data. BMC Bioinformatics 7:99

Comments

Be the first to comment on Michael Ochs's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: