? ? A molecular understanding of biological systems is crucial for the development of innovative medical advances from novel genomic technologies. The systems annotation currently available in public pathway databases is still too limited for many of its envisioned applications in medical research. Recently, our ability to identify novel genes has outstripped our capacity to analyze these genes' interactions experimentally. Thus, algorithms are being developed to infer biological networks by computational analysis of genomic data. A major factor limiting the integration of computationally- derived networks in pathway databases is the lack of common quantitative methods for evaluating and comparing these networks. There is currently no established standard for comparison that is objective, statistically grounded, and scalable. A network evaluation method with these properties would spur measurable improvement in our systems biology resources and a concomitant improvement in our ability to translate genomic data into practical medical knowledge. ? ? The objective of this application is to develop software and methods that use information in online databases to evaluate biological networks.
In Aim 1 we will develop, assess, and select statistical evaluation methods that incorporate probabilistic network models and reflect our confidence in the quality of the validation data. Our statistical approach will report the probability of seeing such confirmation data by chance in a random network. We will validate our methods in part using networks our collaborators have derived from unpublished data, avoiding bias since the source data cannot yet be represented in the online data sources we are mining.
In Aim 2 we will implement network evaluation software that incorporates text-based and experimental validation data mined from links in the Entrez Gene database. Our software will report the statistical significance of the input networks and will provide direct links from each network edge to the relevant supporting data used in the evaluation. With the completion of the proposed work, we expect to offer the community an open-source computational resource for network evaluation that is quantitative, scalable, informative, and grounded in peer-reviewed published experimental findings. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21LM009411-01A1
Application #
7314441
Study Section
Special Emphasis Panel (ZLM1-ZH-H (M3))
Program Officer
Ye, Jane
Project Start
2007-09-30
Project End
2009-09-29
Budget Start
2007-09-30
Budget End
2008-09-29
Support Year
1
Fiscal Year
2007
Total Cost
$181,052
Indirect Cost
Name
Tufts University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
073134835
City
Medford
State
MA
Country
United States
Zip Code
02155
Fox, Andrew D; Hescott, Benjamin J; Blumer, Anselm C et al. (2011) Connectedness of PPI network neighborhoods identifies regulatory hub proteins. Bioinformatics 27:1135-42
Hescott, B J; Leiserson, M D M; Cowen, L J et al. (2010) Evaluating between-pathway models with expression data. J Comput Biol 17:477-87
Fox, Andrew D; Baumgartner, William A; Johnson, Helen L et al. (2010) Mining Protein-Protein Interactions from GeneRIFs with OpenDMAP. Lect Notes Comput Sci 6004:43-52
Przytycka, Teresa M; Singh, Mona; Slonim, Donna K (2010) Toward the dynamic interactome: it's about time. Brief Bioinform 11:15-29
Fox, A; Taylor, D; Slonim, D K (2009) High throughput interaction data reveals degree conservation of hub proteins. Pac Symp Biocomput :391-402
Slonim, Donna K; Yanai, Itai (2009) Getting started in gene expression microarray analysis. PLoS Comput Biol 5:e1000543