In two previous stages of this project, both funded by the National Institute of General Medical Sciences and carried out successfully, we developed GeneWays, a completely automated system that efficiently distills information about molecular interactions from an astronomical number of full-text biomedical articles. The next logical stage of the project is to carry this system from the computational laboratory into a practical, useful, and even indispensable tool that researchers can use to solve complex problems currently posed in experimental medicine and biology. The central hypothesis of our work on GeneWays has been that our computational tools will generate biological predictions of a quality sufficiently high that the biomedical community will invest in serious experimental validation. Specifically, we propose the following. 1. We will improve significantly the precision and recall of the GeneWays system. 2. We will develop and implement a probabilistic belief-network formalism?a belief-graph relative of the Bayesian network formalism that allows us to place and update beliefs on both the vertices and the edges of the graph for probabilistic reasoning over the large collection of facts in the GeneWays database. We will develop and implement a coordinated collection of methods for computing and updating beliefs on individual nodes and edges of the belief graph. 3. We will develop and implement a mathematical framework for incorporating pathway information into a genetic- linkage analysis formalism in such a way that each piece of pathway knowledge includes a specified degree of confidence. 4. We will process an enormous collection of texts, such as open-access biomedical journals, PubMed abstracts, and the GeneWays corpus, and thus will build a comprehensive GeneHighWays database. We will make the GeneHighWays database easily and freely accessible to academic researchers through a web interface. We will evaluate the new version of the GeneWays system and the GeneHighWays database for the quality of data, performance of the mathematical methods, and quality of the interface. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
2R01GM061372-06
Application #
7148274
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Anderson, James J
Project Start
2000-04-01
Project End
2007-09-29
Budget Start
2006-09-30
Budget End
2007-09-29
Support Year
6
Fiscal Year
2006
Total Cost
$303,688
Indirect Cost
Name
Columbia University (N.Y.)
Department
Genetics
Type
Schools of Medicine
DUNS #
621889815
City
New York
State
NY
Country
United States
Zip Code
10032
Yao, Lixia; Li, Ying; Ghosh, Soumitra et al. (2015) Health ROI as a measure of misalignment of biomedical needs and resources. Nat Biotechnol 33:807-11
Soldatova, Larisa N; Rzhetsky, Andrey; De Grave, Kurt et al. (2013) Representation of probabilistic scientific knowledge. J Biomed Semantics 4 Suppl 1:S7
Divoli, Anna; Mendonça, Eneida A; Evans, James A et al. (2011) Conflicting biomedical assumptions for mathematical modeling: the case of cancer metastasis. PLoS Comput Biol 7:e1002132
Balkir, Atilla Soner; Foster, Ian; Rzhetsky, Andrey (2011) A Distributed Look-up Architecture for Text Mining Applications using MapReduce. Proc Int Symp High Perform Distrib Comput 2011:
Evans, James A; Rzhetsky, Andrey (2011) Advancing science through mining libraries, ontologies, and communities. J Biol Chem 286:23659-66
Yao, Lixia; Divoli, Anna; Mayzus, Ilya et al. (2011) Benchmarking ontologies: bigger or better? PLoS Comput Biol 7:e1001055
Yao, Lixia; Evans, James A; Rzhetsky, Andrey (2010) Novel opportunities for computational biology and sociology in drug discovery. Trends Biotechnol 28:161-70
Evans, James; Rzhetsky, Andrey (2010) Philosophy of science. Machine science. Science 329:399-400
Yao, Lixia; Evans, James A; Rzhetsky, Andrey (2009) Novel opportunities for computational biology and sociology in drug discovery. Trends Biotechnol 27:531-40
Iossifov, Ivan; Rodriguez-Esteban, Raul; Mayzus, Ilya et al. (2009) Looking at cerebellar malformations through text-mined interactomes of mice and humans. PLoS Comput Biol 5:e1000559

Showing the most recent 10 out of 26 publications