Building on 8 years of highly productive work in technology development that included the creation of the Colorado Richly Annotated Full Text corpus (CRAFT), we hypothesize that text mining resources and methods are approaching the level of maturity required to productively process a significant proportion of the full text biomedical literature to create a well-represented formal knowledge base of molecular biology. We propose a detailed, integrated plan to achieve this long-standing goal. Success in this effort will make possible a transformative new way for the biomedical research community to identify access and integrate existing knowledge, breaking down disciplinary boundaries and other silos that have kept scientists from fully exploiting relevant prior results in their research. Our successes in the prior funding period broadened the applicability of biomedical concept identification systems to a much wider set of tasks, demonstrating the ability to target multiple community-curated ontologies in text mining, and generate scientifically significant insights from the results. The proposed work would take advantage of the resources we produced to transcend several of the limitations of previous efforts. We propose innovative new approaches to formal knowledge representation and to characterizing relationships between textual elements and semantic content. We will design, implement and evaluate computational systems that have the potential to transform enormous text collections into semantically rich, logic-based, standards-compliant, formal representations of biomedical knowledge with clearly identified provenance. The resulting representations will express complex assertions about a very wide range of entities, processes, qualities, and, most importantly, their specific relationships with one another.

Public Health Relevance

Hunter, Lawrence E. Project narrative This project will affect public health by increasing the access of physicians, researchers, and the general public to highly targeted information from published research and electronic health records. PHS 398/2590 (Rev. 06/09) Page Continuation Format Page

Agency
National Institute of Health (NIH)
Type
Research Project (R01)
Project #
2R01LM008111-09A1
Application #
8694375
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
9
Fiscal Year
2014
Total Cost
Indirect Cost
Name
University of Colorado Denver
Department
Pharmacology
Type
Schools of Medicine
DUNS #
City
Aurora
State
CO
Country
United States
Zip Code
80045
Eberlein, Jens; Davenport, Bennett; Nguyen, Tom et al. (2016) Aging promotes acquisition of naive-like CD8+ memory T cell traits and enhanced functionalities. J Clin Invest 126:3942-3960
Vehlow, Corinna; Kao, David P; Bristow, Michael R et al. (2015) Visual analysis of biological data-knowledge networks. BMC Bioinformatics 16:135
Karimpour-Fard, Anis; Epperson, L Elaine; Hunter, Lawrence E (2015) A survey of computational tools for downstream analysis of proteomic and other omic datasets. Hum Genomics 9:28
Hinterberg, Michael A; Kao, David P; Bristow, Michael R et al. (2015) Peax: interactive visual analysis and exploration of complex clinical phenotype and gene expression association. Pac Symp Biocomput :419-30
Livingston, Kevin M; Bada, Michael; Baumgartner Jr, William A et al. (2015) KaBOB: ontology-based semantic integration of biomedical databases. BMC Bioinformatics 16:126
Andrew, Audra L; Card, Daren C; Ruggiero, Robert P et al. (2015) Rapid changes in gene expression direct rapid shifts in intestinal form and function in the Burmese python after feeding. Physiol Genomics 47:147-57
Funk, Christopher S; Hunter, Lawrence E; Cohen, K Bretonnel (2014) Combining heterogenous data for prediction of disease related and pharmacogenes. Pac Symp Biocomput :328-39
Cohen, K Bretonnel; Hunter, Lawrence E (2013) Chapter 16: text mining for translational bioinformatics. PLoS Comput Biol 9:e1003044
Hill, David P; Adams, Nico; Bada, Mike et al. (2013) Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics 14:513
Liu, Haibin; Hunter, Lawrence; Kešelj, Vlado et al. (2013) Approximate subgraph matching-based literature mining for biomedical events and relations. PLoS One 8:e60954

Showing the most recent 10 out of 72 publications