This research is motivated by the problem of spinal cord injuries, with the goal of producing data that can lead to pharmacological or other interventions to make recovery possible. It is hypothesized that major changes after a spinal cord injury occur on the protein level, and that a model of protein changes and interactions between proteins will make it possible to elucidate previously unnoticed relationships between the participating proteins and protein pathways. To do this, a computational neuroscience approach is a necessary addition to laboratory experiments. This analysis requires a considerable amount of background knowledge; to acquire this knowledge, natural language processing (text mining) is necessary; specifically, new techniques for natural language processing need to be developed in the neuroscience domain.

The research plan involves a series of proteomics experiments to be carried out on rats with induced spinal cord injuries and the interpretation of these results using a computational systems biology approach. The quality of knowledge-based computational analysis depends critically on the breadth of formally represented knowledge in the program. A major challenge is that much of the requisite knowledge is "buried" in scientific publications, rather than being available in computable form in databases. Therefore, novel text mining techniques will be developed, tailored for the neuroscience domain, to extract such knowledge from scientific publications and convert it into a computable form. A generic computational analytical tool, known as Hanalyzer, is already available but needs to be adapted to the specific problem; the challenge is to develop the natural language processing technology. A neuroscience-specific aspect of this challenge is that there is a high diversity in the surface forms of words that are used to refer to spinal cord regeneration, making machine-learning-based approaches susceptible to data sparsity and rule-based approaches vulnerable to an intractable number of keywords that must be accounted for. A distributional approach will be used to approach this challenge, with novel techniques that make use of semantic role labeling, recognition of ontological concepts, and dependency parsing to learn the surface forms that correspond to abstract or implicit concepts like spinal cord regeneration.

This project is a collaboration involving investigators in Denver, Colorado and in Duesseldorf, Germany. A companion project is being funded by the German Ministry of Education and Research (BMBF).

Project Start
Project End
Budget Start
2012-12-01
Budget End
2017-11-30
Support Year
Fiscal Year
2012
Total Cost
$487,862
Indirect Cost
Name
University of Colorado Denver
Department
Type
DUNS #
City
Aurora
State
CO
Country
United States
Zip Code
80045