Evidence Extraction Systems for the Molecular Interaction Literature

Burns, Gully

Abstract

Burns, Gully A. Abstract In primary research articles, scientists make claims based on evidence from experiments, and report both the claims and the supporting evidence in the results section of papers. However, biomedical databases de- scribe the claims made by scientists in detail, but rarely provide descriptions of any supporting evidence that a consulting scientist could use to understand why the claims are being made. Currently, the process of curating evidence into databases is manual, time-consuming and expensive; thus, evidence is recorded in papers but not generally captured in database systems. For example, the European Bioinformatics Institute's INTACT database describes how different molecules biochemically interact with each other in detail. They characterize the under- lying experiment providing the evidence of that interaction with only two hierarchical variables: a code denoting the method used to detect the molecular interaction and another code denoting the method used to detect each molecule. In fact, INTACT describes 94 different types of interaction detection method that could be used in conjunction with other experimental methodological processes that can be used in a variety of different ways to reveal different details about the interaction. This crucial information is not being captured in databases. Although experimental evidence is complex, it conforms to certain principles of experimental design: experimentally study- ing a phenomenon typically involves measuring well-chosen dependent variables whilst altering the values of equally well-chosen independent variables. Exploiting these principles has permitted us to devise a preliminary, robust, general-purpose representation for experimental evidence. In this project, We will use this representation to describe the methods and data pertaining to evidence underpinning the interpretive assertions about molecular interactions described by INTACT. A key contribution of our project is that we will develop methods to extract this evidence from scienti?c papers automatically (A) by using image processing on a speci?c subtype of ?gure that is common in molecular biology papers and (B) by using natural language processing to read information from the text used by scientists to describe their results. We will develop these tools for the INTACT repository but package them so that they may then also be used for evidence pertaining to other areas of research in biomedicine.

Public Health Relevance

Burns, Gully A. Narrative Molecular biology databases contain crucial information for the study of human disease (especially cancer), but they omit details of scienti?c evidence. Our work will provide detailed accounts of experimental evidence supporting claims pertaining to the study of these diseases. This additional detail may provide scientists with more powerful ways of detecting anomalies and resolving contradictory ?ndings.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 1R01LM012592-01
Application #: 9365558
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2017-09-01
Project End: 2021-08-31
Budget Start: 2017-09-01
Budget End: 2018-08-31
Support Year: 1
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: University of Southern California
Department: Biostatistics & Other Math Sci
Type: Biomed Engr/Col Engr/Engr Sta
DUNS #: 072933393

City: Los Angeles
State: CA
Country: United States
Zip Code: 90033

Related projects


NIH 2020 R01 LM	Evidence Extraction Systems for the Molecular Interaction Literature Peng, Nanyun Violet / University of Southern California
NIH 2019 R01 LM	Evidence Extraction Systems for the Molecular Interaction Literature Peng, Nanyun Violet / University of Southern California
NIH 2018 R01 LM	Evidence Extraction Systems for the Molecular Interaction Literature Burns, Gully A. / University of Southern California
NIH 2017 R01 LM	Evidence Extraction Systems for the Molecular Interaction Literature Burns, Gully A. / University of Southern California

Publications

Khan, Arshad M; Grant, Alice H; Martinez, Anais et al. (2018) Mapping Molecular Datasets Back to the Brain Regions They are Extracted from: Remembering the Native Countries of Hypothalamic Expatriates and Refugees. Adv Neurobiol 21:101-193

Comments

Be the first to comment on Gully Burns's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: