In research involving biological pathways, disease progression, and treatment outcomes, investigators face particular difficulties in identifying important relationships among the temporal information they collect. Currently, investigators have limited abilities to discover relevant temporal patterns among volumes of observational and experimental data based, because of the lack of widely available, ready-to-use tools that incorporate temporal and domain knowledge. To address these challenges, we propose a novel method for querying and abstracting temporal patterns that integrates data, ontologies, rules, and formal temporal semantics. Our approach, called Semantic Query-enhanced Web Rule Language (SQWRL), is an extension to W3C standards for the Semantic Web and to the Protigi environment-the mostly widely used, freely available, open-source software for specifying ontologies and knowledge bases. The proposed querying toolkit, called SQWRL-TK, will provide investigators an integrated, scalable framework for domain-driven investigation of temporal phenomena. The goal of our efforts is to develop a set of open-source tools that flexibly integrates knowledge-based methods for the transformation, retrieval, and abstraction of temporal data.
Our specific aims are (1) to implement a scalable, reusable software architecture for the querying and abstraction of temporal patterns using ontologies and rules;(2) to develop a data-mapping tool that can map time-oriented data stored in existing relational databases into a temporal ontology suitable for knowledge- based querying;and (3) to create a query-elicitation tool that can allow investigators to formulate domain- relevant temporal patterns for data extraction and inspect the results of those queries. We will develop and evaluate the use of these tools in ongoing collaborations with investigative teams who have well-developed data repositories in the areas of HIV drug resistance research and immune disorder trial management. We plan to undertake a reiterative process of software testing and optimization to ensure accuracy and performance. We will create a website to disseminate these tools (as open-source plug ins to Protigi). The widespread use of such general methods among investigators may greatly enable the discovery of scientifically relevant associations and patterns hidden among time-oriented data within biomedical databases.

Public Health Relevance

From post-marketing surveillance of adverse drug events to biomarker studies of gene-expression data to outcomes analyses of healthcare interventions, investigators in many research areas have a common need to extract clinically or biologically relevant patterns among time-oriented data. Currently, investigators lack the tools to transform existing time-stamped data into a format useful for the abstraction and analysis of temporal patterns and to express queries against domain-relevant temporal concepts through an easy web-based manner. The goal of our proposal is to develop an open-source software toolkit that can enable investigators to make sense of growing amounts of time-oriented data and thus more rapidly establish scientific findings involving dynamic, temporal phenomena. Our toolkit uses a novel framework to retrieve, abstract, and maintain clinically and biologically relevant temporal patterns;given our ontology-based approach, the toolkit can be configured to the needs of investigators studying different research areas. We will develop and evaluate these proposed efforts in the study of drug resistance and immune disorders.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
1R01LM009607-01A2
Application #
7654754
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2009-07-01
Project End
2011-06-30
Budget Start
2009-07-01
Budget End
2010-06-30
Support Year
1
Fiscal Year
2009
Total Cost
$801,587
Indirect Cost
Name
Stanford University
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K (2011) Evaluation of semantic-based information retrieval methods in the autism phenotype domain. AMIA Annu Symp Proc 2011:569-77
Bridewell, Will; Das, Amar K (2011) Social network analysis of physician interactions: the effect of institutional boundaries on breast cancer care. AMIA Annu Symp Proc 2011:152-60
Lee, Wei-Nchih; Bridewell, Will; Das, Amar K (2011) Alignment and clustering of breast cancer patients by longitudinal treatment history. AMIA Annu Symp Proc 2011:760-7
Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K (2010) A Software Tool for Visualizing, Managing and Eliciting SWRL Rules. Lect Notes Comput Sci 6089:381-385
Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K (2010) Visualizing Logical Dependencies in SWRL Rule Bases. Lect Notes Comput Sci 6403:259-272
O'Connor, Martin J; Das, Amar (2010) Semantic reasoning with XML-based biomedical information models. Stud Health Technol Inform 160:986-90
Lee, Wei-Nchih; Tu, Samson W; Das, Amar K (2009) Extracting cancer quality indicators from electronic medical records: evaluation of an ontology-based virtual medical record approach. AMIA Annu Symp Proc 2009:349-53
Turcott, Robert G; Sagreiya, Hersh; Ashley, Euan A et al. (2009) A general framework for dose optimization. AMIA Annu Symp Proc 2009:656-60
Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K (2009) Exploration of SWRL Rule Bases through Visualization, Paraphrasing, and Categorization of Rules. Lect Notes Comput Sci 5858:246-261