A critical barrier in drug safety is the inability to utilize public data resources in an integrated fashion to fully understand the actions of drugs and chemical compounds on biological systems. There is a need to integrate the heterogeneous datasets pertaining to compounds, drugs, targets, genes, diseases, clinical trials, and known drug side effect, and to develop effective network data analytical techniques to identify or predict important biological relationships. The integrated and associated information can be used to support practices in drug development, evaluation of drug side effects, and related scientific research and assessment. The proposed work can also be applied to analyze patient payment patterns and predict their paying capability for coming bills, and recommend better life-style by analyzing monitoring data from patients. This can save cost of manual labor for searching and analyzing data, and avoid errors generated by manual labor, and allows healthcare budget focus on stringent issues of bringing better healthcare for the society.

The proposed technology (Data2Discovery DataHub Platform) uses semantic integration and searching technologies to integrate siloed data sources related to drug safety and enables search to find and interpret associations which are hard or impossible to find using other methods. The team will seek to commercialize the following tools or approaches: 1) DataHub Integration: integrating data sources related to drug safety into a graph database and connecting related entities across different datasets; 2) DataHub Browser: allowing users to browse data/entities across different datasets; and 3) DataHub Predictor: predicting semantic association based on pre-defined path patterns and biological similarities. These tools can be used to facilitate domain experts to generate hypotheses, and end users to understand side effects of drugs that they are taking. These technologies have the potential to revolutionize how knowledge is derived from data in domains where the important datasets are large, complex and heterogeneous, such as healthcare, life science and business analytics.

Agency
National Science Foundation (NSF)
Institute
Division of Industrial Innovation and Partnerships (IIP)
Type
Standard Grant (Standard)
Application #
1505374
Program Officer
Rathindra DasGupta
Project Start
Project End
Budget Start
2015-01-01
Budget End
2015-06-30
Support Year
Fiscal Year
2015
Total Cost
$50,000
Indirect Cost
Name
Indiana University
Department
Type
DUNS #
City
Bloomington
State
IN
Country
United States
Zip Code
47401