The broader impact/commercial potential of this Small Business Technology Transfer (STTR) Phase I project is to help researchers to identify novel scientific hypotheses and make discoveries by harnessing the power of big data being generated in biological and biomedical fields. It is well known that big data have significant commercial potential, but also present some new challenges. For example, unstructured information in scientific publications needs to be converted to a structured form. This task is important for organizing existing scientific discoveries in a structured format for downstream applications. In addition, heterogeneous, large volume biological information needs to be integrated into a data structure effective for knowledge discovery. For example, integrating disease-protein relationships and protein-small molecule relationships will likely help us to identify small molecule drugs for treating diseases. Integrating the research findings from different sources is essential for understanding the big picture of biological systems. An effective integration will open the doors to important applications with commercial potential, such as drug repurposing and drug side effect prediction.

This project aims to develop three commercial products. The first product is a software tool for bio-entity relationship extraction based on a novel, proprietary method for extracting bio-entity relationships from unstructured texts. The second product is a commercial database for context-specific molecular interaction information based on the integrated molecular interaction database (IMID) developed previously. IMID integrates multiple types of bio-entity relationships with versatile query functions. A commercial version of IMID will be built using updated information with additional functionalities. The third product is a platform for biological information integration and data mining with applications in drug repurposing and drug side effect prediction. Integrated bio-entity network (IBN) is a novel data structure for biological information integration and data mining. It offers some unique advantages compared to existing systems/approaches. Knowledge discovery tools based on IBN will facilitate biologists to search direct and indirect bio-entity relationships effectively. Two drug related data mining applications will be built: Drug repurposing search and drug side effect prediction, which utilize the unique advantage of IBN in identification of dysregulated pathways and subnetworks between two biological conditions.

Project Start
Project End
Budget Start
2014-07-01
Budget End
2015-06-30
Support Year
Fiscal Year
2014
Total Cost
$225,000
Indirect Cost
Name
Insilicom LLC
Department
Type
DUNS #
City
Tallahassee
State
FL
Country
United States
Zip Code
32312