MicroRNAs (miRNAs) perform critical roles in biological processes by regulating respective target genes. Thus, miRNAs are closely associated with human cancers. The manual integration of information on miRNAs and their target genes is challenging: labor-intensive, error-prone, and subject to biologists' prior knowledge - because it involves an extremely large amount of heterogeneous data sources to be explored. Our objective is to develop OmniSearch, a semantic search tool to assist cancer biologists in unraveling critical roles of miRNAs in human cancers in an automated and highly efficient manner.
AIM 1 : We will develop ontology for microRNA target (OMIT), the first miRNA domain ontologies. OMIT will formally define miRNA knowledge and will provide a global metadata model (i.e., data exchange standards and common data elements) as the foundation for automated knowledge acquisition. We will work within established standards and contribute new terminology to a wide range of bio-ontology groups.
AIM 2 : According to the OMIT metadata model, we will develop OmniSearch, including an automated semantic data annotation & integration tool and a user-friendly semantic search interface. The resultant knowledgebase will contain completely machine-readable data integrated from heterogeneous sources: miRNA target prediction databases, Gene Ontology, PubMed, and KEGG PATHWAY. OmniSearch will present unified knowledge, at the semantic level, that is most relevant to what cancer biologists are seeking.
AIM 3 : Finally, we will design use cases followed by a set of evaluating queries. OmniSearch will be thoroughly and iteratively evaluated in the knowledge accuracy, the efficiency of construction methods (the reduction of human labor), and the system friendliness and usability. On a regular basis and in a structured manner, we will solicit feedback from the community and incorporate domain experts' opinions to further improve the system. Feedback mechanisms will take place throughout the entire project lifetime. This project will handle critical needs recognized by the NCI ITCR Initiative: establishing data exchange standards and common data elements; sustained effort to promote data sharing; enhanced support of community-based, research-driven informatics technology development; improved mechanisms to support software development. Expected deliverables include OMIT ontologies as miRNA data exchange standards and common data elements; OmniSearch for miRNA data sharing and automated knowledge acquisition; a unified miRNA knowledgebase; and use cases and evaluating queries. OmniSearch can be used to obtain unified miRNA knowledge and bring insights into the regulation and control of cancer disease processes. By providing a deeper understanding of miRNAs' functions, OmniSearch will also assist miRNA bio-curation and new biological experiment design. Thus, the project can significantly accelerate cancer biology research. And, OmniSearch is by its nature extensible and can be readily generalized to other biomedical areas.
miRNAs have been identified to be closely associated with development, diagnosis, and prognosis for human cancers but miRNA knowledge acquisition remains challenging. We will develop OmniSearch, a semantic search tool, to assist cancer biologists in unraveling critical roles of miRNAs in human cancers in an automated and highly efficient manner. We will handle the significant challenge of data sharing, data integration, and effective search in miRNA/microgenomics research in oncology.