KnowEng, a Scalable Knowledge Engine for Large-Scale Genomic Data-OVERALL

Han, Jiawei; Sinha, Saurabh; Song, Jun; Weinshilboum, Richard

Abstract

The primary goal of the proposed Center of Excellence is to build a powerful and scalable Knowledge Engine for Genomics, KnowEnG. KnowEnG will transform the way biomedical researchers analyze their genome-wide data by integrating multiple analytical methods derived from the most advanced data mining and machine learning research to use the full breadth of existing knowledge about the relationships between genes as background, and providing an intuitive and professionally designed user interface. In order to achieve these goals, the project includes the following components: (1) gathering and integrating existing knowledgebases documenting connections between genes and their functions into a single Knowledge Network;(2) developing computational methods for analyzing genome-wide user datasets in the context of this pre-existing knowledge;(3) implementing these methods into scalable software components that can be deployed in a public or private cloud;(4) designing and implementing a Web-based user interface, based on the HUBZero toolkit, that enables the interactive analysis of user-supplied datasets in a graphics-driven and intuitive fashion;(5) thoroughly testing the functionality and usefulness of the KnowEnG environment in three large scale projects in the clinical sciences (pharmacogenomics of breast cancer), behavioral sciences (identification of gene regulatory modules underlying behavioral patterns) and drug discovery (genome-based prediction of the capacity of microorganisms to synthesize novel biologically active compounds). The KnowEng environment will be deployed in a cloud infrastructure and fully available to the community, as will be the software developed by the Center. The proposed Center is a collaboration between the University of Illinois (UIUC), a recognized world leader in computational science and engineering, and the Mayo Clinic, one of the leading clinical care and research organizations in the worid, and will be based at the UIUC Institute for Genomic Biology, which has state-of-the-art facilities and a nationally recognized program of multidisciplinary team-based genomic research.

Public Health Relevance

Physicians and biologists are now routinely producing very large, genome-wide datasets. These data need to be analyzed in the context of an even larger corpus of publically available data, in a manner that is approachable to non-specialist doctors and scientists. The proposed Center will leverage the latest computational techniques used to mine corporate or Internet data to enable the intuitive analysis and exploration of biomedical Big Data.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Specialized Center--Cooperative Agreements (U54)
Project #: 1U54GM114838-01
Application #: 8774407
Study Section: Special Emphasis Panel (ZRG1-BST-R (52))
Program Officer: Gregurick, Susan

Project Start: 2014-09-29
Project End: 2018-04-30
Budget Start: 2014-09-29
Budget End: 2015-04-30
Support Year: 1
Fiscal Year: 2014
Total Cost: $1,623,265
Indirect Cost: $529,679

Institution

Name: University of Illinois Urbana-Champaign
Department: Genetics
Type: Organized Research Units
DUNS #: 041544081

City: Champaign
State: IL
Country: United States
Zip Code: 61820

Related projects

Publications

Huang, Edward W; Wang, Sheng; Zhai, ChengXiang (2018) VisAGE: Integrating external knowledge into electronic medical record visualization. Pac Symp Biocomput 23:578-589

Zhang, Yi; Manjunath, Mohith; Zhang, Shilu et al. (2018) Integrative Genomic Analysis Predicts Causative Cis-Regulatory Mechanisms of the Breast Cancer-Associated Genetic Variant rs4415084. Cancer Res 78:1579-1591

Athreya, Arjun; Iyer, Ravishankar; Neavin, Drew et al. (2018) Augmentation of Physician Assessments with Multi-Omics Enhances Predictability of Drug Response: A Case Study of Major Depressive Disorder. IEEE Comput Intell Mag 13:20-31

Zhang, Yi; Manjunath, Mohith; Kim, Yeonsung et al. (2018) SequencEnG: an Interactive Knowledge Base of Sequencing Techniques. Bioinformatics :

Shi, Yu; Gui, Huan; Zhu, Qi et al. (2018) AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks. Proc SIAM Int Conf Data Min 2018:144-152

Baheti, Saurabh; Tang, Xiaojia; O'Brien, Daniel R et al. (2018) HGT-ID: an efficient and sensitive workflow to detect human-viral insertion sites using next-generation sequencing data. BMC Bioinformatics 19:271

Tabe-Bordbar, Shayan; Emad, Amin; Zhao, Sihai Dave et al. (2018) A closer look at cross-validation for assessing the accuracy of gene regulatory networks and models. Sci Rep 8:6620

Ho, Ming-Fen; Correia, Cristina; Ingle, James N et al. (2018) Ketamine and ketamine metabolites as novel estrogen receptor ligands: Induction of cytochrome P450 and AMPA glutamate receptor gene expression. Biochem Pharmacol 152:279-292

Adami, Guy R; Tangney, Christy C; Tang, Jessica L et al. (2018) Effects of green tea on miRNA and microbiome of oral epithelium. Sci Rep 8:5873

Xiao, Jinfeng; Blatti, Charles; Sinha, Saurabh (2018) SigMat: a classification scheme for gene signature matching. Bioinformatics 34:i547-i554

Showing the most recent 10 out of 74 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: