The integration of gene annotations and omics (e.g., genomics, proteomics, metabolomics) data can provide important insights into noisy and incomplete biological data. Such data is often also on a scale presents computational challenges to many traditional algorithms. This project exploits Foretell, a local search algorithm originally created to accelerate solvers on large constraint satisfaction problems. Here Foretell is used to detect complex relationships among genes in context-specific protein-protein interaction (PPI) networks, with guidance from human experts. This is a novel, and potentially transformative, approach to provide new insights into the molecular and cellular mechanisms of fundamental biological processes. This flexible, innovative project is ideal for noisy, incomplete genomic data. It uses repeated local search guided by empirical biological knowledge to explore large weighted graphs under human direction. It provides users with meaningful feedback to reformulate their search for complex relationships among genes in context-specific PPI networks, and to devise new weight schemes to find them. Expected outcomes include a knowledge base of recurring clusters in Saccharomyces cerevisiae and the weight schemes used to detect them, a more flexible algorithm that detects and tabulates cluster features and provides meaningful feedback to the user, and a tool whose output suggests additional biological experiments. This project addresses, both in its design and its implementation, important questions in the discovery and application of computational approaches to biological networks.

Knowledge derived from this project will be broadly applicable and well promulgated through publication and through a web site. The resultant knowledge base will support other researchers? detection of combinations of interacting genes and the interpretation of their results. While it advances discovery and understanding, this project will support interdisciplinary collaboration, disseminate its results broadly, and promote research by students in a predominantly female, minority-serving institution.

Project Report

This project has developed innovative computational methods to deal with noisy, incomplete, and biased genomic data. It addresses, both in its design and its implementation, important questions in the discovery and application of computational approaches to biological networks. This project helps biological researchers find complex relationships among genes in data that describes interactions between pairs of proteins. It has produced a knowledge base of clusters, sets of genes that interact, along with the weight schemes used to detect them. This project has revised a cluster-detection algorithm for flexibility and transparency. The algorithm now tabulates cluster features and provides meaningful feedback to the user, so that it is now a tool that suggests additional biological experiments. This project has also created a case-base meta-learning algorithm to integrate multiple predictions. Knowledge derived from this project is broadly applicable and has been well promulgated through a web site and through publication in both bioinformatics and computer science. The resultant knowledge base will support other researchers’ detection of combinations of interacting genes and the interpretation of their results. While it advances discovery and understanding, this project has also supported interdisciplinary collaboration, disseminated its results broadly, and promoted research by students in a predominantly female, minority-serving institution.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1242451
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2012-09-15
Budget End
2014-08-31
Support Year
Fiscal Year
2012
Total Cost
$50,000
Indirect Cost
Name
CUNY Hunter College
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10065