EAGER: Information and complexity in the analysis of biological data sets and networks

Galas, David

Abstract

A living system is distinguished from most of its non-living counterparts by the way it stores and transmits information. It is just this biological information that is the key to the biological functions. It is also at the heart of the conceptual basis of what we call systems biology. Much of the conceptual structure of systems biology can be built around the fundamental ideas concerning the storage transmission and use of biological information. Biological information resides, of course, in digital sequences in molecules like DNA and RNA, but also in 3-dimensional structures, chemical modifications, chemical activities, both of small molecules and enzymes, and in other components and properties of biological systems at many levels. The information depends critically on how each unit interacts with, and is related to, other components of the system. Biological information is therefore inherently context-dependent, which raises significant issues concerning its quantitative measure and representation. An important and immediate issue for the effective theoretical treatment of biological systems then is: how can context-dependent information be usefully represented and measured? This is important both to the understanding of the storage and flow of information that occurs in the functioning of biological systems and in evolution. This work involves both new ideas and the integration of new ideas. It represents new mathematical methods as well as a novel integration of approaches that are focused on the very real and practical problems of biological data analysis. The PI as developed a new conceptual approach that is novel and mathematically well-defined, exploring the relationships between graph properties and set complexity and considering new approaches to network analysis. New interaction distance measures are considered with a new way of dealing with especially large data sets, especially the maximal information coefficient, for which a general approach may be possible, certainly for a small number of variables, and possibly in the general case. The ideas will be tested on a number of diverse biological data sets, especially around gene expressions, and other variants. Current methods often fail in the face of truly complex dependencies in large data sets, and powerful new methods would be of high value. This work involves both new ideas and the integration of new ideas. It represents new mathematical methods as well as a novel integration of approaches that are focused on the very real and practical problems of biological data analysis.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1340619
Program Officer: Sylvia Spengler

Project Start
Project End
Budget Start: 2013-07-01
Budget End: 2015-06-30
Support Year
Fiscal Year: 2013
Total Cost: $299,740
Indirect Cost

EAGER: Information and complexity in the analysis of biological data sets and networks
Galas, David
Pacific Northwest Research Institute, Seattle, WA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments