Increasingly, many aspects of biology can be viewed as involving the processing of information. Modern information and computer science have played an important rolein such major biological accomplishments as the sequencing of the human genome. On the other hand, biological ideas can inspire new concepts and methods in information science. This project is motivated by these two observations. Progress in the field of biological information processing will require interdisciplinary collaborations among computer scientists, mathematicians, physicists, chemists, and biologists. The project is built around a series of workshops that will enhance the interdisciplinary collaborations beginning to form and introduce outstanding junior people to problems and topics at the forefront of research. Intellectual Merit The project will be organized around a series of workshops with four themes: (1) Algorithmic Approaches to Biological Information Processing; (2) Computer Science, Engineering and Biology: Applications and Analogies; (3) Biological Circuits and Cellular Signaling; (4) Proteomics. Two of these themes represent approaches and two represent areas of application of these approaches. Under theme 1, planned workshops are on Detecting and Processing Regularities in High Throughput Biological Data; Machine Learning Ap- proaches for Understanding Gene Regulation; and Computational Tumor Modeling. Theme 2 workshops will cover Nanotechnology and Biology; Control, Communication, and Computing in Biology; and The Mecha- nism and Applications of the RNA Interference Process. For Theme 3, there will be workshops on Strategies for Reverse Engineering Biological Circuits; Cell Communication and Information Processing in Developing Tissues; Dynamics of Biological Networks; and Evolution of Gene Regulatory Networks. Theme 4 work- shops will be on Information Processing by Protein Structures in Molecular Recognition; Proteome Network Evolution; Functional Proteomics of Neurodegenerative Diseases; and Implications of Mathematical Models of Infection and Molecular Modeling of Hepatitis B Virus. We expect that the workshops, scientific papers and books coming out of the project will help to develop the long-term focus of the field, carefully define problems and directions in computer science, mathematics, chemistry, andphysics of specific interest to and designed in collaboration with biologists, and lead to new biological concepts that will in uence biological and information science research in the future. In short, we expect the project to in uence the study of biological information processing for years to come. Broader Impacts The ideas developed in this project will have impact on a myriad of fields and create cross-disciplinary connections. A visitor program will encourage senior and junior researchers, including students, to participate in collaborative research spawned by the workshops. Each workshop will have a fund for support of graduate students and postdocs amd workshops will have a substantial educational component through talks of a tutorial/expository nature. The topic lends itself well to undergraduate research and participating faculty will coordinate topics with an undergraduate research program (REU program) already in existence. To give the project widespread dissemination, each workshop will have awebsite with relevant references, problems, and copies of presentations that can make it a resource for a large community. The project should significantly in uence the careers of a large number of outstanding junior researchers and it should play an important role in the training and development of scientists who are well-prepared to become leaders in the field of biological information processing. The project is expected to have a long-term impact well beyond its four year duration since the workshop, visitor, and dissemination components of the project will allow the ideas developed to reach hundreds of people nationwide and worldwide.

Project Report

Many aspects of biology can be viewed as involving the processing of information. Modern information and computer science have played an important role in such major biological accomplishments as the sequencing of the human genome. Conversely, biological ideas can inspire new concepts/methods in information science. This project, motivated by these two observations and organized around a series of more than 20 workshops, aimed at enhancing collaborations between biologists and computer scientists and introducing junior people to the problems and topics of biological information processing. Project workshops had four themes. Theme 1, Algorithmic Approaches to Biological Information Processing, concentrated on how biological organisms use "algorithms" to process information and how we use algorithmic methods to understand how organisms process information. These problems were reflected in several workshops, including one on Detecting and Processing Regularities in High Throughput Biological Data. Massive amounts of information have made it possible to study structure and behavior of complex cellular networks using algorithmic methods. A workshop, Machine Learning Approaches for Understanding Gene Regulation, examined this idea. Understanding biological information processing can provide insight into the treatment of diseases such as cancer.There are many interconnected processes in tumorigenesis, involving tumor cell signaling and information processing. The development of computational models and algorithms that reflect these interconnected processes was the subject a workshop on Computational Tumor Modeling. The study of analogies between information processing in biology and information processing in computer science and engineering offers promise for understanding both. Theme 2, Computer Science, Engineering and Biology: Applications and Analogies, investigated such analogies. A workshop on Control Theory and Dynamics in Systems Biology emphasized feedback, a central theme in analogies between biochemical regulatory networks and engineered automatic control systems arising in the aerospace, chemical, consumer electronics, and automobile industries. Biochemical networks in the cell are responsible for processing environmental signals, inducing appropriate cellular responses, and sequencing internal events such as gene expression. Through elaborate mechanisms, they allow cells and organisms to perform their basic functions. Theme 3, Biological Circuits and Cellular Signaling, aimed to elucidate the function of biological circuits and cellular signaling. Gene regulatory networks dynamically orchestrate the level of expression for each gene by controlling whether and how vigorously that gene will be transcribed into RNA. They are at the heart of the information processing function of the individual cell and the developmental process. The workshop on Evolution of Gene Regulatory Logic addressed these processes. Also related to this theme were the workshop on Control Theory and Dynamics in Systems Biology, and that on Software Development on Parameter Estimation for Boolean Models of Biological Networks. Theme 4, Proteomics, was devoted to understanding how information encoded in the three-dimensional structures that underlie complex protein-DNA and protein-protein network interaction is one of the fundamental challenges of biology. The workshop on Information Processing by Protein Structures in Molecular Recognition emphasized algorithms for discovery of spatial patterns, uncovering of relationships of proteins preceding the emergence of folds, and for simulating the protein-protein and protein-DNA recognition process. A wide variety of scientific findings arose from the project, many obtained by students. We studied growth laws in cancer and their implications for radiotherapy regimens. According to one such law, the surviving tumor cell fraction could be reduced independently of the initial tumor mass, simply by increasing the number of treatments; but under a different law, there is a lower limit of the survival fraction that cannot be reduced regardless of the number of treatments. Our findings explain the so-called "tumor size effect" and re-emphasize the importance of early diagnosis since radiotherapy may be successful provided the tumor mass at treatment onset is rather small. Our work on machine learning developed tools to aid in computer-aided discovery of biodegrable polymers potentially useful for biological screening. Our method uses new computational models that capture both similarity and difference of polymer structures contained in a large library of potential chemical compounds and can be used to sort out promising polymers based on relevant features. One of our undergraduate (REU) students developed new machine learning methods to automate and expedite analysis of multichannel electroencephalogram (EEG) data to determine whether a patient is epileptic or is simply demonstrating similar symptoms of a different illness. We obtained improved computational methods for reconstructing sibling relationships from genetic markers called DNA microsatellites. Knowledge of such relationships is useful in conservation and management of endangered species, assessing heritability of adaptive traits, etc. Nine REU students obtained results on such topics as modeling biochemical networks, understanding complicated folds of the DNA molecule, and using computer models to understand the onset of colorectal cancer. Many project workshop participants initiated new, often multidisciplinary collaborations.. We supported graduate students from around the country in our workshops and many found new or enhanced dissertation topics as a result.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0432013
Program Officer
Mitra Basu
Project Start
Project End
Budget Start
2004-09-01
Budget End
2011-08-31
Support Year
Fiscal Year
2004
Total Cost
$339,920
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
New Brunswick
State
NJ
Country
United States
Zip Code
08901