Large Neural Networks Predict Protein Structure

Wilcox, George

Abstract

Current DNA and protein sequencing technology allows rapid acquisition of information concerning the primary structure of proteins. The growth of information concerning three dimensional protein structure, which is critical to understanding protein function is markedly slower, however. Although the primary structure of a protein completely determines its secondary and tertiary structure, no procedure can yet predict the complete structure of a protein form sequence alone. This project will apply neural network simulations to the protein folding problem. A back-propagation neural network architecture, already implemented and optimized on the Minnesota Cray 2 supercomputer, will be configured to form an association between sequence information and three dimensional structure for 100 of the smaller proteins with known structure. After the network has""""""""learned"""""""" this training set after perhaps 100-1000 presentations, it will be tested for retention of some of the rules governing protein folding: we will present the network with sequences which are new to it but which have known structures. We will compare its """"""""predictions"""""""" with actual structure; successful performance would be 2 A predictions for 95% of the novel proteins. The neural network program we will use was developed here to model arbitrarily large networks. The program includes a Network Description Language (NDL) which allows an experimenter to configure the network for input-output data sets of arbitray structure and dimensionality. Using the Minnesota Supercomputer Center Cray 2, its interactive UNICOS operating system and the University ethernet network, NDL programs can be created and executed interactively and the results of a simulation reviewed graphically at a Sun or Macintosh II workstation. Experiments with the network show that it can learn associations between one dimensional inputs (e.g. sequence) and multi-dimensional outputs (e.g. 3D structure) and has shown recall of some structural elements of small proteins. Support is requested for a year of experimentation with the network using data sets (learning sets) representing relationships between protein sequence and tertiary structure.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Small Research Grants (R03)
Project #: 1R03RR005294-01
Application #: 3431648
Study Section: Biotechnology Resources Review Committee (BRC)

Project Start: 1989-09-30
Project End: 1990-09-29
Budget Start: 1989-09-30
Budget End: 1990-09-29
Support Year: 1
Fiscal Year: 1989
Total Cost
Indirect Cost

Large Neural Networks Predict Protein Structure
Wilcox, George Latimer
University of Minnesota Twin Cities, Minneapolis, MN, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments