Much of modern biology is becoming dependent on complex models, analyses and theoretical tests that depend directly on the applications of modern mathematics and computer science. For example, the construction of very large DNA sequences and genetic marker maps require considerable computer analysis as well as efforts of organizing and maintaining such information in timely accessible databases. In addition, the identification of functions encoded within the genetic sequences have also been obtained using computer-aided comparative sequence and complex pattern search methods. The work conducted under this award will provide support for research into the development of these technologies and validation of their application to fundamental questions in the biology with several major goals: 1) The further automation of the ARIEL pattern induction and domain expert assistant system. The use of "massively parallel" computer hardware will be continued. 2) The extension of the pattern system to utilize more global structural and functional information, e.g. the overall protein structural domain classification or the cellular organelle localization. 3) Development of proper statistical methods for evaluating the likelihood of an induced pattern being obtained by chance. This will involve the study of a number of pattern representation "languages" and the selection of proper negative control sets.