Inferring knowledge from genetic data will drive future advances in the life sciences. However, DNA sequences are being generated faster than they can be analyzed with existing computing technologies and algorithms. The core computations performed by many genomic applications involve pattern matching. This operation is normally implemented using automata-based algorithms and can be efficiently mapped onto non-general purpose platforms such as Field Programmable Gate Arrays (FPGA) and Micron?s recently announced Automata Processor (AP). However, the lack of high-level programming interfaces for these devices hampers their adoption in the bioinformatics community.

This project fills this gap by developing novel programmatic descriptions of several genomic analyses and mapping them onto these two non-traditional architectures. The work advances the state-of-the-art in several ways. At an algorithmic level, new methods to address the biological problems of genome-scale orthology inference and regulatory motif search are being developed. At a computational abstraction level, the researchers are designing an extended finite automaton abstraction suitable to support diverse computations, and are mapping new and existing computational kernels onto it. At a hardware mapping level, automatic tuning techniques for the effective deployment of automata-based computations on FPGA and Micron?s AP are being developed.

This interdisciplinary project will facilitate the adoption of FPGA and Micron?s Automata Processor by biologists by providing a new library of pattern matching routines and a high-level automata-based programming interface for these platforms. In addition, the researchers are developing instructional material in a variety of topics, such as genomic analysis, pattern matching, automata processing and high-performance computing. Finally, this project provides research opportunities and access to pre-production hardware to undergraduate and graduate students, interdisciplinary training, and technology transfer to industry. The results of this research will be made available through the release of software tools and publication in international conferences and journals.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1421765
Program Officer
Almadena Chtchelkanova
Project Start
Project End
Budget Start
2014-08-01
Budget End
2017-06-30
Support Year
Fiscal Year
2014
Total Cost
$348,219
Indirect Cost
Name
University of Missouri-Columbia
Department
Type
DUNS #
City
Columbia
State
MO
Country
United States
Zip Code
65211