We live in an era when vast amounts of data are being generated at a low cost in several domains of science and engineering. However, advances in analytics tools have not caught up with data generation. In particular, existing tools take too much time. A main reason is that core memories of computers cannot hold all the data to be analyzed -- most of the data have to be stored in secondary storages (SSs) such as solid state drives and (rotating) disks. Data access times from SSs are several orders of magnitude more than from core memories. Tremendous speedups can be obtained by minimizing the number of data accesses from SSs. Also, although there has been much recent research in the development of multicore and GPU algorithms for biological problems, for many of the problems only sequential in-core algorithms are known.

This project is to develop novel out-of-core algorithms for biological big data (BBD) analytics. The proposed novel parallel algorithms employ various architectures including heterogeneous clusters of multicores and GPUs, to solve BBD problems. The developed novel scalable algorithms can handle petabytes of data and beyond for data mining applicable over varied datasets. This interdisciplinary project provides a new computation suite for mining voluminous biological and other data. This project provides educational opportunities to graduate and undergraduate students to get first-hand research experience in computational aspects of biological data analysis.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1447711
Program Officer
Almadena Chtchelkanova
Project Start
Project End
Budget Start
2014-09-01
Budget End
2019-08-31
Support Year
Fiscal Year
2014
Total Cost
$1,200,000
Indirect Cost
Name
University of Connecticut
Department
Type
DUNS #
City
Storrs
State
CT
Country
United States
Zip Code
06269