Several application domains require processing of tera/peta-bytes of data. Developing data-intensive applications with such massive datasets poses several challenges, with respect to data management, processing, and resource allocation. Furthermore, algorithms and their pertinent parameters that yield robust and near-optimal results are often determined through an iterative process for which interactive response times are essential. Thus, there is a need for being able to rapidly create scalable and parallel implementations of a variety of data analysis algorithms.
An emerging and critical area requiring large-scale data analysis is medical imaging. Complex algorithms and novel tools are required to be able to analyze such data. In the project, we consider data obtained from fMRI (functional MRI). Driven by this domain, this project focuses on how algorithm design, API design, and runtime system development can be combined to provide effective data-intensive solutions for spatio-temporal data analysis. Particularly, we target the following four questions: 1) How can we exploit the map-reduce paradigm to accelerate advances in neuroimaging and related medical fields? . 2) What are some of the challenges in using map-reduce paradigm for neuroimage analysis? 3) What alternative interface to the current map-reduce API can provide still provide ease of expression of data analysis algorithms, while enabling better performance? 4) Can we use the map-reduce and similar paradigms starting from high-level languages, such as Matlab?
This project will also make substantial contributions towards teaching, human resource development, and increasing diversity. Most of the requested funds will be used for supporting Ph.D students on this project. PI Agrawal expects to involve at least one of his three current female Ph.D students in this project. PI Machiraju also engages with several undergraduate students at the medical campus of OSU.