Kernel matrices in machine learning and scientific computing describe the relationships between collections of points which may represent various types of information. The increasing size of data sets in various disciplines and the increasing computational capability of computer hardware make it essential that our algorithms and software for kernel matrices are scalable, and that the time it takes for their execution grows linearly or close to linearly, with the problem size. Otherwise, such large-scale data problems may not be tractable. This project addresses the scaling bottlenecks associated with handling the kernel matrix by exploiting a hierarchical structure that is often found in these matrices. By accelerating computations with kernel matrices, this research enables large-scale data analysis and scientific simulation in diverse areas such as uncertainty quantification, integral equation problems, particle simulations, and geostatistics. High-performance software implementing the newly developed methods will be developed in an open-source environment.
This project specifically addresses high-dimensional problems, the use of specialized kernel functions in machine learning, and the high initial computational cost of constructing a hierarchical representation for a kernel matrix. New methods developed will be applied to large-scale cases in a scientific application and a machine learning application: Brownian dynamics and Gaussian process regression. In machine learning, the new methods will complement existing large-scale approaches for Gaussian processes. High-performance software will address specific scaling challenges in constructing hierarchical matrices.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.