Spectral Graph Theory or Algebraic Graph Theory, as it is also known, is the study of the relationship between the eigenvalues and eigenvectors of graphs and their combinatorial properties. Random walks on graphs, expander graphs, clustering, and several other combinatorial aspects of graphs are intimately connected to their spectral properties. Recent approaches to the analysis of high-dimensional data have exploited the fundamental eigenvectors of the data. These data sets are large and ever increasing requiring ``real-time" accurate responses to the given queries. This creates the need for very fast algorithms, that also provide strict theoretical guarantees on their output. Spectral techniques have been applied to image processing, both by computers and in the primary visual cortex of monkeys. Critical component to all these application is algorithms with efficiency and accuracy guarantees for solving these linear system and finding their fundamental eigenvectors.

A multidisciplinary team consisting of Theoretical Computer Scientists, Machine Learning Scientist, and Neuroscientist will develop and apply spectral graph theory to applications from data mining to clustering, and image processing. Enabling technology develop will include: 1) linear-work or O(m log m)-work algorithms that run in poly-logarithmic parallel time for computing extreme eigenvalues and generalized eigenvalues of diagonally-dominant matrices, including Laplacian matrices, as well as algorithms of similar complexity for solving the related linear systems. 2) Better estimates for Fiedler values and generalized Fiedler values. Application development: 1) Improvements in spectral image segmentation. 2) The use of generalized eigenvalues in data mining and image segmentation to combine multiple sources of information. 3) The use of preconditioners for approximate inference in graphical models. and 4) Combine insights into the problem of image segmentation gained from spectral algorithms with knowledge gained from recent experiments in visual system of monkeys to better understand how the primary visual cortex functions.

Project Report

The proposal was a collaboration beween four institutions: Berkeley, Carnegie Mellon, USC, and Yale. The work and findings described here were preformed at CMU. The main goal of this research was to find new more efficient computer algorithms for many problems that arrise in Computer Science, Engineering, the Sciences, and Medicine. The research focused on three main algorithmic areas: combinatorial graph theory, linear algebra, and computational geometry. The first two areas combine to form an area called spectral graph theory. We start by describing our work in this area first. Spectral graph theory uses both graph algorithms to help solve problems in linear algebra and linear alebra to help solve problems in graph theory. Possibly the most basic question in linear algebra is finding a solution to a system of simultaneous linear equations. Humans have been studying this problem for at least the last two thousand years. The most famous method is Gaussian elimination. Unfortunately as the systems have more degrees of freedom this method becomes both very slow and requires a lot of space because it runs in time cubic in the number of variables being solved and quadratic in the space needed. As an example as to what this means suppose we what to solve an image problem coming from an iphone with about ten million pixels. Thus Gaussian elimination would take about thousand-billion-billion steps, a very large number even for a big machine. Our approach to this problem was to find methods for special yet useful cases of linear system solving. The class of systems we focussed on are called Symmetric Diagonally Dominate (SDD) systems. In work partially funded by this grant and more recent NSF grants we have been able to find an algorithm that solves SDD systems to constant precision in approximately m log m time, were m is the number of nonzero coefficients is the system. For our image example and ignoring constants our algorithm wold take about a billion steps. This improves on the fairly complicated algorithm of Spielman and Teng that ran in m times some-large-number-of-logarithmic-factors time. This new algorithm is yet to be implemented. But an earlier algorithm that we developed, partially funded by this gran, has been implemented, made available to researchers, and used by us it to show the importance of solving the special case of SDD systems. There has been a growing number of applications of fast SDD solvers. The list includes solving the heat equation, images denoising, image segmentation, generating a random spanning tree of a graph, and finding a flow maximum in a graph. For each of these applications the best known asymptotic fast algorithm relies on the existence of fast SDD solvers. The second main research effort was in the area of computational geometry. In particular, we focused on the problem of efficiently generating finite element meshes, more simply known as mesh generation. As an example, suppose one wants to simulated heat dissipation on a computer chip and determine the temperature at all points on the chip. In the finite element approach one needs to determine a method to represent this function of the approximate temperatures to be returned. A standard approach is to decompose the chip into triangles for 2-dimensional simulation, or into tetrahedra for a full 3-dimensional simulation. After the appropriate system is solved the answer will be a function that is piecewise linear on each element, triangle or tet. In order that the function returned is a good approximation to the real solution one also needs that the triangles and tets are well shaped. In work funded by this and more recent NSF grants, we have found an optimal time algorithm for the meshing problem. That is the algorithm runs in time n log n + m time up to constant factors where n is the description size of the domain and m the size of the mesh generated. The mesh generation problem is a critical step in a large number of applications from more efficient cars, wind mills, aircraft, and power plants. Meshes are also used in the design and testing of new medical devices. The existence of robust and fast meshing algorithms should decrease the design time for these applications.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0635257
Program Officer
Balasubramanian Kalyanasundaram
Project Start
Project End
Budget Start
2007-05-01
Budget End
2012-04-30
Support Year
Fiscal Year
2006
Total Cost
$480,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213