A fundamental class of problems is collectively named Connect-the-Dots (CTD). CTD problems have the following common components: (1) a random point set, (2) a prescribed functional class, and (3) a goal of finding a maximum subset of the point set that a function from the functional class can pass through. Example problems include maximizing the number of points on a monotone or convex function, or on a surface of bounded curvature. The principal project goal is to develop the computational and probabilistic underpinnings for a new family of statistical tests and solution procedures for CTD. The PIs propose to implement these as tools in MATLAB, making them freely available on the internet. There are two related aspects of the CTD problems: computational and statistical. The computational aspect is to design efficient numerical algorithms to search for the maximum subset and its associated function. This involves finding extensions of low computational complexity algorithms to high dimensions, searching for efficient algorithms for several functional classes, and investigating possibilities of nearly-exact algorithms. Preliminary work indicates that dynamic programming plays a key role, often in a highly non-trivial way. The statistical aspect concerns properties of the maximum subset, such as (1) the limit distribution of the size of the maximum subset of a random point cloud, (2) the `typical' shape of the maximum subset or associated function, and (3) convergence properties and rates. The computational results aid in the statistical work by permitting empirical investigation.

If successful, the proposed work may have impact in many scientific fields: (1) CTD problems are related to the Longest Increasing Subsequence and Random Traveling Salesman problems, which have been studied in probability; (2) CTD problems have a link with Geometric Discrepancy Theory -- a new branch in pure mathematics; (3) CTD problems are related to Filament Detection in image processing, and more generally are a class of pattern recognition problems; (4) CTD problems have applications in Vision, which has grown into an interdisciplinary field between computer science and psychology; (5) CTD can potentially be utilized to analyze and develop Human Interactive Proofs, which are increasingly popular in Internet security applications. (6) CTD problems are related to random matrices (RM), which have been an emerging research topic in statistics. (7) CTD problems have surprising links to applications such as disk scheduling and airplane boarding. (8) CTD problems have connections with problems in statistical physics.

Project Start
Project End
Budget Start
2007-09-01
Budget End
2010-08-31
Support Year
Fiscal Year
2007
Total Cost
$248,741
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332