Today's user of scientific computing facilities has easy access to thousands of processors. However, this bounty of processing power has led to a data crisis. A conventional computing system often dispatches hundreds or thousands of jobs that simultaneously access a centralized server, which inevitably becomes a bottleneck. To support large data intensive applications, clusters must expose control of their internal storage and computing resources to an external scheduler that can make more informed placement decisions. This technique is called deconstructing clusters.
This project attacks a particular data-intensive problem in high-end biometric research: the pair-wise comparison of hundreds of thousands of face images. The technique of deconstructing clusters will be used to parallelize the workload across large computing clusters. If successful, this project will reduce the time to develop and analyze a new biometric matching algorithm from years to days, thus improving the productivity of biometric researchers. The broader impact upon society will be an improvement in the accuracy and efficiency of biometric identification for commercial and national security. The software will be published in open source form in order to benefit other scientific computations with a similar pair-wise computation model.