This SGER proposes to develop statistical techniques for using current information about the load on given clusters, combine it with the known characteristics of a given user job, and estimate the wait times that would result if the job were submitted on each of the given clusters. The hope is that this will lead to shorter turn-around times for users and more efficient use of the given clusters.
Given the rapidly evolving nature of grid computing and the need for developing adequate statistical techniques, this project has considerable risk. The PI's track record to date in related projects gives confidence that the problem is well understood and that the nature of the statistical techniques is well understood. The key mathematical/statistical work will be done by John Brevik, who has collaborated with the PI on related problems in the application of mathematics to grid computing at least since 2001. Longstanding collaborations with the San Diego Supercomputer Center and with the Texas Advanced Computing Center ensure that the techniques will receive current relevant data and will be applied using grid software being developed at the Texas Advanced Computing Center. Thus, the potential for rapid translation of advances in mathematics and statistics to useful software of application to the Extensible Terascale Facility is strong.