The California-Harvard Astrostatistics Collaboration aims to design and implement fully model-based methods of statistical inference to solve outstanding data analytic problems in high-energy astrophysics. The Collaboration's methods explicitly model the complexities of both astronomical sources and the data generation mechanisms inherent in new high-tech instruments and fully utilize the resulting highly structured models in learning about the underlying astronomical and physical processes. Using these models requires sophisticated scientific computation, advanced methods for statistical inference, and careful model checking procedures. The PIs of the Collaboration (van Dyk and Meng) both have substantial research experience in developing the methods that the Collaboration is extending, employing, and publicizing: inferential and efficient computational methods under highly-structured models that involve multiple levels of latent variables and incomplete data. Such models are ideally suited to account for the many physical and instrumental filters that compose the data generation mechanism in high-energy astrophysics. The five consultants on the project (Chiang, Connors, Kashyap, Karovska, and Siemiginowska) all have expertise on the instrumentation and science of high-energy astrophysics, and, all have collaborated with statisticians in efforts to develop appropriate methods to address scientific questions. There are two primary impacts of this project: the impact of the development of more reliable statistical methods on scientific findings in astronomy and the impact of the new statistical inference and computation methods in a wide range of scientific fields. As the Collaboration develops methods and distributes free software for specific inferential tasks, it also educates the astronomical community as to the benefit of careful use of sophisticated statistical methods. (The Collaboration organizes one or two special sessions at meetings of the American Astronomical Society each year.) It is expected that a fundamental impact of the proposed research will be a more general acceptance and more prevalent use of appropriate methods among astronomers. Second, the Collaboration is an example of a new mode of statistical inference. Rather than using off-the-shelf models and methods, it is becoming ever more feasible to develop application specific models that are designed to account for the particular complexities of a problem at hand. The Collaboration develops inferential and computational methods for handling such multi-level models. As application specific multi-level models become more prevalent, these methods will have application throughout the natural, social, and engineering sciences.
In recent years, there has been an explosion of new data in observational high-energy astrophysics. Recently launched or soon-to-be launched space-based telescopes that are designed to detect and map ultra-violet, X-ray, and gamma-ray electromagnetic emission are opening a whole new window to study the cosmos. Because the production of high-energy electromagnetic emission requires temperatures of millions of degrees and is an indication of the release of vast quantities of stored energy, these instruments give a completely new perspective on the hot and turbulent regions of the universe. The new instrumentation allows for very high resolution imaging, spectral analysis, and time series analysis. The Chandra X-ray Observatory, for example, produces images at least thirty times sharper than any previous X-ray telescope. The complexity of the instruments, the complexity of the astronomical sources, and the complexity of the scientific questions leads to a subtle inference problem that requires sophisticated statistical tools. For example, data are subject to non-uniform censoring, errors in measurement, and background contamination. Astronomical sources exhibit complex and irregular spatial structure. Scientists wish to draw conclusions as to the physical environment and structure of the source, the processes and laws which govern the birth and death of planets, stars, and galaxies, and ultimately the structure and evolution of the universe. Nonetheless little effort has been made to bring the strength of modern statistical methods to bare on these problems. The California-Harvard Astrostatistics Collaboration develops statistical methods, computational techniques, and freely available software to address outstanding inferential problems in high-energy astrophysics. The methods developed are an example of a new mode of statistical inference: Rather than using off-the-shelf methods, it is becoming ever more feasible to develop methods that are application specific and are designed to account for the particular complexities of a problem at hand. The inferential and computational methods designed by the Collaboration for handling such multi-level models have application throughout the natural, social, and engineering sciences.