This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).

The California-Boston AstroStatistics Collaboration is developing a new model-based strategy for statistical inference that embeds computer models into multilevel models that explicitly account for complexities of both astronomical sources and the data generation mechanisms inherent in new high-tech telescopes. The resulting highly structured models must be fully utilized in order to learn about the underlying astronomical and physical processes. This strategy requires state-of-the-art scientific computation, advanced methods for statistical inference, and careful model checking procedures. The Collaboration has a track record using these methods to solve outstanding data-analytic problems in astronomy. In addition, the PIs (van Dyk, Meng, and Yu) have substantial research experience in developing the methods that the Collaboration will extend, employ, and publicize: inferential and efficient computational methods under highly-structured models that involve multiple levels of latent variables and incomplete data. Such models are ideally suited to account for the many physical and instrumental filters of the data generation mechanism in high-energy astrophysics. The five astronomers (Chiang, Connors, Kashyap, Kelly, and Siemiginowska) all have expertise on the instrumentation and science of high-energy and/or optical astronomy, and, all have collaborated with statisticians in efforts to develop appropriate methods to address scientific questions. The collaboration specifically aims to develop a mixture of parametrized and flexible multi-scale models that can be combined with complex computer-models to describe spectral, spatial, and timing data, either marginally or jointly. The models are developed in a fully Bayesian framework that allows us to incorporate external information, provide coherent estimates of uncertainty, and calibrate statistical comparisons of proposed underlying physical models. These methods require the Collaboration to develop new sophisticated statistical computing techniques for Monte Carlo exploration of complex and often multi-modal posterior distributions.

In recent years, technological advances have dramatically increased the quality and quantity of data available to astronomers. Newly launched or soon-to-be launched space-based telescopes are tailored to data-collection challenges associated with specific scientific goals. These instruments provide massive new surveys resulting in new catalogs containing terabytes of data, high resolution spectrography and imaging across the electromagnetic spectrum, and incredibly detailed movies of dynamic and explosive processes in the solar atmosphere. The spectrum of new instruments is helping scientists make impressive strides in our understanding of the physical universe, but at the same time generating massive data analysis challenges for scientists who study the resulting data. The complexity of the instruments, the complexity of the astronomical sources, and the complexity of the scientific questions leads to many subtle inference problem that require sophisticated statistical tools. For example, data are partially missing, are subject to varying measurement errors, and are contaminated with irrelevant artifacts. Scientists wish to draw conclusions as to the physical environment and structure of the source, the processes and laws which govern the birth and death of planets, stars, and galaxies, and ultimately the structure and evolution of the universe. Sophisticated astrophysics-based computer-models are used along with complex mathematical models to predict the data observed from astronomical sources and populations of sources. The California-Boston AstroStatistics Collaboration aims to tackle outstanding statistical problems generated in astrophysics by establishing frameworks for the analysis of complex data using state-of-the-art statistical, astronomical, and computer models. In so doing the Collaboration will not only develop new methods for astronomy but will also use these problems as a spring board in the development of new general statistical methods, especially in signal processing, multilevel modeling, computer modeling, and computational statistics.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0907185
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2009-07-15
Budget End
2013-06-30
Support Year
Fiscal Year
2009
Total Cost
$378,426
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138