Classical statistical methods were developed during much of the 20th century to analyze small-to-moderate sized data sets. The ubiquity of computing devices with vastly increased capabilities has led to the collection and storage of massive data sets. In many fields, such as telecommunications and particle physics, the data sets are essentially infinite: new data arise continuously as connections are established, financial transactions are conducted, and particles interact over time. Moreover, the data are likely to involve many highly correlated variables; e.g., data on a transaction may include demographic variables on both the purchaser and the supplier, in addition to the amount of the transaction, and data on particles in a physics experiment involve characteristics such as velocity, momentum, and mass, all of which can affect the characteristics on other particles.
Such massive data sets require the development of new statistical methods. One important area where such methods are needed is high-energy particle physics. The goal for the proposed research is to develop statistical methods and algorithms for viewing, and identifying patterns in, massive amounts of data, with the specific goal of applying these methods to data from particle physics experiments, and hence lead to: (1) advances in the theory of particle physics through novel uses of experimental databases, (2) new statistical methods for analyzing massive data sets, and (3) enhanced educational experiences for students who plan to pursue interdisciplinary research in the physical sciences.
The PI will enhance her undergraduate education in particle physics and work experience in the Statistical Engineering Division at the National Bureau of Standards (now NIST) and in the microwave test products and research division at Hewlett Packard Company (now Agilent), under the supervision of Professor Robert G. Jacobsen, a lead researcher at Stanford Stanford Linear Accelerator Center (SLAC) BaBar collaboration and in Berkeley (Lawrence Berkeley Laboratory and Department of Physics). Upon returning to Denver, she will introduce students in both statistics and physics to problems in high-energy physics and statistical methodology for the analysis of massive data sets, and apply this experience to the problem of assessment of uncertainty in the calibration models for the primary and secondary frequency standards (cesium fountain; five hydrogen masers, four commercial cesium standards) at NIST-Boulder's Time and Frequency Division.
Intellectual Merit: Specific outcomes of the proposed research include advances in both physics and statistics. In physics, this research should lead to greater precision and quantification of uncertainties associated with specific particle decay reactions associated with the B meson (SLAC). This knowledge will be useful in the PI's collaborations with physicists at NIST-Boulder on the uncertainty in the cesium fountain frequency standard. In statistics, this research should lead to new methods for analyzing massive data from such experiments.
Broader impacts: The PI expects that this collaboration will benefit not only physicists' understanding of decay reactions involving the B meson but also will lead the way for physicists' use of data from particle physics experiments, advance the state of knowledge about the Standard Model, lead to new methods of analyzing massive data sets/streams, and enhance the educational experience for students in computational statistics.
This IGMS project is jointly supported by the MPS Office of Multidisciplinary Activities (OMA) and the Division of Mathematical Sciences (DMS).