In the past 200 years, physicists have discovered the basic constituents of ordinary matter and the developed a very successful theory to describe the interactions (forces) between them. All atoms, and the molecules from which they are built, can be described in terms of these constituents. The nuclei of atoms are bound together by strong nuclear interactions. Their decays result from strong and weak nuclear interactions. Electromagnetic forces bind atoms together, and bind atoms into molecules. The electromagnetic, weak nuclear, and strong nuclear forces are described in terms of quantum field theories. The predictions of these theories can be very, very precise, and they have been validated with equally precise experimental measurements. Most recently, a new fundamental particle required to unify the weak and electromagnetic interactions, the Higgs boson, was discovered at the Large Hadron Collider (LHC), located at the CERN laboratory in Switzerland. Despite the vast amount of knowledge acquired over the past century about the fundamental particles and forces of nature, many important questions still remain unanswered. For example, most of the matter in the universe that interacts gravitationally does not have ordinary electromagnetic or nuclear interactions. As it has only been observed via its gravitation interactions, it is called dark matter. What is it? Equally interesting, why is there so little anti-matter in the universe when the fundamental interactions we know describe matter and anti-matter as almost perfect mirror images of each other? The LHC was built to discover and study the Higgs boson and to search for answers to these questions. The first data-taking run (Run 1, 2010-2012) of the LHC was a huge success, producing over 1000 journal articles, highlighted by the discovery of the Higgs boson. The current LHC run (Run 2, 2015-present) has already produced many world-leading results; however, the most interesting questions remained unanswered. The LHCb experiment, located on the LHC at CERN, has unique potential to answer some of these questions. LHCb is searching for signals of dark matter produced in high-energy particle collisions at the LHC, and performing high-precision studies of rare processes that could reveal the existence of the as-yet-unknown forces that caused the matter/anti-matter imbalance observed in our universe. The primary goal of this project - supported by the Office of Advanced Cyberinfrastructure in the Directorate for Computer and Information Science and Engineering and the Physics Division and the Division of Mathematical Sciences in the Directorate of Mathematical and Physical Sciences - is developing and deploying software utilizing Machine Learning (ML) that will enable the LHCb experiment to significantly improve its discovery potential in Run 3 (2021-2023). Specifically, the ML developed will greatly increase the sensitivity to many proposed types of dark matter and new forces by making it possible to much more efficiently identify and study potential signals -- using the finite computing resources available.

The data sets collected by the LHC experiments are some of the largest in the world. For example, the sensor arrays of the LHCb experiment, on which both PIs work, produce about 100 terabytes of data per second, close to a zettabyte of data per year. Even after drastic data-reduction performed by custom-built read-out electronics, the data volume is still about 10 exabytes per year, comparable to the largest-scale industrial data sets. Such large data sets cannot be stored indefinitely; therefore, all high energy physics (HEP) experiments employ a data-reduction scheme executed in real time by a data-ingestion system - referred to as a trigger system in HEP - to decide whether each event is to be persisted for future analysis or permanently discarded. Trigger-system designs are dictated by the rate at which the sensors can be read out, the computational power of the data-ingestion system, and the available storage space for the data. The LHCb detector is being upgraded for Run 3 (2021-2023), when the trigger system will need to process 25 exabytes per year. Currently, only 0.3 of the 10 exabytes per year processed by the trigger are analyzed using high-level computing algorithms; the rest is discarded prior to this stage using simple algorithms executed on FPGAs. To process all the data on CPU farms, ML will be used to develop and deploy new trigger algorithms. The specific objectives of this proposal are to more fully characterize LHCb data using ML and build algorithms using these characterizations: to replace the most computationally expensive parts of the event pattern recognition; to increase the performance of the event-classification algorithms; and to reduce the number of bytes persisted per event without degrading physics performance. Many potential explanations for dark matter and the matter/anti-matter asymmetry of our universe are currently inaccessible due to trigger-system limitations. As HEP computing budgets are projected to be approximately flat moving forward, the LHCb trigger system must be redesigned for the experiment to realize its full potential. This redesign must go beyond scalable technical upgrades; radical new strategies are needed.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1740102
Program Officer
Seung-Jong Park
Project Start
Project End
Budget Start
2017-09-01
Budget End
2021-08-31
Support Year
Fiscal Year
2017
Total Cost
$224,621
Indirect Cost
Name
University of Cincinnati
Department
Type
DUNS #
City
Cincinnati
State
OH
Country
United States
Zip Code
45221