A host of problems in scientific research, security, and commerce involve events registered by many devices in multiple locations. The result is fragmented information that must be gathered and built into a coherent whole. In addition, these events may come in rapid succession. When the event rate is high and the number of fragments large, the problem comes to resemble that of assembling tens, hundreds or even thousands of puzzle pieces that are continually being dumped into a common container. Further, puzzle pieces can become damaged or lost, introducing errors into the puzzle assembly process. These challenges are well-studied in the field of computational (nanoscale) self-assembly, which models processes such as the growth of crystals from organic molecules in solution. This project adapts computational self-assembly models to create a new paradigm that treats pieces of information from multiple sensors like molecules randomly meeting and assembling in solution. The result is a dynamic, fluid database of information chunks that evolve over time to form complete, accurate associations. This approach is applied to assemble data from the telescope arrays of very-high-energy gamma-ray observatories. A successful proof of concept in this domain is of interest to more than high-energy astrophysicists. The methods developed here are relevant to high data-volume experiments in other areas of physics and may have further applications to data transport and mining problems in the economic and security sectors.

This radically different method of fault-tolerant association of information from distributed sensors requires a proof-of-concept study, which will take place over a two-year period. The chosen test case is scientific. Very-high-energy gamma rays and cosmic rays initiate showers of charged particles in Earth's atmosphere, which in turn produce light due to an effect known as Cherenkov radiation. Arrays of atmospheric Cherenkov telescopes sample the light from a shower from multiple directions in order to more accurately infer the origin and energy of a given gamma ray. Assembling data from these telescopes into a description of a single gamma- or cosmic-ray shower (event-building) is typically done only once. Since revisiting the event-building process is impractical for a large (up to 100 petabytes per year) volume of data, errors become frozen into the data archive. This problem is addressed by the algorithmic self-assembly paradigm. Real and simulated data from the operating gamma-ray observatory VERITAS and simulated data from a planned next-generation observatory, the Cherenkov Telescope Array (CTA), are used to develop the concept and iteratively design, prototype, and test simple implementations for these instruments. Novel signal processing techniques will be exploited to rapidly extract information used in the association process. A series of use-case-dependent benchmarks are used to assess the performance. CTA's size, roughly 100 telescopes distributed over a square kilometer, and high (30 gigabytes per second) data rates make it a particularly apt test case and a successful proof of concept could lead to adoption of this model by CTA.

Agency
National Science Foundation (NSF)
Institute
Division of Physics (PHY)
Application #
1419259
Program Officer
Bogdan Mihaila
Project Start
Project End
Budget Start
2014-07-15
Budget End
2017-06-30
Support Year
Fiscal Year
2014
Total Cost
$169,034
Indirect Cost
Name
Department
Type
DUNS #
City
State
Country
Zip Code