There has been a perfect storm of convergent technologies leading to an onslaught of data in digital format, ready to be acquired, stored, accessed, transmitted, and processed. This research develops quantitative analytics that are appropriate for data arising in many novel areas like social networks or urban environments, outside traditional engineering or science. This data is unstructured?it is no longer a single time series or a single image and cannot be naturally arranged in a vector or table. The data is distributed?it originates from many different agents, possibly scattered over a large physical space (e.g., a metro area).
To process unstructured and distributed Big Data, the research extends traditional signal processing methods to distributed signal processing on graphs (DSPG) by associating two graphs with the data: 1) the ?physical graph? whose nodes index the data and whose physical edges capture the relations or dependencies among the data; and 2) the ?cyber? graph, possibly different from the physical graph, whose nodes index distributed processing units and whose cyber edges represent (local) communication channels among these units. To extend DSPG analytics to process distributed Big Data analytics, the investigators address the following challenges: 1) discover the structure of the underlying physical graph; and 2) design consensus+innovations scalable distributed algorithms to accurately process the distributed Big Data. Examples of important analytics include: forward filtering that computes y = Hx, where H is a graph filter or graph transform and x is the data associated with the agents; and inverse problems that compute solutions to linear systems Hx = y, where H and y are given, while x is reconstructed.