Rapid advances in sensor, communication and storage technologies have led to the availability of data on an unprecedented scale. Depending on the source, these data may represent measurements, images, texts, time-series and a variety of other formats. Significant challenges remain in translating the increasing amount of data to useful and actionable information. The objective of this project is to address this information dilemma through the lens of modern large-scale optimization. This project supports research on methods to effectively process large-scale, unstructured, complex data so as to be usable in applications such as bioinformatics, smart energy systems, manufacturing, and healthcare. The project will also engage graduate students in the research activities and will support outreach to undergraduate STEM students through an existing program at the PI's university.
This project will focus on the construction of a general optimization and computational framework that enables a number of promising but challenging large-scale data-intensive applications. The research comprises two major thrusts. The first will build and analyze a novel optimization-based primal-dual decomposition framework that transforms a large, tightly coupled, non-convex problem into a sequence of independent subproblems solvable by parallel machines. The second applies the decomposition framework to a number of important emerging data-intensive applications, including high-dimensional clustering, topic modeling, and robust high-dimensional regression. Fundamental questions, such as optimality, convergence rates, and scalability in high dimension will be investigated. The project will test the developed methods using data from two important energy applications: smart energy meters and real-time residential photovoltaic inverters.