From phase imaging in microscopy to semantic analysis of neural signals, scientific discovery relies on identifying models from massive noisy, incomplete, and corrupted data. One critical challenge for model identification is the nonlinearity inherent in most scientific models and data analysis tasks. However, as suggested by several successful approaches developed across machine learning, statistics, and optimization, nonlinear problems could become linear when "lifted" into higher dimensional spaces. Exploring the power and limitations of such "linearization" techniques plays an important role in complex model identification.
The research develops a unified framework of formulating nonlinear problems as infinite-dimensional linear problems using measure, a foundational concept of modern mathematics that abstracts the notion of volume and mass. In many measure estimation problems arising in physical and information sciences, the observations are linear functions of the measure's moments and the true measure describing the system is atomic with a small support. The work first delineates the class of model identification tasks that can be formulated as measure estimation from moments. Notable examples include tensor decomposition and completion, non-negative matrix factorization, and solving high-order multi-variate polynomial equations. Second, the project develops a unified theoretical and computational approach to solve measure estimation using semidefinite programming with guaranteed performance. Finally, the project provides efficient and scalable algorithms for applications in computational optics and large-scale data analysis, where the problems of measure estimation from moments arise frequently.