Many experiments in molecular biology designed for discovering cellular functions and organization, or estimating the efficiency of drug treatments, are prohibitively expensive to be systematically conducted on large scale systems. It is therefore of paramount importance to develop analytical methods for predicting which experiments are likely to provide the most informative outcomes of biological tests, based only on a small sample of incomplete and noisy measurements. Frequently, the incomplete data are random or pseudo-random samples of a matrix, or more generally, a multidimensional array (tensor). This suggests the use of a new algorithmic and analytic paradigm, termed low-rank matrix and tensor completion, for solving the inference problems at hand.
This research involves the development of novel tensor and nonlinear completion methods for inference of protein-protein interaction networks and design of synthetic lethality experiments. The methods used represent a combination of information-theoretic and algorithmic approaches centered around constrained optimization on Grassman manifolds. This research program has the potential to benefit many branches of molecular biology, neuroscience, computer vision, control, economics and signal processing in terms of answering fundamental inference questions for sparse systems and reducing the cost and time associated with system testing and experimental design. It will also provide unprecedented opportunities to graduate students in the electrical engineering and mathematics departments at UIUC to explore state-of-the art technologies and research problems at the intersection of molecular biology, bio-informatics, and signal processing.