One of the major concerns that practitioners have about theoretical machine learning is the focus on distributions where the attributes are independent: in other words, knowing one attribute gives no information about any others. As an example, while it is plausible that height and eye color are independent, it is much less believable that height and weight are independent. Thus, the output given by any algorithm that is based on the assumption that the attributes of a person (such as height and weight) are independent cannot be trusted. The goal of this project is to extend what we know about the theory of such problems while removing some of the mathematically convenient assumptions such as independence.
The tools used focus on discrete Fourier analysis, but involve many other techniques from mathematics such as functional analysis and representation theory of finite groups. One recurring technique is the application of the "noise sensitivity" method, which quantifies the complexity of a function based on how similar the value of the function is on some input to the values of that input's neighbors. In many cases, the goal is to show that the Fourier spectrum of certain classes of functions is predictable; often, this predictability is a key component of algorithms for machine learning.
The broader goal of this project is to discover new connections between mathematics and computer science with a special focus on questions motivated by machine learning. Answers to the underlying questions would be useful to theoreticians and could lead to better applied machine learning algorithms. Also, the mathematical questions raised are interesting independently of the machine learning connection. The problems considered in this project will provide an invigorating research opportunity for undergraduate and Master's students.