In the Big Data era, the large numbers of observations and variables pose unprecedented challenging problems of complex forms of dependency and high computing expenses. This situation is especially common in important problems in Astronomy, Biology, Economics, Engineering, Finance, Genetics, Genomics, Neurosciences, etc. To meet these challenges, the PI proposes to study these problems through a novel framework of binary expansion statistics. The overall objective of the project is (i) to provide an in-depth understanding of complex dependency in Big Data with new theory and methods, and (ii) to build a stronger connection between Statistics and Computer Science. The PI anticipates the achievement of his goals through an integration of research and education plans.
The binary expansion statistics framework is able to "divide and conquer" any complex dependency, i.e., to approximate and decompose nonlinear dependency into interactions of Bernoulli variables in the binary expansion filtration and then aggregate the information to produce nonparametric inference of dependence. This approach connects the inference problems to important concepts in Statistics and Computer Science such as multiple testing, Hadamard transform, and bitwise operation. The research agenda is to further develop this framework and study several fundamental problems to develop optimal theory, methodologies and algorithms. The PI also has comprehensive plans on educating graduate and undergraduate students and on disseminating the research results to the broader scientific community.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.