The abundance of data in science, medicine and commerce, and the current state of computing technologies gives us opportunities in statistical modeling never seen before. We are able to build powerful predictive models for the risk of breast cancer, heart disease or stroke, for example, using genomic markers. We can predict the risk of credit-card default or fraudulent insurance claims. Predictive models are able to recommend movies or music to a customer, based on their past behavior and preferences and that of customers like them. Using data on locations of sightings of multiple animal or plant species, we can build distribution maps over a geographical domain. With large amounts of data, it becomes necessary that these models are built in an automatic way; the goal of this project is to ensure that the resulting products remain interpretable.

Generalized additive models are both interpretable and somewhat powerful, but were originally intended for a relatively small set of predictor variables. This project will use methods in convex optimization to automatically build such models using potentially thousands of variables. The method will automatically omit irrelevant variables, as well as select the amount of nonlinearity needed for all those retained. Convex methods will also be used to incorporate side information in matrix completion problems, as well as a variety of multivariate methods where we have traditionally worked with low-rank representations. Ecologists often struggle with combining data from multiple species and different sampling schemes. This project will provide a unified framework using inhomogeneous Poisson process models for combining these data, and producing high-quality distribution.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1407548
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2014-08-01
Budget End
2019-10-31
Support Year
Fiscal Year
2014
Total Cost
$499,996
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305