Decision trees are one of the earliest machine learning models. They make a prediction for a given input by asking a series of simple questions that lead to a predicted value. This discrete structure makes decision trees very special: they are among the most interpretable of all models (in that the tree can be inspected to understand or manipulate its predictions), and they are very fast (since only one root-leaf path is followed to make a prediction for a given input). Trees can also model highly nonlinear functions. However, there is a large gap between the theoretical power of decision trees and their practical performance, which is due to the lack of an effective way to learn a decision tree from data. This problem is much harder than learning other models because the tree defines a discontinuous function and gradient-based optimization over the nodes' parameters is not applicable. The algorithms used today to learn trees were invented about 50 years ago, and result in suboptimal trees with low accuracy, which makes them not competitive with models such as kernel machines or neural nets, for which a number of effective optimization algorithms exist. This project seeks to redress this situation by developing an effective optimization algorithm for decision trees. This will make it possible to deploy decision trees in far more applications and to combine them with other models. The project will develop open-source software and teaching materials for decision trees, and train graduate and undergraduate students in machine learning and optimization.

The project develops the "tree alternating optimization (TAO)" algorithm, based on iteratively optimizing the node parameters over subsets of nondescendant nodes in a tree of fixed structure. TAO sidesteps the need for gradients and capitalizes on existing algorithms to train individual nodes. Starting from any given initial tree, each TAO iteration monotonically decreases the training loss function. This makes trees trainable like other parametric models (such as kernel machines or neural nets). The project will develop TAO for different loss functions and machine learning tasks (the traditional classification and regression, but also dimensionality reduction, semisupervised learning, structured inputs and others); for different regularization via penalty or constraints (such as those promoting sparse or nonnegative parameters); and for different node models (such as linear, kernel machines, neural nets or even decision trees themselves). Further, the project will explore TAO to learn the structure of a tree, to ensemble trees into forests, and to investigate interpretability and fairness of tree-structured models.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-10-01
Budget End
2023-09-30
Support Year
Fiscal Year
2020
Total Cost
$425,000
Indirect Cost
Name
University of California - Merced
Department
Type
DUNS #
City
Merced
State
CA
Country
United States
Zip Code
95343