This SBIR project aims to produce superior methods and software for classification and regression when there are many potential predictor variables to choose from. The methods should (1) produce stable results, where small changes in the data do to produce major changes in the variables selected or in model predictions, (2) produce accurate predictions, (3) facilitate scientific interpretation, by selecting a smaller subset of predictors which provide the best predictions, (4) allow continuous and categorical variables, and (5) support linear regression, logistic regression (predicting a binary outcome), survival analysis, and other types of regression. This project is based on least angle regression, which unifies and provides a fast implementation for a number of modern regression techniques. Least angle regression has great potential, but the state of the art is limited to linear regression with continuous or binary variables, and uses numerically-unstable calculations. The outcome of this project should be software which is more robust and widely applicable. This software would apply broadly, including to medical diagnosis, detecting cancer, feature selection in microarrays, and modeling patient characteristics like blood pressure. Phase I work will demonstrate feasibility by extending least angle work in three key directions-categorical predictors, logistic regression, and a numerically-accurate implementation. Phase II will extend the work to other types of explanatory variables (e.g. polynomial or spline functions, and interactions between variables), and to survival and other additional regression models. This proposed software will enable medical researchers to obtain high prediction accuracy, and obtain stable and interpretable results, in high-dimensional situations.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43GM074313-01
Application #
6933500
Study Section
Special Emphasis Panel (ZRG1-HOP-B (10))
Program Officer
Lyster, Peter
Project Start
2005-05-15
Project End
2006-05-14
Budget Start
2005-05-15
Budget End
2006-05-14
Support Year
1
Fiscal Year
2005
Total Cost
$99,685
Indirect Cost
Name
Insightful Corporation
Department
Type
DUNS #
150683779
City
Seattle
State
WA
Country
United States
Zip Code
98109