Spline smoothing method contains substantial advantages for its simple implementation and fast computation. The method becomes one of the most prominent techniques in the area of semi-parametric and nonparametric regression modeling. The main objective of this proposal is to investigate the inferential aspects of two spline-based methods. One is the quasi-likelihood estimation for categorized response data in generalized regression, and the other the empirical likelihood estimation which brings efficiency properties analogous to parametric likelihood and retains distribution-free character of nonparametric procedures. Specifically, the PI proposes to i) develop robust estimation and testing procedures for generalized spline regression models; ii) employ the equivalence between linear mixed models and penalized splines for linearity tests in generalized additive models; iii) extend free-knots spline to generalized regression in order to improve the empirical behavior of polynomial spline estimators; and iv) investigate spline confidence region of linear coefficients in partially linear models via empirical likelihood by considering the number of constraints growing with the sample sizes.

The proposed projects are expected to be of broad interest to researchers from a wide range of applied and social science fields including biochemistry, biostatistics, epidemiology, and economics. For example, the research findings are applicable to a cancer research study for a dose-response relationship between ethanol and risk of cancer with binary outcomes, and a fauna study for relationship between the number of species on sea bed and the spatial coordinates where error distribution is not fully specified. The proposed procedures serve as new highly usable tools for curve estimation and model diagnosis in general regression model-related data analysis. For educational purpose, the PI plans to develop a new advanced topic course related with the proposed topics to mentor undergraduate or graduate students, and therefore involve them in proposed research and related projects.

Project Report

Statistical data analysis usually begins with a specified parametric model as parametric effects can be interpreted and comprehended intuitively. This project is intended to further understand the spline techniques which are applied to parametric model diagnosis by the construction of simultaneous confidence bands. We have developed robust and stable spline estimators with optimal asymptotic properties. We have found that in both classical regression and generalized regression models, utilizing bias reduction and consistent wild bootstrap procedures can help improve the coverage probabilities of the spline confidence bands, increase the accuracy of model checking, and hence reduce model misspecification error for parametric data analysis. Detailed implementation procedures and computational algorithms are developed for spline confidence bands under regular setup. Simulation results are run extensively to investigate the finite sample behavior of the proposed bands. Optimization, reparameterization, and penalized techniques are key computational tools in simulations. Findings are shown to corroborate with the proposed theoretical outcomes. These new statistical methods can be applied to many research fields with data including binary/categorical responses and linear predictors. On the educational side, we have developed a new topic course on generalized regression models for graduate students with diverse background. This course mainly covers popular regression smoothing methods and their theoretical properties and computational challenges for generalized regression models. Under the advising/coadvising of this PI, two female students have obtained their doctorate degrees in statistics and economics respectively. One dissertation focuses on inferential and computational sides of model diagnosis by using the bootstrapped spline confidence bands. The other dissertation is on the nonlinearity detection of prediction of stock returns with kernel simultaneous confidence bands. One undergraduate student supervised by the PI investigates climate change data with quantile regression methods in a Capstone project. Part of his research result is published in a peer-review journal. The student has won an undergraduate research award for his poster to present his project result.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1107017
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2011-08-15
Budget End
2014-07-31
Support Year
Fiscal Year
2011
Total Cost
$80,448
Indirect Cost
Name
University of Illinois at Chicago
Department
Type
DUNS #
City
Chicago
State
IL
Country
United States
Zip Code
60612