Quantile regression is gradually evolving into a comprehensive strategy for the analysis of statistical models with univariate response, complementing the exclusive focus of least-squares-based methods with a general approach to estimating conditional quantile functions. This project would continue the development of quantile regression methods, and intensify research on penalty methods for density estimation. In addition to continued work on survival/duration models and models with endogonous covariates, this research includes: 1. Penalty methods play an increasingly important role throughout statistics. Penalties are increasingly recognized as a critical tool of modern data analysis. The investigator and Ivan Mizero have been exploring such methods for both quantile regression and density estimation. For densities, total variation roughness penalties and related shape-constrained estimators seem especially promising. A major thrust of this proposal is to broaden the scope of this inquiry, producing improved algorithms, inferential capabilities, better understanding of asymptotic behavior and extending the flexibility of the available penalties. 2. Panel data methods in econometrics are still predominately the province of Gaussian random effects models, however there is often a strong motivation in applications for also estimating conditional quantile models. Growth curve data offers leading examples in biostatistics, and program evaluation offers numerous examples in econometrics. Expanding upon the close relationship between random effects estimation and penalty methods in Gaussian settings, research is proposed on penalty approaches to quantile regression estimation for longitudinal data, particularly focused on dynamic panel models. 3. Although there is already quite an extensive literature on quantile regression methods for time-series, most of the attention has focused on models in which lagged response exert a pure location shift effect on the distribution of the current response. The investigator and Zhijie Xiao have been investigating more general specifications, focusing initially on a class of models that exhibit some features of persistent random-walk behavior, while also exhibiting an episodic form of mean reversion. These models offer considerable potential for broadening the scope of applied time series analysis.
Broader Impact: Conventional statistical methods, since Quetelet, have sought to estimate the effects of policy treatments on the average man. But such effects are often quite diverse: medical treatments may improve life expectancy, but also impose serious short term risks; reducing class sizes may improve performance of good students, but not help weaker ones. Quantile regression methods help to explore these heterogeneous effects, as do improved methods for density estimation. These methods have broad applicability throughout science.
My overarching objective for this project was the development of more flexible and more robust statistical methods for economic and biological applications. I have continued to develop methods for quantile regression, methods designed to estimate how the entire distribution of a given response variable, like educational test scores, or unemployment durations change in response to changes in other relevant variables. New methods that relax the stringent linearity assumptions underlying earlier methods, and improved methods for assessing the accuracy of these methods were developed and have been incorporated into my open-source software package "quantreg" as part of the R language initiative. New methods of nonparametric density estimation were also developed as an integral part of the project. These methods are related to a variety of applied problems in duration analysis many of which rely on models with log-concave, or monotone hazard functions. In recent work we have extended these methods to more general classes of distributions including heavier tailed densities that would be more relevant in financial applications. In new work on empirical Bayes methods we have made remarkable progress in improving methods for computing the Kiefer Wolfowitz nonparametric maximum likelihood estimator, and these improvements have important applications in a wide variety of high dimensional data analytic contexts