This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5). The PI develops methods in function estimation and inference, using shape-restricted regression splines. The work includes three broad areas in estimation and inference. First, generalized multiple regression models is investigated, where the mean response function is assumed to be smooth and have a shape restriction such as monotone or convex. Second, a new maximum-likelihood method for smoothed unimodal density estimation is developed, that allows for heavy tails such as in the Pareto family of densities. An application is a new robust regression method that estimates the error density estimation non-parametrically, simultaneously with the regression function. Finally, the proportional hazards model is developed, where the hazard function is assumed to be smooth and have a shape restriction such as monotonicity or convexity.
Many problems in data analysis involve estimation of a function. Standard methods require the specification of the function up to a few parameters, but the a priori knowledge about the function is often vague and qualitative. For example, the researcher might know that a growth curve is smooth, increasing, and concave. The expected number of nesting sites at a lake might be a decreasing function of some pollution measure. Perhaps a hazard rate function is known to be increasing and convex, as in modeling wear-out of a mechanical part, or bath-tub shaped, as in modeling organ transplant failures. Nonparametric methods in function estimation are appealing because they require minimal assumptions, but development of practical inference methods is more difficult. Many methods that assume only smoothness of the function are sensitive to user-defined choices of the smoothing parameters such as bandwidth, number of knots, or penalty parameter, and the user can not rely on inference results that change with these choices. However, when the researcher can also assume a shape such as increasing or convex, the fits to the data become more robust, for the simple reason that the ``wiggling'' associated with over-fitting is obviated. The PI develops inference methods in three important areas: first, generalized regression models, such as when the response is a count or binary. Second, a new method for robust regression is developed, where the error density is assumed to be unimodal and symmetric, to allow for either heavy-tailed or thin-tailed errors. Finally, the proportional hazards model is developed, under shape and smoothness assumptions for the hazard function. These models are often used in medical studies to compare treatments while accounting for possible mitigating factors, and in industry to model mechanical systems. All three of these research projects result in basic data-analysis tools that can be used in virtually any area of science.