Linear models analysis is one of the most appealing statistical methods for its directly interpretable results. The accelerated failure time (AFT) and censored quantile regression (QR) model serve counterparts of the classical linear and uncensored QR model for censored data, and complement the Cox-proportional hazards model. Censored QR, in particular, enriches linear models analysis for censored data by allowing non-constant covariate effects across the distribution of event times. Other regression methods unduly constrain the covariate effects to be constant and fail to provide consistent results. In contrast censored QR allows the treatment effect to be negative for more severe cases (with shorter event-free survival times) but positive in other cases. The AFT and censored QR model are, however, under-utilized as flexible and general methods for estimation, variable selection and inference do not exist. This investigation includes developing (A) flexible estimation methods that work under less stringent conditions than those for existing methods, (B) methods for variable selection, including high dimensional data, and (C) general empirical likelihood(EL) methods parallel to uncensored case. In addition, the general ideas of the proposed research and method developed are applicable to truncation or other censoring types, although they are developed under random right censoring mechanism.
Improving statistical models for predicting medical outcomes is always an important part of statistical research. Thanks to recent advancement in high throughput technologies, a vast amount of potentially useful information, including patient's gene profile, is available and anticipated to lead to much improved prediction. The proposed study investigates novel methods to incorporate those data in building a better statistical model to more accurately predict a patient survival. The type of models to be investigated are also more sophisticated: instead of predicting only an "average" person's survival, they allow prediction for "top 10%, or "bottom 10%", while allowing the survivals can be very differently impacted by the gene profile.
Normal 0 false false false EN-US ZH-CN X-NONE Survival data or time to event outcomes are often of interest in many research studies. Linear regression model analysis is one of the most appealing methods for its directly interpretable results. However, it is under-utilized for time to event censored data as we do not have enough flexible and general methods for estimation, variable selection and inference among others. The proposed research address these issues by developing (A) flexible estimation methods that works under a broader class of cases and under less stringent conditions than those for existing methods, (B) methods for variable selection and model building, and (C) general Empirical Likelihood methods parallel to uncensored data case. The individual themes of this project are important in its own right and also from a broader perspective. The results will be widely applicable to many scientific and medical problems for which a lack of estimation, variable selection and inference tools has largely precluded linear models analysis under censoring. The Empirical Likelihood methods project is an improvement on the so called Generalised Method of Moment (GMM), and GMM is cited in a 2013 Nobel prize in economics. On the other hand, survival time or time to relapse commonly appears as primary outcome in many biomedical studies. Standard statistical analytic tools are constrained by rigid assumptions and arbitrary requirements. One of such constraint is the proportionality of hazards that imposes hazard of one group uniformly higher than those of comparison groups though out lifespan of interest. In reality such uniformity is rare and possibly the hazard of one group may be higher or lower short-term but converges to the hazard of the rest of comparison groups over time. We have developed modeling and inference tools for such changing hazards. We allow potentially different short-term and long-term hazards, which are highly conceptually intuitive and interpretable. Also the proposed research will contribute to advancing other sciences where continuing development in high throughput technologies will make high dimensional data routinely available. Particularly many studies in biomedical science aim to link time-to-event outcome with a genomic, proteomic or imaging data to obtain a system-level understanding of the underlying biological/bio-chemical process and to build a predictive model for future events and aid decision making. The variable selection methods proposed in this research can be readily extended to regularized estimation problems for such data. They will lead to better model selection methods and thus will help to build a more accurate prediction model as to predicting survival times of cancer patients or time until the second cardiac event.