The ready availability of public-use data from large National population-based complex surveys have immense potential to lead to the assessment of (1) population frequency of cancer (incidence and prevalence);(2) hospital length of stay and related costs for treatment;(3) cancer screening rates;(4) newly discovered associations between risk factors (e.g. screening rates, diet) and different cancers. The goal of this project i to demonstrate this potential using novel statistical methods applied to at least seven United States complex surveys. Specifically, we will use the Behavioral Risk Factor Surveillance System and the Health Information National Trends Survey to describe screening rates;the National Health and Nutrition Examination Survey to explore behaviors (diet, smoking, etc.) in current and future cancer patients;the Nationwide Inpatient Sample and the Medical Expenditure Panel Survey to describe hospital length of stay and related costs for treating cancer;the National Home and Hospice Care Survey to explore end-of-life care for cancer patients;and the National Health Interview Survey to examine follow-up of cancer survivors. Complex sample surveys present some quite unique problems, and we will develop appropriate models and methods complex surveys. Our proposal has three broad aims of significance to medical researchers. (1) New statistical approaches for small subgroup analyses in which the standard large sample complex survey methods can be inappropriate;2) New statistical procedures for databases that are too large for the usual complex survey approaches to be feasible;and 3) Complex survey methods for skewed data. An additional goal is to make the newly developed statistical/epidemiological methodology widely accessible to non-statisticians. For the methods described in each aim, we plan to create macros and procedures which can be used with existing, widely-used statistical packages (e.g., SAS). Statistical macros and procedures will be documented and made available on the Internet, together with documentation on how to apply these macros to the examples analyzed in the resulting publications.

Public Health Relevance

National complex survey data are used often in cancer epidemiology. We propose new approaches for analyzing such data that are theoretically valid, technically simple and can be implemented within most standard sample survey packages.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Epidemiology of Cancer Study Section (EPIC)
Program Officer
Liu, Benmei
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Brigham and Women's Hospital
United States
Zip Code
Fraser, Raphael André; Lipsitz, Stuart R; Sinha, Debajyoti et al. (2016) Approximate median regression for complex survey data with skewed response. Biometrics 72:1336-1347
Lipsitz, Stuart R; Fitzmaurice, Garrett M; Arriaga, Alex et al. (2015) Using the jackknife for estimation in log link Bernoulli regression models. Stat Med 34:444-53
Lipsitz, Stuart R; Fitzmaurice, Garrett M; Sinha, Debajyoti et al. (2015) Testing for independence in J×K contingency tables with complex sample survey data. Biometrics 71:832-40
Carter, Stacey C; Lipsitz, Stuart; Shih, Ya-Chen T et al. (2014) Population-based determinants of radical prostatectomy operative time. BJU Int 113:E112-8
Fitzmaurice, Garrett M; Lipsitz, Stuart R; Arriaga, Alex et al. (2014) Almost efficient estimation of relative risk regression. Biostatistics 15:745-56
Fitzmaurice, Garrett; Lipsitz, Stuart; Natarajan, Sundar et al. (2014) Simple methods of determining confidence intervals for functions of estimates in published results. PLoS One 9:e98498
Lipsitz, Stuart R; Fitzmaurice, Garrett M; Regenbogen, Scott E et al. (2013) Bias correction for the proportional odds logistic regression model with application to a study of surgical complications. J R Stat Soc Ser C Appl Stat 62:233-250
Natarajan, Sundar; Lipsitz, Stuart R; Fitzmaurice, Garrett M et al. (2012) An extension of the Wilcoxon Rank-Sum test for complex sample survey data. J R Stat Soc Ser C Appl Stat 61:653-664