This project supports research on the development of statistical methods for conducting cancer and other disease epidemiologic and surveillance analyses from national health surveys. We developed innovative statistical methods and statistical software for conducting a wide range of statistical analyses of data from weighted cluster samples from complex designed surveys, including displaying scatter plots for weighted data, estimating kernel density smoothers to obtain conditional mean and percentile plots, estimating directly adjusted estimates (predicted margins) from linear and nonlinear models, and estimating population variance components. A problem arising from logistic regression analysis of risk factors for disease is determining how well the estimated logistic model fits the data. We have developed a method to test the goodness-of-fit of a logistic regression model with survey data. In this approach the distribution of a Wald test that compares the observed and expected counts from deciles of risk is simulated under the null hypothesis. This approach is particularly promising for logistic models with small numbers of outcomes where the asymptotic distribution of the Wald test is not accurate. We are extending this simulation approach to testing of regression coefficients from logistic regression when the number of outcomes in covariate cells are sparse. The simulation approach is being compared to score tests under these same sparse data conditions. When we use regression analysis such as multiple linear, logistic or Cox regression, it is useful to estimate the average predicted response from the regression for each level of the risk factor if everyone in the population had been exposed to that level of risk. This is called a predictive margin. We have developed variance estimates for predictive margins when the sample data is from a survey. We are developing methods for making inferences about superpopulation parameters. We have developed adjustments to classical finite population variance estimators that can provide accurate variances for superpopulation means. We have extended these variance estimators to ratio and regression parameters and applying these estimators to the National Health Interview Survey, National Hospital Discharge Survey and the Third National Health and Nutrition Examination Survey. We are researching methods for using latent class theory to analyze dietary survey data. We have developed jackknife methods for estimating standard errors for estimators of latent class parameters and have investigated Wald procedures for testing hyptheses about these parameters. These methods have been successfully applied to dietary intake data from the USDA Continuing Survey of Food Intakes by Individuals to estimate the proportion of individuals who meet NCI guidelines for consuming fruits and vegetables. We are developing design-based consistent estimators of population variance components that are an improvement over existing inconsistent estimators. Simulation studies are under way to investigate the small sample properties of these design-based estimators. There has been much interest determining the extent and cause of racial/ethnic and economic disparities in the cancer-related behavior such as screening for cancer. Surveys offer important sources of data for measuring differences in screening behaviors among racial and economic subgroups of the US population. Decomposing the difference in rates of cancer screening between advantaged and disadvantaged groups due to known risk factors is analogous to the Peters-Belson (PB) approach used in discrimination cases of differences in wages. Here minority wages are compared to their expected values obtained from a regression type model (e.g., linear of logistic regression) fit to """"""""whites"""""""". We are developing methods for applying the PB approach to complex survey data with appropriate standard errors and confidence intervals, and are applying these methods to estimate socioeconomic and racial differences in regular mammography, colorectal, and PSA screening using data from the recent National Health Interview Surveys.