This project will investigate theory, methods and applications of mathematical statistics and probability, with particular emphasis on the problems with data collected by NICHD. Current focus is on 1) the analysis of data arising from longitudinal studies with repeated measurements, 2)nonparametric procedures, 3) likelihood approaches to nonparametric two-sample problem for right-censored data, 4) sequential clincial trials, and 5) general methodology for reproductive and perinatal epidemiology. Examples of NICHD projects on longitudinal studies are Successive Small-for-Gestational Age Study I and Study II in Alabama and Scandinavia, and the Longitudinal Study of Vaginal Flora. A host of statistical procedures for estimation and hypothesis testing will be proposed and investigated for the time varying coefficient models via their asymptotic properties and simulations. Applications will be developed to handle questions concerning various issues in perinatal and reproductive epidemiology. New and rigorous statistical methods and algorithms will be generated and validated through investigation of their statistical and probabilistic properties. Computer-intensive techniques such as bootstrapping methodology will be investigated for the relevant problems. Among the applications of the developed methodology are fetal growth, maternal risk factors and pregnancy outcomes. Regression models for unbalanced longitudinal ordinal data will be studied. Major motivation and application come from the Longitudinal Study of Vaginal Flora. One direction is to develop sequential methodologies for clinical trials. Particular focus will be on the estimation problems following the termination of a clinical trial. Adaptive designs in clinical trials will be studied. Also under investigation is the incorporation of partial overrruning into the final analysis of a sequential clinical trial. Longitudinal analysis for discrete data and sequential adaptive designs will be the major focus for the near future. Point and interval estimation for two-stage adaptive procedures will be studied. A two-stage adaptive procdeure will be designed for the selection of the best diagnostic biomarker. General linear mixed models are very broad and constitute an important class of statistical models for longitudinal studies in many areas of biomedical and epidemiological studies. The statistical procedures and properties thereof are numerous. We carve out a small area for more detailed and deeper investigation. For the general linear mixed models, including some special cases, the least squares estimators and the analysis of variance estimators of the fixed effects and the variance are studied. Necessary and sufficient conditions are derived for the simultaneous optimality of both types of estimators. Variance components for the general linear mixed effects models are studied. A comparison of the analysis of variance estimators and the estimators under spectral decomposition is investigated. Analysis of variance statistical procedures for the general linear mixed models under heteroscedasticity will be investigated. Parametric and semi-parametric regression models are used to evaluate the change of the mean response over time and the effects of the explanatory variables of interest on the mean response. Because of their sensitivity to model specifications, inadequate parametric and semi-parametric modeling may lead to erroneous conclusions. A global goodness of fit test is to assess the adequacy of a chosen parametric or semiparametric model versus a more general nonparametric varying coefficient model. The test statistics based on the squared distances of the smoothed residuals under the parametric or semi-parametric model, have asymptotically normal distributions under the null hypotheses as well as under the alternatives. Approximate critical values and the power functions of the tests can be derived from the asymptotic distributions. Finite sample properties of our test procedures are investigated through a Monte Carlo simulation. Application of our test procedure is demonstrated through the Small-for-Gestational Age Study in Alabama. This goodness of fit tests can also be applied to other data sets. In general growth studies including fetal growth, usually a unit is under observation and a measurement of a certain attribute is made repeatedly at successive times. The data collected will consist of the growth measurements and the corresponding growth times. There are situations when the growth times are not completely known to the investigators when the measurements are made. However, the inter-measurement times are always known. The investigation of the growth dynamics without the benefit of the complete knowledge of the growth times is principally motivated by the Small-for-Gestational Age Study in Alabama. Ultrasound measurements on the fetuses are made at various times of the fetal growth. When the measurements are made, the growth times (gestational ages) are at best estimated;however the times elapsed between measurements are completely determined. There is a spectrum of different levels of knowledge about the growth times. They can be (a) completely unknown, (b) completely known, and (c) partially known, an example being that there is also an instrument to measure the growth time but it is not accurate enough to render reliable readings. Appropriate parametric models are used to describe the growth dynamics of the fetus. Different levels of knowledge of growth times will be considered. We propose an approach of hypothesis testing which bridges the classical likelihood ratio test and the fully Bayes factor. Statistical properties of the approach will be examined in terms of classical criteria. In particular we shall examine when it will reduce to the classical case and when it will give a fully Bayes factor. Under some conditions, the significance level can be controlled.
Showing the most recent 10 out of 17 publications