Proposal Number: DMS-9802261 PI: Peter Bickel Institution: Project: Research on sieve approximations to non and semiparametric models, Hidden Markov models and comparison of phylogenetic tree biologies. Abstract: A theoretical investigation of the "plug in" property in the context of non and semiparametric models. The intention of the investigators is to characterize non and semiparametric models in which the outcomes of appropriate fitting procedures can be safely plugged in for a broad range of uses, and the study of model selection criteria when the loss function reflects the goal of fitting some features of the data well rather than a global fit. This project includes: *Further development of a theory for testing parametric or semiparametric hypotheses in a semi or non parametric context. *Further development of the theory of inference for Hidden Markov Models. The investigators propose extension to state space models. *Development of new procedures and analysis of existing procedures for estimating fixed effects and prediction of random effects using semiparametric models for longitudinal and/or "pharmacokinetic" data. *Further development of the theory and practice of selecting m in the m out of n bootstrap *An examination of the sensitivity to choice of stochastic model in the construction of phylogenetic trees Tests and diagnostics for semiparametric models such as those the investigators intend to continue to develop are useful in a number of areas. For instance, the Black Scholes option pricing formula is widely used in finance. One of the methods the investigators have already developed show the invalidity of the formula for large data set and points to plausible more realistic models. Phylogenetic trees are used not only for representing evolutionary relationships among species of animals and plants but also, as in the case we are g oing to study, important families of proteins. Studying the types of models that lead to plausible evolution trees should also lead to pattern recognition algorithms which will be useful in classifying protein families and hence to relating new proteins to families whose properties are known. This is a major activity in the search for new bioactive compounds in biotechnology.