This proposal focuses on mathematical models for the selection of experimental or observational units in sample surveys, traffic studies, ecological surveys and clinical trials. The PI studies the concept of auto-generation of units, a marked point process for the occurrence of events, interpreted as observational units, the mark incorporating all relevant quantitative information. This is intended as a probabilistic model for situations such as marketing studies and animal behaviour surveys, where each sampling unit is a event having a random type or mark, including both covariate and response. Since the number of units and the covariate configuration are both random, care must be taken in calculating the response distribution because different schemes for sampling the point process yield different sampling distributions. The sampling scheme may be biased in the sense that it favours larger values of the response. Less obvious biases may occur in samples drawn from binary random-effects models. For example, there may be interference from the covariate value for other events. Moreover, the response distribution for a unit taken by quota sample from a fixed covariate stratum need not be the same as the conditional distribution for an autogenerated unit having the same covariate value.

In the traditional statistical formulation taken from field trials, the set of units is regarded as fixed in advance, and treatment is assigned at random to those fixed units. In survey work, the sample units are selected from the population by an objective randomization mechanism independent of the response. By contrast, in animal behaviour studies, each behaviour event is a unit, and neither the number nor the configuration of units is pre-specified. In clinical trials, the patients are not fixed in advance, nor are they selected by random sampling. Instead, they are volunteers, to some extent self-selected, and frequently subject to pre-screening to improve compliance rates. The aim of this work is to develop probabilistic schemes tailored to biased-sampling schemes of this sort, where the response distribution among sampled units may be different from the response distribution in the broader population.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
0906592
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2009-08-01
Budget End
2013-07-31
Support Year
Fiscal Year
2009
Total Cost
$200,000
Indirect Cost
Name
University of Chicago
Department
Type
DUNS #
City
Chicago
State
IL
Country
United States
Zip Code
60637