Network-based investigations and interventions are hindered in part by uncertainties about sampling. Hidden, elusive populations are not amenable to the master-frame, top-down sampling that permits valid estimation and hypothesis testing. One result has been a burgeoning of theoretical approaches and simulation. Completing the loop from theoretical result to empirical verification has received less attention. We propose that the theoretical concerns about empirical network sampling may not be justified, and propose to compare three different sampling approaches in two different areas (Dar es Salaam, Tanzania and Atlanta, GA) to determine if different sampling approaches produce similar network configurations. We will conduct the study in two Phases: {Phase One, an intense geographic risk assessment, will use key informant workshops that focus on mental maps of the community, collection of GPS coordinates for sites identified by these maps, and rapid assessment surveys to facilitate an understanding of where different types of risk activity are occurring within the communities in which we plan to sample. These data will then be entered into ArcGIS and analyzed in order to select the venues and informal areas from which to recruit seed participants in Phase Two. In Phase Two, in the selected high risk activity areas} we will conduct three different sampling approaches: time-space;short chain-link (one seed and referral of three risk contacts);and long chain-link (one seed and a subsequent chain of nine risk contacts). In all three approaches, we will elicit information on all contacts (social, sexual, or drug-using) in eac respondent's personal network, and follow up with interviews on the risk contacts, in accordance with the specific design. In keeping with our hypotheses, we anticipate that each of these groups created by each design will """"""""link up"""""""" to form a large connected component, and will be characterized by similar network attributes (such as degree distribution, clustering, mixing, concurrency, component distribution, and geographic contiguity), despite differences in sampling. In addition, we posit that a relatively small number of persons are required to elucidate the underlying network configuration. If these hypotheses are substantiated, they will provide a better empirical basis for verifying theoretical relationships and for developing network-based interventions.
This study will use empirical observations to test the presumed inadequacy of network sampling. The study specifically aims to: 1) use three different sampling schemes (time-space, small chain-link, long chain-link) in two urban areas to describe the underlying social/sexual/drug-using network configuration for persons presumed to be at risk for HIV;2) in each study area, test each of the three methods for concordance of network measures and for demonstration of the underlying network configuration;and 3) determine the point at which the network configuration """"""""stabilizes,""""""""( i.e., the point at which the underlying configuration becomes evident and no longer changes with additional recruitment) in each area. Should our hypotheses regarding the effects of sampling be sustained, they will provide researchers and public health workers with a basis for rapid network ascertainment and the use of network information for public health interventions.