New, emerging and re-emerging infectious diseases pose an ever-increasing threat to public health, with attendant escalation of health care costs. In response, the Centers for Disease Control and Prevention (CDC) recommended a front line strategy involving expanded use of molecular epidemiology. Yet, before this can be successfully implemented, multiple analytic issues for statistical handling of the resultant data must be resolved. This study proposes to address concerns surrounding the analysis of fragment data band patterns (DNA fingerprints) that arise from numerous genotyping techniques applied to infectious organisms. These concerns differ substantially from the far more developed forensic use of DNA fingerprints in humans. Although the approaches proposed focus on tuberculosis, the resulting statistical methodology will generalize across organisms and genotyping techniques. Such analytic tools are crucial in fully realizing the potential of the rich, molecular epidemiologic data that is of such vital importance and, accordingly, is being widely obtained.
The Specific Aims of this application will address: 1) evaluating various methods for comparing microbial DNA fingerprint patterns including accommodating sources of measurement error, developing and comparing similarity/distance measures, extending these measures to handle multiple genotyping systems and to systems where band intensity is consequential, and to assess significance of matching individual fingerprints to large fingerprint databases; 2) properties of statistical techniques for representing these data including clustering and phylogenetic algorithms; and 3) integrating these analyses with epidemiologic and clinical data to identify interpersonal transmission of pathogens, and bacterial clones which have distinct pathogenic properties. Fragment data are currently being used extensively because of the many advantages they have over DNA sequence data. These include technical simplicity and relatively low cost, permitting use in epidemiologic studies with large sample sizes. However, a disadvantage is the absence of the variety of statistical analytical approaches to this type of data. This application seeks to redress this deficiency, thereby enhancing the utility of the associated molecular genotyping techniques in the collection of data for combating infectious disease.

Agency
National Institute of Health (NIH)
Institute
National Institute of Allergy and Infectious Diseases (NIAID)
Type
Research Project (R01)
Project #
5R01AI040906-03
Application #
6137216
Study Section
Epidemiology and Disease Control Subcommittee 2 (EDC)
Program Officer
Morens, David M
Project Start
1998-01-01
Project End
2002-12-31
Budget Start
2000-01-01
Budget End
2002-12-31
Support Year
3
Fiscal Year
2000
Total Cost
$202,658
Indirect Cost
Name
University of California San Francisco
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
094878337
City
San Francisco
State
CA
Country
United States
Zip Code
94143
Segal, Mark R; Dahlquist, Kam D; Conklin, Bruce R (2003) Regression approaches for microarray data analysis. J Comput Biol 10:961-80
Keles, Sunduz; Segal, Mark R (2002) Residual-based tree-structured survival analysis. Stat Med 21:313-26
Segal, M R; Cummings, M P; Hubbard, A E (2001) Relating amino acid sequence to phenotype: analysis of peptide-binding data. Biometrics 57:632-42
Xiao, Z; Greaves, M F; Buffler, P et al. (2001) Molecular characterization of genomic AML1-ETO fusions in childhood leukemia. Leukemia 15:1906-13
Salamon, H; Behr, M A; Rhee, J T et al. (2000) Genetic distances for the study of infectious disease epidemiology. Am J Epidemiol 151:324-34
Tanaka, M M; Small, P M; Salamon, H et al. (2000) The dynamics of repeated elements: applications to the epidemiology of tuberculosis. Proc Natl Acad Sci U S A 97:3532-7
Small, P M (1999) Tuberculosis in the 21st century: DOTS and SPOTS. Plenary lecture given at the 29th World Conference of the International Union Against Tuberculosis and Lung Disease, Bangkok, Thailand, 23-26 November 1998. Directly observed therapy. Int J Tuberc Lung Dis 3:949-55
Singh, S P; Salamon, H; Lahti, C J et al. (1999) Use of pulsed-field gel electrophoresis for molecular epidemiologic and population genetic studies of Mycobacterium tuberculosis. J Clin Microbiol 37:1927-31