Translation of Automated Sequencer Data to DNA Sequences

Tibbetts, Clark

Abstract

The objective of this proposal is to significantly improve automated determination of DNA sequences. Practical performance limits of automated DNA sequencers are determined by the separation of oligonucleotides effected by polyacrylamide gel electrophoresis. Designs of contemporary instruments are basically similar. As oligomers in a DNA sequencing ladder pass the detector(s), multi-component analysis specifies the radioactive or fluorescent label associated with each oligomer. Under ideal conditions, determination of the sequence of terminal nucleotides is straightforward. When separations of oligomers or signal levels are not optimal, ambiguities or errors are likely. These are miscalled bases, extra or missing bases, or unidentified bases in the DNA sequence file, typically at about 1 to 3 errors per 100 bases. An error rate near 1% is a common target for DNA sequencing performance, since comparison with complementary strand sequence data should then reduce errors to about 1 per 10,000 base pairs. This is only possible if every mismatch of the sequence and its complement is identified and correctly reconciled. Even then, error rates from 0.01% to 0.1% approximate the variation among alleles in a gene pool: some such alleles can correlate with severe burdens of inherited pathology. Small improvements in single strand error rate will have substantial impact on quality of finished sequences from 1/10,000 bp to 1/1,000,000 bp. Improvements are needed if automated systems are to provide longer spans of DNA sequences with fewer errors. The emphasis of this proposal is on raw data acquisition and new methods for translation of the raw data to finished DNA sequences. An expert system, rule-based method will be developed to reinforce conventional translation of raw data to DNA sequences. An independent, pattern-recognition system will also be developed and tested, using techniques for construction and training of neural nets. We will also evaluate two new approaches to utilize single label, single data channels for more efficient determination of DNA sequences. Alternative approaches to oligonucleotide separation for sequence analysis will also be investigated. In pursuit of these specific aims we will take advantage of the relative separations and intensities of successive oligomers in DNA sequencing ladders, as independent determinants of DNA sequence-specific data stream patterns.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 5R01HG000562-02
Application #: 3333740
Study Section: Genome Study Section (GNM)

Project Start: 1992-02-01
Project End: 1995-01-31
Budget Start: 1993-02-01
Budget End: 1994-01-31
Support Year: 2
Fiscal Year: 1993
Total Cost
Indirect Cost

Institution

Name: Vanderbilt University Medical Center
Department
Type: Schools of Medicine
DUNS #: 004413456

City: Nashville
State: TN
Country: United States
Zip Code: 37212

Related projects


NIH 1999 R01 HG	Enhanced Perf and Throughput of Automated DNA Sequence Tibbetts, Clark / George Mason University
NIH 1999 R01 HG	Enhanced Perf and Throughput of Automated DNA Sequence Tibbetts, Clark / Virginia Polytechnic Institute and State University
NIH 1998 R01 HG	Enhanced Perf and Throughput of Automated DNA Sequence Tibbetts, Clark / George Mason University
NIH 1997 R01 HG	Enhanced Perf and Throughput of Automated DNA Sequence Tibbetts, Clark / George Mason University
NIH 1996 R01 HG	Translation of Automated Seuencer Data to DNA Sequences Tibbetts, Clark / Vanderbilt University Medical Center
NIH 1995 R01 HG	Translation of Automated Seuencer Data to DNA Sequences Tibbetts, Clark / Vanderbilt University Medical Center
NIH 1994 R01 HG	Translation of Automated Sequencer Data to DNA Sequences Tibbetts, Clark / Vanderbilt University Medical Center
NIH 1993 R01 HG	Translation of Automated Sequencer Data to DNA Sequences Tibbetts, Clark / Vanderbilt University Medical Center
NIH 1992 R01 HG	Translation of Automated Sequencer Data to DNA Sequences Tibbetts, Clark / Vanderbilt University Medical Center

Publications

Lauer, Kim P; Llorente, Isabel; Blair, Eric et al. (2004) Natural variation among human adenoviruses: genome sequence and annotation of human adenovirus serotype 1. J Gen Virol 85:2615-25

Benamira, M; Johnson, K; Chaudhary, A et al. (1995) Induction of mutations by replication of malondialdehyde-modified M13 DNA in Escherichia coli: determination of the extent of DNA modification, genetic requirements for mutagenesis, and types of mutations induced. Carcinogenesis 16:93-9

Boylan, K B; Cornblath, D R; Glass, J D et al. (1995) Autosomal dominant distal spinal muscular atrophy in four generations. Neurology 45:699-704

Soares, V M; Brzustowicz, L M; Kleyn, P W et al. (1993) Refinement of the spinal muscular atrophy locus to the interval between D5S435 and MAP1B. Genomics 15:365-71

Golden 3rd, J B; Torgersen, D; Tibbetts, C (1993) Pattern recognition for automated DNA sequencing: I. On-line signal conditioning and feature extraction for basecalling. Proc Int Conf Intell Syst Mol Biol 1:136-44

Brzustowicz, L M; Kleyn, P W; Boyce, F M et al. (1992) Fine-mapping of the spinal muscular atrophy locus to a region flanked by MAP1B and D5S6. Genomics 13:991-8

Comments

Be the first to comment on Clark Tibbetts's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: