High-throughput analysis has become an essential tool in genomic research. One area of this research that has not been heavily examined is how mistakes and errors generated in early steps of these analyses propagate through downstream analyses. For many complex, high-throughput genomic analyses, sequence alignment is among the very first steps. This proposal is for a pilot study to examine the feasibility of using computational sequence simulation to examine the effects and propagation of alignment error in high throughput comparative and functional genomic sequence analysis. This study will encompass three simple objectives: (1) To profile the accuracy of DMA sequence alignment, including both paired- and multiple alignments, in order to capture the breadth of simulation and study necessary for answering questions about downstream genomic sequence analysis; (2) To define the factors that need to be included in a simulation of sequences in order to encompass a realistic level of biological complexity, without overparameterizing or adding unnecessary complications, and to create a computer program that can perform these simulations; and (3) A case study - what are the effects of alignment error on the estimation of evolutionary distances among sequences. The results of this project will be used to define and plan a large-scale study of the downstream effects of alignment fidelity in high-throughput sequence analysis, including approaches for downstream analysis that take into account the presumed error of the hypothesized alignment.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Small Research Grants (R03)
Project #
5R03LM008637-02
Application #
7073353
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2005-07-01
Project End
2008-06-30
Budget Start
2006-07-01
Budget End
2008-06-30
Support Year
2
Fiscal Year
2006
Total Cost
$69,485
Indirect Cost
Name
Arizona State University-Tempe Campus
Department
Other Basic Sciences
Type
Schools of Arts and Sciences
DUNS #
943360412
City
Tempe
State
AZ
Country
United States
Zip Code
85287
Ogden, T Heath; Rosenberg, Michael S (2007) How should gaps be treated in parsimony? A comparison of approaches using simulation. Mol Phylogenet Evol 42:817-26
Ogden, T Heath; Rosenberg, Michael S (2007) Alignment and topological accuracy of the direct optimization approach via POY and traditional phylogenetics via ClustalW + PAUP*. Syst Biol 56:182-93
Ogdenw, T Heath; Rosenberg, Michael S (2006) Multiple sequence alignment accuracy and phylogenetic inference. Syst Biol 55:314-28
Rosenberg, Michael S (2005) Multiple sequence alignment accuracy and evolutionary distance estimation. BMC Bioinformatics 6:278