The NCBI CoreTools now contains code from us, code that calculates to practical accuracies, and in less than 1 sec, all parameters of the modified Gumbel distribution (the Gumbel scale parameter, , pre-factor k, and finite-size correction). The BLAST group plans has used our faster calculations to generate the modified Gumbel parameters for several new DNA scoring schemes. Our collaboration with Dr. Martin Frith has extended our methods to next-generation sequence matching, including frameshifts in DNA, a subject of relevance to the NCBI BLAST services. In a practical test of our methods, our frameshift statistics found many novel human pseudogenes.

Project Start
Project End
Budget Start
Budget End
Support Year
15
Fiscal Year
2013
Total Cost
$267,102
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
Zip Code
Sheetlin, Sergey; Park, Yonil; Frith, Martin C et al. (2016) ALP & FALP: C++ libraries for pairwise local alignment E-values. Bioinformatics 32:304-5
Carroll, Hyrum D; Williams, Alex C; Davis, Anthony G et al. (2015) Improving Retrieval Efficacy of Homology Searches Using the False Discovery Rate. IEEE/ACM Trans Comput Biol Bioinform 12:531-7
Sheetlin, Sergey L; Park, Yonil; Frith, Martin C et al. (2014) Frameshift alignment: statistics and post-genomic applications. Bioinformatics :
Park, Yonil; Sheetlin, Sergey; Ma, Ning et al. (2012) New finite-size correction for local alignment score distributions. BMC Res Notes 5:286