Dr Maricel Kann and her group at the University of Maryland had a list of mutations in disease, but no statistical test to decide whether the mutations were harmless or disease-causing. Joint work with the Drs DoHwan and Junyong Park in the Statistics Department at the University of Maryland established the validity of a proposed test with extensive simulations. The test used the asymptotic regression we developed for the estimation of BLAST statistical parameters, and the simulation showed that the method was more powerful than methods using Efron's local false-discovery rate. Dr Kann published work applying the statistical test to the mutational data. We also developed a benchmarking domain database, MultiDomainBenchmark, to evaluate the results of programs that search for specific domain architectures, corresponding to proteins with specific functions.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Investigator-Initiated Intramural Research Projects (ZIA)
Project #
1ZIALM000088-20
Application #
10007523
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
20
Fiscal Year
2019
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
Zip Code
Gauran, Iris Ivy M; Park, Junyong; Lim, Johan et al. (2018) Empirical null estimation using zero-inflated discrete mixture distributions and its application to protein domain data. Biometrics 74:458-471
Sheetlin, Sergey; Park, Yonil; Frith, Martin C et al. (2016) ALP & FALP: C++ libraries for pairwise local alignment E-values. Bioinformatics 32:304-5
Carroll, Hyrum D; Williams, Alex C; Davis, Anthony G et al. (2015) Improving Retrieval Efficacy of Homology Searches Using the False Discovery Rate. IEEE/ACM Trans Comput Biol Bioinform 12:531-7
Sheetlin, Sergey L; Park, Yonil; Frith, Martin C et al. (2014) Frameshift alignment: statistics and post-genomic applications. Bioinformatics :
Park, Yonil; Sheetlin, Sergey; Ma, Ning et al. (2012) New finite-size correction for local alignment score distributions. BMC Res Notes 5:286
Sheetlin, Sergey; Park, Yonil; Spouge, John L (2011) Objective method for estimating asymptotic parameters, with an application to sequence alignment. Phys Rev E Stat Nonlin Soft Matter Phys 84:031914
Park, Yonil; Sheetlin, Sergey; Spouge, John L (2009) ESTIMATING THE GUMBEL SCALE PARAMETER FOR LOCAL ALIGNMENT OF RANDOM SEQUENCES BY IMPORTANCE SAMPLING WITH STOPPING TIMES. Ann Stat 37:3697