Dr Maricel Kann and her group at the University of Maryland had a list of mutations in disease, but no statistical test to decide whether the mutations were harmless or disease-causing. Joint work with the Drs DoHwan and Junyong Park in the Statistics Department at the University of Maryland established the validity of a proposed test with extensive simulations. The test used the asymptotic regression we developed for the estimation of BLAST statistical parameters, and the simulation showed that the method was more powerful than methods using Efron's local false-discovery rate. Dr Kann published work applying the statistical test to the mutational data. We also developed a benchmarking domain database, MultiDomainBenchmark, to evaluate the results of programs that search for specific domain architectures, corresponding to proteins with specific functions.
Gauran, Iris Ivy M; Park, Junyong; Lim, Johan et al. (2018) Empirical null estimation using zero-inflated discrete mixture distributions and its application to protein domain data. Biometrics 74:458-471 |
Sheetlin, Sergey; Park, Yonil; Frith, Martin C et al. (2016) ALP & FALP: C++ libraries for pairwise local alignment E-values. Bioinformatics 32:304-5 |
Carroll, Hyrum D; Williams, Alex C; Davis, Anthony G et al. (2015) Improving Retrieval Efficacy of Homology Searches Using the False Discovery Rate. IEEE/ACM Trans Comput Biol Bioinform 12:531-7 |
Sheetlin, Sergey L; Park, Yonil; Frith, Martin C et al. (2014) Frameshift alignment: statistics and post-genomic applications. Bioinformatics : |
Park, Yonil; Sheetlin, Sergey; Ma, Ning et al. (2012) New finite-size correction for local alignment score distributions. BMC Res Notes 5:286 |
Sheetlin, Sergey; Park, Yonil; Spouge, John L (2011) Objective method for estimating asymptotic parameters, with an application to sequence alignment. Phys Rev E Stat Nonlin Soft Matter Phys 84:031914 |
Park, Yonil; Sheetlin, Sergey; Spouge, John L (2009) ESTIMATING THE GUMBEL SCALE PARAMETER FOR LOCAL ALIGNMENT OF RANDOM SEQUENCES BY IMPORTANCE SAMPLING WITH STOPPING TIMES. Ann Stat 37:3697 |