We will continue our development of methods for recognizing and representing functional domains in biological sequences. This includes methods to identify regulatory sites in DNA starting from unaligned sequences, and to develop models that will allow new sites to be accurately predicted. This will involve the adoption of better statistical models so that the most significant alignments can be more readily obtained. We will also develop improved methods for recognizing functional motifs in RNA sequences that are composed of both sequence and structure. These methods will be useful for identifying regulatory domains that operate post-transcriptionally, and also for determining the common motifs in RNAs selected in vitro for particular activities. And we will further enhance methods for representing conserved domains in protein families that new members of the families can be identified more reliably. This will involve the use of neural network methods that optimize the discrimination of protein family members from other sequences in the database that are not members of the family. We will also continue several collaborations with biologists who can take advantage of our methods in their work, and develop new collaborations as opportunities arise.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Colorado at Boulder
Schools of Arts and Sciences
United States
Zip Code
Ruan, Shuxiang; Stormo, Gary D (2018) Comparison of discriminative motif optimization using matrix and DNA shape-based models. BMC Bioinformatics 19:86
Chang, Yiming K; Zuo, Zheng; Stormo, Gary D (2018) Quantitative profiling of BATF family proteins/JUNB/IRF hetero-trimers using Spec-seq. BMC Mol Biol 19:5
Hu, Caizhen; Malik, Vikas; Chang, Yiming Kenny et al. (2017) Coop-Seq Analysis Demonstrates that Sox2 Evokes Latent Specificities in the DNA Recognition by Pax6. J Mol Biol 429:3626-3634
Roy, Basab; Zuo, Zheng; Stormo, Gary D (2017) Quantitative specificity of STAT1 and several variants. Nucleic Acids Res 45:8199-8207
Xiao, Shu; Lu, Jia; Sridhar, Bharat et al. (2017) SMARCAD1 Contributes to the Regulation of Naive Pluripotency by Interacting with Histone Citrullination. Cell Rep 18:3117-3128
Zuo, Zheng; Roy, Basab; Chang, Yiming Kenny et al. (2017) Measuring quantitative effects of methylation on transcription factor-DNA binding affinity. Sci Adv 3:eaao1799
Ruan, Shuxiang; Stormo, Gary D (2017) Inherent limitations of probabilistic models for protein-DNA binding specificity. PLoS Comput Biol 13:e1005638
Ruan, Shuxiang; Swamidass, S Joshua; Stormo, Gary D (2017) BEESEM: estimation of binding energy models using HT-SELEX data. Bioinformatics 33:2288-2295
Chang, Yiming K; Srivastava, Yogesh; Hu, Caizhen et al. (2017) Quantitative profiling of selective Sox/POU pairing on hundreds of sequences in parallel by Coop-seq. Nucleic Acids Res 45:832-845
Stormo, Gary D; Roy, Basab (2016) DNA Structure Helps Predict Protein Binding. Cell Syst 3:216-218

Showing the most recent 10 out of 109 publications