Covariance models are a powerful and popular method for searching genomic databases for new members of functional RNA families. The use of prior information in covariance model parameter estimation is crucial since many RNA families only have a very few known examples or all those that are known are in a sub-family of the actual family. Experimental thermodynamic measurements of RNA structures point to a number of regularities in molecular stability that are not captured in current covariance modeling practice. Among these are the dependence of stability on hairpin closing pair and loop end nucleotide identities and well as hairpin loop length. Preliminary evidence has been found that incorporation of these effects into priors and model structure may improve covariance model performance. It is proposed to investigate additional thermodynamics-inspired improvements to covariance model parameter estimation and structure, combine the prior information with in-family observed- frequency data in an optimal fashion, and to expand testing of the performance of the model structure changes and parameter estimation methods.
This project improves the effectiveness of programs which search genomic databases for genes which specify RNA molecules which have biological function without conversion into protein. These molecules are involved in human disease and aging processes. Finding additional genes of this class will improve the ability to design drug and other therapies for human disease and age-related health degeneration.