The majority of human genetic mutations known to cause disease are single amino acid substitutions generated by a nonsynonymous single nucleotide polymorphism (nsSNP), even though they comprise less than 1% of the approximately 10 million SNPs in the genome. How do the amino acid substitutions generated by nsSNPs affect proteins to cause disease? The set of known disease-causing nsSNP is biased towards severely deleterious rare mutations with easily identified phenotypes, while mildly deleterious mutations with higher population frequencies more likely to cause prevalent diseases such as hypertension go unnoticed. We seek to harness the growing wealth of comparative genomic information by developing a stochastic model of protein sequence evolution starting with an ancestral sequence and ending with a human protein sequence and its homologs where the likelihood of observing an amino acid in the process represents its evolutionary fitness, and substitutions to residues with low fitness are more likely to cause disease. The proposed project will develop a better understanding of the relationship between human disease and protein sequence evolution while identifying the genetic basis of human diseases.
RodrÃguez-Flores, Juan L; Zhang, Kuixing; Kang, Sun Woo et al. (2010) Conserved regulatory motifs at phenylethanolamine N-methyltransferase (PNMT) are disrupted by common functional genetic variation: an integrated computational/experimental approach. Mamm Genome 21:195-204 |