The majority of human genetic mutations known to cause disease are single amino acid substitutions generated by a nonsynonymous single nucleotide polymorphism (nsSNP), even though they comprise less than 1% of the approximately 10 million SNPs in the genome. How do the amino acid substitutions generated by nsSNPs affect proteins to cause disease? The set of known disease-causing nsSNP is biased towards severely deleterious rare mutations with easily identified phenotypes, while mildly deleterious mutations with higher population frequencies more likely to cause prevalent diseases such as hypertension go unnoticed. We seek to harness the growing wealth of comparative genomic information by developing a stochastic model of protein sequence evolution starting with an ancestral sequence and ending with a human protein sequence and its homologs where the likelihood of observing an amino acid in the process represents its evolutionary fitness, and substitutions to residues with low fitness are more likely to cause disease. The proposed project will develop a better understanding of the relationship between human disease and protein sequence evolution while identifying the genetic basis of human diseases.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Predoctoral Individual National Research Service Award (F31)
Project #
3F31HG004247-03S1
Application #
7924989
Study Section
Special Emphasis Panel (ZRG1-GGG-G (29))
Program Officer
Graham, Bettie
Project Start
2006-09-01
Project End
2010-08-31
Budget Start
2008-09-01
Budget End
2010-08-31
Support Year
3
Fiscal Year
2009
Total Cost
$25,176
Indirect Cost
Name
University of California San Diego
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
804355790
City
La Jolla
State
CA
Country
United States
Zip Code
92093