Following the completion of the human genome sequence, there has been growing interest in identifying which proteins accrued adaptive sequence changes on the human lineage due to positive selection. Identifying such positively selected proteins entails comparing protein-coding DNA sequences from humans and other primates in a statistical setting. With the recent completion of the chimpanzee genome sequence, this approach has become possible on a large scale. A major limitation of this approach, however, is that humans and many primate species are too similar at the DNA sequence level for conventional statistical methods to achieve their theoretically optimal power, potentially resulting in a high number of positively selected genes being overlooked. The proposed research develops a new statistical solution to this problem that is expected to have significantly higher power to identify selected genes than do existing approaches. The specific objectives of this proposed research are: 1) to develop a new method to detect positive selection that maximizes statistical information by pooling genes within mutationally similar regions of the genome; 2) to characterize the new method's power to detect selection; and 3) to apply the method to detect positive selection based upon the whole genome assemblies from human and chimpanzee, as well as the forthcoming macaque monkey genome. The proposed research is expected to considerably improve statistical power to detect genes of adaptive significance between closely related species, such as human and chimpanzee. The methods developed will be used to generate an exhaustive portrait of positively selected genetic changes on the human and chimpanzee lineages. This resource should provide a strong starting point for numerous evolutionary and functional studies by researchers worldwide. All data and computer programs developed in this project will be made freely available to the research community over the Internet. The computational resources required for this project will enhance the research infrastructure at the host institution. As dissertation research, this project will contribute to the education and training of doctoral candidate. The results of this study will be widely disseminated through publication in scientific journals, and will be presented at appropriate scientific conferences. This project will contribute to society by helping to fulfill the long-term goals of the Human Genome Project through enhancing understanding of human genetics and origins.

Project Start
Project End
Budget Start
2004-07-15
Budget End
2005-06-30
Support Year
Fiscal Year
2004
Total Cost
$9,250
Indirect Cost
Name
Suny at Albany
Department
Type
DUNS #
City
Albany
State
NY
Country
United States
Zip Code
12222