Large scale duplication, from chromosomal fragments to the entire genome, followed by mutation is considered a major force driving functional diversity in vertebrates. Isolated examples of duplicated regions support this theory, but a genome-wide study of mammalian gene duplication and subsequent functional diversification has not been attempted. A better understanding of the functional similarities between duplicated genes would greatly enhance the power of paralogous relationships in predicting gene function and understanding genetic disease. The candidate's long-term career goal is to establish an independent, research program in academia, using computational genomics to study the role of gene duplication in the evolution, structure and function of mammalian genomes. The specific goals of the current proposal constitute the first step in this program and will allow the candidate to demonstrate the feasibility of her interdisciplinary approach. They are (1) to construct a spatially ordered set of all discernible duplicated genes in the mouse and human genomes and to estimate the time of duplication for each; (2) to develop algorithms to identify the number of large scale duplications that took place and determine the sequence of rearrangements that subsequently fragmented them; (3) to determine, using probabilistic models of rearrangements, to what extent spatial organization of duplicated regions is preserved; and (4) to annotate the duplication data with functional data in preparation for studying the processes of functional differentiation following duplication.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Career Transition Award (K22)
Project #
Application #
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer
Graham, Bettie
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Schools of Arts and Sciences
United States
Zip Code
Vernot, Benjamin; Stolzer, Maureen; Goldman, Aiton et al. (2008) Reconciliation with non-binary species trees. J Comput Biol 15:981-1006
Raghupathy, Narayanan; Hoberman, Rose; Durand, Dannie (2008) Two plus two does not equal three: statistical tests for multiple genome comparison. J Bioinform Comput Biol 6:1-22
Song, Nan; Joseph, Jacob M; Davis, George B et al. (2008) Sequence similarity network reveals common ancestry of multidomain proteins. PLoS Comput Biol 4:e1000063
Vernot, B; Stolzer, M; Goldman, A et al. (2007) Reconciliation with non-binary species trees. Comput Syst Bioinformatics Conf 6:441-52
Durand, Dannie; Hoberman, Rose (2006) Diagnosing duplications--can it be done? Trends Genet 22:156-64
Przytycka, Teresa; Davis, George; Song, Nan et al. (2006) Graph theoretical insights into evolution of multidomain proteins. J Comput Biol 13:351-63
Durand, Dannie; Halldorsson, Bjarni V; Vernot, Benjamin (2006) A hybrid micro-macroevolutionary approach to gene tree reconstruction. J Comput Biol 13:320-35
Hoberman, Rose; Sankoff, David; Durand, Dannie (2005) The statistical analysis of spatially clustered genes under the maximum gap criterion. J Comput Biol 12:1083-102
Durand, Dannie; Sankoff, David (2003) Tests for gene clustering. J Comput Biol 10:453-82
Durand, Dannie (2003) Vertebrate evolution: doubling and shuffling with a full deck. Trends Genet 19:2-5