DNA sequences are now routinely used to reconstruct the branching evolutionary relationships among species, often displayed graphically as a phylogenetic tree. Phylogenetic trees are often based on just one or a few genetic loci, each representing a very small portion of the full genetic complement of a species. These small samples may produce misleading results due to complications arising from processes such as random genetic drift, natural selection, and hybridization. For some time, it has been recognized that the ideal solution would be to construct phylogenies using a large number of genetic loci sampled throughout the genome, but the cost of this approach has been prohibitive. Using two groups of bird species with different modes of reproduction and rates of speciation, this research will test the efficacy of a new "next-generation" sequencing method as a low-cost solution for recovering and sequencing essentially the same set of thousands of genetic loci from related species.

Phylogenetic trees are critical to understanding the phenomenal diversity and evolutionary history of life on earth. The potentially powerful new tool evaluated by this research is likely to be of broad interest as it should be applicable with little or no modification to almost any organism and will produce data sets orders of magnitude larger than in the best current studies.

Project Report

Biologists use phylogenetic trees to organize the diversity of living things. These trees illustrate the ancestor-descendant relationships among species and provide the basis for understanding the similarities and differences among species in morphology, physiology and behavior, fundamental knowledge that impacts a broad range of biological and biomedical investigations. Phylogenies were originally based primarily on morphological traits, but biologists have found that genetic data, and in particular DNA sequence data, often provides more robust inferences about evolutionary relationships. Until recently, phylogenetic analyses based on molecular data have been limited to relatively small data sets, representing a tiny fraction of a given organism’s genetic information. Our study contributes to the rapidly developing field of phylogenomics, in which modern high throughput DNA sequencing technology is being used to generate larger and more robust data sets in a cost-effective manner. Recent theoretical work in phylogenetics has led to the conclusion that analyses should include multiple DNA segments from different portions of the genome (i.e., loci) in order to produce more robust and accurate phylogenetic trees. Collecting such data, however, has historically been limited by the requisite laboratory procedures, which are slow, labor-intensive, and costly. Fortunately, the development of "next-generation" DNA sequencing technology has produced machines capable of quickly and cost-effectively generating huge amounts of DNA sequence data (i.e., billions of base pairs in a few days), but challenges remain for the application of this technology in the context of phylogenetic analysis. Our project comprised the development and testing of a modified version of "restriction site-associated DNA sequencing (RAD-seq)." Our approach uses to restriction enzymes to target specific loci in the genome in manner that is repeatable across samples, including samples of closely related species. This allows us to sequence a broadly overlapping set of genomic loci from closely related species; the data set represents a small portion of the overall genome, but is nonetheless orders of magnitude larger than has been the standard in this field for the past 20 years. In addition, the method allows up to 100 or more samples/species to be processed in a single run. We also explored the utility of the RAD-seq method and resulting data for phylogenetic analysis. As a test case, we collected RAD-seq data from two groups of African birds with contrasting natural histories: 1) Lagonosticta firefinches that build nests and raise their own young, and 2) Vidua whydahs and indigobirds, which are brood parasites and lay their eggs in the nests of other species. We used these data and newly developed analytical methods to generate phylogenetic trees for each group. These trees are robustly supported by the data and likely represent the most accurate hypotheses of relationships for these species to date. We also explored how the contrasting biology of these birds has influenced patterns of genetic diversity across the genome, a question of general interest in evolutionary biology and one that has received substantial study in historical analyses of the human population. This project has had the following positive impacts: Both the principal investigator and PhD student gained expertise in new laboratory, analytical, and computational methods. The graduate student was provided with training in modern phylogenetic analyses by attending the workshop "Estimating species trees: a phylogenetic paradigm for the 21st century" at The Ohio State University. We developed a cost-effective laboratory protocol and computational pipeline to generate RAD-seq data, both of which are described in a peer-reviewed scientific publication. We collected RAD-seq data and conducted phylogenetic analyses of two groups of African birds with contrasting natural histories. This resulted in a second scientific publication in preparation, which we expect will be an important resource for biologists studying these birds and/or conducting phylogenetic analyses on other organisms using RAD-seq data. We trained graduate students, post-doctoral associates, and faculty members at Boston University and four other universities in our laboratory and analytical methods. This has led to additional phylogenetic studies on other birds (ducks and loons) and Adelpha butterflies that are implementing the method we developed.

Agency
National Science Foundation (NSF)
Institute
Division of Environmental Biology (DEB)
Type
Standard Grant (Standard)
Application #
1011517
Program Officer
Joseph Miller
Project Start
Project End
Budget Start
2010-06-01
Budget End
2013-05-31
Support Year
Fiscal Year
2010
Total Cost
$13,223
Indirect Cost
Name
Boston University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02215