Humans are diploid organisms with two sets of chromosomes: 22 pairs of autosomes and one pair of sex chromosomes. The two chromosomes in a pair of autosomes are homologous, i.e., they have similar DNA sequences and essentially carry the same type of information but are not identical. The most common type of variation between chromosomes in a pair is that where the base in a specific location differs between the two sequences, i.e., the corresponding alleles on the homologous chromosomes are different. The complete information about DNA variations in an individual genome is provided by haplotypes, the list of alleles at contiguous sites in a region of a single chromosome. Haplotype information is essential for medical and pharmaceutical studies, including understanding variations in gene expressions and recombination patterns.

Intellectual Merit:

This research aims to develop and analyze novel algorithms for haplotype assembly from next-generation sequencing data. It consists of three main thrusts: (1) Haplotype assembly from next-generation sequencing data is computationally challenging. The first thrust proposes branch-and-bound algorithms that exploit certain structural features of the problem to efficiently find the exact solution. (2) As the size of the haplotype assembly problem grows, the exact solution is increasingly more difficult to obtain. The second thrust is focused on the development of fast heuristic methods with guaranteed performance bounds that enable explicit complexity-accuracy trade-offs. (3) Existing haplotype assembly schemes process DNA fragments comprising nucleotides whose order is already determined by the sequencing platform. The third thrust is focused on the development of algorithms for finding joint solution to the base-calling and haplotype assembly problems, enabling significant improvements in accuracy.

Broader Impact:

The results of this research will have a major impact on a number of fields that rely on accurate haplotype assembly, including medicine and pharmacogenomics, and will enrich the educational experience of engineering students at the University of Texas at Austin.

Project Start
Project End
Budget Start
2013-09-01
Budget End
2018-05-31
Support Year
Fiscal Year
2013
Total Cost
$400,000
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78759