application) The mechanisms of persistence of hepatitis C virus (HCV) infection are poorly understood, but the highly mutable viral genome offers a window on the natural history of this process. Whereas inadequate in vitro and animal models have hampered efforts to observe the virus more directly, sequence analysis has identified strains with higher pathogenic potential and resistance to pharmacologic treatment. It is not surprising that HCV sequences reveal clues to pathogenesis, because HCV exists in each infected host as a quasispecies. (a swarm of distinct but related variants), which is subject to Darwinian selection by the host's immune system. We have developed a method to identify distinct variants in each infected individual, allowing us to acquire an accurate nucleotide sequence sample of the quasispecies at greatly reduced cost. We also have access to specimens from ALIVE, a large cohort of injecting drug users, with clearly defined clinical and virologic outcomes. Our preliminary studies of acute infection suggest that self-limited viremia is associated with a less complex swarm of HCV variants, greater host selective pressure on more conserved regions (based on the ratio of within-quasispecies non-synonymous to synonymous diversity), and higher positive charge at the highly variable N-terminus of the envelope protein E2. Our central hypothesis is that an immune response directed against more conserved epitopes is associated with clearance of HCV viremia following acute infection. By testing this hypotheses in a well-characterized cohort with years of clinical and virologic data, we anticipate success in achieving greater understanding of antigenic variation as it relates to quasispecies diversity, and of the mechanisms of HCV persistence. Greater understanding of these aspects of HCV pathogenesis would have a significant impact on vaccine development and rational drug design. By combining our epidemiologic and molecular resources with novel tools, we aim to confirm and extend our preliminary observations by (1) studying a validation cohort, (2) collecting longitudinal sequence data, and (3) expanding the scope of the analysis to include more conserved regions of the HCV genome. This application is efficient and has a high likelihood of success because it makes use of a well-characterized cohort, for which years of clinical and virologic data have already been assembled.