The genome of the AIDS retrovirus (AIDS RV), the etiologic agent of AIDS codes for several polypeptides in an extremely complex scheme of mRNA biogenesis. Although several of the virus coded polypeptides have been identified and localized on the genetic map by means of partial amino acid sequencing, direct functional correlation of the viral proteins can be greatly facilitated by genetic analysis of viral mutants produced during the natural course of acute infection. Since classical genetic approaches have been hampered by lack of appropriate viral plaquing procedures, we have begun to study viral gene expression in persistently infected cell lines carrying a single copy of integrated viral genome. To facilitate such an approach, we have used cloned cells from a mass of lymphocytes surviving acute infection. One such cell (8E5) was found to contain a single copy of integrated provirus, but produced no viable virus. Lack of reverse transcriptase activity and the absence of two polypeptides of 64 and 34 K in 8E5 cells prompted us to sequence these proteins in wild type virus infected cells. The N-termini of the 64, 51, and the 34 kd polypeptides present and acutely infected cells were localized within the pol reading frame of the proviral DNA. The deduced N-termini of the 64 and the 34 kd proteins were found at 156 and 716 residues from the beginning of the pol open reading frame of ca. 1000 residues. They corresponded respectively to the reverse transcriptase and the endonuclease proteins of the virus. The 51 kd protein had the same N terminus as the 64 kd. Studies are presently in progress to try to determine the exact processing pathway of the gag and the gag-pol precursor proteins. Other viral proteins, notably a 17 kd putative protease and a novel 41 K virus coded proteins are also being sequenced.