The electronic medical record (EMR) holds great allure to both the medical informatics and health services research communities. In this project, we propose to enhance the capability of electronic medical record (EMR) systems by creating and evaluating tools to extract clinical vocabularies as well as patient data from narrative text reports. We will apply advanced natural language processing tools from the CLARIT system to both of the above problems. We contend that fast and robust automated text processing methods are the only way that the problems of vocabulary construction and narrative text extraction can be solved. We will address the clinical vocabulary problem by utilizing the thesaurus extraction techniques already present in the CLARIT system. Using several gigabytes of narrative text, including discharge summaries, progress notes, radiology reports, and other clinical text, we plan to: l. Identify empirically the terminology used in medicine. 2. Compare the coverage of that terminology in several existing large medical vocabularies: UMLS, SNOMED, and the Medical Entities Dictionary. 3. Discern the semantic characteristics of that terminology to allow other structured vocabularies a richer substrate of terms as well as providing us the opportunity to implement a clinical vocabulary schema based on the methods of the MedSORT-II Project. 4. Evaluate how well our tools assist the vocabulary building efforts of ourselves and others. The narrative extraction problem will be approached differently than in the past, building on the efforts of previous investigators who have tackled this problem before but changing the perspective by focusing on the development of tools specific to researchers and others with a need to extract data from narrative text. This approach will be applied in two domains: l.Consortium-based research in the use of esophogastroduodenoscopy (EGD). 2.Practice guidelines implementation in blood product transfusion.