In the broadest terms, the goal of the proposed work is to make it easier for researchers to apply robust, scalable, entity-centered, heterogeneous data access to the biomedical literature. 'Entity centered' means that information is indexed irrespective of what a surface mention looks like in any given data source. For example, there is a gene in FlyBase with synonyms in text as diverse as 'Foil"""""""" and """"""""Mel(3)10"""""""", generic norminal referring expressions like 'The gene"""""""", pronouns like """"""""it"""""""", as well as a FlyBase database id of CG5490.[Morgan et al. 2002]. The Phase I proposal breaks down into two major efforts. First, extend the existing LingPipe suite of linguistic processing tools to the challenges of bioinformatics resulting in LingPipe-Bio. This will be distributed as an open source suite of tools to the research and entrepreneurial community with dual open source/commercial licensing. Second, it is proposed to adapt a current interface for entity centered data access (ThreatTracker for intelligence analysts) to BioTracker, based on the needs of biomedical researchers.