The goal of this project is to build a software toolkit that will enable a biologist to create, from a collection of on-line articles, a database of protein subcellular localization information that can be queried, browsed, or used to support data-mining activities. We have developed a system, called SLIF, which can harvest fluorescence microscope images from online papers, analyze them using image-processing methods, and annotate them with information appearing in the accompanying textual description. We propose to improve and extend this system so as to produce a robust, comprehensive toolkit for extracting information about subcellular localization from the text and images found in online journals, as well as analyzing, verifying and querying the resulting body of information. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM078622-02
Application #
7241547
Study Section
Special Emphasis Panel (ZRG1-BST-A (02))
Program Officer
Deatherage, James F
Project Start
2006-07-01
Project End
2009-06-30
Budget Start
2007-07-01
Budget End
2008-06-30
Support Year
2
Fiscal Year
2007
Total Cost
$256,309
Indirect Cost
Name
Carnegie-Mellon University
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
052184116
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Ahmed, Amr; Arnold, Andrew; Coelho, Luis Pedro et al. (2010) Structured Literature Image Finder: Parsing Text and Figures in Biomedical Literature. Web Semant 8:151-154
Coelho, Luis Pedro; Ahmed, Amr; Arnold, Andrew et al. (2010) Structured Literature Image Finder: Extracting Information from Text and Images in Biomedical Literature. Lect Notes Comput Sci 6004:23-32
Ahmed, Amr; Xing, Eric P; Cohen, William W et al. (2009) Structured Correspondence Topic Models for Mining Captioned Figures in Biological Literature. KDD 2009:39-48
Qian, Yuntao; Murphy, Robert F (2008) Improved recognition of figures containing fluorescence microscope images in online journal articles using graphical models. Bioinformatics 24:569-76
Kou, Zhenzhen; Cohen, William W; Murphy, Robert F (2007) A stacked graphical model for associating sub-images with sub-captions. Pac Symp Biocomput :257-68