This NSF EAGER project is to develop novel prototypes for (1) iconic image analysis and recognition for retrieval, classification annotation, and analysis of iconic digital imagery of Cypriot cultural heritage materials, and (2) searching and exploring the Ancient Cypriot Secretariat Corpus. (The corpus contains works from antiquity to early Christian era written in period text and describing scientific, philosophical, social commentary, etc.) The project will involve computer scientists, archeologists and art historians from Penn State and The Cyprus Institute. The systems developed will allow users to search for the iconic images in representative of Cypriot culture at various times of history based on such characteristics as image content, shape sketches, and metadata. From the Secretariat Corpus, end-users will be able to search for items belonging to different categories such as daily life at different periods, mythology, religion, politics, language, landscape, events and other categories. Strong support for this project has been pledged by the Cyprus Institute. The NSF Office of International Science and Engineering will co-sponsor the award.
The project is at the intersection of science, technology, and the humanities. Machine recognition of key features found on Cypriot icons demonstrates the utility of objective analysis of art, something that has traditionally been left to subjective humanistic interpretation. The discovery of attributes that will allow objective assignment of specimen icons by period or style, and the identification of attributes that identify or preclude cases of fraud are likely to impact the study of iconography in productive ways. We implemented a search engine that allows searching in English on mixed language documents with parts of it written in an archaic, obsolete language, Ancient Greek. Our work shows that even when large-scale parallel corpora are not available, using machine translation to learn from a small number of parallel corpora using Ancient Greek and English and augmenting it with learning from modern Greek-English parallel corpora works reasonably well, although not perfectly. Further investigation and development of algorithms to address this scenario is needed. This project expands the technical contributions information technology can make to the study of iconography in particular and art history more generally. It can be expanded and adapted to also impact archaeology and related fields. It also expands our knowledge of using machine translation in the presence of a small-sized parallel corpus for the translation of ancient languages. Student participants in the project have or will soon move on to areas of employment where the skills developed as part of this project will enhance their career development. One of the PhD students funded under this project has completed her thesis in summer 2013 in a related topic -- image annotation. In her thesis work, she focused on enhancing the training collections obtained from tagged Web images for machine-based image annotation. She developed a new scheme using statistical computing methods. Experiments indicate a higher resilience to noise than several other schemes. The work has the potential to substantially improve the performance of automated image annotation methods, with applications in Web infromation retrieval, image database management, and image analysis and recognition. A masters student and another Ph.D. student funded under this project have completed or are near completion of their theses. They have been trained in machine translation and search engines and cross-lingual search technologies. Their work has advanced the state-of-the-art in a small way and raised interesting questions in these areas that should be addressed in more detail in the future.