The proposed project aims to develop a globally distributed, internet-based, digital library in Sanskrit, one of the worlds richest culture-bearing languages. At the same time, the project will explore OCR and language attributes of certain other none Roman scripts, such as Arabic, as the technologies to be developed are generic at the image recognition and other linguistic levels. Online access to digital content of many languages is at present severely hampered because available information processing technologies have developed primarily in the environment of the Roman alphabet. Sanskrit words have contextual phonetic variation and is a highly inflected language. These properties make discerning word boundaries and search of textual materials difficult, and requires the development of phonological and inflectional software.. Maximizing the utility of digital libraries of Sanskrit and a number of other widely used texts involves overcoming linguistic problems concerning script and encoding, phonology, grammar (inflection, agglutinative structures, derivation), and lexicon.This project integrates currently independent projects to create Sanskrit digital archives, digital lexica, and linguistic software; and explores text-encoding standards to enhance ancient and medieval manuscript access. The project will also advance OCR technology, display software, and Unicode-compliant text-editing software. Tools and techniques developed in this project will provide powerful new methods of access to scanned Sanskrit and other texts and will make widely available works that were previously available only to highly-trained specialists. This access will be profoundly valuable to students, scholars, and the wider public concerned with such fields as historical and general linguistics, philosophy and religious studies, pharmacology and medicine, history of science and mathematics, and general history and literature of South Asia.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0535038
Program Officer
Stephen Griffin
Project Start
Project End
Budget Start
2005-12-01
Budget End
2008-11-30
Support Year
Fiscal Year
2005
Total Cost
$202,888
Indirect Cost
Name
Suny at Buffalo
Department
Type
DUNS #
City
Buffalo
State
NY
Country
United States
Zip Code
14260