This project will explore the use of high dimensional visualization for analyzing text structure and patterns for scholars in the humanities. The use of computing in text study has not exploited virtual reality and 3D visualization to the degree that can significantly extend scholarly examination. This project suggests new, untried approaches to text analysis of significant foundational texts. The test document of study will be the Korean Buddhist Canon, the oldest complete set of the texts that make up the Buddhist canon for East Asia. The document consists of the more than 83,000 blocks of text which has been digitized with each characters marked up with a Unicode designation. Millions of glyphs will be converted into an image that will have the same metadata as the glyph rendering it capable of being linked to the appropriate description in the text. In this way patterns of visualized word distribution and frequencies can be used to examine more complex document structures. Ancient literature generally exhibits multiple structures that are critical to interpretation of origin and meaning. New methods for revealing and analyzing structure are of great interest to social scientists, linguists and humanists. These present primary challenges to computer science in multiple areas.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0741556
Program Officer
Stephen Griffin
Project Start
Project End
Budget Start
2007-09-01
Budget End
2009-08-31
Support Year
Fiscal Year
2007
Total Cost
$99,685
Indirect Cost
Name
University of California Berkeley
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94704