This innovative project proposes to explore new hypotheses with respect to analysis of handwriting by examining and analyzing the written works of native and non-native writers of a particular script or alphabet. One hypothesis to be explored is that handwriting can be taken as analogous to speech in the sense that emphases might be indicated in the script itself without special notation. A second hypothesis is that an individual's handwriting is a mixture of style influences that can be decomposed into identifiable constituents. Insight into these issues could result in expansion of our ability to identify important aspects of written works that could benefit research and applications in a variety of disciplines including computing and computational sciences, forensics, biometrics and also the humanities. Humanists would gain powerful new tools and interfaces for analyzing large collections of handwritten documents in many alphabets and scripts and thus be able to ascertain critical information of use in such tasks as chronological ordering, categorization, determination of geographic origins, etc. A number of scripts and alphabets will be included in the research including Arabic, Oriental scripts, Roman and numerous others with significant feature differences.

Project Report

Normal 0 false false false EN-US X-NONE HI Accent is a term that has traditionally been defined in the context of the spoken language. It is used to signify differences in articulation habits of people from different geographic locations, ethnicities or socio-economic status. Intonation, pronunciation and stress on different parts of the spoken word are some of the articulation habits that differ with each type of accent. Although different accents exist among people with the same first (native) language, accents are more pronounced by speakers who are not speaking in their native language. Drawing upon this definition from speech, this project applies it to handwriting. The premise is that a group of people who start learning to write in a particular language will develop stylistic tendencies peculiar to that language. When writing in a different language, these peculiarities will be exhibited as writing accents. As in speech, the hypothesis is that a person's first (native) language plays a major role in deciding her handwriting style, particularly in a second (non-native) language. Handwriting analysis, which encompasses the challenges of handwriting recognition, writer identification, writer verification, document indexing and retrieval, have traditionally approached the problems by considering each person's handwriting as being unique to that person, with no shared components between people. In contract, the approach of accents in handwriting postulates that a person's handwriting is an amalgamation of cultural and genetic factors. A hierarchical framework, in which accent identification is the first step, is proposed. First, a topic modeling approach is used to determine the accents in handwriting, and then, on the basis of the accent identification step, the handwriting analysis problem is tackled. Experiments were performed on two data sets to verify the hypothesis: (i) An in-house data set collected exclusively for the accents in handwriting task and, (ii) the UNIPEN data set, which has the necessary annotations. The proposed hierarchical approach significantly outperforms the state of the art baseline (82% versus 73%).

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1014540
Program Officer
William Bainbridge
Project Start
Project End
Budget Start
2010-09-15
Budget End
2013-08-31
Support Year
Fiscal Year
2010
Total Cost
$149,986
Indirect Cost
Name
Suny at Buffalo
Department
Type
DUNS #
City
Buffalo
State
NY
Country
United States
Zip Code
14228