Sign languages are the primary means of communication for millions of Deaf people in the world, including about 350,000-500,000 American Sign Language (ASL) users in the US. While the hearing population has benefited from advances in speech technologies such as speech recognition and spoken web search, much less progress has been made for sign language interfaces. Advances depend on improved technology for analyzing sign language from video. In addition, the linguistics of sign language is less well-understood than that of spoken language. This project addresses both of these needs, with an interdisciplinary approach that will contribute to research in linguistics, language processing, computer vision, and machine learning. Applications of the work include better access to ASL social media video archives, interactive recognition and search applications for Deaf individuals, and ASL-English interpretation assistance.
This project focuses on handshape in ASL, in particular on one constrained but very practical component: fingerspelling, or the spelling out of a word as a sequence of handshapes and trajectories between them. Fingerspelling comprises up to 35% of ASL, depending on the context, and includes 72% of ASL handshapes, making it an excellent testing ground. The project addresses gaps in existing work by focusing on handshape in various conditions, including fast, highly coarticulated signing. The main project activities include development of (1) robust automatic detection and recognition of fingerspelled words using new handshape models, including segmental and "multi-segmental" graphical models of ASL phonological features; (2) techniques for generalizing across signers, styles, and recording conditions; (3) improved phonetics and phonology of handshape, in particular contributing to an articulatory phonology of sign; and (4) publicly released multi-speaker, multi-style fingerspelling data and associated semi-automatic annotation.