Video-enabled cell phones have the potential to enable Deaf people to speak in their community's native language (which in the United States is American Sign Language or ASL), while gaining the freedom, flexibility, and comfort afforded by the wireless phone revolution. Real-time video-enabled cell phones are available in Japan and parts of Europe based on higher bandwidth cell phone technology, and Deaf people in those countries have already begun using these devices to communicate with each other. In the United States, however, it remains a major scientific and technical challenge to provide real-time video-enabled cell phones. The PIs' goal in their ongoing MobileASL effort has been to develop and evaluate low bandwidth, high fidelity, error-resilient, and low complexity software video encoders that are tailored for compressing ASL video; achieving such advances requires novel laboratory and field studies with Deaf individuals. The specific objectives of the current project are four fold: To implement improvements to the MobileASL codec including combined stabilization and compression algorithms for hand-held captured video, packet loss mitigation techniques, and further complexity reduction; To finalize the real-time implementation of the ASL codec on cell phones, To determine ideal default setting and desirable options for encoding through laboratory studies of the features of the PIs' ASL codec including region-of-interest coding, variable frame rate encoding, lower complexity encoding, and ASL-specific error concealment strategies; To conduct an extended field study with deployed cell phones in the Deaf community.
Broader Impacts: Left out of the cell phone revolution are the approximately 500,000 Deaf people in the United States who use ASL. This community has embraced Internet-enabled video phone technology, but still cannot speak with each other on cell phones in their natural language. This project will rectify that situation, by bringing mobile ASL communication to the Deaf community. By incorporating the Deaf community's input into the design of all aspects of the system, the PIs will help ensure that the ASL video cell phones will penetrate the Deaf community. By employing Deaf undergraduates as summer research interns, these students will be inspired to continue their research careers and seek higher degrees. Real-time video-enabled cell phones will also be useful for the general population who wish to communicate face-to-face in a mobile environment.
This project studied how people who are Deaf use mobile video to communicate. We developed software for real-time video communication called MobileASL using the Windows 6 operating system. We were able to provide real-time video two years before Apple's FaceTime. We conducted a field study in Summer 2010 with eleven Deaf and Hard-of-Hearing individuals in Seattle. The phone that was used was an HTC Ty TN II phone, which was released in 2007. It was determined that the participants liked visual communication more than texting, but they wanted to use a sleek smartphone instead of the Ty TN II. As a result, we ported MobileASL to the Android platform and are still finishing up polishing the technology. We published a number of findings of value to scientists and engineers in conferences like ASSETS, CHI, UIST, and the IEEE Data Compression Conference and a journal paper in Signal, Image and Video Processing. We conducted a number of web studies in collaboration with Video Relay Services such as Sorenson Communications and ZVRS. The studies examined the feasibility of implementing power savings techniques in real-time video coding of ASL video; whether people prefer video with lower spatial resolution at very low bit rates; and what bit rate/frame rate combinations people prefer for ASL video. Ultimately our project demonstrated that: 1) People who are deaf prefer mobile communication via video 2) The required bit rate and frame rate combinations that the ITU are suggesting are needed for mobile video ASL communication are much too high. They are calling for 100 kilobits/second or higher at frame rates of 25 frames/second. In contrast, we have demonstrated many intelligible two-way conversations at bit rates of 30 kilobits/second at frames of 8-12 frames/second. 3) People prefer 10 frames/second over 15 frames/second for fixed bit rates of 15, 30, 60, and 120 kilobits/second. Further, there is not much improvement when bit rates rise from 60 to 120 kilobits/second. Hence our recommendation is two-way mobile video at 60 kilobits/second at a frame rate of 10 frames/second. Work on the user studies associated with the MobileASL project repeatedly raised the need for some statistical techniques not commonly used within the field of human-computer interaction (HCI). Our exploration led to the discovery of a little-known technique for conducting multi-factor nonparametric repeated measures analyses called the Aligned Rank Transform (ART). While the ART has been developed to two factors in the statistics literature, it has nowhere appeared in the HCI literature. One of our team members, Dr. Jacob Wobbrock, contacted Dr. James Higgins of Kansas State University's Department of Statistics, an early developer of the ART, and worked with him to generalize the technique from 2 to N factors. We then built a downloadable Windows-based software tool, and a web-based software tool, to provide the data preprocessing necessary to carry out the ART. Finally, we presented numerous presentations on MobileASL to the public, including high school students, college students, and K-12 students. Nearly fifteen undergraduates worked on the project, including women and underrepresented minorities in Electrical Engineering and Computer Science. One student received an MS degree on the MobileASL project and three students received Ph.D. degrees. The MS student is expected to finish her Ph.D. in June 2014.