In the past decade, we have seen a dramatic increase in the availability of on-line academic lecture material. It is conspicuous, however, that in contrast to many other communicative activities, lecture processing has until now enjoyed little benefit from the development of human language technology. The goal of this proposal is to enable fast, accurate and easy access to lecture content. We will develop new technologies in the area of speech recognition, structure induction and summarization.

Our work will contribute to a better understanding of the relationship between written and spoken language, a long standing issue in linguistics which has seen limited empirical research. Our research will rectify this situation by an extensive corpus-based study of this relationship at different levels, ranging from vocabulary to discourse variations.

The tools we propose to develop will be integrated and tested in the framework of the MIT Open CourseWare Initiative, a large publicly available on-line repository of teaching material from 500 MIT courses. In addition, we will also incorporate our tools in the Liberated Learning Initiative, which works on the integration of students with disabilities in mainstream higher education.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0415865
Program Officer
Kenneth C. Whang
Project Start
Project End
Budget Start
2004-10-01
Budget End
2008-09-30
Support Year
Fiscal Year
2004
Total Cost
$825,000
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02139