Digital audio is widely used for research in areas from biology field studies to video games, yet software platforms to support such research are largely incompatible, often poorly designed, and rarely portable. Software bugs and unimplemented features discourage new users, and researchers waste vast resources duplicating efforts. In this project the PI will create a suite of coordinated and interoperable software libraries and applications, giving researchers easy access to audio data. Low-level audio interfaces will fill an important need for well-designed, cross-platform, open-source libraries that provide direct access to audio data. Libraries will be developed for audio input and output, access to audio sound files, MIDI input and output, and access to standard MIDI files. This effort will not create new hardware or device drivers, rather it will add a thin abstraction layer above existing operating system audio APIs that hides the ugly details and operating system dependencies, making audio access simpler. A high-level application will support users who need a general-purpose tool for audio visualization that provides instruments for audio analysis and annotation that do not currently exist. Audio visualization capabilities will include waveforms, spectrograms, other 1D and 2D functions of time, text and graphical annotations, and piano-roll displays of music data. The annotation facility will allow users to sketch on displays, edit graphical overlays, and save annotation data in simple text-based representations for exchange with other software. Analysis software will include an integration of the Marsyas project, which is widely used in audio classification research; indeed, the PI will build upon existing software wherever possible, both to reduce the total effort and to increase the impact by tapping into established communities and loyalties.

Broader Impacts: This work will establish de-facto standards and benefit diverse research areas including not only music and audio technology but also human-computer interfaces, assistive technologies, acoustics, human perception, surveillance and security, and patient monitoring and care, and education, to name but a few. Project outcomes will both facilitate the incorporation of audio in applications, and reduce duplicated effort by allowing audio software to be shared across platforms, thereby making it easier, for example, for teachers to incorporate "hands-on" audio processing into their curricula, and supporting innovative efforts relating to universal access and collaborative systems.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0534370
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
2006-06-15
Budget End
2011-05-31
Support Year
Fiscal Year
2005
Total Cost
$283,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213