Despite recent advances in speech recognition technology, successful recognition is limited to co-operative speakers using close-talking microphones. There are, however, many other situations in which speech recognition would be useful - for instance to provide transcripts of meetings or other archive audio. Speech researchers at ICSI, UW, SRI, and IBM are very interested in new application domains of this kind, and we have begun to work with recorded meeting data.
The first stage in investigating speech recognition for meetings is to collect some data. At ICSI, we have equipped a meeting room with a multichannel, studio-quality recording system and have begun to collect pilot recordings of meetings, primarily between speech group members. At the time of writing (2001 February), we have collected 40 hours of 16 channel pilot data, and ten hours has been hand-transcribed. See this information on Meeting Recorder data collection including both the mechanics of the meeting recorder setup at ICSI and some initial forays into processing the recordings. The data were then transcribed, using a set of transcription conventions designed for speed and accuracy of data input and encoding.
Transcribers used a version of the "Transcriber" interface, modified in two ways to handle multi-channel inputs and overlapping speech. For information on our modifications of the Transcriber tool, including screen shots, see here.An example of the differences between a high-quality, near-field, head-worn microphone and a far-field, desktop microphone is also available.
A key issue in the project is to specify the goals and applications. While the basic idea is to develop recognition that could transcribe conventional meetings, this would be useful only in so far as it would support applications such as searching for particular information or producing automatic summaries. Here is an introduction to Meeting Recorder: Portable Speech Recognition, which particularly discusses applications for a meeting recorder that could be made portable i.e. like a PDA.
This project is a collaboration between the ICSI speech group (aka Realization), the SSLI lab at the University of Washington (as part of their Communicator project work), and SRI's STAR Lab. Primary funding currently comes from DARPA, and IBM will be providing further support via both collaboration and funding.
A recent instrumented meeting at ICSI. Note the microphones down the center of the table, and the headsets being worn by participants. A close-up of the PDA mockup is available.
Back to ICSI Speech Group homeDan Ellis - $Header: /n/www/export/http/htdocs/real/RCS/mtgrcdr.html,v 1.3 2000/06/26 20:49:16 dpwe Exp $