Extensions to Transcriber for Meeting Recorder Transcription

The Transcriber transcription tool provides a friendly and quick interface for transcription work (note to avoid confusion: ISIP developed an unrelated, similar tool, which is also named Transcriber). We have modified Transcriber here at ICSI to meet the need of the Meeting Recorder project for transcription of multiple speakers recorded on multiple audio channels. The changes are described below. This software is available at ftp.icsi.berkeley.edu in the pub/speech/download/channeltrans directory (click here to go to that directory).

The ICSI Meeting Transcripts corpus released by the Linguistic Data Consortium in 2004 uses the MRT file format for transcripts. The version of the channeltrans tool available for download as described above uses a different file format, which is a very slightly modified version of the original file format of the Transcriber tool.

Switching Playback Between Audio Files

A "MultiWav" menu has been added that allows the user to switch playback between a number of audio files (which are all assumed to be time synchronized). This menu is shown in the screenshot below. During playback, the name of the audio file being played is displayed in the status line at the bottom of the Transcriber window.

Click for full-size image.

In our Meeting Recorder project, we record each participant on their own microphone; in other words, each participant is associated with an audio channel. Each audio channel is recorded to a separate file, and we use the MultiWav menu to switch between the files. Usually transcribers listen to a file which contains a mix of all the channels (with channel volumes adjusted for a more equal mix).

While audio is playing, the name of the audio file used will be displayed at the end of the status line at the bottom of the progam's window (the window may need to be resized so that the status line is visible). So if you are unsure what audio file is being played, check the status line.

Multi-Channel Transcription

The green segmentation bar displayed under the waveform in the screenshot above shows the transcribed speech. Vertical divisions in the segmentation bar mark utterance boundaries. As the screen layout suggests, those vertical divisions are drawn on the same time scale as the waveform.

The modified version of Transcriber allows multiple channels of transcription, each with its own segmentation bar. At ICSI, N+1 segmentation bars are used for a meeting with N participants, with one bar being used as a "default" for transcribing sounds such as background noises or unidentified speech. This is shown in the screenshot below, which illustrates the transcription of a meeting with four participants.

Click for full-size image.

In the screenshot, the currently selected segment, which starts with the text "Which", is in the third segmentation bar from the top. The currently selected segment is highlighted in both the corresponding segmentation bar and in the text editing window above the waveform display. The text editing window only shows the text associated with the channel of the currently selected segment.

The selected segmentation bar in the screenshot corresponds to channel 1. The two bars above it correspond to the "default" channel and channel 0, and the two bars below it correspond to channels 2 and 3. The name of the current channel is displayed in the "status line" at the bottom of the window. The "Channel labels..." dialog in the Options menu can be used change the channel names displayed in the status line (e.g., to meeting participants' names). This dialog is shown below.

The red and blue bars under the waveform display represent the Section and Turn segmentation levels which are not functional in our modified version of Transcriber. (Neither is the Background segmentation level.) This is because we are not using them in our transcription work and it would have been significant extra coding work to make them function properly in the new multi-channel model.

Usage Tips

Users of our modified version of Transcriber may benefit from these usage tips.

Loading Transcriptions Created with the Original Transcriber

Change "trans-13.dtd" to "trans-13-icsi.dtd" and add an initial Sync for each channel. For a three-channel transcription this means replacing


<Sync time="0"/>

with

                                                             

<Sync chan="default" time="0.000"/>
<Sync chan="0" time="0.000"/>
<Sync chan="1" time="0.000"/>
<Sync chan="2" time="0.000"/>

For more information about this software, email mrcontact@icsi.berkeley.edu.

Back to ICSI Meeting Recorder homepage