Documentation for the Prosodic Stress Transcription


The data from the OGI and Switchboard corpora are much the same.  The difference is that in the OGI corpus, the .phn files were stress-labeled (this corpus was labeled at the phone level.)  Since the bulk of the switchboard data are transcribed at the syllable level, .syl files were used forstress labeling.

The format for the .syl files is as follows:

HEADER (several lines of information about the creation of the file etc.)
    time    121    syl
    time    121    syl
    time    121    syl
...

where time is the time at which the syllable ends (in milliseconds,) 121 is color information for xwaves and syl is the phonetic transcription for that time.  (note that the duration of a syllable can be calculated by subtraction the time of the previous syllable from the that of the current.)
e.g.

    0.490071  121 w eh lg
    0.712399  121 ax t s

Stress is marked with a number following (without a space) the last phone of the syllable, e.g.

    0.490071  121 w eh lg1

1 denotes primary stress.  2, secondary.

For the .phn files, the vowel (nucleus) of the stressed syllable is marked with a +1 or +2 (for primary and secondary stress respectively.)

    0.413000  121 er+1

The format is just the same as for the .phn files except that each line is a phone rather than a syllable.



To decompress the files (in a UNIX environment) use:

    gunzip file.tar.gz    -followed by
    tar xf file.tar