ICSI Speech FAQ:
3.9 What are the dictionary data formats?

Answer by: dpwe - 2000-07-30


Dictionary by convention refers to the file that defines the pronunciation of each word known to the recognizer (it is also sometimes called the lexicon, which I think is more properly simply the set of words known to the recgonizer). As described under HMM formats, the y0 decoder has its pronunciations defined directly as HMM state sequences. Large-vocabulary decoders like noway and chronos find it more covenient to have a separate and more compact representation of pronunciations.

A noway/chronos format dictionary file consists of a set of lines of the following form:

ABANDON(1.00)           ax bcl b ae n dcl d ax n

First is the word, in the form that it will appear at the decoder output (i.e. all uppercase, or all lowercase if you prefer). In parens following the word is a prior on that particular pronunciation. Multiple pronunciations for the same word are handled by repeated definitions, with each alternative weighted by the prior; the priors should sum to one:

seven(0.1) s eh v ih n
seven(0.35) s eh v ah n
seven(0.4) s eh v ax n
seven(0.15) s eh v eh n

Following the prior (which I believe defaults to 1.0 if it is absent) comes the pronunciation in terms of the phone models defined either in the noway phonemodel file, or in the noway/chronos phi file. Where phonemodel files have been used to define context-dependent durational variants, the phone symbols may not be quite identical with the basic phoneset, e.g.

addison(1.000000) ae5 dcl2 d1 ih3 s5 ax4 n2

Note on sort order: For efficiency, in -- dare I say it? -- typical SoftSound fast-and-loose style, the SoftSound decoders chronos and efsgd assume that their dictionary files are sorted into 'correct' lexicographic order (and don't work properly if they are not). It turns out that sort cannot be relied upon to generate the required order, due to specifics in how casing and non-alpahebtics are handled. Thus, Eric wrote a little script just for sorting dictionaries: See this man page for sort_dict(1).


Previous: 3.8 What are phi files? How do I build one? - Next: 3.10 What are the grammar data formats?
Back to ICSI Speech FAQ index

Generated by build-faq-index on Tue Mar 24 16:18:15 PDT 2009