A grammar is a probabilistic model of word sequences. Typically, a decoder requires an estimate of the probability of a word that it is hypothesizing given the previous words it has hypothesized. The basic grammar type supported by all of the decoders at ICSI is called a n-gram grammar. This type of grammar makes the assumption that the probability of a word is conditional only on the n-1 prior words. The commonly used term "bigram" refers to a 2-gram grammar, and "trigram" refers to a 3-gram grammar. "Unigram" means the unconditional probability of a word in a corpus (roughly its inverse frequency in a corpus). Perhaps confusingly, this means that a bigram grammar gives the probability of a word given the previous word, and a trigram likewise gives the probability of a word given the two previous words.
Here's a quick example of a trigram grammar in action. Suppose we want to evaluate the likelihood of three utterances:
Intuitively, we know that the first sentence is pretty likely, the second is syntactically correct, but semantically weird, and the third is just not very likely. A trigram model would evaluate the sentences thus:
(Note the including of the probability of ending a sentence with the last words of the sentence -- it discounts sentences like "MY DOG HAS".)
The terms for the first sentence are all relatively likely, given a grammar of general English. (Not so if your grammar is based on the Numbers corpus!) In the second sentence, most terms would be likely, except probably P(DOGS|FLEA HAS). Consequently, it would probably score lower than the first sentence. The third sentence would score much lower than the first two, since the terms P(FLEAS|HAS MY), P(DOG|MY FLEAS), and P(end|FLEAS DOG) might have low probability.
Previous: 7.6 What is embedded training? - Next: 8.2 How do I build a bigram grammar for Y0?
Back to ICSI Speech FAQ index
Generated by build-faq-index on Tue Mar 24 16:18:17 PDT 2009