ICSI Speech FAQ:
8.1 What are grammars for?

Answer by: fosler - 2000-08-10


A grammar is a probabilistic model of word sequences. Typically, a decoder requires an estimate of the probability of a word that it is hypothesizing given the previous words it has hypothesized. The basic grammar type supported by all of the decoders at ICSI is called a n-gram grammar. This type of grammar makes the assumption that the probability of a word is conditional only on the n-1 prior words. The commonly used term "bigram" refers to a 2-gram grammar, and "trigram" refers to a 3-gram grammar. "Unigram" means the unconditional probability of a word in a corpus (roughly its inverse frequency in a corpus). Perhaps confusingly, this means that a bigram grammar gives the probability of a word given the previous word, and a trigram likewise gives the probability of a word given the two previous words.

Here's a quick example of a trigram grammar in action. Suppose we want to evaluate the likelihood of three utterances:

  • My dog has fleas.
  • My flea has dogs.
  • Has my fleas dog.
  • Intuitively, we know that the first sentence is pretty likely, the second is syntactically correct, but semantically weird, and the third is just not very likely. A trigram model would evaluate the sentences thus:

  • P(sent1)=P(MY|start)P(DOG|start MY)P(HAS|MY DOG)P(FLEAS|DOG HAS)P(end|HAS FLEAS)
  • P(sent2)=P(MY|start)P(FLEA|start MY)P(HAS|MY FLEA)P(DOGS|FLEA HAS)P(end|HAS DOGS)
  • P(sent2)=P(HAS|start)P(MY|start HAS)P(FLEAS|HAS MY)P(DOG|MY FLEAS)P(end|FLEAS DOG)
  • (Note the including of the probability of ending a sentence with the last words of the sentence -- it discounts sentences like "MY DOG HAS".)

    The terms for the first sentence are all relatively likely, given a grammar of general English. (Not so if your grammar is based on the Numbers corpus!) In the second sentence, most terms would be likely, except probably P(DOGS|FLEA HAS). Consequently, it would probably score lower than the first sentence. The third sentence would score much lower than the first two, since the terms P(FLEAS|HAS MY), P(DOG|MY FLEAS), and P(end|FLEAS DOG) might have low probability.


    Previous: 7.6 What is embedded training? - Next: 8.2 How do I build a bigram grammar for Y0?
    Back to ICSI Speech FAQ index

    Generated by build-faq-index on Tue Mar 24 16:18:17 PDT 2009