next up previous
Next: Complexity issues Up: Null productions Previous: Completion with null productions

Eliminating null productions

Given these added complications one might consider simply eliminating all tex2html_wrap_inline7691 -productions in a preprocessing step. This is mostly straightforward and analogous to the corresponding procedure for non-probabilistic CFGs [Aho and Ullman1972, Algorithm 2.10,]. The main difference is the updating of rule probabilities, for which the tex2html_wrap_inline7691 -expansion probabilities are again needed.

  1. Delete all null productions, except on the start symbol (in case the grammar as a whole produces tex2html_wrap_inline7691 with non-zero probability). Scale the remaining production probabilities to sum to unity.
  2. For each original rule tex2html_wrap_inline8593 that contains a nonterminal Y such that tex2html_wrap_inline8597 :
    1. Create a variant rule tex2html_wrap_inline7787
    2. Set the rule probability of the new rule to tex2html_wrap_inline8601 . If the rule tex2html_wrap_inline7787 already exists, sum the probabilities.
    3. Decrement the old rule probability by the same amount.
    Iterate these steps for all RHS occurrences of a null-able nonterminal.

The crucial step in this procedure is the addition of variants of the original productions that simulate the null productions by deleting the corresponding nonterminals from the RHS. The spontaneous dot shifting described in the previous sections effectively performs the same operation on the fly as the rules are used in prediction and completion.



Andreas Stolcke
Sat Jun 29 21:49:02 PDT 1996