next up previous
Next: Completion (probabilistictransitive) Up: Coping with recursion Previous: Prediction (probabilistictransitive)

Completion loops

As in prediction, the completion step in the Earley algorithm may imply an infinite summation, and could lead to an infinite loop if computed naively. However, only unit productionsgif can give rise to cyclic completions.

The problem is best explained by studying an example. Consider the grammar

eqnarray5365

where q = 1 - p. Presented with the input a (the only string the grammar generates), after one cycle of prediction, the Earley chart contains the following states.

eqnarray5367

The tex2html_wrap_inline8305 factors are a result of the left-corner sum tex2html_wrap_inline8307 .

After scanning tex2html_wrap_inline8309 , completion without truncation would enter an infinite loop. First tex2html_wrap_inline8311 is completed, yielding a complete state tex2html_wrap_inline8313 , which allows tex2html_wrap_inline8315 to be completed, leading to another complete state for S, etc. The non-probabilistic Earley parser can just stop here, but as in prediction, this would lead to truncated probabilities. The sum of probabilities that needs to be computed to arrive at the correct result contains infinitely many terms, one for each possible loop through the tex2html_wrap_inline8319 production. Each such loop adds a factor of q to the forward and inner probabilities. The summations for all completed states turn out as

eqnarray5403

The approach taken here to compute exact probabilities in cyclic completions is mostly analogous to that for left-recursive predictions. The main difference is that unit productions, rather than left-corners, form the underlying transitive relation. Before proceeding we can convince ourselves that this is indeed the only case we have to worry about.

lemma5425

We now formally define the relation between nonterminals mediated by unit productions, analogous to the left-corner relation.

defn5440

As before, a matrix inversion can compute the relation tex2html_wrap_inline7703 in closed form:

displaymath8298

The existence of tex2html_wrap_inline7703 is shown in Appendix A.

The modified completion loop in the probabilistic Earley parser can now use the tex2html_wrap_inline7703 matrix to collapse all unit completions into a single step. Note that we still have to do iterative completion on non-unit productions.




next up previous
Next: Completion (probabilistictransitive) Up: Coping with recursion Previous: Prediction (probabilistictransitive)

Andreas Stolcke
Sat Jun 29 21:49:02 PDT 1996