next up previous
Next: Forward and inner probabilities Up: Probabilistic Earley Parsing Previous: Stochastic context-free grammars

Earley paths and their probabilities

 

In order to define the probabilities associated with parser operation on a SCFG, we need the concept of a path, or partial derivation, executed by the Earley parser.

defn5119

Note that the definition of path length is somewhat counter-intuitive, but is motivated by the fact that only scanned states correspond directly to input symbols. Thus, the length of a path is always the same as the length of the input string it generates.

A constrained path starting with the initial state contains a sequence of states from state set 0 derived by repeated prediction, followed by a single state from set 1 produced by scanning the first symbol, followed by a sequence of states produced by completion, followed by a sequence of predicted states, followed by a state scanning the second symbol, and so on. The significance of Earley paths is that they are in a one-to-one correspondence with left-most derivations. This will allow us to talk about probabilities of derivations, strings and prefixes in terms of the actions performed by Earley's parser. From now on, we will use ``derivation'' to imply a left-most derivation.

  lemma5131

(a) is the invariant underlying the correctness and completeness of Earley's algorithm; it can be proved by induction on the length of a derivation [Aho and Ullman1972, Theorem 4.9,]. The slightly stronger form (b) follows from (a) and the way possible prediction steps are defined.

Since we have established that paths correspond to derivations, it is convenient to associate derivation probabilities directly with paths. The uniqueness condition (b) above, which is irrelevant to the correctness of a standard Earley parser, justifies (probabilistic) counting of paths in lieu of derivations.

defn5148

  lemma5150

Note that when summing over all paths ``starting with the initial state,'' summation is actually over all paths starting with S, by definition of the initial state tex2html_wrap_inline8043 . (a) follows directly from our definitions of derivation probability, string probability, path probability and the one-to-one correspondence between paths and derivations established by Lemma 1. (b) follows from (a) by using S as the start nonterminal. To obtain the prefix probability in (c), we need to sum the probabilities of all complete derivations that generate x as a prefix. The constrained paths ending in scanned states represent exactly the beginnings of all such derivations. Since the grammar is assumed to be consistent and without useless nonterminals, all partial derivations can be completed with probability one. Hence the sum over the constrained incomplete paths is the sought-after sum over all complete derivations generating the prefix.


next up previous
Next: Forward and inner probabilities Up: Probabilistic Earley Parsing Previous: Stochastic context-free grammars

Andreas Stolcke
Sat Jun 29 21:49:02 PDT 1996