next up previous
Next: Appendix: Implementation Notes Up: An Efficient Probabilistic Context-Free Previous: Conclusions

Appendix: Existence of tex2html_wrap_inline7701 and tex2html_wrap_inline7703

 

In Section 4.5 we defined the probabilistic left-corner and unit-production matrices tex2html_wrap_inline7701 and tex2html_wrap_inline7703 , respectively, to collapse recursions in the prediction and completion steps. It was shown how these matrices could be obtained as the result of matrix inversions. In this appendix we give a proof that the existence of these inverses is assured if the grammar is well-defined in the following three senses. The terminology used here is taken from Booth:73.

It is useful to translate consistency into ``process'' terms. We can view an SCFG as a stochastic string-rewriting process, in which each step consists of simultaneously replacing all nonterminals in a sentential form with the right-hand sides of productions, randomly drawn according to the rule probabilities. Booth:73 show that the grammar is consistent if and only if the probability that stochastic rewriting of the start symbol S leaves nonterminals remaining after n steps, goes to 0 as . More loosely speaking, rewriting S has to terminate after a finite number of steps with probability 1, or else the grammar is inconsistent.

We observe that the same property holds not only for S, but for all nonterminals, if the grammar has no useless terminals. If any nonterminal X admitted infinite derivations with non-zero probability, then S itself would have such derivations, since by assumption X is reachable from S with non-zero probability.

To prove the existence of tex2html_wrap_inline7701 and tex2html_wrap_inline7703 , it is sufficient to show that the corresponding geometric series converge:

Interestingly, a SCFG may be inconsistent and still have converging left-corner and/or unit-production matrices, i.e., consistency is a stronger constraint. For example

eqnarray6158

is inconsistent for any choice of , but the left-corner relation (a single number in this case) is well-defined for all q < 1, namely . In this case the left fringe of the derivation is guaranteed to result in a terminal after finitely many steps, but the derivation as a whole may never terminate.


next up previous
Next: Appendix: Implementation Notes Up: An Efficient Probabilistic Context-Free Previous: Conclusions

Andreas Stolcke
Sat Jun 29 21:49:02 PDT 1996