A stochastic context-free grammar (SCFG) extends the standard context-free formalism by adding probabilities to each production:
where the rule probability p is usually written as . This notation to some extent hides the fact that p is a conditional probability, of production being chosen, given that X is up for expansion. The probabilities of all rules with the same nonterminal X on the LHS must therefore sum to unity. Context-freeness in a probabilistic setting translates into conditional independence of rule choices. As a result, complete derivations have joint probabilities that are simply the products of the rule probabilities involved.
The probabilities of interest mentioned in Section 1 can now be defined formally.
In the following, we assume that the probabilities in a SCFG are proper and consistent as defined in Booth:73, and that the grammar contains no useless nonterminals (ones that can never appear in a derivation). These restrictions ensure that all nonterminals define probability measures over strings, i.e., is a proper distribution over x for all X. Formal definitions of these conditions are given in Appendix A.