A symbol (either variable/nonterminal or terminal) appearing in a context-free grammar is said to be useful if it appears in some sentential form in a derivation of a terminal string from the grammar's start symbol. That is, X is useful if there is a derivation of the form
where S is the start symbol and z is a string of terminal symbols.
To be useful, a symbol must be both fruitful and reachable. X is fruitful if there is a derivation X ⇒* z, where z is a terminal string. X is reachable if there is a derivation S ⇒* αXβ.
If grammar G' is obtained from G by removing all productions involving non-useful symbols, L(G') = L(G).
The first step is to identify the grammar's fruitful variables/nonterminals:
|
Upon completion of the above, the set F contains all fruitful variables. Eliminate from grammar G any production whose left-hand side is a non-fruitful variable or whose right-hand side includes any non-fruitful variable. Call the resulting grammar G1.
The second step is to identify reachable nonterminals/variables. If S (the start symbol of G) is not among the fruitful variables of G just computed, then L(G) = ∅ and we are finished. (Take G' to be the grammar with start symbol S and having no productions!)
Otherwise, construct a directed graph whose nodes are labeled by the variables of G1. For nodes labeled A and B, place an edge going from A to B iff B occurs on the right-hand side of a production (in G1) whose left-hand side is A. (That is, (A,B) is an edge in the graph iff G1 has a production A → αBβ for some α and β.)
Now perform a search in the directed graph (e.g., breadth-first using a queue or depth-first using a stack or recursion) to find which nodes are reachable from S (the node labeled by the start symbol of G1). To obtain G', remove from G1 every production involving a variable that is not reachable.
S | ⟶ | aSa | bB | bAA | (1) (2) (3) |
A | ⟶ | abb | SbA | aB | (4) (5) (6) |
B | ⟶ | AB | CaB | (7) (8) |
C | ⟶ | cC | Sa | bD | (9) (10) (11) |
D | ⟶ | dD | λ | (12) (13) |
Solution: First we identify fruitful nonterminal symbols, i.e., those from which at least one terminal string can be derived.
Productions (4) and (13) imply that A and D, respectively, can produce a terminal string in one step. Hence, at the conclusion of the first loop in the algorithm, both A and D will be members of F and will be on the queue.
Iterations of the second loop correspond to the following analysis.
All the nonterminals on the right-hand side of Production (3) (namely, A) are fruitful, and thus its left-hand side (S) is fruitful. Similarly, Production (11) implies that C is fruitful. The remaining nonterminal is B, but none of its productions has a right-hand side composed entirely of known-to-be-fruitful symbols. (Indeed, each such right-hand side contains B itself.) Thus, we conclude that B is the lone non-fruitful nonterminal symbol and we eliminate all productions involving it. That leaves us with the grammar
S | ⟶ | aSa | bAA | (1) (2) |
A | ⟶ | abb | SbA | (3) (4) |
C | ⟶ | cC | Sa | bD | (5) (6) (7) |
D | ⟶ | dD | λ | (8) (9) |
For our small grammar, it was hardly necessary to explicitly build the graph, because it is obvious that the only nonterminal reachable from S, other than itself, is A. Thus, nonterminals C and D are unreachable and we can eliminate all productions in which they play a part. The resulting grammar is
S | ⟶ | aSa | bAA | (1) (2) |
A | ⟶ | abb | SbA | (3) (4) |