<- previous index next ->
Use the "simplification" steps to get to a Chomsky Normal Form. Use the algorithm on P140, Figure 6.8 Given a CFG grammar G in Chomsky Normal Form and a string x of length n Group the productions of G into two sets { A | A -> a } target is a terminal { A | A -> BC } target is exactly two variables V is a two dimensional matrix. Each element of the matrix is a set. The set may be empty, denoted phi, or the set may contain one or more variables from the grammar G. V can be n by n yet only part is used. x[i] represents the i th character of the input string x Parse x using G's productions for i in 1 .. n V[i,1] = { A | A -> x[i] } for j in 2..n for i in 1 .. n-j+1 { V[i,j] = phi for k in 1 .. j-1 V[i,j] = V[i,j] union { A | A -> BC where B in V[i,k] and C in V[i+k,j-k]} } if S in V[1,n] then x is in CFL defined by G. In order to build a derivation tree, a parse tree, you need to extend the CYK algorithm to record (variable, production number, from a index, from B index, from C index) in V[i,j]. V[i,j] is now a set of five tuples. Then find one of the (S, production number, from a, from B, from C) entries in V[1,n] and build the derivation tree starting at the root. Notes: The parse is ambiguous if there is more than one (S,...) in V[1,n] Multiple levels of the tree may be built while working back V[*,k] to V[*,k-1] and there may be more than one choice at any level if the parse is ambiguous. Example: given a string x = baaba given grammar productions A -> a S -> AB B -> b S -> BC C -> a A -> BA B -> CC C -> AB V[i,j] i 1(b) 2(a) 3(a) 4(b) 5(a) 1 B A,C A,C B A,C 2 S,A B S,C S,A j 3 phi B B 4 phi S,A,C 5 S,A,C ^ |_ accept Derivation tree S B C b A B a C C A B a a b
<- previous index next ->