Just a reference from here http://en.wikipedia.org/wiki/Context-free_grammar

A context-free grammar G is defined by the 4-tuple

G = (V\,, \Sigma\,, R\,, S\,) where

  1. V\, is a finite set; each element  v\in V is called a non-terminal character or a variable. Each variable represents a different type of phrase or clause in the sentence. Variables are also sometimes called syntactic categories. Each variable defines a sub-language of the language defined by G\, .
  2. \Sigma\, is a finite set of terminals, disjoint from V\,, which make up the actual content of the sentence. The set of terminals is the alphabet of the language defined by the grammar G\, .
  3. R\, is a finite relation from V\, to (V\cup\Sigma)^{*}, where the asterisk represents the Kleene star operation. The members of R\, are called the (rewrite) rules or productions of the grammar. (also commonly symbolized by a P\,)
  4. S\, is the start variable (or start symbol), used to represent the whole sentence (or program). It must be an element of V\,.

Production rule notation

A production rulein R\, is formalized mathematically as a pair (\alpha, \beta)\in R, where \alpha \in V is a non-terminal and \beta \in (V\cup\Sigma)^{*} is a string of variables and/or terminals; rather than using ordered pair notation, production rules are usually written using an arrow operator with \alpha as its left hand side and \beta as its right hand side: \alpha\rightarrow\beta.

It is allowed for \beta to be the empty string, and in this case it is customary to denote it by ε. The form \alpha\rightarrow\varepsilon is called an ε-production.[4]

It is common to list all right-hand sides for the same left-hand side on the same line, using | (the pipe symbol) to separate them. Rules \alpha\rightarrow \beta_1 and \alpha\rightarrow\beta_2 can hence be written as \alpha\rightarrow\beta_1\mid\beta_2.

Rule application

For any strings u, v\in (V\cup\Sigma)^{*}, we say u\, directly yields v\,, written as u\Rightarrow v\,, if \exists (\alpha, \beta)\in R with \alpha \in V and u_{1}, u_{2}\in (V\cup\Sigma)^{*} such that u\,=u_{1}\alpha u_{2} and v\,=u_{1}\beta u_{2}. Thus, \! v is the result of applying the rule \! (\alpha, \beta) to \! u.

Repetitive rule application

For any u, v\in (V\cup\Sigma)^{*}, we say u yields v written as u\stackrel{*}{\Rightarrow} v (or u\Rightarrow\Rightarrow v\, in some textbooks), if \exists \ u_{1}, u_{2}, \cdots u_{k}\in (V\cup\Sigma)^{*}, k\geq 0 such that u\Rightarrow u_{1}\Rightarrow u_{2}\cdots\Rightarrow u_{k}\Rightarrow v

Context-free language

The language of a grammar G = (V\,, \Sigma\,, R\,, S\,) is the set

L(G) = \{ w\in\Sigma^{*} : S\stackrel{*}{\Rightarrow} w\}

A language L\, is said to be a context-free language (CFL), if there exists a CFG G\,, such that L\,=\,L(G).

Proper CFGs

A context-free grammar is said to be proper, if it has

  • no inaccessible symbols: \forall N \in V: \exists \alpha,\beta \in (V\cup\Sigma)^*: S \stackrel{*}{\Rightarrow} \alpha{N}\beta
  • no unproductive symbols: \forall N \in V: \exists w \in \Sigma^*: N \stackrel{*}{\Rightarrow} w
  • no ε-productions: \forall N \in V, w \in \Sigma^*: (N, w) \in R \Rightarrow w \neq ε
    (the right-arrow in this case denotes logical “implies” and not grammatical “yields”)
  • no cycles: \neg\exists N \in V: N \stackrel{*}{\Rightarrow} N

Example

The grammar G = (\{S\} , \{a, b\}, P, S), with productions

S → aSa,
S → bSb,
S → ε,

is context-free. It is not proper since it includes an ε-production. A typical derivation in this grammar is

S → aSa → aaSaa → aabSbaa → aabbaa.

This makes it clear that L(G) = \{ww^R:w\in\{a,b\}^*\}. The language is context-free, however it can be proved that it is not regular.

Leave a Reply