Context-Free Grammar

2 min read Updated Fri Apr 24 2026 07:36:29 GMT+0000 (Coordinated Universal Time)

A formal system used to generate strings of a language using production rules. Denoted as CFG. Defined as a 4-tuple (V,Σ,S,P)(V, \Sigma, S, P) where:

  • VV - finite set of variables (non-terminals)
  • Σ\Sigma - finite set of terminal symbols
  • SVS \in V - start symbol
  • PP - finite set of productions

VV and Σ\Sigma are disjoint.

Production Rule

A production has the form AαA \rightarrow \alpha where:

  • AVA \in V (a non-terminal)
  • α(VΣ)\alpha \in (V \cup \Sigma)^*

This means a non-terminal (regardless of the surrounding) can be replaced by a string of terminals and/or non-terminals.

Example: SΛ    Sa    SbS → Λ\;|\;Sa\;|\;Sb

Derivations

The process of generating strings from the start symbol using production rules.

Single-step derivation

Denoted as αβ\alpha \Rightarrow \beta. Means β\beta is obtained from α\alpha using a production.

Multi-step derivation

Denoted as αβ\alpha \Rightarrow^* \beta. Means β\beta is derived from α\alpha in zero or more steps.

Nullable Variable

A non-terminal symbol AA is a nullable variable iff there is a production AϵA \Rightarrow* \epsilon.

FindNull Algorithm

  • Start with N0N_0, a set of variables directly producing Λ\Lambda
  • Define Ni+1=Ni{non-terminals which produce elements of Ni}N_{i+1} = N_i \cup \set {\text{non-terminals which produce elements of } N_i}
  • Repeatedly calculate Ni+1N_{i+1} until NiNi+1N_i \neq N_{i+1}
  • NiN_i is the set of all nullable non-terminals

Regular Grammar

A restricted CFG where productions have one of the forms:

  • BaCB → aC
  • BaB → a

where:

  • B,CB, C are non-terminals
  • aa is a terminal

CFG for a Regular Expression

Example regular language: (0111)(01)(011|1)^*(01)^*

Equivalent CFG:

A → 011 | 1
B → AB | Λ
D → 01
C → DC | Λ
S → BC

CFG for a Finite Automata

For each transition P --a--> Q in the FA, add production PaQP \rightarrow aQ. If state QQ is an accepting state, add PaP \rightarrow a.