Regular languages are the simplest class of languages in the Chomsky hierarchy.
They are built using simple operations on basic languages.
Basic Languages
For a given alphabet , its basic languages are:
- for each symbol
- (empty language)
- (language containing the null string)
Construction of Regular Languages
Regular languages are obtained from basic languages using:
- Union
- Concatenation
- Kleene star
Using only concatenation gives single-string languages. Union and * allow infinite languages.
Regular Expressions
A regular expression is a symbolic representation of a regular language. Every regular language has a regular expression.
A regular expression describes the typical form of strings in the language.
Example: means any number of s followed by .
Notation Rules
{ }are removed or replaced by( )- Union is written as
- Concatenation is written by juxtaposition
Example: →
Recursive Definition of Regular Expressions
Let be an alphabet.
Regular expressions are defined as follows:
- is a regular expression and denotes
- is a regular expression and denotes
- For each , is a regular expression denoting
- If and are regular expressions:
- denotes union
- denotes concatenation
- denotes Kleene star
Only expressions formed using these rules are valid.
Operator Precedence
To reduce parentheses:
- has highest precedence
- Concatenation has next highest
- has lowest precedence
Examples:
- →
- →
Equivalence of Regular Expressions
2 regular expressions are equal iff they denote the same language.
Examples:
Recognizing a Language
Language recognition means deciding whether a given string belongs to a language.
Recognition is done by reading input left to right in one pass.
The machine must remember limited information.