Notes from COSC 341/2 1/27/2015
Regular expressions
A regular expression (RE) is one of
- character (from the alphabet)
-
ε
-
RE1 ∙ RE2 // concatenation -- RE1 followed by RE2
// where RE1 and RE2 are regular expressions
// the ∙ symbol may be implied (like multiplication symbol)
-
RE1 | RE2 // or, alternative -- RE1 or RE2
-
RE* // Kleene closure -- 0 or more occurences of RE
- Binary operators: ∙ and |
- Unary postfix operator: *
Operator precedence: * > ∙ > |
Finite Automata
A finite automaton (FA, plural finite automata) is represented as directed graph.
A vertex represents a state. An edge represents a transition.
The label on an edge is the (input) character that causes a transition from
initial state to target state.
If a transition is labelled with ε, then a transition from start state
to destination state is triggered by an input string of 0 length.
- A deterministic finite automaton is one where there is no choice about
which transition will apply. From a given state, with a given input, only one
transition is possible.
- A nondeterministic finite automaton is one where there is a choice
about which transition applies. From a given state, with a given input, there
may be multiple transitions that qualify.
Example RE --> NFA
NFA -> DFA conversion
Algorithm
- For every state in NFA, determine all reachable states for every input symbol.
- The set of reachable states forms a single state in the DFA.
- Find reachable states for each new DFA state until no more new states can be
found.