Regex Engine
GitHubThis library implements a simple regular expression engine in Scala.
Implementation
The library uses Scala’s postfix operations to implement basic regular expressions as a DSL.
// matches double zero
val simpleConcatenation: Nfa = "00"
// matches 0 or 1
val zeroOrOne: Nfa = "0" | "1"
// matches 1 zero or more times
val kleene: Nfa = "1"*
// we can also concatenate expressions together by (ab)using the apply method
// matches 0 or 1 followed by at least three 1s
val nfa: Nfa = zeroOrOne("111")(kleene)
The DSL produces an NFA, which can then be compiled into DFA.
val dfa = nfa.compile()
At which point we can obtain a stream of matches from a given input.
// matches == "1111" #:: "01111" #:: "01111"
val matches = dfa.matches("111101111001111")
Matches are greedy i.e. if there are two possible matches from a given point in the input, the longer match will be returned from the stream.
Examples
// exclude Predef to avoid conflicts
import scala.Predef.{augmentString => _}
// matches any sequence of at least one zero
val expression: Nfa = "0"("0"*);
val dfa = expression.compile()
val matches = dfa.matches("101000101100010")
// if isEmpty returns false, we have a matches
val test = !matches.isEmpty // true
// if we only need one match, we can use head
val first = matches.head // "0"
// if we want to use all the matches, we have the full range of Stream methods available
// prints "0" "000" "0" "000" "0" across 5 lines
matches.foreach { println(_) }
// expressions can be arbitrarily complex
val complexExpression: Nfa = ("011"|"10")("1"*)|("00"|"11")("0"("1"*)|"")
Further Reading
For my introduction to regular expressions, click here.
For more information about the process of writing regex-engine, see my series of articles: