Recently, I had to use a functional programming language to implement a regular expression matcher. I have never used a functional programming languages before, so I though I would share some of the things I discovered by working through a small example that uses a functional language called Lisp.
First of all, what exactly are regular expressions? Wikipedia gives a precise definition:
“In computing, a regular expression provides a concise and flexible means for “matching” (specifying and recognizing) strings of text, such as particular characters, words, or patterns of characters.”
Regular expressions can become quite complex and in order to keep technical jargon in this text to a minimum, let’s define a more simple regular expression with basic functionality:
SimpleRegex is a regular expression that returns true if a string can be matched against a pattern, where a pattern is a list of terms. A term is either a word, a word place holder donated as ‘?’, or a sentence place holder donated as ‘$’.
The following table gives a few examples of the string “This is a string” matched against different patterns:
|This is a string||True|
|This ? a string||True|
|This ? is a string||True|
|This is also a string||False|
|That $ string||False|
All fine and well. Let’s look at what functional programming languages are. The paradigm of functional programming is that all computations can be evaluated by using mathematical functions. In other words, a function maps a domain set to a range set. The domain set is specified by arguments passed to a function while the return values of a function specifies the range set.
You might ask how functional programming languages differ from languages such as C, Java or Python, since they also have functions and return values. The difference is that expressions such as function calls only depend on the arguments passed to the function. Therefore, a function will always return the same value for an argument x, eliminating all side effects. This is called “referential transparency”. In order to guarantee referential transparency, pure functional languages don’t allow assignment statements. On the other hand, the same expression in an imperative language can result in different values at different times depending on the state of the executing program.
So, let’s look at the functional programming language Lisp. Lisp is not a pure functional language but the code below is purely functional.
Everything is Lisp is an expression consisting of lists or atoms. Atoms are things such as strings, numbers, etc. Lists are donated by
(). When Lisp evaluates a list it looks at the first symbol, which is normally a function and then the expression inside the list is evaluated.
(+ 2 5) is list which returns
7. Here the
+ function is called to which
5 are passed.
Since everything is Lisp surround lists, there are a lot of built-in list functions.
(first some-list) returns the first element of
A good tutorial is available here for more information.
Below is a function that implements SimpleRegex for those readers that are interested. The argument
patterns that is passed to the function matching is a list of terms. Similarly the argument
text is a list of words.
(defun matching(pattern text) (if (null pattern) (null text) (if (null text) (if (> (get-type (first pattern)) 0) t nil ) (cond ((= (get-type (first pattern)) 0) (if (equal (first pattern) (first text)) (matching (rest pattern) (rest text)) nil ) ) ((= (get-type (first pattern)) 1) (if (matching (rest pattern) (rest text)) t (matching (rest pattern) text) ) ) ((= (get-type (first pattern)) 2) (or (matching pattern (rest text)) (matching (rest pattern) (rest text)) ) ) (t (nil)) ) ) ) ) (defun get-type(value) (cond ((eq (char value 0) #\?) 1) ((eq (char value 0) #\$) 2) (t 0) ) )
This looks quite confusing. So why use a language such as Lisp and what are the benefits of using a functional over an imperative programming language?
The main benefit of functional languages is that the order of execution of side effect free functions is not important. This makes it easier to parallelise blocks of code. Furthermore, functional programming lets the compiler take care of basic programming ideas such as list comprehensions, caching and to some degree parallelisation. Nowadays, with CPUs that boast a number of cores it is beneficial if the compiler can take care of parallelisation tasks that otherwise have to be manually programmed using complicated parallelisation libraries.
Since functions in functional languages behave similar to mathematical functions, it is easier to translate them into code. In addition, if a function produced the right answer today, it will produce the correct answer again in future.