Introduction to Functional Programming

Recursion and recursive thinking


Learning Objectives

  • You understand why recursion is essential in functional programming.
  • You know what a recursive data type is.
  • You understand how the list data type works from first principles.
  • You know how to implement basic recursive functions in π•Šπ•‹π•ƒβ„‚++.

Recursion

As you have learned in the previous chapters, a key difference between functional and imperative programming languages is immutability. This usually forces us to rethink e.g. iterative problems in a more mathematical fashion β€” quickly leading to recursion.

For example, consider calculating the sum of integers from 1 to 𝑛. In an imperative programming language, a typical solution could look like this:

int sum = 0;
for (int i = 1; i <= n; ++i) {
    sum += i;
}

However, if we attempt to translate this directly to π•Šπ•‹π•ƒβ„‚++, or any functional language for that matter, we quickly run into issues. First of all, there is no for or while expression in π•Šπ•‹π•ƒβ„‚++, and even if there were, we still wouldn’t have a way to update the values of i and sum as we can’t mutate variables in π•Šπ•‹π•ƒβ„‚++. It’s worth noting that many functional programming languages support something1 that closely resembles the above C-code, it just requires a bit of getting used to.

An idiomatic implementation in π•Šπ•‹π•ƒβ„‚++ could look something like this:

sumTo : Int -> Int
sumTo = fun n : Int,
    if n <= 0 then 0 // (1)
    else n + sumTo (n - 1) // (2)

In this code, instead of using a loop, we call the function being defined within its definition, which is called recursion. Substituting loops with recursion is very common when translating imperative code to functional code.

Recursive (or inductive) thinking is characterized as follows:

  • We begin by defining a function that takes an input x.
  • By assuming that the function being defined works correctly for strictly smaller inputs than x, we are allowed to make recursive calls with smaller inputs to compute the desired output for the current input x.

For example, on line (2), we use the sumTo function recursively with the assumption that it returns the correct result for inputs strictly smaller than the current input, i.e. n β€” this assumption is often called an inductive hypothesis. From this, we know that the β€œcorrect thing to do” is to add our n to the previous result β€” because the previous result is correct by the induction hypothesis.

The reason recursion works, is mathematical induction (or specifically induction over natural numbers in this case):

  1. We show the base case (1) is correct, i.e. sumTo 0 evaluates to 0.
  2. Inductive/recursive case (2): assuming sumTo (n - 1) is valid because n - 1 is strictly less than n, we produce the result of n + sumTo (n - 1).

Strictly speaking induction also requires that the concept of β€œsmaller” (a less-than relation) is well-founded. Conveniently, most data types we run into in programming have a well-founded less-than relation. As a nonexample, rational numbers with the usual less-than relation is not well-founded, so inductively defining a function for a rational number 𝑛 can not recursively evaluate itself at 𝑛2.

Loading Exercise...
Loading Exercise...
Loading Exercise...
Loading Exercise...

The Linked List data type

In addition to primitive data types, π•Šπ•‹π•ƒβ„‚++ implements the list data type. It is implemented as a linked list, meaning that each element refers to the next one. A list consists of 0 or more elements of the same type but in this chapter we’ll be using the list of Ints denoted by List Int or [Int]. Lists of type T have two constructors: cons : T -> [T] -> [T] and nil T : [T]. Notice how the cons constructor takes another list of the same type as the second argument β€” this makes list a recursive data type.

π•Šπ•‹π•ƒβ„‚++ doesn’t have a Hindley-Milner type system nor unification of any kind. It requires type parameters explicitly when it comes to constructors like nil. An alternative way to write an empty list of type T is [:T].

No files opened Select a file to edit

The notation ..= is syntax sugar for an inclusive range of integers. Likewise, .. is used for an exclusive range.

To deconstruct or pattern match a list in π•Šπ•‹π•ƒβ„‚++, we use the lcase expression. The lcase has two branches (starting with the | character), the nil branch, matching an empty list, and the cons head tail branch, matching a non-empty list. In the spirit of the language, both of the branches must be present and therefore a list must always be matched exhaustively.

No files opened Select a file to edit

If you don’t want to bind a value to a variable in π•Šπ•‹π•ƒβ„‚++, you can use an _ where the variable name would go

Loading Exercise...
Loading Exercise...
Loading Exercise...
Loading Exercise...

Lists and Recursion

Since the list is a recursive data type, it makes sense that a lot of list operations are implemented with recursion.

No files opened Select a file to edit

A sum over a list of integers can be implemented as shown above. The nil branch corresponds to the base case (sum of an empty list is zero) and the cons branch corresponds to the recursive case, where the sum of a list is defined as the value of the first element plus the sum of its tail.

h and t stand for head and tail. Other common variable names used after cons are x and xs.

Unlike with arrays, which you can easily index with an arbitrary offset, lists must always be accessed from the beginning, one element at a time. This means that accessing an element at an arbitrary index and many other list operations are generally 𝑂(𝑛). In functional programming, if you find yourself in a situation where you’re trying to index a list, it’s perhaps a good idea to think about your approach for a second.

Tail-call optimization

If you’re a performance-minded programmer, you might already be feeling a bit sick about all the performance implications of the recursion. The nature of linked lists already has some performance constraints due to the fact that we don’t have constant time random access and that the elements are not located sequentially in memory. There isn’t really an easy solution for this, and the most common solution to just use something else than a linked list.

Unlike the nature of linked lists, we can actually do something about the growing amount of recursive function calls. To this problem, we can apply an optimization technique called tail-call optimization (TCO). It is an optimization technique often implemented by optimizing compilers of functional-style programming languages. TCO is generally viable for function that make a function call as their final action (hence the name).

Here’s an example of a trivial recursive sum function in Compiler Explorer. The example is provided both in Rust and in C++ (GCC) and you can follow along by adjusting the optimization levels on the right side of the screen. It’s interesting to analyze what happens on different optimization levels.

-C opt-level=0 and -O1 (starting point)

Both compilers generate recursive calls to the sum function (look for the call instruction in the output) This kind of code will become problematic on deep recursive chains, especially if the function has a large stack frame.

-C opt-level=1 and -O2

Both compilers optimize the recursive calls into some kind of loop. Rust generates something like this:

pub fn sum_loop(mut to: i32) -> i32 {
    let mut ans = to;
    to -= 1;
    while to != 0 {
        ans += to;
        to -= 1;
    }
    ans
}

While GCC generates something a bit simpler:

int sum(int to) {
    int sum = 0;
    while (to) {
        sum += to;
        to--;
    }
    return sum;
}

-C opt-level=2 and -O3

Here GCC starts vectorizing2 the code while Rust3 does something even more clever. It seems to get rid of the loop entirely and it uses a closed formula instead.

sum(𝑛)≔𝑛⋅(𝑛+1)2

This is a neat example of how an optimizing compiler can take a stack-hungry recursive function and turn it into a constant time operation.

Loading Exercise...

Higher-order functions and mapping over recursive structures

Since lists are a recursive data structure, and we often want to call a function on its elements exactly once, higher-order functions emerge naturally.

As we know, functions are not restricted to taking regular values4 as arguments. They can also take functions as arguments. Higher-order functions allow programmers to, for example, β€œcustomize the behavior” of a function.

One of the more common functions in functional style code is the map function, which applies a function to each element in a collection returning the β€œmapped” version of the collection. Below is an example of the map function for our List of Ints.

map : (Int -> Int) -> [Int] -> [Int]
map = fun f : (Int -> Int), fun xs : [Int],
    lcase xs of
    | nil => xs
    | cons h t => f h :: map f t

This map function takes in a function with type signature Int→Int, but it could theoretically also take in Int→Bool or Int→[Int] (or really a function of any return type).

While map β€œpreserves” the structure (i.e. the order and shape) of the collection, there exist also a plethora of useful functions that β€œalter” the structure, for instance, the filter function that takes in a predicate Intβ†’Bool and returns a collection that consists only of elements in the original collection for which the predicate returns true.

No files opened Select a file to edit
Loading Exercise...
Loading Exercise...

Error Handling

Error handling, at compile time and runtime, is something functional programming languages tend to be really good at. When it comes to errors that are raised during runtime, most languages have a try-catch mechanism to handle them. This is not the case with functional languages, because functions can’t β€œreturn early” to raise an error β€” control flow is more visible in functional languages.

As always, there is a way to raise and catch errors, but the style is quite different compared to Python, Java, JavaScript etc. Errors are implemented as ordinary data, typically using sum types called Option or Result, which are covered in a later chapter. Such a style is also present in languages such as Rust and Java, where instead of raising an error, the function returns Option::None or Result::Err.

One of the reasons that sum types are not used everywhere is that they add more complexity and work for the caller to handle the different cases. Even in functional languages some functions prefer not to handle errors and panic (i.e. crash) instead. Such functions belong to the class of non-total functions.

Exceptions and Panicking

π•Šπ•‹π•ƒβ„‚++ supports exception with using panic terms. The term panic T msg tells the type checker that this term has type T β€” which can be any type you choose.

When evaluating a panic term, all computations are halted and the evaluator immediately returns. To the outside, this looks like the panic term β€œinfects” all other computations causing the panic term to β€œbubble up”. Because the panic halts the program, no computation can deduce anything from its value, so it’s not even considered as a value.

π•Šπ•‹π•ƒβ„‚++ doesn’t have a way to handle exceptions, and that’s why they are called panics.

To demonstrate the idea, let’s look at the head function for lists.

head : [Int] -> Int
head = fun xs : [Int],
    lcase xs of
    | nil => panic Int "head called on empty list"
    | cons h _ => h

head is implemented in a fashion that panics on empty lists.

In the following code, where head is used, the panic term infects the computation + 10 which means that evaluating main returns the panic term.

main : Int
main = head [:Int] + 10

The code type checks because head [:Int] : Int and so adding another integer results in an integer. The type checker is not aware of the panic and cannot deduce whether it occurs or not.

Total Function with Panics

Another noteworthy detail is that the presence of panic in code does not always mean that the function is non-total. For example, the following function is total despite it containing a panic term:

total_42 : Int -> Int
total_42 = fun _ : Int,
    lcase [42:Int] of
    | nil => panic Int "unreachable"
    | cons answer _ => answer

Next chapter goes into more detail on what it means for programs to be correct, and how to prove properties of programs.

Totality

As mentioned in the overview, total functions are those relations which map every input to at least one output.

Reminder: all functions in mathematics are both functional and total.

Functions like head and tail from the standard library are non-total because they panic on empty lists, i.e. they do not map [] to any output. This is a widely accepted as a β€œfeature and not a bug” even though it’s completely incorrect and absurd from a mathematical perspective. Total versions of head and tail could be created by returning a default value for empty lists (or a junk value as is preferred in proof assistants5) or using the aforementioned sum types Option or Result.

Another cause of non-totality in programming is the use of recursion without a base case. Defining such a function is comparable to writing an equation of the form 𝑓(π‘₯)=𝑓(π‘₯βˆ’1)+1 β€” calculating the value of any 𝑓(π‘₯) diverges computationally. Even if there β€œseemingly” is a base case, totality requires termination on all inputs.

factorial : Int -> Int
factorial = fun n : Int,
    if n == 1 then n
        else n * factorial (n - 1)

Consider the implementation of factorial in the above code, it does not return a value for inputs n <= 0. For those inputs, the function ends up in an infinite chain of recursive calls.

By this counterexample, we’ve shown that our poorly written implementation of factorial is non-total.

Showing that a function is non-total is actually rather difficult or even impossible in general, as famously proven by Alan Turing in 1930s. This is why a programming language or a type system cannot decide whether a function is total in general.

There are still many cases where totality is guaranteed, for example when doing structural recursion. Even more complex recursive functions can still be proven to be total. Turing’s proof simply showed us that no single algorithm can decide correctly whether a program halts or a function is total across all inputs.

Totality of Infinite Loops

Problems related to totality are not only present in functional programming and recursive function, but also more widely with loops in imperative code. For example, calling the following function with a nonpositive input will lead6 to an integer overflow error.

pub fn sum_loop(mut to: i32) -> i32 {
    let mut ans = to;
    to -= 1;
    while to != 0 {
        ans += to;
        to -= 1;
    }
    ans
}

Likewise, even very simple functions such as

pub fn add1(x: i32) -> i32 {
    x + 1
}

are not total because adding 1 to the maximum 32 bit integer (2_147_483_647) also results in a panic.

When writing recursive list functions, one most commonly makes a recursive call to the tail of the list, which by getting strictly shorter each time ensures that the recursion terminates. Common list functions like map, filter and fold are all total assuming that we only pass in total functions.

Inhabited and Empty Types

A type is said to be inhabited if one can write a term of that type. As the panic term in π•Šπ•‹π•ƒβ„‚++ can be of any type, it immediately follows that in π•Šπ•‹π•ƒβ„‚++ every type is inhabited.

However, this doesn’t mean that every type contains at least one value, because to be a value is to be the result of evaluation. Evaluating a panic or an infinite recursive loop does not yield a value.

This distinction is actually why we don’t call terms values β€” not every term has a value!

π•Šπ•‹π•ƒβ„‚++ has a type called Void in its standard library, which is the same as in Haskell β€” it’s the bottom type. For example, the function signature Int -> Void says that the function must result in an exception, or diverge in an infinite recursion.

In some functional programming languages, particularly in those where all functions must be total, the bottom type is actually uninhabited. In those languages every term has a value, so the distinction between terms and values is not as big β€” a term is a value, or evaluates to one.

You can read more about the bottom type on Wikipedia.

Loading Exercise...

Footnotes

Footnotes

  1. Many functional programming languages actually have for loops, e.g. forM in Haskell. Blurring the line between imperative and function, Lean not only has for and while loops, but also supports mutable variables, breaking out of loops and even early returns.

    The imperative features provided by Lean’s do-notation allow many programs to very closely resemble their counterparts in languages like Rust, Java, or C#. β€” Functional Programming in Lean

    ↩
  2. We’re not too worried about vectorization on this course, but if you’re interested, we strongly encourage enrolling on the CS-E4580 Programming Parallel Computers course ↩

  3. In this case, it is most likely LLVM that ends up doing the heavy-lifting for this optimization. ↩

  4. Functions are also values, but here we are referring to values of base types such as integers or other data types. ↩

  5. Proof assistants like Lean return an empty list from List.tail if the list is empty, but to call List.head!, the list element type must be inhabited to produce a junk value if the list is empty. ↩

  6. In Rust, overflow checks are enabled in β€œdebug builds” by default. If disabled, overflow is allowed and sum_loop and add1 are considered total again. Then, the value of sum_loop(0) is -2147483648 as it β€œshould be”. ↩