Introduction to Functional Programming

Calculation and Proof


Learning Objectives

  • You know of sets, domains and codomains, predicates and relations.
  • You can perform function application step by step using syntactic substitution.
  • You know why pure functions enable optimizations like memoization.
  • You understand the definition of a function, and how functionality and totality is reflected in programming.
  • You know a notion of program correctness and can reason about properties of functional programs.

The essence of this chapter is to cover how programs are evaluated, and how that relates to calculation in mathematics. We also define a function more rigorously and finally discuss program correctness from a functional point-of-view.

Functions, Sets and Defining Equations

A function 𝑓:𝐴→𝐡 maps elements from its domain 𝐴 to elements in its codomain 𝐡. The image of a function is the set of all outputs it actually produces β€” a subset of the codomain. Given a subset π‘†βŠ†π΄, we write 𝑓(𝑆) to mean the image of 𝑆 under 𝑓, i.e. {𝑓(π‘₯)|π‘₯βˆˆπ‘†} in set-builder notation.

Set-Builder Notation

Throughout this chapter, we use set-builder notation to describe sets. The notation {π‘₯|𝑃(π‘₯)} means β€œthe set of all π‘₯ such that 𝑃(π‘₯) holds”.

For example:

  • {𝑛|𝑛>0} is the set of positive integers
  • {2𝑛|π‘›βˆˆβ„€} is the set of even integers
  • {π‘₯2|π‘₯∈{1,2,3}} equals {1,4,9}
  • For any function 𝑓 and predicate 𝑃, {𝑓(π‘₯)|𝑃(π‘₯)}=𝑓({π‘₯|𝑃(π‘₯)})

The vertical bar | can be read as β€œsuch that” or β€œwhere”, and can be thought of as filtering a set. In the last example, 𝑓 maps over the set filtered by 𝑃. Set-builder notation is just a convenient way of composing filter and map from functional programming, but for sets!

Let’s consider the following function from integers to integers

f : Int -> Int
f = fun x : Int, 2 * x

We can look at the image of the function acting on any subset of the domain. For example

𝑓({0})={0}𝑓({0,1,2})={0,2,4}𝑓({𝑛|𝑛>0})={2𝑛|𝑛>0}={2,4,6,…}𝑓(β„€)={2𝑛|π‘›βˆˆβ„€}={…,βˆ’4,βˆ’2,0,2,4,…}

The defining equation of the function is 𝑓(π‘₯)=2π‘₯ for all π‘₯:Int.

f : Int -> Int
f = fun x : Int, 2 * (x - x)

The above function takes each integer π‘₯ to 2(π‘₯βˆ’π‘₯). By calculating, we can see that 𝑓(0)=2(0βˆ’0)=0, 𝑓(1)=2(1βˆ’1)=0 and so on, but we can even prove that 𝑓(π‘₯)=0 for all π‘₯. To do this we can employ simplification.

Simplification

Simplification is a mathematical process most people learn early in their education. More precisely, simplification is the process of transforming a term into an equivalent term using a set of simplification rules. The terms are equal if they evaluate to the same value.

For example, in a context where π‘₯:Int, we can simplify π‘₯βˆ’π‘₯ into 0. Because of referential transparency, this means that everywhere where π‘₯:Int we can β€œreplace” π‘₯βˆ’π‘₯ by 0 without affecting the behavior of the code.

Using this simplification rule, we can rewrite the defining equation of 𝑓 as follows

𝑓 π‘₯=2(π‘₯βˆ’π‘₯)⏟simplify to0=2β‹…0⏟simplify=0

If the simplification rules are correct, this proves that 𝑓(π‘₯)=0 for all π‘₯.

Problematic Optimization
f : Int -> Int
f = fun x : Int, x * x / x

What is 𝑓(β„€)?

At first glance, you might simplify π‘₯β‹…π‘₯/π‘₯ to just π‘₯, making 𝑓(β„€)=β„€. But there’s a catch: what happens when π‘₯=0?

Mathematically speaking, division by zero is undefined, so 𝑓(0) doesn’t mean anything β€” 0 is not in the domain of 𝑓. Therefore the image of β„€ under 𝑓 is undefined: 𝑓(β„€)=𝑓(0)⏟illegalβˆͺ𝑓(β„€βˆ–{0}).

In π•Šπ•‹π•ƒβ„‚++, and in most programming languages, division by zero causes a runtime panic.

This brings up an important point: simplification in the presence of partial functions is problematic as we must not simplify away situations where the code would crash. This is quite an important problem in the area of optimizing compilers: how to ensure that the optimized code still preserves the wanted side effects of the code.

As an interesting case study, a user on the Rust forums asked how to stop the compiler from optimizing away the following code:

let mut delay: u32 = 0xffffff;
while delay > 0 {
    delay -=  1
}

Mathematically speaking, the code does absolutely nothing, and we can even prove that delay = 0 after the code block, so the compiler optimizes it away. However the code has the desired side effect of delaying the next computation, i.e. sleeping, so we don’t want the code to get optimized away.

This brings up an interesting philosophical question: is the code pure (functional) or impure?

The next chapter goes into more detail on simplification.

Calculation

Simplification is not the only way to reduce a term to another equal term. When all variables are realized to concrete values, we can just calculate the result. The process of calculating a resulting value from a term is quite simple and fundamentally recursive: we look at every subterm and calculate their value.

2 * (5 + 1)25 + 151

For example, the subterms of 2 * (5 + 1) are 2 and 5 + 1. We first calculate the value of 2 which, because it’s an atomic term, is 2. However, 5 + 1 is not atomic, so we proceed recursively, calculating the value of its subterms and adding them to get 6. The result of the calculation is therefore 12.

Reminder: atomic terms are characterized by not consisting of subterms. For example, variable names and other literals are atomic terms.

Calculation, like simplification, creates a chain of equal terms: 2 * (5 + 1) = 2 * 6 = 12. The terms are equal because they have the same value.

Clearly the process of calculating went from left to right, so the rightmost term1 should be the atomic term whose value is what we are looking for. Therefore we can even say that calculating a term results in another term β€” which is akin to simplification. This realization lifts the concept of a value to the term level, and is done in lambda calculus to make calculation purely syntactic: the underlying meaning of terms like 2 and 5 + 1 is almost made invisible. Of course the underlying values are still present in defining the operators such as + and *.

Arithmetic is not the heart of functional programming β€” rather functions and function applications are. Next, let’s look at how calculating with functions works in the spirit of lambda calculus.

Calculating with Functions

Let’s look at a simple function add1.

No files opened Select a file to edit

To understand how we get add1 5 = 6, we need to understand how to calculate with functions. As we are in the realm of programming, we prefer to use the word evaluate instead of calculate, although they effectively mean the same thing.

From the defining equation of add1 we get add1 5 = (fun x : Int, x + 1) 5. Now we have an application of two terms. As with arithmetic operators, we recursively evaluate the values of both sides β€” but this immediately leads to the following question: what is the value of a function?

The standard answer in lambda calculus and functional programming is: every function is a value2!

Now we are in a similar situation as when evaluating 1 + 5: both subterms are values. The next step is to perform syntactic substitution: we substitute the function’s parameter by the right-hand-side value in the function’s body.

Skipping some details regarding substitution, we observe that substituting x by 5 in x + 1 results in 5 + 1. Therefore

(fun x : Int, x + 1) 5=5 + 1=6

Let’s frame this cornerstone of functional programming in a definition.

DefinitionFunction ApplicationLet 𝑓, π‘Ž be terms. To evaluate the application term 𝑓 π‘Ž, we firstevaluate the values of 𝑓 and π‘Ž and get 𝑣𝑓 and π‘£π‘Ž respectively.Because 𝑣𝑓 is a function value, it must be of the shape funπ‘₯:𝑇,𝑑1for some variable name π‘₯, type 𝑇 and term 𝑑1.If 𝑑1[π‘₯β†¦π‘£π‘Ž] is the result of substituting π‘₯ with π‘£π‘Ž in 𝑑1, then we saythat calculating 𝑓 π‘Ž yields the term 𝑑1[π‘₯β†¦π‘£π‘Ž].

Notice that the definition says nothing about the types involved. This is because type checking is a pre-condition to evaluation β€” i.e. the code must type check before it’s run. Therefore we can (and do) assume that the types match, i.e. 𝑓 is a function type and π‘Ž is in the domain of 𝑓.

In untyped programming languages, including the untyped lambda calculus, there is no type checker, but pretty much the same definition for function applications is still used.

Next, let’s look at a function called twice : (Int -> Int) -> Int -> Int. Looking at its signature, it takes an integer function as input and returns an integer function as output (remember that -> associates to the right).

No files opened Select a file to edit

Applying twice to the add1 function creates a function which adds 2 to the argument.

We want to evaluate the value of main, so we start with its definition: twice add1 10. Function application associates left, so we first evaluate twice add1:

twice add1=(fun f : Int -> Int, fun x : Int, f (f x)) add1=fun x : Int, add1 (add1 x)

The second equality comes from the definition of function application: add1, a value, is substituted in place of f. We don’t unfold the definition of add1 to keep the calculation cleaner.

As fun x : Int, add1 (add1 x) is a function, it’s a value, and therefore we are ready to evaluate the outer application.

twice add1 10=(fun x : Int, add1 (add1 x)) 10=add1 (add1 10)=add1 11=12

Here, the second equality comes from the function application and the third comes from calculating the subterm add1 10 which we have skipped for brevity.

In lambda calculus, each of the above equals signs is called a 𝛽-reduction, and its rules will be covered in detail later on this course.

Loading Exercise...
Loading Exercise...

Predicates and Relations

Let’s establish some mathematical notions and vocabulary that helps us reason about programs. These concepts are very important and used throughout the course.

DefinitionPredicateLet 𝐴 be a set. A predicate on 𝐴 is a logical formula 𝑃(π‘₯) whichdepends on a variable π‘₯ whose values range over 𝐴.

As shown in the previous chapter, we can write predicates on a type 𝐴 by writing functions from 𝐴 to booleans.

Here are some examples of predicates:

  • positive is a predicate on integers, and defined as follows: positive(π‘₯)≔π‘₯>0.
  • even is a predicate on integers, and can be defined recursively (as shown next), or by using the remainder operator: even(π‘₯)≔π‘₯%2=0.
  • empty is a predicate on lists, and defined as follows: empty(xs) is true if and only if xs=[].

In π•Šπ•‹π•ƒβ„‚++, predicates are written as functions returning Bool:

positive : Int -> Bool
positive = fun n : Int, n > 0

even : Int -> Bool
even = fun n : Int, let n = int.abs n in
    if n == 0 then true else ! (even (n - 1))

empty : [Int] -> Bool
empty = fun xs : [Int], lcase xs of
    | nil => true
    | cons _ _ => false

A similar concept, where instead of one variable we have two, is a relation.

DefinitionRelationLet 𝐴 and 𝐡 be sets. A relation 𝑅 over 𝐴 and 𝐡 is a subset of 𝐴×𝐡, i.e. the set of pairs {(π‘Ž,𝑏)|π‘Žβˆˆπ΄,π‘βˆˆπ΅}.We denote an element of the relation (π‘Ž,𝑏)βˆˆπ‘… using infix-stylesyntax: π‘Žπ‘…π‘.If 𝐴 and 𝐡 are the same, we simply say the relation is on (or over)𝐴 rather than over 𝐴 and 𝐴.

Here are some familiar relations.

  • Let 𝐴 be a set. Equality (=) is a relation on 𝐴: 3=3 is true (i.e. (3,3)∈=), but 3=4 is not
  • Less than (<) is a relation on β„€: 2<5 holds, but 5<2 does not
  • Set membership (∈) is a relation between elements3 and sets: 2∈{1,2,3} holds

To express relations in code, we write them with a function of type 𝐴→𝐡→Bool.

lt : Int -> Int -> Bool
lt = fun x : Int, fun y : Int, x < y

contains : Int -> [Int] -> Bool
contains = fun x : Int, fun xs : [Int], lcase xs of
    | nil => false
    | cons y xs => if x == y then true else contains x xs

As contains is a relation, it’s mathematically a set of pairs. For example, the following are true statements about contains:

  • (1,[])βˆ‰contains
  • (1,[1])∈contains
  • (1,[0,1,2])∈contains

Sometimes predicates and relations end up mixed together. The convention used on this course is that a predicate is a statement over one free variable, and relation over at least two variables:

  • When writing π‘ƒβŠ†π΄, we can think of 𝑃 as just the predicate 𝑃(π‘₯)≔π‘₯βˆˆπ‘ƒ.
  • A relation π‘…βŠ†π΄Γ—π΅ can be thought of as a predicate over a pair: 𝑅(π‘₯,𝑦)≔(π‘₯,𝑦)βˆˆπ‘…, and is mathematically equivalent to a relation.
  • Likewise for any triple π‘„βŠ†π΄Γ—π΅Γ—πΆ, we can think of it as a predicate over a triple, which is equivalent to a relation over 𝐴, 𝐡 and 𝐢.

Functionality and Totality in Depth

So far we have been talking about both functionality and totality quite informally. Let’s now give the definition of a function, and after that conceptualize those ideas using programming.

DefinitionFunctionalLet 𝑅 be a relation over any two sets 𝐴 and 𝐡. 𝑅 is said to befunctional if the following holds.For all π‘₯∈𝐴 and 𝑦,π‘§βˆˆπ΅, if π‘₯𝑅𝑦 and π‘₯𝑅𝑧, then 𝑦=𝑧.

A functional relation is also known as a partial function, denoted 𝐴⇀𝐡, as it doesn’t require the following ingredient.

DefinitionLeft totalLet 𝑅 be a relation over any two sets 𝐴 and 𝐡. 𝑅 is said to be lefttotal if the following holds.For all π‘₯∈𝐴 there exists a π‘¦βˆˆπ΅ such that π‘₯𝑅𝑦

These two concept are now ready to be packaged together in the key definition.

DefinitionFunctionLet 𝑓 be a relation over any two sets 𝐴 and 𝐡. 𝑓 is called a functionfrom 𝐴 to 𝐡 if it’s functional and left total.

Notice that all functions are also partial functions.

Next, we prove that we are allowed to write 𝑓(π‘₯) for the unique value 𝑦 that 𝑓 relates π‘₯ with.

RemarkLet 𝑓 be a function from 𝐴 to 𝐡 and fix some π‘₯∈𝐴. As 𝑓 is left-total, wehave some π‘¦βˆˆπ΅ such that π‘₯𝑓𝑦.We claim that 𝑦 is unique and therefore can denote it using 𝑓(π‘₯).Proof: Assume by way of contradiction that 𝑦 is not unique, i.e. there exists aπ‘§βˆˆπ΅ not equal to 𝑦 such that π‘₯𝑓𝑧. Now, because 𝑓 is functional, we have that𝑦=𝑧 which is a contradiction.☐

The mathematical definition of functionality is often referred to as β€œpurity” in the realm of programming. Some functions in imperative languages can still be pure, i.e. functional, from the mathematical perspective. This is one reason why functional programming is more than a classification of programming languages: it’s a paradigm where we choose to restrict ourselves to the constrained definition given in mathematics.

Let’s explore some practical implications of these concepts.

Functionality

Simply put, functionality allows us to replace β€œequals with equals”, meaning that, if we see two identical subterms in a program, we can make the assumption that they are equal, compute the value of just one of them and if the value exists, substitute the result to both places. In the program below, for instance, we could evaluate expensive_computation 42 once then reuse that value for the second operand of the addition.

expensive_computation : Int -> Int
expensive_computation = // something expensive

main : Int
main = expensive_computation 42 + expensive_computation 42

More generally, in cases like this, the interpreter could apply the memoization optimization technique in order to avoid re-evaluating terms.

At the moment of writing this π•Šπ•‹π•ƒβ„‚++ does not do this, but it could very well do so, because evaluation in the language is mostly pure4.

Let’s explore some examples of trying similar optimizations on impure terms.

rng : Unit -> Int
rng = fun unused : Unit, random.randInt 1 100

main : Int
main = rng () + rng ()

In the example above, we assume that the standard library provides a function random.randInt similarly to how, for example, the Python standard library does. If we were to assume that computation and by extension random.randInt is a pure function, we would behave accordingly and only evaluate it once. This would probably not provide us the result we expected, since we would just reuse the value of the first rng () term in the place of the second term. A random number generator always generating the same number isn’t particularly useful.

Side Effects and State

In programming, the way to distinguish impure and pure functions is the presence of side effects. A side effect is anything a function does besides returning a value: reading from or writing to files, modifying global variables, printing to the console, making network requests, or generating random numbers.

Perhaps surprisingly, impure functions are not more expressive than pure functions β€” although they might be more concise. Any program with side effects can be rewritten as a pure program by making the β€œhidden” state explicit. The trade-off is verbosity: pure code must explicitly thread state through every function call.

Side effects include any and all expressions that rely on or modify a global state, that is, some value that is not explicitly passed as an argument, but is somehow implicitly included. In the above example, the rng β€œfunction” clearly doesn’t depend only on the Unit value passed to, but actually depends on some implicitly passed argument, let’s sketch out a version where we would control this argument as an explicit argument.

rng : RandomState -> Int
rng = fun s : RandomState, random.randInt s 1 100

main : RandomState -> Int
main = fun s : RandomState, rng s + rng s

Now we’re passing around a random state but we’re not completely there yet (we also got rid of the Unit argument since we were not using it in any case). As you probably noticed, the rng function is still called twice with exactly the same argument and now, since it is a function, we still have the same problem with memoization as above. We’re still missing one crucial part.

We know that a random number generator in addition to relying on a global state also modifies it. Therefore our rng function should return the new random state. With this adjustment our code now looks something like this:

rng : RandomState -> (RandomState, Int)
rng = fun s : RandomState, random.randInt s 1 100

main : RandomState -> (RandomState, Int)
main = fun s : RandomState,
    let result1 = rng s in
    let result2 = rng (fst result1) in
    (fst result2, (snd result1) + (snd result2))

In the above code, we’ve successfully taken an impure program and turned it into a pure program. This same exact technique of spelling out the implicit state as explicit arguments can be used to convert any impure program into a pure one, and in fact, one could say that the impure programs are just pure programs that lie about their inputs. Therefore, being able to distinguish pure and impure programs boils down to the ability to model state and side effect explicitly rather than implicitly. If there exists no implicit state, then all functions are pure.

Side effects, and how to achieve them with pure code, are discussed in more detail in Input/Output and Side Effects.

Pure side effects with the IO monad

The pattern of passing state through tuples (NewState, Result) works, but it’s verbose and error-prone. What if we forget to use the updated state? What if we accidentally use the old state twice?

Haskell solves this elegantly with the monad pattern and do-notation. In the case of the State monad, type State a represents β€œa locally stateful computation may that depends on the state of type a”.

Below is our rng example implemented with a RandomState monad and do notation. The state is automatically forwarded to subsequent calls of rng when doing it inside of the do block with the <- syntax. Finally, the return function is used to return a value and the final state.

The function example implements the same behavior as our main function above, while example2 generates 3 random values and returns a list of them.

import Control.Monad.State

type RandomState = State Int

rng :: RandomState Int
rng = do
  n <- get
  put (n + 1) -- Sophisticated RNG logic
  return $ n + 1 

example :: RandomState Int
example = do
  result1 <- rng
  result2 <- rng
  return $ result1 + result2

example2 :: RandomState [Int]
example2 = do
  result1 <- rng
  result2 <- rng
  result3 <- rng
  return [result1, result2, result3]

main :: IO ()
main = do
  print $ evalState example 0 -- <- rng seed
  print $ evalState example2 0

This topic is revisited when the IO monad is covered in the chapter on Input and Output.

Totality

Totality of functions can be expressed simply via the following statement: a function is total if it can not panic or go into an infinite loop.

These are both perhaps common sense requirements for β€œhigh quality code” even in programming languages where functions are not functional in general, as is the case with Rust. In Rust, a function which panics while inside its β€œoperating parameters”, is usually considered a bug. However, a panic caused by a problem outside the operating parameters, like memory or disk space running out, is not usually itself a bug.

Let’s return back to the factorial example from the previous chapter.

factorial : Int -> Int
factorial = fun n : Int,
    if n == 1 then n
        else n * factorial (n - 1)

Even though the function doesn’t return a value for inputs less than one, it still works for all positive integers β€” it’s β€œtotal on inputs β‰₯1”, but partial overall. As π•Šπ•‹π•ƒβ„‚++ doesn’t have a type for positive integers, we can’t express this assumption in the domain of the function signature.

Even if π•Šπ•‹π•ƒβ„‚++ had a type for positive integers, what would be the result of 1 - 1? Options are: 0, 1 or a panic.

In functional languages, one can trust the function signatures, up to termination. The type Int -> Int means that, provided with an integer, the function may return an integer. It might work for all inputs or for some particular inputs or equally work for no inputs as well β€” a finer notion of correctness is needed to reason about functions.

Totality at the Cost of Incompleteness

In languages including Agda, Idris 2, Lean and Rocq, core/kernel5 functions must be total. This means that the core of these languages can not be Turing complete because every function and evaluation must terminate. However, they also support partiality, but only outside the logical core of the language/type system, so they are far from useless for real life programming applications.

These languages are usually dependently typed languages with proof assistant capabilities, as one sometimes needs to prove to the language that a function is total when the language can’t figure it out on its own.

Partial functions in these languages are sometimes called colored functions as they β€œspread” to the calling function requiring the caller to be marked as partial too. On the contrary, total functions are not colored, i.e. don’t require the caller to be total.

The same philosophy works for pure and impure as well: an impure function (in any programming language) is allowed to use pure functions, but not vice versa.

These ideas are mostly explained by the following basic properties of relation composition:

  • the composition of two left-total relations is left-total, and
  • the composition of two functional relations is functional.

Proving these properties is left as an exercise for the reader.

Program Correctness

We have all the tools and knowledge to now learn about program correctness, a widely studied field of computer science. For our purposes it suffices to look at correctness of functional programs β€” or just functions.

First, to even define what β€œcorrect” means, we need a notion of a specification. A specification for a function of type 𝐴→𝐡 consists of two predicates: 𝑃(π‘₯) and 𝑄(π‘₯,𝑦), where π‘₯ is a variable of type 𝐴 and 𝑦 is of type 𝐡.

DefinitionCorrectnessA partial function 𝑓:𝐴⇀𝐡 is correct according to the specification(𝑃,𝑄) if the following holdsfor all π‘₯∈𝐴, if 𝑃(π‘₯) then 𝑓 is defined at π‘₯ and 𝑄(π‘₯,𝑓(π‘₯)) holds.

The two predicates 𝑃 and 𝑄 are often called the pre-condition and the post-condition. The definition of correctness intuitively says that all inputs that satisfy the pre-condition are mapped to values that satisfy the post-condition.

A trivial pre-condition could be π‘₯=π‘₯ and a post-condition π‘¦βˆˆπ΅ are a specification which holds for all total functions.

The range of properties expressible with only types is quite limited in π•Šπ•‹π•ƒβ„‚++. This is why the pre- and post-conditions are expressed at the meta-level instead of using types. We also avoid talking about exceptions, or panics separately and consider a panic at input x to mean that the function is not defined at x.

As an example, let’s show that for all positive integers, factorial returns the factorial of the input.

Examplefactorial is correct according to the specification (π‘₯>0,𝑦=π‘₯!).Proof: Let π‘₯βˆˆβ„€ and assume the pre-condition π‘₯>0 holds.By induction on π‘₯, if π‘₯=1 we observe that factorial 1 evaluates to 1=1!.If π‘₯=π‘š+1 for some positive integer π‘š, factorial x evaluates to x *factorial (x - 1). By the induction hypothesis, factorial m evaluates toπ‘š!. By the mathematical definition of factorial, we have that (π‘š+1)β‹…π‘š!=(π‘š+1)!. Thereforefactorial (m + 1)=(π‘š+1)β‹…factorial m=(π‘š+1)β‹…π‘š!=(π‘š+1)!☐

In the proof, we were able to freely move between mathematical values and terms in π•Šπ•‹π•ƒβ„‚++ as the language is functional and using the assumption π‘₯>0 every use of β€œevaluates to” produces a value.

Notice that even though factorial wasn’t defined for negative integers, it’s still correct, according to the given specification. Relaxing the precondition π‘₯>0 would have made the requirement that 𝑓 is defined at π‘₯ false on negative inputs.

Loading Exercise...
Loading Exercise...
Loading Exercise...

Footnotes

Footnotes

  1. The rightmost term in a sequence of calculation steps is guaranteed to exist for finite arithmetic expressions. However, with more complex terms involving functions, think about whether a final value is guaranteed to exist. ↩

  2. In some textbooks a function is a value only if its body is β€œnormalized”. This leads to interesting problems when dealing with recursive functions which we won’t discuss here. On this course we take the stance that functions are values regardless of their bodies. ↩

  3. In set theory, β€œeverything” is a set, so elements are sets too. See for example this Wikipedia article where natural numbers are defined recursively starting from zero which is the empty set. ↩

  4. π•Šπ•‹π•ƒβ„‚++ has trace, which although functional, does print to the standard output β€” which we consider a side effect. Because mathematically it doesn’t mean anything to print to the standard output, and because trace doesn’t introduce an outside value into the evaluation, we choose to ignore the side effects of trace when analyzing π•Šπ•‹π•ƒβ„‚++ programs. This is standard in functional programming languages. See Debug.Trace in Haskell. ↩

  5. You can find a description of the kernel of the Lean programming language the Lean language reference. ↩