Introduction to Functional Programming

Fun and Functional Programming


Learning Objectives

  • You can distinguish between expressions, values, and declarations in π•Šπ•‹π•ƒβ„‚++ and more generally.
  • You understand the core characteristics of functional programming (immutable data, functions as values, higher-order functions, and expression-oriented code).
  • You can read and write basic π•Šπ•‹π•ƒβ„‚++ programs, and understand the role of type checking.
  • You can work with higher-order functions (functions that take/return functions), and you recognize common patterns such as map/filter/fold at a conceptual level.

Execution and Evalution

A mathematical function does not execute any code β€” it just knows the answer β€” Bartosz Milewski, Category Theory for Programmers

As Bartosz says in his book Category Theory for Programmers, mathematical functions aren’t executed, rather they are evaluated. Evaluation means turning an expression, or a block of code, into a final value through a sequence of evaluation rules. Evaluation rules say what the value of a certain kind of expression is, they give β€œmeaning” to words.

For example the expression 1 + 2 evaluates to the value 3 through the following steps. We will use the convention that text formatted as code, e.g. 1 + 2, is used to denote expressions whereas mathematical looking text, e.g. 3, denotes values.

  1. The expression is an arithmetic add expression with two subexpressions 1 and 2.
  2. To evaluate the expression, we first need to evaluate the subexpressions:
    • evaluating the subexpression 1 results in the value 1
    • evaluating 2 results in the value 2
  3. Now that we know the values of the subexpressions, we can derive the value of 1 + 2, which is 1+2=3

The distinction between code expressions and values is important to keep in mind when analyzing functional programs.

Every piece of syntax/code has a similar procedure for determining its value. …Well almost every β€” for instance, a function definition isn’t an expression, but an equation assigning functions to names. This is how top-level names are defined in π•Šπ•‹π•ƒβ„‚++ as well as in the pure functional programming language Haskell.

Declarations vs. Statements vs. Expressions

Most programming languages have three categories of code: declarations, statements and expressions. A declaration defines a name for an object that’s at the top-level of the file (or module). For example, functions are often declared starting with a keyword such as def, function or fn followed by the name of the function. In π•Šπ•‹π•ƒβ„‚++ (and in Haskell), no such keyword is used β€” instead, everything at top-level is a declaration or an import statement.

num : Int
num = 0

square : Int -> Int
square = fun x : Int, x * x

The above code declares two names num and square to represent different expressions. An expressions is a piece of code that we can imagine evaluating to a value. For example, x * x is certainly an expression. A declaration, however, is not an expression, but the name of a declaration is an expression. In functional languages we can write function expressions like fun x : Int, x * x.

Declarations are like mathematical equalities: one can replace equals by equals in functional languages without changing the semantics, i.e. the meaning, of the code.

Expressions and Terms

The square function is a name for the expression fun x : Int, x * x, but as we know, expressions are just strings of characters. Intuitively we understand that code is structured and consists of smaller pieces. We will use the word term to refer to the abstract tree structure that a language uses internally.

Just like how many expressions consist of subexpressions, terms can have multiple subterms. A term which has no subterms is called an atomic term. Variable names and other literals are examples of atomic terms.

The following diagram represents the abstract tree of the expression fun x : Int, x * x:

fun x : Int, x * xxIntx * xxx

Blue nodes are variable names, but they are not terms even though variable names are expressions. Types (nodes with purple color) are not terms in π•Šπ•‹π•ƒβ„‚++, but might be in other functional languages. The uncolored nodes represent the subterms. The variable name β€œx” occurs three times in the expression, but only twice as a subterm β€” the first occurrence is not a term.

The expression fun x : Int, x * x can be parsed as the following term

Fun("x", Int, Mul(Var("x"), Var("x")))

To sum up, the key distinction between expressions and terms is that terms have more structure, whereas expressions are just strings with no extra structure. We will cover terms in more detail in later parts, where we actually start implementing programming languages.

Immutability

Contrary to popular belief about functional programming, many functional languages actually do support statefulness1 and mutation of variables2 in a local context. Some languages, such as Clojure, go as far as allowing mutation of certain kinds of global variables, but that breaks functionality β€” not all functions in Clojure are therefore mathematical functions.

π•Šπ•‹π•ƒβ„‚++ doesn’t have mutation. Instead of mutating variables, we can shadow them.

No files opened Select a file to edit

On line 5, we shadow the variable x with a new variable.

Loading Exercise...
Loading Exercise...

Basic Types and Type Checking

When writing let expressions (i.e. variable assignments) in π•Šπ•‹π•ƒβ„‚++, the system infers the type of the right-hand-side term. Type inference works recursively by checking the types of subterms using pre-defined inference rules. If the types don’t match any rule, a type error is raised.

The following example attempts to compute 5 + true, which π•Šπ•‹π•ƒβ„‚++β€˜s type checker refuses.

No files opened Select a file to edit

And to be clear, the point of type checking is that it always happens before evaluation, which means that it can eliminate entire classes of runtime errors.

As an example of a β€œtype error” which happens at runtime rather than beforehand, we have Python’s legendary TypeError.

No files opened Select a file to edit

Trying to run the above code produces the following error message:

Traceback (most recent call last):
  File "/app/src/main.py", line 1, in <module>
    [] + 1
    ~~~^~~
TypeError: can only concatenate list (not "int") to list

We can see the difference more clearly by examining how conditionals affect type checking.

No files opened Select a file to edit

Try removing + true in the else-branch to see what happens.

In the example, π•Šπ•‹π•ƒβ„‚++ infers 3 < 5 to be a boolean, but fails to infer the type of the entire if-then-else term. Compare this with Python, where no β€œtype error” is raised in a similar situation.

No files opened Select a file to edit

Try changing the condition to evaluate to False.

Loading Exercise...
Loading Exercise...

Dynamic and Static Typing

Programming languages like Python, where types are compared at runtime, are commonly called β€œdynamically typed” languages. Some languages are neither dynamically nor statically typed, for example x86 assembly language has no concept of types. Many languages, like Python, Java and Rust, have both dynamic and static typing capabilities, i.e. a runtime value can carry type information, but the language still provides some means of static type checking.

Loading Exercise...

Assigning Functions to Variables

  • functions as β€œfirst-class citizens”

In π•Šπ•‹π•ƒβ„‚++, functions are just ordinary data. We can assign functions to variables just like we can assign the value 5.

Let-bindings also allow us to specify the type of a variable using a colon between the variable name and equals sign.

Function Signatures β€” How Functions Sign Their Name

A function is declared in mathematical notation as follows

𝑓:Intβ†’Int

So far we’ve mostly looked at functions from the functionality perspective β€” trying to answer β€œwhat does this function do”. In (typed) functional programming it is often useful to abstract a function to its signature. A function’s signature is a description of the type(s) that it takes as arguments and the type that it outputs. For instance, a function with the type signature Intβ†’Bool maps a value of type Int to a value of type Bool.

A common practice when developing functional programs is to first determine a function’s signature and worry about the implementation later. It is in fact so common, that there’s an entire development methodology called type-driven development that takes it above and beyond.

Hoogle – Searching functions by signature

The community of the influential functional programming language Haskell maintains a database of functions implemented by different packages. This database can be indexed with its own search engine, hoogle.

In addition to searching by name, hoogle allows one to search functions by their signature. For example, looking for a function that takes in a [a] (a list of a) and returns a [[a]] (a list of lists of a) can be done with the following query: [a] -> [[a]]. This query returns functions such as inits and tails.

Searching functions by signature proves to be extremely useful in cases where naming the function is ambiguous. The database is not only useful for finding functions in Haskell, but also finding generally agreed upon names for common functions.

Multi-Parameter Function Types and Currying

The signature of functions with arity greater than one, i.e. which have multiple parameters, is usually written using parentheses with all the parameter types separated by commas.

For example, a function plus which takes two integer arguments would have the following type.

plus:(Int,Int)β†’Int

This concept of partially applying functions one argument at a time is called currying (named after the mathematician Haskell Curry).

The function plus_alt in the example has the signature Int→Int→Int. This can be expressed mathematically with

plus_alt:Int→Int→Int

and is often used to mean exactly the same thing as (Int,Int)β†’Int. To read chains of arrows, you can add parentheses associating to the right like so:

plus_alt:Int→(Int→Int)

Crucially, this is not the same as

not_plus_alt:(Int→Int)→Intalso_not_plus_alt:Int→(Int,Int)
plus_alt : Int -> Int -> Int
plus_alt = fun x : Int, fun y : Int, x + y

not_plus_alt : (Int, Int) -> Int
not_plus_alt = fun x : (Int, Int), fst x

also_not_plus_alt : Int -> (Int, Int)
also_not_plus_alt = fun x : Int, (x, x)
Loading Exercise...
Loading Exercise...

Composing Functions

Composing is the process of taking two functions and combining the into one. In mathematics it’s denoted using the ∘ symbol, but in π•Šπ•‹π•ƒβ„‚++, as it’s easier to write ASCII, the composition operator is the period ..

DefinitionCompositionLet 𝐴,𝐡,𝐢 be sets, 𝑓:𝐴→𝐡 and 𝑔:𝐡→𝐢. The composition ofthe functions π‘”βˆ˜π‘“ is defined as(π‘”βˆ˜π‘“)(π‘₯)≔𝑔(𝑓(π‘₯))for all π‘₯∈𝐴.

When talking about sets, like in the above definition, one can during this course substitute β€œtype” for the word β€œset”3.

Composition is baked into π•Šπ•‹π•ƒβ„‚++ for technical reasons, i.e. it’s a primitive operator, but it’s no escape hatch for the type checker. The type checker ensures that the composition is legal, i.e. the types match. For example composing g : Char -> Bool after f : String -> Int would have to be rejected as Int and Char are different types.

No files opened Select a file to edit

If you change Int to Char or vice versa, the code type checks but panics when evaluated as g and f are only declared but not defined.

This gets us to one of the core principles of (functional) programming: abstraction i.e. implementation hiding. The main function has type String -> Bool, not String -> Int -> Bool or whatever else. If a user would only ever see the name and signature of the main function, they would have no way to identify the β€œintermediate” type that f and g use to communicate β€” assuming they don’t know of f and g in the first place.

As the developer of a library, we have the power to choose which types are seen and which are hidden to the outside world.

π•Šπ•‹π•ƒβ„‚++ does not have visibility rules for controlling this, however most programming languages do.

Higher-order Functions

Functions that take other functions as arguments are referred to as higher-order functions. They are a key feature of functional programming languages, adopted even in many not-so-functional programming languages (such as C++).

The usage of higher-order functions often leans heavily on generics, which will be discussed in the chapter about polymorphism.

Below is a simple higher-order function which uses concrete types (instead of generics) that takes in a function 𝑓:Intβ†’Int and a value Int and returns the result of the function applied to the value.

No files opened Select a file to edit

trace and ? are useful ways to debug and inspect code in π•Šπ•‹π•ƒβ„‚++. trace takes an integer literal which says how many steps to reduce. ? is shorthand for trace 99999999.

Like apply_int, list.map takes in a function Int -> Int and applies it to each integer in the list.

No files opened Select a file to edit

Common higher-order functions

Here is a table of some common higher-order functions which exists in almost every functional programming language. The most commonly used higher-order functions are often implemented by the language’s standard library. Such is the case also for π•Šπ•‹π•ƒβ„‚++.

NameSignatureDescriptionapply(𝐴→𝐡)→𝐴→𝐡Applies a function to anelementmap(𝐴→𝐡)β†’[𝐴]β†’[𝐡]Applies the argument functionto each element of the argumentfilter(𝐴→Bool)β†’[𝐴]β†’[𝐴]Returns a new list consistingonly of elements in the originallist for which the predicate istruefold𝐡→(𝐡→𝐴→𝐡)β†’[𝐴]→𝐡A general operation ofconverting a collection to avalue by constructing a resultelement by elementcurry((𝐴,𝐡)→𝐢)→𝐴→𝐡→𝐢A conversion from a functionthat takes two arguments to asingle parameter function whichreturns another singleparameter functionuncurry(𝐴→𝐡→𝐢)β†’(𝐴,𝐡)→𝐢The inverse of curry
Loading Exercise...

Footnotes

Footnotes

  1. See for example Haskell’s State monad. ↩

  2. See for example Lean’s do notation. Lean even has early returns and for-loops while being purely functional. There is a nice paragraph written about imperative versus functional programming in Lean and a research paper β€˜do’ Unchained: Embracing Local Imperativity in a Purely Functional Language (Functional Pearl) that goes into detail how it works in Lean. ↩

  3. Some types which have quantifiers (known as polymorphic types) can not be sets, in the usual sense. However, restricting ourselves to constructive mathematics does give us a way to reason about types as sets.

    To learn more, see this question on StackExchange. Also see the HoTT book which is one of the most successful books that shows how type theory can be the foundations of mathematics. ↩