Theory of Lambda Calculi

Big Steps, Small Steps


Learning Objectives

  • You understand big-step call-by-name evaluation semantics for lambda calculus.
  • You can derive evaluation trees using the call-by-name rules.
  • You understand single-step Ξ²-reduction and weak-head normal forms.
  • You can implement call-by-name evaluation and Ξ²-reduction step functions in code.

Introduction

Now that we understand and have implemented capture-avoiding substitution, we can define and implement the semantics of lambda calculus. As there are only three kinds of terms, the semantics are really simple on the surface. In fact, most of the complexity lies in the capture-avoiding substitution.

In Variables and Abstraction we briefly mentioned evaluation in lambda calculus. We begin by defining big-step operational semantics and then move to defining small-step semantics for lambda calculus.

From now on, we use the convention that lambda terms are equal, if they are equal up to 𝛼-equivalence. This means that when we are working with the term πœ†π‘₯.π‘₯ it is the same as πœ†π‘¦.𝑦 β€” we say they are mathematically equal.

Big-Step Semantics for Lambda Calculus

Unlike the languages studied in the previous chapter, lambda terms don’t evaluate to an integer or a Value. Instead, lambda terms evaluate to lambda terms, and being a value, almost self-referentially, means being the outcome of evaluation.

To actually make any sense of this idea, let’s look at some examples next. We denote big-step evaluation by the usual ⇓ symbol. The statements have a β€œRust analog” using closures next to them.

Exampleπ‘₯⇓π‘₯x=xπœ†π‘₯.π‘₯β‡“πœ†π‘₯.π‘₯|x| x=|x| x(πœ†π‘₯.π‘₯)𝑦⇓𝑦(|x| x) y=y(πœ†π‘₯.π‘₯)(π‘Žπ‘π‘)β‡“π‘Žπ‘π‘(|x| x) med (a(b)(c))=a(b)(c)(πœ†π‘₯.π‘₯π‘₯)𝑦⇓𝑦𝑦(|x| x(x)) y=y(y)(πœ†π‘₯.πœ†π‘¦.π‘₯)π‘§β‡“πœ†π‘¦.𝑧(|x| |y| x) z=|y| z

Now, let’s define the semantics of lambda calculus. As there are a lot of different ways to define semantics for the lambda calculus, called reduction strategies1 we opt for one of the simplest: call-by-name or CBN.

As we saw in the example, variables don’t evaluate, or rather they evaluate to themselves β€” variables only play a role in substitution. In fact, if the term is β€œwell-typed” we should never need to evaluate a free variable. This is because values are always closed when evaluating well-typed terms.

What it means for a term to be well-typed is defined in the next chapter on the simply typed lambda calculus. For now you can think of well-typed terms primarily as closed terms.

As abstractions house an inner term, we could have a rule to evaluate inside the abstraction. However, this is unnecessary, and done in normalization strategies (which we study in the next chapter) as opposed to the call-by-name strategy.

We will have the following two base rules for the call-by-name evaluation.

π‘₯⇓π‘₯πœ†π‘₯.π‘‘β‡“πœ†π‘₯.𝑑

They state that any variable or abstraction evaluates to itself.

Another evaluation rule in CBN is one that involves both an abstraction and an application. This is where the β€œmagic” happens. If 𝑑1 evaluates to an abstraction, we have an abstraction applied to a term β€” so we can perform a substitution!

π‘“β‡“πœ†π‘₯.𝑓1[π‘₯β†¦π‘Ž]𝑓1β‡“π‘£π‘“π‘Žβ‡“π‘£

Take a moment to read what the rule actually says. Are all the assumptions necessary? It’s important that the substitution is capture-avoiding β€” why?

Finally, we add an evaluation rule to use the result of evaluating 𝑓 in the previous rule. This rule can only be used if 𝑓 doesn’t evaluate to an abstraction (or rather evaluates to a nonabstraction), which we denote with π‘“β‡“π‘“β€²β‰ πœ†π‘₯.βˆ’.

π‘“β‡“π‘“β€²β‰ πœ†π‘₯.βˆ’π‘“π‘Žβ‡“π‘“β€²π‘Ž

As an example, let’s write down the evaluation tree of an identity function applied to the identity function

(πœ†π‘¦.𝑦)(πœ†π‘₯.π‘₯)β‡“πœ†π‘₯.π‘₯ Exampleπœ†π‘¦.π‘¦β‡“πœ†π‘¦.π‘¦πœ†π‘₯.π‘₯β‡“πœ†π‘₯.π‘₯[π‘¦β†¦πœ†π‘₯.π‘₯]π‘¦β‡“πœ†π‘₯.π‘₯(πœ†π‘¦.𝑦)(πœ†π‘₯.π‘₯)β‡“πœ†π‘₯.π‘₯The first and second subderivations are instances of the base rule. As[π‘¦β†¦πœ†π‘₯.π‘₯]𝑦=πœ†π‘₯.π‘₯, the rightmost subderivation is also an instance ofthe base rule. Therefore the derivation is valid.

As πœ†π‘₯.π‘₯ is 𝛼-equivalent with πœ†π‘¦.𝑦, it doesn’t matter what the actual bound variables are. Of course a computational implementation has to decide which bound variables to use (or which fresh variables to use in capture-avoiding substitution), but that’s just an implementation detail which has no effect on the theory.

Here are some more examples of call-by-name evaluation:

Example(πœ†π‘₯.πœ†π‘¦.π‘₯)(πœ†π‘¦.πœ†π‘§.𝑦)β‡“πœ†π‘¦.πœ†π‘¦.πœ†π‘§.𝑦(πœ†π‘₯.π‘₯)((πœ†π‘₯.π‘₯)𝑦)⇓𝑦(πœ†π‘₯.πœ†π‘¦.π‘₯)(𝑧𝑦)β‡“πœ†π‘¦β€².𝑧𝑦(πœ†π‘₯.πœ†π‘¦.π‘₯)(πœ†π‘§.𝑦)β‡“πœ†π‘¦β€².πœ†π‘§.𝑦((πœ†π‘₯.π‘₯)𝑦)(πœ†π‘§.𝑧)⇓𝑦(πœ†π‘§.𝑧)(πœ†π‘₯.π‘₯𝑦)(πœ†π‘§.𝑧)⇓𝑦

Notice that in the examples, evaluating terms that contain free variables may result in a term which is a variable or even an application. The set of terms that result from CBN evaluation are said to be in weak-head normal form.

ExampleThe following terms are in weak-head normal form as they evaluate tothemselves.πœ†π‘₯.π‘₯πœ†π‘₯.πœ†π‘¦.π‘₯π‘₯(πœ†π‘¦.𝑦)π‘₯𝑦𝑧π‘₯((πœ†π‘¦.𝑦)𝑧)(π‘₯𝑧)(πœ†π‘¦.𝑦)

Implementing Call-By-Name Evaluation

As the call-by-name evaluation is just a relation, we would like to turn it into at least a partial function.

The function, called cbn evaluates a term 𝑑 by computing the value of ? that 𝑑 evaluates to:

𝑑⇓?

This makes us ask a couple of important questions about the ⇓ relation:

  1. Does ? always exist? If so, ⇓ is left-total.
  2. Is the value of ? unique for a given 𝑑? If so, ⇓ is functional.
  3. Can we write an algorithm to compute ??

We put questions 1 and 2 aside for now and instead focus on question 3 by implementing an algorithm that does call-by-name evaluation.

Loading Exercise...

Loading Exercise...

Single-Step Semantics

Now that we have a basic understanding of evaluation of lambda terms, we expand it with the notion of small-step semantics by studying the single-step 𝛽-reduction. Instead of evaluating the whole term at once, 𝛽-reduction happens step-by-step. 𝛽-reduction also tends to be nonfunctional meaning that a term can take different reduction steps, e.g. reducing the function or the argument position in an application β€” or any reducible subterm. We denote (single-step) 𝛽-reduction using an arrow symbol ⟢.

𝛽-reduction has many slightly different formulations1 and has a lot of literature written about it. We highly recommend reading chapters 2.4 and 2.5 from Selinger: Lecture Notes on the Lambda Calculus.

The notion of 𝛽-reduction is much more general than defined here. We will learn another, stronger version of it in the next chapter.

Our first version will be like the call-by-name evaluation studied earlier as it doesn’t evaluate the arguments at all. It gets the name weak, because it doesn’t reduce under abstractions and produces weak-head normal forms when reducing until no reductions can be made.

The first, and most important, rule called AppAbs is the β€œheart of the lambda calculus”. Let’s go over it thoroughly. The rule states that an application term (πœ†π‘₯.𝑑1)𝑑2 where the left-hand-side is an abstraction reduces to the substitution of π‘₯ by 𝑑2 in 𝑑1 which is the term [π‘₯↦𝑑2]𝑑1. Here it is written as an inference base rule without premises:

AppAbs(πœ†π‘₯.𝑑1)𝑑2⟢[π‘₯↦𝑑2]𝑑1 ExampleEach of the following is a valid instance of the AppAbs rule.(πœ†π‘₯.π‘₯)(πœ†π‘§.𝑧)βŸΆπœ†π‘§.𝑧(πœ†π‘₯.π‘₯)(πœ†π‘§.𝑧)βŸΆπœ†π‘¦.𝑦(=π›Όπœ†π‘§.𝑧)(πœ†π‘₯.πœ†π‘¦.π‘₯)(πœ†π‘¦.πœ†π‘§.𝑦)βŸΆπœ†π‘¦.πœ†π‘¦.πœ†π‘§.𝑦(πœ†π‘₯.π‘₯)((πœ†π‘₯.π‘₯)𝑦)⟢(πœ†π‘₯.π‘₯)𝑦(πœ†π‘₯.πœ†π‘¦.π‘₯)(𝑧𝑦)βŸΆπœ†π‘¦β€².𝑧𝑦(πœ†π‘₯.π‘₯𝑦)(πœ†π‘§.𝑧)⟢(πœ†π‘§.𝑧)𝑦

Notice that the AppAbs rule allows us to only perform one substitution at a time, so e.g.

(πœ†π‘₯.π‘₯𝑦)(πœ†π‘§.𝑧)⟢(πœ†π‘§.𝑧)π‘¦βŸΆπ‘¦

but (πœ†π‘₯.π‘₯𝑦)(πœ†π‘§.𝑧)βŸΆπ‘¦ is not a 𝛽-reduction step directly.

Next, we explore the other single-step 𝛽-reduction rules. These rules are like β€œadapters” that let us reduce subterms. For 𝛽-reduction to do anything useful, we need a way to reduce the left subterm of an application:

𝑑1βŸΆπ‘‘β€²1App1𝑑1𝑑2βŸΆπ‘‘β€²1𝑑2

This is because the only other rule AppAbs requires an abstraction as the left subterm of an application. Reducing the right subterm does not get us any closer to being able to use AppAbs.

One can think of AppAbs as the rule that drives progress. Every other rule is there to get us closer to using AppAbs or to simplify the lambda term to a nicer form.

Loading Exercise...

Loading Exercise...

Loading Exercise...

Now that you hopefully found a working implementation, the following quizzes will probe your understanding a bit further. A key detail to unlock the correct answers is to think about the relationship between the AppAbs and App1 rules. For example, is it possible for 𝑑1 to be an abstraction in the premise 𝑑1βŸΆπ‘‘β€²1 of App1?


Loading Exercise...

Loading Exercise...

Footnotes

Footnotes

  1. See for example Sestoft: Demonstrating Lambda Calculus Reduction or Wikipedia: reduction strategy. ↩ ↩2