Programming Language Foundations

Evaluation and Big-Step Semantics


Learning Objectives

  • You understand the difference between evaluation as a function and evaluation as a relation.
  • You know what big-step semantics means and how it differs from small-step semantics.
  • You can define evaluation rules using inference rules.
  • You can construct evaluation trees to show how terms evaluate to values.

Evaluation and Reduction

Evaluation is the process of converting an expression in a programming language to some kind of a value according to some evaluation strategy. If instead of expressions we evaluate terms, and value are also terms, the evaluation process is also called reduction. We will learn about how reduction works in lambda calculus in the next part. For now we will consider values to be separate from terms.

As evaluation takes a term to a value, we typically call it a function. A significant and interesting subset of evaluation relations are not functions: they can be nondeterministic or non-total. For example a random number generator that generates random coin flips is nondeterministic, and therefore not functional — but it’s still a relation and can be reasoned about in a mathematical way.

In this chapter we will observe non-totality when defining how certain kinds of “invalid computations” evaluate. Therefore from now on we use the term evaluation relation rather than function.

Evaluation relations are key in the analysis of the dynamics of programming languages.

Reminder: the word dynamics refers to how the language behaves during evaluation or execution — it is unrelated to the concept of dynamic typing or dynamic programming.

Big-Step semantics

Programming languages where evaluation produces a value immediately from a term have so called big-step semantics. “Big-step” refers to the property of evaluation where a computation is done all at once, in one (usually recursive) function call.

For example the calculate function from Chapter 2: Calculator Semantics

calculate:AST

evaluates the entire term (represented by the AST in this case) to an integer value in one function call by evaluating the subterms recursively:

calculate(Num(𝑛))=𝑛calculate(Add(𝑙,𝑟))=calculate(𝑙)+calculate(𝑟)calculate(Sub(𝑙,𝑟))=calculate(𝑙)calculate(𝑟)calculate(Mul(𝑙,𝑟))=calculate(𝑙)calculate(𝑟)

Therefore the calculator has big-step semantics — the user only needs to press the calculate button once.

In contrast, small-step semantics describes the dynamics of a language as state transitions or individual reductions. In small-step semantics, instead of describing the set of values separately (like ), values are represented by a subset of the states or terms.

Small-step semantics is explored in more detail in the next part of the course.

Next, let’s look at how to define an evaluation relation using inference rules.

We will analyze the dynamics of the calculator, i.e. the calculate function using an evaluation relation denoted 𝑡𝑣 where 𝑡 is a term and 𝑣 is a value. For now, it suffices to identify terms with the abstract syntax trees, but we will see later that we would like to extend terms to be more than just the AST.

The definition of the relation for the calculator is as follows:

DefinitionCalculator Evaluation ()The binary relation for the calculator is defined as follows.If 𝑡 is a term (an AST) and 𝑣, then𝑡𝑣if and only ifcalculate(𝑡)=𝑣

Let’s look at a couple of examples.

ExampleThe following statements are true:Num(1)1Add(Num(1),Num(2))3Sub(Num(1),Mul(Num(2),Num(3)))5Num(𝑛)𝑛for all𝑛Num(𝑛)𝑛1Num(𝑛)𝑛2𝑛1=𝑛2

The last statement is interesting, as it’s an immediate consequence of functionality of the relation, which we get from the fact that calculate is a function.

Evaluation Rules

Next, let’s give an alternative definition of without using calculate directly. To do this we use inference rules where judgments have the form 𝑡𝑣 (i.e. 𝑡 evaluates to 𝑣) where 𝑡 is a term and 𝑣 is a value. In our case terms are AST trees and values are just integers.

The calculate function is defined using four equations, so we should be able to define the evaluation relation with three rules and one (set of) “axioms”. First, the “axiom” which states that: for each 𝑛 we have

Num(𝑛)𝑛

To avoid confusion, instead of calling the above an axiom, we call it a base rule (of the relation we are defining). The base rule can be equivalently expressed as an inference rule without premises.

NumNum(𝑛)𝑛

Whether or not one draws the line above the base rule is only a matter of preference.

Specializing the rule for concrete values of 𝑛 gives us all of the following valid derivations.

Num(1)1Num(0)0Num(1)1
Loading Exercise...

Next, the evaluation rule for Add. It should state that Add(𝑙,𝑟) evaluates to the sum of what 𝑙 and 𝑟 evaluate to. In the definition of calculate this was expressed with recursion:

calculate(Add(𝑙,𝑟))=calculate(𝑙)+calculate(𝑟)

So the natural translation to get an evaluation rule is to first give a name for each value:

calculate(Add(𝑙,𝑟))?==calculate(𝑙)𝑛𝑙++calculate(𝑟)𝑛𝑟

We observe, that to get ? we need to know 𝑛𝑙 and 𝑛𝑟 which are results of evaluation. As evaluation is not a function in general, we need the assumptions 𝑙𝑛𝑙 and 𝑟𝑛𝑟. Now, putting things together, we get the evaluation rule

𝑙𝑛𝑙𝑟𝑛𝑟AddAdd(𝑙,𝑟)𝑛𝑙+𝑛𝑟

Let’s look at a couple of examples next. First, let’s prove that "1+1=2", i.e. Add(Num(1),Num(1))2. A direct proof (as opposed to an indirect proof) starts from the base rules and hypotheses (which we have none).

First, we observe that Num(1)1 is an instance of the base rule. Now, applying the Add rule to this, we get

Num(1)1Num(1)1AddAdd(Num(1),Num(1))1+1

whose conclusion, Add(Num(1),Num(1))1+1, is what we tried to prove.

Even though they are syntactically different, 1+1 on right hand side is equal to 2 as values.

Concrete Syntax in Rules

As the language of prefix arithmetic has a concrete syntax, we often find it more convenient to use that in the evaluation rules. For example we would write + 1 2 instead of Add(Num(1),Num(2)). Even though in reality, the evaluation rules are not concerned with the concrete syntax, but the terms (or AST) instead, the concrete syntax is easy for humans to read — AST on the otherhand is easy for the computer to read.

This notation comes with the cost of ambiguity as Num(𝑛)𝑛 would be written just n𝑛. Here one has to know the type signature of the relation beforehand to know that on the left side we have terms and on the right side we have integers, and these two are completely different.

Now, the base rule becomes deceptively simple:

n𝑛

And the Add rule can be shortened quite a bit:

l𝑛𝑙r𝑛𝑟Add+ l r𝑛𝑙+𝑛𝑟

One has to always keep in mind the type of the left and right hand side when working with relations.

The evaluation rules for Sub and Mul are similar to the Add rule.

l𝑛𝑙r𝑛𝑟Sub- l r𝑛𝑙𝑛𝑟l𝑛𝑙r𝑛𝑟Mul* l r𝑛𝑙𝑛𝑟

It’s very important to note that these are all the rules for evaluation (for this language), or the evaluation relation is the closed under these rules. This means that we can say exactly which rule was/is used during evaluation.

As a final note about the logic involved, we observe that the following derivation is an instance of the Add rule with l1,r=2,𝑛𝑙=5,𝑛𝑟=8

1528+ 1 25+8

However, the derivation is not a valid derivation of the statement because the premises are not instances of the base rule.

Ex falso quodlibet, is the principle of logic that states that anything can be proven from a contradiction. Using this principle we must have that the conclusion follows from the premises, as the premises are false in the first place.

Loading Exercise...

Evaluation Trees

As we learned in the previous chapter, a proof tree consists of applications of inference rules such that the conclusion is at the root of the tree and the leaves are just instances of axioms (or a base rule). A proof tree shows the complete picture of every rule used, and is easily verified, especially if the inference lines are named.

Likewise, an evaluation tree is the derivation tree consisting of evaluation rules.

Let’s walk through an example. The term whose evaluation tree we are interested in is + 1 * 2 3 which has the AST Add(Num(1),Mul(Num(2),Num(3))).

We build up the tree starting from the leaves. From the base rule, we get

11,22and33

Which we can combine to get

2233Mul* 2 323

And finally the complete proof

112233Mul* 2 323Add+ 1 * 2 31+23

As 1+23=7 we have a valid derivation of + 1 * 2 37.

The reason every step worked out so smoothly was because of the fact that our relation was defined using a function (a functional and total relation). This fact made every subtree uniquely determined (by functionality) and ensured there was at least one evaluation rule we could use to progress the subtree (totality).

Loading Exercise...

Loading Exercise...

Computation That Fails

As mentioned at the beginning of this chapter, evaluation being a relation makes it possible to define it such that some terms don’t evaluate to a value at all. We first add a fallible operation, such as integer division, to the language. It’s syntax is expressed using the following formation tree:

𝑡𝒯𝑢𝒯/𝑓/ t u𝒯

If we try to add the following evaluation rule

l𝑛𝑙r𝑛𝑟Div/ l r𝑛𝑙/𝑛𝑟

we get a “mathematical type error” 𝑛𝑟 must not be 0, because division by zero is not defined.

Notice that 𝑛𝑙/𝑛𝑟 means integer division (calculating the quotient) in this context. This is also known as Euclidean division.

Therefore we need an extra assumption, which we write as just 𝑛𝑟0 above the inference line:

l𝑛𝑙r𝑛𝑟𝑛𝑟0Div/ l r𝑛𝑙/𝑛𝑟

As an example, here’s the evaluation tree of / 2 + 1 1:

223344Add+ 3 43+43+40Div/ 2 + 3 42/(3+4)

Now as one of the subderivations is a mathematical statement outside the rules of evaluation, we can borrow axioms from mathematics, like the Peano axioms described in the previous chapter. By appealing to Peano axiom 1, we get

3+4=7=𝑆(𝑆(𝑆(𝑆(𝑆(𝑆(𝑆(0)))))))0

Stuck terms

The next interesting subject to discuss is how should we deal with evaluating terms like / 1 0. Our evaluation rules, and by extension, the evaluation relation, associate no value with such terms. These terms are called stuck as they cannot be evaluated.

Reminder (progress theorem/property): every term that is not already a value can evaluate to another term, i.e. terms never get “stuck”.

Our language now fails to have “progress”, which was a key component of type safety. As division is such a useful operation, losing type safety over that seems unfortunate.

To understand more why it’s an issue, consider the following term: * 0 / 1 0, or 010 in more mathematical style. Now this term is also stuck, as it would have to divide by zero to continue. An incorrectly done optimization might however simplify the multiplication by zero to just zero, which is not stuck!

This is one of the motivating reasons why we also need to model errors in our evaluation. A key requirement of a correct optimizing compiler is that it must not optimize away any code that produces an error!

Loading Exercise...

Error Values

To model an error that can happen during evaluation, we simply add an Error value which may additionally contain some error message.

In Rust, this is typically done by wrapping the type in a Result. In the next exercise you get to practice exactly that.

Loading Exercise...

As Result is a predefined type, we might prefer to create our own Value type to e.g. overload operations in any way we like. This is what you will practice in the next exercise.

Loading Exercise...