Programming Language Foundations

Evaluation and Big-Step Semantics

Learning Objectives

You understand the difference between evaluation as a function and evaluation as a relation.
You know what big-step semantics means and how it differs from small-step semantics.
You can define evaluation rules using inference rules.
You can construct evaluation trees to show how terms evaluate to values.

Evaluation and Reduction

Evaluation is the process of converting an expression in a programming language to some kind of a value according to some evaluation strategy. If instead of expressions we evaluate terms, and value are also terms, the evaluation process is also called reduction. We will learn about how reduction works in lambda calculus in the next part. For now we will consider values to be separate from terms.

As evaluation takes a term to a value, we typically call it a function. A significant and interesting subset of evaluation relations are not functions: they can be nondeterministic or non-total. For example a random number generator that generates random coin flips is nondeterministic, and therefore not functional — but it’s still a relation and can be reasoned about in a mathematical way.

In this chapter we will observe non-totality when defining how certain kinds of “invalid computations” evaluate. Therefore from now on we use the term evaluation relation rather than function.

Evaluation relations are key in the analysis of the dynamics of programming languages.

Reminder: the word dynamics refers to how the language behaves during evaluation or execution — it is unrelated to the concept of dynamic typing or dynamic programming.

Big-Step semantics

Programming languages where evaluation produces a value immediately from a term have so called big-step semantics. “Big-step” refers to the property of evaluation where a computation is done all at once, in one (usually recursive) function call.

For example the calculate function from Chapter 2: Calculator Semantics

evaluates the entire term (represented by the AST in this case) to an integer value in one function call by evaluating the subterms recursively:

Therefore the calculator has big-step semantics — the user only needs to press the calculate button once.

In contrast, small-step semantics describes the dynamics of a language as state transitions or individual reductions. In small-step semantics, instead of describing the set of values separately (like ), values are represented by a subset of the states or terms.

Small-step semantics is explored in more detail in the next part of the course.

Next, let’s look at how to define an evaluation relation using inference rules.

We will analyze the dynamics of the calculator, i.e. the calculate function using an evaluation relation denoted where is a term and is a value. For now, it suffices to identify terms with the abstract syntax trees, but we will see later that we would like to extend terms to be more than just the AST.

The definition of the relation for the calculator is as follows:

Let’s look at a couple of examples.

The last statement is interesting, as it’s an immediate consequence of functionality of the relation, which we get from the fact that calculate is a function.

Evaluation Rules

Next, let’s give an alternative definition of without using calculate directly. To do this we use inference rules where judgments have the form (i.e. evaluates to ) where is a term and is a value. In our case terms are AST trees and values are just integers.

The calculate function is defined using four equations, so we should be able to define the evaluation relation with three rules and one (set of) “axioms”. First, the “axiom” which states that: for each we have

To avoid confusion, instead of calling the above an axiom, we call it a base rule (of the relation we are defining). The base rule can be equivalently expressed as an inference rule without premises.

Whether or not one draws the line above the base rule is only a matter of preference.

Specializing the rule for concrete values of gives us all of the following valid derivations.

Loading Exercise...

Next, the evaluation rule for . It should state that evaluates to the sum of what and evaluate to. In the definition of calculate this was expressed with recursion:

So the natural translation to get an evaluation rule is to first give a name for each value:

We observe, that to get we need to know and which are results of evaluation. As evaluation is not a function in general, we need the assumptions and . Now, putting things together, we get the evaluation rule

Let’s look at a couple of examples next. First, let’s prove that "", i.e. . A direct proof (as opposed to an indirect proof) starts from the base rules and hypotheses (which we have none).

First, we observe that is an instance of the base rule. Now, applying the rule to this, we get

whose conclusion, , is what we tried to prove.

Even though they are syntactically different, on right hand side is equal to as values.

Concrete Syntax in Rules

As the language of prefix arithmetic has a concrete syntax, we often find it more convenient to use that in the evaluation rules. For example we would write + 1 2 instead of . Even though in reality, the evaluation rules are not concerned with the concrete syntax, but the terms (or AST) instead, the concrete syntax is easy for humans to read — AST on the otherhand is easy for the computer to read.

This notation comes with the cost of ambiguity as would be written just . Here one has to know the type signature of the relation beforehand to know that on the left side we have terms and on the right side we have integers, and these two are completely different.

Now, the base rule becomes deceptively simple:

And the rule can be shortened quite a bit:

One has to always keep in mind the type of the left and right hand side when working with relations.

The evaluation rules for and are similar to the rule.

It’s very important to note that these are all the rules for evaluation (for this language), or the evaluation relation is the closed under these rules. This means that we can say exactly which rule was/is used during evaluation.

As a final note about the logic involved, we observe that the following derivation is an instance of the rule with

However, the derivation is not a valid derivation of the statement because the premises are not instances of the base rule.

Ex falso quodlibet, is the principle of logic that states that anything can be proven from a contradiction. Using this principle we must have that the conclusion follows from the premises, as the premises are false in the first place.

Loading Exercise...

Evaluation Trees

As we learned in the previous chapter, a proof tree consists of applications of inference rules such that the conclusion is at the root of the tree and the leaves are just instances of axioms (or a base rule). A proof tree shows the complete picture of every rule used, and is easily verified, especially if the inference lines are named.

Likewise, an evaluation tree is the derivation tree consisting of evaluation rules.

Let’s walk through an example. The term whose evaluation tree we are interested in is + 1 * 2 3 which has the AST .

We build up the tree starting from the leaves. From the base rule, we get

Which we can combine to get

And finally the complete proof

As we have a valid derivation of .

The reason every step worked out so smoothly was because of the fact that our relation was defined using a function (a functional and total relation). This fact made every subtree uniquely determined (by functionality) and ensured there was at least one evaluation rule we could use to progress the subtree (totality).

Loading Exercise...

Computation That Fails

As mentioned at the beginning of this chapter, evaluation being a relation makes it possible to define it such that some terms don’t evaluate to a value at all. We first add a fallible operation, such as integer division, to the language. It’s syntax is expressed using the following formation tree:

If we try to add the following evaluation rule

we get a “mathematical type error” must not be , because division by zero is not defined.

Notice that means integer division (calculating the quotient) in this context. This is also known as Euclidean division.

Therefore we need an extra assumption, which we write as just above the inference line:

As an example, here’s the evaluation tree of / 2 + 1 1:

Now as one of the subderivations is a mathematical statement outside the rules of evaluation, we can borrow axioms from mathematics, like the Peano axioms described in the previous chapter. By appealing to Peano axiom 1, we get

Stuck terms

The next interesting subject to discuss is how should we deal with evaluating terms like / 1 0. Our evaluation rules, and by extension, the evaluation relation, associate no value with such terms. These terms are called stuck as they cannot be evaluated.

Reminder (progress theorem/property): every term that is not already a value can evaluate to another term, i.e. terms never get “stuck”.

Our language now fails to have “progress”, which was a key component of type safety. As division is such a useful operation, losing type safety over that seems unfortunate.

To understand more why it’s an issue, consider the following term: * 0 / 1 0, or in more mathematical style. Now this term is also stuck, as it would have to divide by zero to continue. An incorrectly done optimization might however simplify the multiplication by zero to just zero, which is not stuck!

This is one of the motivating reasons why we also need to model errors in our evaluation. A key requirement of a correct optimizing compiler is that it must not optimize away any code that produces an error!

Loading Exercise...

Error Values

To model an error that can happen during evaluation, we simply add an Error value which may additionally contain some error message.

In Rust, this is typically done by wrapping the type in a Result. In the next exercise you get to practice exactly that.

Loading Exercise...

As Result is a predefined type, we might prefer to create our own Value type to e.g. overload operations in any way we like. This is what you will practice in the next exercise.

Loading Exercise...

← Theory of Programming Languages

Booleans and Conditional Computation →