Alpha Equivalence and Capture-Avoiding Substitution
Learning Objectives
- You understand alpha-equivalence and when terms are equivalent up to renaming of bound variables.
- You can apply the formal inference rules to determine if two terms are alpha-equivalent.
- You understand the variable capture problem and how capture-avoiding substitution solves it.
- You can implement variable renaming and capture-avoiding substitution in code.
Introduction
In the previous chapter, we defined a naΓ―ve substitution operation and noticed it had some problems when it came to free variables in .
Instead of denying illegal substitutions, we will implement an algorithm that performs the substitution while carefully renaming variables so that every substitution becomes legal. This is called capture-avoiding substitution and is key to making lambda calculus work (when the substituted value is not closed).
If the substituted value is closed, substitution is always legal as the set of free variables in the substitute is empty.
We start with what it means to rename variables. As a reminder, bound variables are the names bound by abstractions in the term, whereas free variables are those which are not bound by any abstraction.
To get warmed up, hereβs an exercise for computing the set of bound/binding variables of a term denoted .
Alpha Equivalence
Consider these two definitions for the function .
Clearly, by mathematical intuition, they are the same definition, because the names of bound variable occurrences shouldnβt make a difference. Now, consider the following mathematical expressions.
Again, the expressions are the same up to renaming of bound variables. This kind of equivalence between terms (in lambda calculus but also more generally) is called -equivalence and denoted using .
Renaming a free variable, however, is not allowed to creat an -equivalent term. For example is not the same as .
This is because one should always be aware of the surrounding context. In this case, the surrounding context likely binds the or variable, for example using another sum (or using a quantifier like ). One has to also rename the binding variable to get an -equivalent term.
As another example, bound variables should not be renamed to ones that occur free inside the term.
Here are some examples of alpha-equivalence in lambda calculus.
In the last example, and are different free variables, so the terms are not -equivalent.
Alpha-Equivalence Rules
As -equivalence is a relation, we can define it conveniently using inference rules. However, itβs not as simple as defining the rules inductively it would be impossible to keep track of the context.
We take the approach presented in chapter 3 of the article A simple formalization of alpha-equivalence which defines a helper relation which also keeps track of the context as a list.
As usual, we start with the base rules. First, the helper relation has the following base rules:
The pair consists of the context (written as a list) and a variable. As the context is empty, it states that a free variable is -equivalent with itself. The authors use the convention that lists grow to the right (like a vector), and pushing an element to is denoted (which is reversed with regard to lists in Haskell).
Next, if and are lists of variables, then the following base rule says that if and were the latest (bound) variables introduced to the context, then they are -equivalent.
The context always grows to the right as we traverse both of the terms.
Next, the authors of the article write a rule called ThereVA which states that two variables bound earlier are -equivalent if they are -equivalent in the context without the latest variable.
This crucially requires that the latest bound variable in on each is not the same as the variables we are comparing. If and , then we can just use the second base rule instead.
Another subtle detail of
ThereVAis that both contexts become shorter by exactly one element. This means that for two free variables to be -equivalent, their contexts must have the same length.
This concludes the definition of the helper relation and we can use that in the definition of -equivalence β well almost. We also need to keep track of the context in the -equivalence relation, so define the relation over a pair consisting of a list of variables denoting the context and a lambda term. So letβs define -equivalence inductively over each term kind: variables, applications and abstractions.
When comparing variables and we use the -relation. If two variables are -equivalent, then they are also -equivalent in the respective contexts:
When comparing applications and (where are lambda terms) we simply compare the left and right terms:
Notably the context doesnβt change, as applications donβt introduce any new variables.
Finally, comparing an abstraction to needs that is -equivalent with when the left and right contexts have been updated with respectively:
AbsA is the most important rule to understand as this is the only place in the rules where the context grows by one.
Compare this with ThereVA, where the context shrinks by one.
The AbsA rule gives the list the meaning of the context (of bound variables).
As the left and right terms may use different variable names, we need to keep track of the context separately on both sides.
Here are some examples of -equivalence or -equivalence between variables. You can read the list as a sequence of lambdas to get an intuition. For example the pair represents the term .
From now on, when we say that two terms are -equivalent, we mean that they are -equivalent with regard to the empty context, i.e. and denote it using .
Next, your task is to implement in code the above relation for -equivalence between lambda terms.
Capture-Avoiding Substitution
We are finally ready to solve the substitution problem. The idea is to implement substitution in such a way that no free variables get captured and that the result is unique up to -equivalence.
Letβs look at some examples first. The following substitution is illegal and produces an incorrect term.
Instead, substitution should rename the bound βout of the wayβ to e.g. so that instead of , it results in . Clearly substitution needs some way to come up with new variables, but as we are not limited to characters and there are an infinite number of strings to choose from, we can just pick a βfreshβ string. To do this automatically requires some considerations β we want to ensure that renaming the bound variables stays -equivalent with the original.
Firstly, we need a mechanism to rename variables.
Now that we can rename variables, we can define capture-avoiding substitution, which is what we will refer to as from now on.
Notice the renaming of the body done in the last equation. When substituting into an abstraction, e.g. , the free variable must be picked so that it doesnβt conflict with or free variables in or any (free, bound or binding) variables in the body of the abstraction.
Notably must not equal . This is where Selinger makes a mistake in his lecture notes. He writes β should be chosen to be the least variable that does not occur in either or β but this does not account for the above situation.
Another important detail is that the fresh must not be in . For example, consider the substitution , as and we take the branch, where renaming needs to be done. If the freshness condition was without the (where ), the set of variables the fresh variable should not be is
choosing would be allowed. However, during renaming of to inside the body (), the following happens
The that was originally bound by the outer gets bound by the inner after the renaming, which is clearly incorrect.
In the next exercise, you are to implement capture-avoiding substitution.