STLC Implementation
Learning Objectives
- You can map STLC typing rules to Rust enums and helper methods.
- You can implement type inference for STLC with booleans using a typing context.
- You can implement deterministic call-by-value reduction for STLC terms.
Introduction
In the previous chapter, we studied the simply typed lambda calculus with booleans from a mathematical perspective. We defined hypothetical typing which gave us a rich framework to define how typing works in the language.
This chapter will then focus on writing an actual implementation of the simply typed lambda calculus at the abstract syntax tree level. We wonβt write a concrete parser for the language to keep things focused on whatβs important.
As a preview of what we are about to build, here is a snippet of code that tests that the function has type .
let term = t!(fun x : Bool, x);
assert_eq!(term.infer_type(), Some(ty!(Bool -> Bool)));
The code uses the macros t! and ty! to construct terms and types of STLC using the familiar syntax of .
Before diving in to the implementation, please review the typing rules of STLC with booleans. While reading the implementation, make sure you understand the connection between the rules and the code. In the following chapters you will practice translating various typing and evaluation rules to a Rust implementation.
Type Inference Implementation
We begin by defining the Term and Type enums in Rust.
#[derive(Debug, Clone, PartialEq, Eq)]
/// A term of the simply typed lambda calculus with booleans.
pub enum Term {
Var(String),
Abs {
var: String,
ty: Type,
body: Box<Term>,
},
App(Box<Term>, Box<Term>),
// Boolean extension
True,
False,
Ite {
cond: Box<Term>,
if_true: Box<Term>,
if_false: Box<Term>,
},
}
#[derive(Debug, Clone, PartialEq, Eq)]
/// A type in simply typed lambda calculus with booleans.
pub enum Type {
Bool,
Arrow(Box<Type>, Box<Type>),
}
Next we add type checking for booleans.
The method Term::bool_ty checks if the term is True or False and returns Some(Bool) if it is, and None otherwise.
use Term::*;
use Type::*;
impl Term {
/// Typing rules for [`Term::True`] and [`Term::False`]
pub fn bool_ty<'t>(&'t self, _ctx: &mut Vec<(&'t str, &'t Type)>) -> Option<Type> {
match self {
True | False => Some(Bool),
_ => None,
}
}
}
Here is a reference for the typing rules for you to compare with the implementation.
The method Term::ite_ty infers the type of an Ite term.
Please carefully compare the code to the rule presented in the info box above to make sure you understand the connection between the mathematical description and the implementation.
/// Typing rule for [`Term::Ite`]
pub fn ite_ty<'t>(&'t self, ctx: &mut Vec<(&'t str, &'t Type)>) -> Option<Type> {
match self {
Ite {
cond,
if_true,
if_false,
} => {
if cond.infer_type_ctx(ctx)? != Bool {
return None;
}
let ty = if_true.infer_type_ctx(ctx)?;
if ty == if_false.infer_type_ctx(ctx)? {
Some(ty)
} else {
None
}
}
_ => None,
}
}
Next, the typing rules for core STLC terms are implemented in the following methods.
Term::var_tysearches the context in reverse for the variable name to get its type.Term::app_tychecks that the type oft1is an arrow type and its domain matches the type oft2.Term::abs_tyis the only place where the context is directly mutated. It pushes the binding variable and its type to the context, infers the type of the body and returns the arrow type with the domain and codomain.
/// Typing rule for [`Term::Var`]
pub fn var_ty<'t>(&'t self, ctx: &mut Vec<(&'t str, &'t Type)>) -> Option<Type> {
match self {
Var(name) => ctx
.iter()
.rfind(|(n, _)| n == name)
.map(|(_, ty)| (*ty).clone()),
_ => None,
}
}
/// Typing rule for [`Term::App`]
pub fn app_ty<'t>(&'t self, ctx: &mut Vec<(&'t str, &'t Type)>) -> Option<Type> {
match self {
App(t1, t2) => {
let t1_ty = t1.infer_type_ctx(ctx)?;
let t2_ty = t2.infer_type_ctx(ctx)?;
match t1_ty {
Arrow(dom, cod) => {
if *dom == t2_ty {
Some(*cod)
} else {
None
}
}
_ => None,
}
}
_ => None,
}
}
/// Typing rule for [`Term::Abs`]
pub fn abs_ty<'t>(&'t self, ctx: &mut Vec<(&'t str, &'t Type)>) -> Option<Type> {
match self {
Abs { var, ty, body } => {
ctx.push((var, ty));
let body_ty = body.infer_type_ctx(ctx);
ctx.pop();
let body_ty = body_ty?;
Some(Arrow(Box::new(ty.clone()), Box::new(body_ty)))
}
_ => None,
}
}
The way these are used in practice is to try one, and if it fails try the next.
The method Term::infer_type_ctx implements the hypothetical typing rules as a function in exactly this way.
Using this, Term::infer_type can check if the term is well-typed in the empty context.
pub fn infer_type_ctx<'t>(&'t self, ctx: &mut Vec<(&'t str, &'t Type)>) -> Option<Type> {
self.var_ty(ctx)
.or_else(|| self.app_ty(ctx))
.or_else(|| self.abs_ty(ctx))
.or_else(|| self.bool_ty(ctx))
.or_else(|| self.ite_ty(ctx))
}
pub fn infer_type(&self) -> Option<Type> {
let mut ctx = Vec::new();
self.infer_type_ctx(&mut ctx)
}
The complete code for type checking part of STLC is in the following editor.
At the bottom of the file, two macros
ty!andt!are provided to help write terms and types using syntax.
Call-By-Value with NaΓ―ve Substitution
As mentioned in the previous chapter, as long as we check types before evaluating and donβt reduce under abstractions, substitution is always legal so the naΓ―ve algorithm is sufficient to implement dynamics for STLC.
Instead of defining -reduction for STLC, we define call-by-value (CBV) evaluation, denoted .
This is not as simple as just dropping the Abs rule and calling it a day β we need to fix the evaluation order starting from the CBV-reduction rules.
Another feature of call-by-value evaluation is the fact that we donβt need three Ite rules, but just the one Ite1 suffices.
When we implemented step for untyped lambda calculus, the order of the App1, App2 and AppAbs rules (as well as other rules) was an unimportant implementation detail, and had little impact on the result due to the Church-Rosser (confluence) property.
For example, the order could be as follows: first attempt to use the AppAbs rule (even if the argument hasnβt been reduced) and if it doesnβt apply reduce the function-position with App1, and if neither applies, reduce the argument position with App2:
pub fn step(&self) -> Option<Self> {
self.app_abs()
.or_else(|| self.app1())
.or_else(|| self.app2())
// ...
}
Because the order in which these rules are applied is (on purpose) not defined as part of the -reduction, terms like can take multiple reduction steps (App1 or App2).
This makes -reduction nondeterministic, i.e. a nonfunctional relation.
Determinism matters for various reasons: for example it makes the evaluation steps easier to analyze, as the sequence of steps a term can take is not a graph with branches, but a straight line instead. Also, if the language has side effects like printing or tracing, the order of messages is unique and well-defined. Determinism is a good design principle in general β having functional (i.e. deterministic) relations is a requirement to faithfully translate the relation into e.g. Rust code.
Canonical Forms and Values
In practice, the effects of dropping the Abs rule and switching to call-by-value semantics can only be seen when reducing a term which has an arrow type.
If we reduce a term whose type is Bool, the result is always true or false.
These are called the canonical forms or values of the boolean type.
Notably, an application or an if-then-else is never a value as it can be reduced with or call-by-value reduction.
Values play an important role in the dynamics of functional programming languages.
What it means for a term of arrow type to be in canonical form is more subtle. If we require -normal form, then the body of an abstraction must also be normalized, which requires capture-avoiding substitution. Therefore we consider all abstractions to be values regardless of whether the body is normalized or not. We prefer to use the word value from now on to avoid confusing the concept with normal forms.
By adding βside conditionsβ, we can define the call-by-value semantics so that there is at most one CBV-reduction rule that can be applied for any given term. We use the letter to denote a value, and write this as .
Values now coincide the set of well-typed terms that canβt be reduced further using the call-by-value reduction rules.
In the next example, we implement the is_value predicate thatβs used to check the side-conditions in app_abs and app2.
Because CBV is deterministic, self.app1(), self.app2() and self.app_abs() are mutually exclusive (at most one returns Some) β so the order in which they are attempted does not matter.
/// Returns whether the term is a value.
pub fn is_value(&self) -> bool {
match self {
True | False | Abs { .. } => true,
_ => false,
}
}
/// Applies the `AppAbs` rule returning None if it doesn't apply.
pub fn app_abs(&self) -> Option<Term> {
match self {
App(t1, t2) if t2.is_value() => match &**t1 {
Abs { var, body, .. } => {
// Check if t2 is well-typed to ensure it's closed.
debug_assert!(
t2.infer_type_ctx(&mut Vec::new()).is_some(),
"t2 is not well typed!"
);
Some(body.clone().subst(var, t2))
}
_ => None,
},
_ => None,
}
}
/// Applies the `App1` rule returning None if it doesn't apply.
pub fn app1(&self) -> Option<Term> {
match self {
App(t1, t2) => Some(App(Box::new(t1.step_cbv()?), t2.clone())),
_ => None,
}
}
/// Applies the `App2` rule returning None if it doesn't apply.
pub fn app2(&self) -> Option<Term> {
match self {
App(t1, t2) if t1.is_value() => Some(App(t1.clone(), Box::new(t2.step_cbv()?))),
_ => None,
}
}
/// Does a call-by-value reduction step returning None if no reduction rule applies.
pub fn step_cbv(&self) -> Option<Term> {
// The order in which the rules are attempted doesn't matter
self.app1()
.or_else(|| self.app2())
.or_else(|| self.app_abs())
// ...
}
Notice that also
Ite1,IteTrueandIteFalseare mutually exclusive (and exclusive withApp1,App2,AppAbs), so their order does not matter either. CBV reduction rules for most language features are in fact mutually exclusive with one another.
We can now state the progress theorem for call-by-value semantics.
STLC With CBV Implementation
The following code extends the type checking code with call-by-value semantics.
Substitution (Term::subst) is implemented without capture-avoidance.
The Term::app_abs method also contains a debug_assert! to ensure t2 is closed, which is the pre-condition to call subst.