Programming Language Design 2

The Simply Typed Lambda Calculus


Learning Objectives

  • You know what base types are.
  • You know of the hypothetical typing judgment.
  • You understand the typing rules of simply typed lambda calculus.
  • You know how the simply typed lambda calculus behaves with only one base type: Boolean.

Base Types

So far, we haven’t seen any base types, such as integers or booleans in the lambda calculus. That is on purpose, the theory of lambda calculus is mostly independent of base types, and even if we had integers or booleans, those could always be translated to abstractions with Church encodings. One way to say this is that integers and booleans are admissible, i.e. all properties of lambda calculus still hold even if we add more base types.

Not having integers or booleans is not a property…

The Simply Typed Lambda Calculus

The simply typed lambda calculus (STLC) is an extension of the untyped lambda calculus where each valid term has a type1 and abstractions must specify the type of their quantified variable.

Term≔VarString|App(Term,Term)|Abs(String,Type,Term)Type≔Arrow(Type,Type)

The Type highlighted in red is the only difference between simply typed and untyped lambda calculus.

However, in order for the simply typed lambda calculus to be useful it needs to have at least one base type, such as integers or booleans.

Booleans

Next, we will introduce booleans as part of the untyped lambda calculus. Then we can begin to analyze and define behavior for booleans from an untyped, and later from the typed point of view.

Term≔Var(String)|Abs(var:String,body:Term)|App(Term,Term)|Let(var:String,val_t:Term,body:Term)|True|False|Ite(cond:Term,if_true:Term,if_false:Term)

The syntax for writing an Ite(cond:𝑑𝑐,if_true:𝑑1,if_false:𝑑2) is if𝑑𝑐then𝑑1else𝑑2.

Because the set of terms have changed, we have to extend the 𝛽-reduction, otherwise all the if-then-else term is stuck, i.e. a non-reducible non-value. One of the benefits of expressing relations using inference rules is that we can just extend the rules ad-hoc. So let’s add the following evaluation rules to the new boolean lambda calculus.

IfTrue(ifTruethen𝑑1else𝑑2)βŸΆπ‘‘1IfFalse(ifFalsethen𝑑1else𝑑2)βŸΆπ‘‘2π‘‘π‘βŸΆπ‘‘β€²π‘If(if𝑑𝑐then𝑑1else𝑑2)⟢(if𝑑′𝑐then𝑑1else𝑑2)

Note that 𝑑1 and 𝑑2 can be non-value terms. Now, the complete set of evaluation rules for the untyped lambda calculus with let expressions and booleans is App1,App2,AppAbs,Let1,Let,IfTrue,IfFalse,If.

Make sure you know the details of how evaluation of if-then-else works by asking these questions from yourself:

  1. What happens if the condition of an if-then-else is not a value?
  2. If the condition evaluates to false, is the true branch reduced?
  3. If the condition is a value but not a boolean, what happens?
  4. What happens if the branches are the same term?
Loading Exercise...
Loading Exercise...
Loading Exercise...
Loading Exercise...

Church Booleans

For example, the Church encoding of booleans is done by representing them with abstractions of abstractions:

  • True is πœ†π‘₯.πœ†π‘¦.π‘₯
  • False with πœ†π‘₯.πœ†π‘¦.𝑦

Then, we can represent the if-then-else construction for three church booleans 𝑑𝑐,𝑑if,𝑑else with

𝑑𝑐 𝑑if 𝑑else

To analyze how this works, let’s assume 𝑑𝑐 is a Church encoded boolean and analyze both values:

  1. 𝑑𝑐=πœ†π‘₯.πœ†π‘¦.π‘₯. Now, we can see that the term 𝑑𝑐 𝑑if 𝑑else reduces to 𝑑if:

    𝑑𝑐 𝑑if 𝑑else=(πœ†π‘₯.πœ†π‘¦.π‘₯) 𝑑if 𝑑elseβŸΆπ‘‘if
  2. 𝑑𝑐=πœ†π‘₯.πœ†π‘¦.𝑦. Now, we can see that the term 𝑑𝑐 𝑑if 𝑑else reduces to 𝑑else:

    𝑑𝑐 𝑑if 𝑑else=(πœ†π‘₯.πœ†π‘¦.𝑦) 𝑑if 𝑑elseβŸΆπ‘‘else

Notice that the 𝑑if and 𝑑else don’t even need to be booleans.

It would be easy to define the following type for a lambda calculus with booleans

#[derive(Debug, Clone)]
pub enum TermSugar {
    // Booleans
    True,
    False,
    Ite(Box<TermSugar>, Box<TermSugar>, Box<TermSugar>),

    Var(String),
    Abs { var: String, body: Box<TermSugar> },
    App(Box<TermSugar>, Box<TermSugar>),
}

Then we could simply translate the sugared terms back to plain lambda terms using the Church encoding.

With understanding of booleans in the untyped calculus, we are ready to add types on top. The reason we added a base type is that plain STLC has no finite types1, so it would be impossible to write a well-typed term with a finite type.

Infinite terms and types are outside the scope of this course.

πœ†π‘₯:Boolean.π‘₯

This lambda term has type Boolean→Boolean (or formally Arrow(Boolean,Boolean)). The arrow (→) type is used to denote the type of abstractions.

The following function would also be the identity function, but for Boolean→Boolean functions:

πœ†π‘₯:Booleanβ†’Boolean.π‘₯

The type of this identity function is (Boolean→Boolean)→Boolean→Boolean

Definition 6.5.1: Terms and types of STLC+𝟚The set of terms STLC+𝟚, which is referred to as Term during the rest ofthis chapter, is defined as follows:Term≔VarString|App(Term,Term)|Abs(String,Type,Term)|Let(var:String,val_t:Term,body:Term)|True|False|Ite(Term,Term,Term)The set of types of STLC+𝟚, referred to as Type, is defined as follows:Type≔Arrow(Type,Type)|BooleanIn addition, we use the notation πœ†π‘₯:𝑇.𝑑 to denote the term Abs("x",𝑇,𝑑)and 𝟚 to denote the type Boolean. Other notation for Terms (let-expressions and booleans) is inherited from the untyped lambda calculus.

We also refer to free and bound variables in STLC by considering the definitions 6.3.1-3 after erasing types from STLC terms.

Type Erasure (6.5.2-3)

Types play no significance during evaluation of statically typed programming languages, like those based on STLC, so we can lower e.g. STLC+𝟚 to its untyped variant. This step can be done to ensure that types are not used during evaluation.

Definition 6.5.2: πœ†+𝟚The terms of πœ†+𝟚 are defined as follows.Term≔VarString|App(Term,Term)|Abs|Let(var:String,val_t:Term,body:Term)|True|False|Ite(Term,Term,Term)Notation is defined in the usual style.Definition 6.5.3: Lowering STLC+𝟚 to πœ†+𝟚The type erasure of a term 𝑑:STLC+𝟚, denoted βŸ¦π‘‘βŸ§:πœ†+𝟚, is defined asfollows.⟦π‘₯⟧=π‘₯,π‘₯is a variableβŸ¦π‘‘1 𝑑2⟧=βŸ¦π‘‘1⟧ βŸ¦π‘‘2βŸ§βŸ¦πœ†π‘₯:𝑇.π‘‘βŸ§=πœ†π‘₯.βŸ¦π‘‘βŸ§,Note: forgets/erasesπ‘‡βŸ¦letπ‘₯=𝑣inπ‘‘βŸ§=letπ‘₯=𝑣inπ‘‘βŸ¦True⟧=True⟦False⟧=False⟦if𝑑𝑐then𝑑1else𝑑2⟧=ifβŸ¦π‘‘π‘βŸ§thenβŸ¦π‘‘1⟧elseβŸ¦π‘‘2⟧

One way to enforce the use of the type checker before, and only before evaluating a term would be to have the following architecture:

type_check:STLC+πŸšβ†’Result(πœ†+𝟚,TypeError)multistep:πœ†+πŸšβ†’Valueinterpreter:STLC+πŸšβ†’Result(Value,TypeError)interpreter=mapmultistep∘type_check

Where the type_check is defined as

type_check(𝑑)=caseinfer_type(𝑑,βˆ…)of|Ok(_)β‡’Ok(βŸ¦π‘‘βŸ§)|Err(err)β‡’Err(err)

And the type πœ†+𝟚 is used as an existential/opaque type, which is not visible to the user.

This way evaluation can not externally be used before type checking and reduction doesn’t have access to type information.

Here are some example terms of STLC+𝟚 (STLC with booleans):

  1. πœ†π‘“:πŸšβ†’πŸš.𝑓

    The identity function for boolean endofunctions (endo meaning the domain and codomain have the same type.)

    The type of this term is (πŸšβ†’πŸš)β†’(πŸšβ†’πŸš).

  2. πœ†π‘₯:𝟚.ifπ‘₯thenFalseelseTrue

    The negation function mapping each boolean to its negation. The type of this term is πŸšβ†’πŸš.

    Let’s refer to it as not.

  3. (πœ†π‘“:πŸšβ†’πŸš.𝑓)not

    The identity function for boolean endofunctions applied to the not endofunction.

    The type of this term is πŸšβ†’πŸš.

  4. (πœ†π‘“:πŸšβ†’πŸš.𝑓)notTrue

    The identity function for boolean endofunctions applied to the not endofunction which is then applied to the boolean term True.

    The type of this term is 𝟚.

  5. πœ†π‘“:πŸšβ†’πŸš.𝑓True

    This function applies its argument to the True value.

    The type of this term is (πŸšβ†’πŸš)β†’πŸš2.

Loading Exercise...
Loading Exercise...
Loading Exercise...

Typing

In part 4 chapter 7 type checking was simply a process of determining the type of subterms until they hit integer or boolean values. However, in STLC, type checking is not as simple because we have to deal with variables.

The term πœ†π‘₯:𝟚,πœ†π‘¦:𝟚,π‘₯ has type πŸšβ†’πŸšβ†’πŸš as applying the abstraction to some boolean argument 𝑑::𝟚 reduces to the term πœ†π‘¦:𝟚,𝑑 which has type πŸšβ†’πŸš.

What we have just stated certainly makes sense, but how would we turn that into an algorithm? Especially the β€œsome boolean 𝑑” part…

The logic for type checking terms in STLC is the most interesting property of the simply typed lambda calculus. The whole idea is that abstractions know the type of their parameter, so type checking the body of the abstraction should use that knowledge of the variable’s type.

The typing rules should therefore look something like this:

𝑑𝑦𝑝𝑒 π‘β„Žπ‘’π‘π‘˜ 𝑑1::πŸšπ‘˜π‘›π‘œπ‘€π‘–π‘›π‘”π‘₯::πŸšπœ†π‘₯:𝟚. 𝑑1::𝟚

This β€œknowledge” is stored in a context.

Context

Typing context, typing environment, or just context, refers to the data structure which associates variable names to their types. Symbols Ξ“,Ξ£ are used to denote contexts.

Here are a couple examples of contexts:

βˆ…Ξ“π‘₯::IntegerΞ“,π‘₯::IntegerΞ“,Ξ£

The properties we need from a context, are that we can look up types by variable name, and extend them with new variables.

Hypothetical Typing Judgment

Unfortunately the typing relation (::), which has gotten us this far, is no longer suitable for type checking, as it’s not aware of the context. We need to define a new relation, the hypothetical typing judgment which also has the context in it.

Definition 6.5.4: Hypothetical Typing JudgmentGiven a context Ξ“, a term 𝑑 and a type 𝑇, the hypothetical typing judgmentis the smallest ternary relation closed under the typing rules (to be defined).The relation, denoted byΞ“βŠ’π‘‘::𝑇reads β€œin the context Ξ“, 𝑑 has type 𝑇”.Example 6.5.5: Hypothetical Typing Judgmentπ‘₯::𝟚⊒π‘₯::πŸšβˆ…βŠ’(πœ†π‘₯:𝟚. π‘₯)::πŸšβ†’πŸš

Note: When reading about lambda calculus and type theory, try to glance over the context and focus on the terms and types on the first reading.

Let’s start spelling out the typing rules for STLC+𝟚. First, for simplicity, let’s consider just the boolean fragment of the language.

Reminder, here are the boolean typing rules from part 4 of the course:

False-TFalse::𝟚True-TTrue::πŸšπ‘‘π‘::πŸšπ‘‘2::𝑇𝑑1::𝑇Ite-TIte(if:𝑑𝑐,then:𝑑1,else:𝑑2)::𝑇

The rules True,False,Ite for type checking terms named respectively, are as follows.

TrueΞ“βŠ’True::𝟚FalseΞ“βŠ’False::πŸšβˆ€Ξ“,𝑑𝑐,𝑑1,𝑑2,𝑇,Ξ“βŠ’π‘‘π‘::πŸšΞ“βŠ’π‘‘1::π‘‡Ξ“βŠ’π‘‘2::𝑇IteΞ“βŠ’Ite(𝑑𝑐,𝑑1,𝑑2)::𝑇

Note: the context Ξ“ is completely arbitrary, as are the terms 𝑑𝑐,𝑑1,𝑑2 and type 𝑇. Notably, this rule correctly type checks function types: 𝑇=𝑇1→𝑇2.

Now, let’s write down the rules for type checking the core or STLC.

Variables

Type checking a variable π‘₯ happens with the following logic:

  • If the typing association π‘₯::𝑇 appears in the context, then we know that π‘₯::𝑇.

This can be written in a style without hypotheses as follows:

βˆ€Ξ“,π‘₯,𝑇,VarΞ“,π‘₯::π‘‡βŠ’π‘₯::𝑇

The rule is taken to mean that if π‘₯::𝑇 appears in the context3, then π‘₯::𝑇.

Here’s an example of a typing derivation involving a variable inside an if-then-else:

VarΞ“,π‘₯::𝟚⊒π‘₯::𝟚FalseΞ“,π‘₯::𝟚⊒False::𝟚TrueΞ“,π‘₯::𝟚⊒True::πŸšΞ“,π‘₯::𝟚⊒ifπ‘₯thenFalseelseTrue::𝟚

Abstractions

The rule for type checking abstractions is the part where the context is extended with a new variable before recursively continuing. Type checking an abstraction πœ†π‘₯:𝑇1,𝑑1 happens with the following logic:

  • If 𝑑1::𝑇2 under the additional hypothesis that π‘₯::𝑇1, then πœ†π‘₯:𝑇1,𝑑1::𝑇1→𝑇2.

Extending the context into the hypothesis is crucial for the potentially free parameter inside the abstractions to type check.

βˆ€Ξ“,π‘₯,𝑑1,𝑇1,𝑇2,Ξ“,π‘₯::𝑇1βŠ’π‘‘1::𝑇2AbsΞ“βŠ’πœ†π‘₯:𝑇1. 𝑑1::𝑇1→𝑇2

Example typing derivation:

VarΞ“,π‘₯::𝟚⊒π‘₯::𝟚AbsΞ“βŠ’πœ†π‘₯:𝟚. π‘₯::πŸšβ†’πŸš

Let’s name this derivation Ξ”.

Applications

Type checking applications simply involves ensuring that the argument is in the domain of the arrow type.

βˆ€Ξ“,𝑑1,𝑑2,𝑇1,𝑇2,Ξ“βŠ’π‘‘1::𝑇1→𝑇2Ξ“βŠ’π‘‘2::𝑇1AppΞ“βŠ’π‘‘1 𝑑2::𝑇2

Example typing derivation:

Ξ”TrueΞ“βŠ’True::𝟚AppΞ“βŠ’(πœ†π‘₯:𝟚,π‘₯)True::𝟚

Here, the typing derivation Ξ” is substituted in from the previous example as it precisely matches the expected format: 𝑑1=πœ†π‘₯:𝟚,π‘₯ and 𝑇1=𝑇2=𝟚.

Let-expressions

The rule for type let-expressions is very similar to abstractions:

  • If 𝑑2::𝑇2 under the additional hypothesis that π‘₯::𝑇1, and 𝑑1::𝑇1 then letπ‘₯=𝑑1in𝑑2::𝑇2.

Even though the let expression knows the π‘₯, we avoid using substitution in the rules. Substitution is an implementation detail of evaluation, and evaluation details should not be mixed with type checking, as type checking is a prerequisite for correctness of evaluation.

If we used substitution directly, we could write both of the assumptions as [π‘₯↦𝑑1]𝑑2::𝑇2, but this makes reasoning about the rule much more difficult.

βˆ€Ξ“,π‘₯,𝑑1,𝑑2,𝑇1,𝑇2,Ξ“,π‘₯::𝑇1βŠ’π‘‘2::𝑇2Ξ“βŠ’π‘₯::𝑇1LetΞ“βŠ’letπ‘₯=𝑑1in𝑑2::𝑇2
Loading Exercise...
Loading Exercise...
Loading Exercise...
Loading Exercise...

Evaluation

The evaluation rules are inherited from the untyped lambda calculus, just with the added type in abstractions.

𝑑1βŸΆπ‘‘β€²1App1𝑑1 𝑑2βŸΆπ‘‘β€²1 𝑑2𝑑2βŸΆπ‘‘β€²2is_value(𝑣1)App2𝑣1 𝑑2βŸΆπ‘£1 𝑑′2is_value(𝑣2)AppAbs(πœ†π‘₯:𝑇.𝑑1) 𝑣2⟢[π‘₯↦𝑣2]𝑑1IfTrue(ifTruethen𝑑1else𝑑2)βŸΆπ‘‘1IfFalse(ifFalsethen𝑑1else𝑑2)βŸΆπ‘‘2π‘‘π‘βŸΆπ‘‘β€²π‘If(if𝑑𝑐then𝑑1else𝑑2)⟢(if𝑑′𝑐then𝑑1else𝑑2)

Implementation

Before diving into the exercise, here are some notes (and hints) for the implementation:

  • We define the context as a HashMap of variable names to types.

    pub type Context = std::collections::HashMap<String, Ty>;
  • infer_type results in a Result<Ty, TypeError> where TypeError encodes the following set of possible type errors.

    pub enum TypeError {
        UndefinedVariable(String),
        WrongAppTypeRight(Ty),
        WrongAppTypeLeft(Ty),
        Fail,
    }
  • It is highly recommended to write unit tests for the implementations, although they are not graded.


Loading Exercise...

Properties of the Simply Typed Lambda Calculus

Theorem 6.5.6: ProgressLet 𝑑 be a well-typed term, i.e. βˆ…βŠ’π‘‘::𝑇 for some type 𝑇. Now, eitheris_value𝑑, or there exists 𝑑′, such that π‘‘βŸΆπ‘‘β€².The proof is considered too complex for the scope of this course. Theorem 6.5.7: PreservationLet 𝑑 be a well-typed term, i.e. βˆ…βŠ’π‘‘::𝑇 for some type 𝑇. Let 𝑑′ be a termsuch that π‘‘βŸΆπ‘‘β€². Then βˆ…βŠ’π‘‘β€²::𝑇 for some type 𝑇.The proof is considered too complex for the scope of this course.

For proofs of these theorems, refer to the optional chapter 10 which covers formalization of programming language theory.

Footnotes

Footnotes

  1. Well, the lack of base types makes the system less expressive and only interesting from a theory point of view. Without base types, the simply typed lambda calculus is β€œdegenerate”. That is, there are no (finite) well-typed terms. This is why we need base types to make the system useful. However, the typing rules and other properties are still valid in this degenerate system. ↩ ↩2

  2. When writing arrow types, it is useful to associate them to the right so that we don’t need parentheses to indicate that the return type is a function. This means that the type of β€œmulti-argument” functions is just a chain of arrows without parentheses (unless the parameters have types with arrows).

    πŸšβ†’πŸšβ†’πŸšβ‰‘πŸšβ†’(πŸšβ†’πŸš)

    which is not the same type as (πŸšβ†’πŸš)β†’πŸš. ↩

  3. Depending on the formal definition of the context, we might use a function get:Contextβ†’Stringβ†’Option(Type) and write get(Ξ“,π‘₯)=Some(𝑇) in the assumptions and Ξ“βŠ’π‘₯::𝑇 in the conclusion. This is closer to the Rust implementation where a HashMap is used as the context. ↩