Simple and Inductive Types

Big Integer Optimization


Learning Objectives

  • You know how natural numbers can be implemented more efficiently with big integers.

Introduction

The Peano-style natural number implementation is horrendously inefficient — it’s practically using long linked lists for basic arithmetic. In this chapter we optimize this by extending STLC directly with big integers instead.

When evaluating a term whose type is Nat, we get the term’s canonical form. The set of canonical forms, or values, of natural numbers is

{𝑍,𝑆(𝑍),𝑆(𝑆(𝑍)),}

The set is really an encoding (embedding) of the set in to the language of STLC with natural numbers. Our goal is to redefine the semantics to directly work with natural numbers, and provide an efficient “runtime representation” for it.

In the implementation we use a big integer type from the num_bigint library instead of a machine integer u64 to keep the language type safe. If we used u64, then our implementation would not be a faithful representation of the dynamics because e.g. overflowing addition causes a panic (or wraps around).

First, let’s extend STLC (with booleans) with the terms for “big unsigned integers” or “external natural numbers”, and also add basic arithmetic and comparison operators which are provided by the library.

DefinitionSTLC with External Natural NumbersWe extend the language of STLC with booleans with the numeralterm Uint(𝑛) where 𝑛. We also have the following termsrepresenting arithmetic operators and comparisons: 𝑡1+𝑡2,𝑡1𝑡2,𝑡1𝑡2,𝑡1/𝑡2,𝑡1𝑡2,𝑡1==𝑡2.NotationWe use 0,1, to denote the numeral terms Uint(0),Uint(1), etc.ExampleThe following are terms of STLC with external natural numbers0+0(2+3)/4𝑥+1𝜆𝑥:Nat.10𝑥

In Rust, the Term enum would have the following new constructors

Uint(BigUint),
RecNat {
    scrutinee: Box<Term>,
    if_zero: Box<Term>,
    if_succ: Box<Term>,
},
Add(Box<Term>, Box<Term>),
Mul(Box<Term>, Box<Term>),
Pow(Box<Term>, Box<Term>),
Sub(Box<Term>, Box<Term>),
Div(Box<Term>, Box<Term>),
Le(Box<Term>, Box<Term>),
Eq(Box<Term>, Box<Term>),

Type Checking

Type checking the natural number terms is straight-forward. Any Uint(𝑛) term has type Nat.

UintΓUint(𝑛)::Nat

The arithmetic and comparison operators are typed as follows

Γ𝑡1::NatΓ𝑡2::NatAddΓ𝑡1+𝑡2::NatΓ𝑡1::NatΓ𝑡2::NatLeΓ𝑡1𝑡2::Bool

The typing rule for RecNat remains the same.

Γ𝑡1::NatΓ𝑡𝑧::𝑇Γ𝑡𝑠::Nat𝑇𝑇RecNatΓnrec(𝑡1,𝑡𝑧,𝑡𝑠)::𝑇

CBV-Reduction

The call-by-value reduction rules for arithmetic and comparison operators reduce subterms. To keep the rules deterministic, the left subterm is reduced until it’s a value, only then can the right subterm reduce.

𝑡1CBV𝑡1Add1𝑡1+𝑡2CBV𝑡1+𝑡2𝑡2CBV𝑡2𝑣1valueAdd2𝑣1+𝑡2CBV𝑣1+𝑡2AddUint(𝑛1)+Uint(𝑛2)CBVUint(𝑛1+𝑛2)𝑡1CBV𝑡1Le1𝑡1𝑡2CBV𝑡1𝑡2𝑡2CBV𝑡2𝑣1valueLe2𝑣1𝑡2CBV𝑣1𝑡2𝑛1𝑛2LeTrueUint(𝑛1)Uint(𝑛2)CBVtrue¬(𝑛1𝑛2)LeFalseUint(𝑛1)Uint(𝑛2)CBVfalse

The rules for the other operators are similar. Notably underflowing subtraction saturates at zero and division returns the integer quotient.

The RecNat rules require some changes now that we don’t have 𝑍 and 𝑆 terms anymore. Notably the RecNatS rule has the assumption that 𝑝 is the predecessor of 𝑛, i.e. 𝑝+1=𝑛, where 𝑝,𝑛.

𝑡1CBV𝑡1RecNat1nrec(𝑡1,𝑡𝑧,𝑡𝑠)CBVnrec(𝑡1,𝑡𝑧,𝑡𝑠)RecNatZnrec(0,𝑡𝑧,𝑡𝑠)CBV𝑡𝑧𝑝+1=𝑛RecNatSnrec(Uint(𝑛),𝑡𝑧,𝑡𝑠)CBV𝑡𝑠Uint(𝑝)nrec(Uint(𝑝),𝑡𝑧,𝑡𝑠)
Loading Exercise...

Integers

It would be easy to extend the language with also an integer type () using the signed BigInt type from the num_bigint library. Doing this is a good exercise, but instead of doing that, we continue straight into the next topic which is pairs and the product type.