0003

The Hindley-Milner Algorithm

The Hindley-Milner Algorithm is an algorithm for determining the types of expressions. Basically it’s a formalisation of this earlier post. There’s an article on Wikipedia which is frustratingly terse and mathematical. This is my attempt to explain some of that article to myself, and to anyone else who may be interested.

Background

The Hindley-Milner algorithm is concerned with type checking the lambda calculus, not any arbitrary programming language. However most (all?) programming language constructs can be transformed into lambda calculus. For example the lambda calculus only allows variables as formal arguments to functions, but the declaration of a temp variable:

can be replaced by an anonymous function call with argument:

Similarily the lambda calculus only treats on functions of one argument, but a function of more than one argument can be curried, etc.

Expressions

We start by defining the expressions (e) we will be type-checking:

e::=EA primitive expression, i.e. 3.
|sA symbol.
|λs.eA function definition. s is the formal argument symbol and e is the function body (expression).
|(e e)The application of an expression to an expression (a function call).

Types

Next we define our types (τ):

τ::=TA primitive type, i.e. int.
|τ0 → τ1A function of one argument taking type τ0 and returning type τ1

Requirement

We need a function:

[1]f(ε, e)=τ

where:

εA type environment.
eAn expression.
τA type

Assumptions

We assume we already have:

[2]f(ε, E)=TA set of mappings from primitive expressions to their primitive types (from 3 to int, for example.)

The following equations are logic equations. They are easy enough to read, Everything above the line are assumptions. The statement below the line should follow if the assumptions are true.

Our second assumption is:

[3](s, τ)εIf (s, τ) is in ε (i.e. if ε has a mapping from s to τ)
f(ε, s)=τThen in the context of ε, s is a τ

Informally symbols are looked up in the type environment.

Deductions

[4]f(ε, g)=τ1 → τIf g is a function mapping a τ1 to a τ
f(ε, e)=τ1and e is a τ1
f(ε, (g e))=τThen the application of g to e is a τ

That is just common sense.

[5]ε1=ε ∪ (s, τ)If ε1 is ε extended by (s, τ), e.g. if s is a τ
f(ε, λs.e)=τ → f(ε1, e)Then the output type of a function with argument s of type τ, and body e, is the type of the body e in the context of ε1

This is just a bit tricky. We don’t necessarily know the value of τ when evaluating this expression, but that’s what logic variables are for.

Algorithm

  • We extend the set T of primitive types with an infinite set of type variables α1, α2 etc.
  • We have a function new which returns a fresh type variable each time it is called.
  • We have a function eq which unifies two equations.

We modify our function, part [4] (function application) as follows:

[6]τ0=f(ε, e0)If τ0 is the type of e0
τ1=f(ε, e1)and τ1 is the type of e1
τ=newand τ is a fresh type variable
f(ε, (e0 e1))=eq(τ0, τ1 → τ); τThen after unifying τ0 with τ1 → τ, the type of (e0 e1) is τ.

That deserves a bit of discussion. We know e0 is a function, so it must have a type τa → τb for some types τa and τb. We calculate τ0 as the provisional type of e0 and τ1 as the type of e1, then create a new type variable τ to hold the type of (e0 e1).

Problem

Suppose e0 is the function length (the length of a list of some unspecified type τ2), then τ0 should come out as [τ2] → int (using [x] to mean list of x.)

Suppose further that τ1 is char.

We therefore unify:

2]int
[char]τ

Which correctly infers that the type of (length [‘c’]) is int. Unfortunately, in doing so, we permanently unify τ2 with char, forcing length to have permanent type [char] → int so this algorithm does not cope with polymorphic functions such as length.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">