Tag Archives: lambda

The Y-Combinator

I’ve struggled a bit in the past to explain why letrec was necessary to allow recursion in a language with first class functions. All we’re trying to achieve is:

But without the use of a global subroutine name, or in fact any environment assignments. If you remember, letrec created a recursive function by creating a symbol naming the function first, with a dummy value,  then evaluated the function in the environment where it’s name was already present, then assigned the resulting closure to the symbol so the function could “see itself”. But in a purely functional setting, assignment is bad, right?

There is a little bit of programming language magic called the “Y-Combinator” that does the job. It’s very succinctly expressed in the λ calculus as:

That is to say, a function taking a function as argument  applying that function to itself, and given (a copy of) itself as argument.

In case this seems all a bit too esoteric, here it is in F♮:

And if that’s still too esoteric here it is in Perl:

Notice that we haven’t named any subroutine, so on the face of it recursion is impossible, but nonetheless, if you give the above code to perl it will very slowly rattle your discs until an out of memory exception, without even a deep recursion error because there’s no function name for perl to attribute the recursion to.

Beore going any further I should point out that none of this is of any value to you whatsoever, other than to assuage your curiosity. Most all modern languages allow recursion, if not support or encourage it (supporting as opposed to just allowing recursion is a fine but important point: scheme supports recursion, Perl and its ilk merely allow it.) Anyway we can use the Y-combinator to calculate a factorial:

Once the inner sub has got hold of itself in  $factorial  it can call  $factorial  as a subref. The outer anonymous sub bootstraps the whole thing by:

  1. Capturing the inner sub in its $factorial
  2. Both calling  $factorial  and passing  $factorial  to it
  3. Passing an extra argument, 5, the number we require the factorial of.

Evaluating Partial Function Application

I’ve mentioned Currying and partial function application already. The idea is that given a function with more than one argument:

if we call it with less than the arguments it expects, then it will return a function that accepts the rest:

(The trailing comma is just a syntactic convention that I’ve come up with that lets the compiler know that we know what we are doing, and lets the reader know that there is Currying going on.) Now setting aside how we might type-check that, it turns out that it’s actually pretty easy to evaluate.

Normal application of a closure looks something like this (Java-ish pseudocode):

For those of you that don’t know Java, List<Symbol> means List of Symbol. And yes, we’re ignoring the possibility that we’re passed the wrong number of arguments, the type checker should deal with that.

Now if we are expecting that we might get fewer than the full set of arguments, we can instead create a new closure that expects the rest:

Note that the dictionary that we have been building is used to extend the environment of the new closure with the values we know already, and that the formal_args we’ve been chaining down is now precisely the remaining arguments that we haven’t seen yet.

Of course this allows for silly definitions like:

But presumably our type checker (if not our grammar) would disallow that sort of thing, because there’s nothing to put a trailing comma after.

[Edit] You could alternatively add a guard clause to  apply() that says if this closure is expecting arguments and doesn’t get any, just return the original closure. That way, something like:

while still silly, would at least not be unnecessarily inefficient.

Addendum – over-complete function application

So I got the above working in F♮ easily enough, then I noticed an anomaly. The type of:

is definately int → int → int, which means that the type checker is allowing it to be called like:  adder(2, 3). Why can’t I call it like that? It turns out I can:

Assuming the type checker has done its job, then if we have any actual arguments left over then they must be destined for the function that must be the result of evaluating the body. So instead of just evaluating the body in the new env, we additionally call  apply()  on the result, passing in the left-over arguments.

This is pretty cool. We can have:

and call it like  adder(2, 3) or  adder(2)(3), and we can have:

and call it like  add(2, 3) or  add(2)(3).

One or the Other, or Both?

The question arises: if we have implicit Currying, (partial function application) then do we need explicit Currying (explicitly returning a function from a function)? The answer is a resounding yes! Consider:

We’ve only called  bigfn once, when evaluating the first argument to map, so expensive_calculation only got called once, and the explicit closure calling either cheap_op_1 or  cheap_op_2 gets called on each element of the list.

If instead we had written:

Then the call to  expensive_calculation would get deferred until the  map actually called its argument function, repeatedly, for each element of the  long_list.

The Hindley-Milner Algorithm

The Hindley-Milner Algorithm is an algorithm for determining the types of expressions. Basically it’s a formalisation of this earlier post. There’s an article on Wikipedia which is frustratingly terse and mathematical. This is my attempt to explain some of that article to myself, and to anyone else who may be interested.

Background

The Hindley-Milner algorithm is concerned with type checking the lambda calculus, not any arbitrary programming language. However most (all?) programming language constructs can be transformed into lambda calculus. For example the lambda calculus only allows variables as formal arguments to functions, but the declaration of a temp variable:

can be replaced by an anonymous function call with argument:

Similarily the lambda calculus only treats on functions of one argument, but a function of more than one argument can be curried, etc.

Expressions

We start by defining the expressions (e) we will be type-checking:

[bnf lhs=”e”]
[rhs val=”E” desc=”A primitive expression, i.e. 3.”/]
[rhs val=”s” desc=”A symbol.”/]
[rhs val=”λs.e” desc=”A function definition. s is the formal argument symbol and e is the function body (expression).”/]
[rhs val=”(e e)” desc=”The application of an expression to an expression (a function call).”/]
[/bnf]

Types

Next we define our types (τ):

[bnf lhs=”τ”]
[rhs val=”T” desc=”A primitive type, i.e. int.”/]
[rhs val=”τ0 → τ1” desc=”A function of one argument taking type τ0 and returning type τ1“/]
[/bnf]

Requirement

We need a function:

[logic_equation num=1]
[statement lhs=”f(ε, e)” rhs=”τ”/]
[/logic_equation]

where:

[logic_table]
[logic_table_row lhs=”ε” desc=”A type environment.”/]
[logic_table_row lhs=”e” desc=”An expression.”/]
[logic_table_row lhs=”τ” desc=”A type”/]
[/logic_table]

Assumptions

We assume we already have:

[logic_equation num=2]
[statement lhs=”f(ε, E)” rhs=”T” desc=”A set of mappings from primitive expressions to their primitive types (from 3 to int, for example.)”/]
[/logic_equation]

The following equations are logic equations. They are easy enough to read, Everything above the line are assumptions. The statement below the line should follow if the assumptions are true.

Our second assumption is:

[logic_equation num=3]
[assumption lhs=”(s, τ)” op=”∈” rhs=”ε” desc=”If (s, τ) is in ε (i.e. if ε has a mapping from s to τ)”/]
[conclusion lhs=”f(ε, s)” rhs=”τ” desc=”Then in the context of ε, s is a τ”/]
[/logic_equation]

Informally symbols are looked up in the type environment.

Deductions

[logic_equation num=4]
[assumption lhs=”f(ε, g)” rhs=”τ1 → τ” desc=”If g is a function mapping a τ1 to a τ”/]
[assumption lhs=”f(ε, e)” rhs=”τ1” desc=”and e is a τ1“/]
[conclusion lhs=”f(ε, (g e))” rhs=”τ” desc=”Then the application of g to e is a τ”/]
[/logic_equation]

That is just common sense.

[logic_equation num=5]
[assumption lhs=”ε1” rhs=”ε ∪ (s, τ)” desc=”If ε1 is ε extended by (s, τ), e.g. if s is a τ”/]
[conclusion lhs=”f(ε, λs.e)” rhs=”τ → f(ε1, e)” desc=”Then the output type of a function with argument s of type τ, and body e, is the type of the body e in the context of ε1“/]
[/logic_equation]

This is just a bit tricky. We don’t necessarily know the value of τ when evaluating this expression, but that’s what logic variables are for.

Algorithm

  • We extend the set T of primitive types with an infinite set of type variables α1, α2 etc.
  • We have a function new which returns a fresh type variable each time it is called.
  • We have a function eq which unifies two equations.

We modify our function, part [4] (function application) as follows:

[logic_equation num=6]
[assumption lhs=”τ0” rhs=”f(ε, e0)” desc=”If τ0 is the type of e0“/]
[assumption lhs=”τ1” rhs=”f(ε, e1)” desc=”and τ1 is the type of e1“/]
[assumption lhs=”τ” rhs=”new” desc=”and τ is a fresh type variable”/]
[conclusion lhs=”f(ε, (e0 e1))” rhs=”eq(τ0, τ1 → τ); τ” desc=”Then after unifying τ0 with τ1 → τ, the type of (e0 e1) is τ.”/]
[/logic_equation]

That deserves a bit of discussion. We know e0 is a function, so it must have a type τa → τb for some types τa and τb. We calculate τ0 as the provisional type of e0 and τ1 as the type of e1, then create a new type variable τ to hold the type of (e0 e1).

Problem

Suppose e0 is the function length (the length of a list of some unspecified type τ2), then τ0 should come out as [τ2] → int (using [x] to mean list of x.)

Suppose further that τ1 is char.

We therefore unify:

[logic_equation]
[statement lhs=”{{{τ2}}}” op=”→” rhs=”int“/]
[statement lhs=”{{{char}}}” op=”→” rhs=”τ”/]
[/logic_equation]

Which correctly infers that the type of (length [‘c’]) is int. Unfortunately, in doing so, we permanently unify τ2 with char, forcing length to have permanent type [char] → int so this algorithm does not cope with polymorphic functions such as length.

A Bit on the Lambda Calculus

In my book I briefly mention (in a footnote I think) the lambda calculus. It’s probably time to expand on that. As I said, the lambda calculus is a mathematical discipline that predates computers, but is amazingly relevant to computer language design. Basically it assumes that there are nothing but functions, each taking only one argument, and it derives all possible computations from them.

I won’t bother describing the specific syntax right now, rather let’s take a look at a translation of some of it into Perl.

 First off, functions can only take one argument. So how do we imitate functions, such as addition, that require more than one argument? The answer is a technique called “Currying” (after the mathematician Haskell Curry who gave the better part of his name to a programming language.) To summarise, a “Curried” function of two arguments is a function that takes one argument and returns a function that takes a second argument and returns the result. So for example Currying a perl function add:

(using anonymous subs throughout for consistency) produces:

which you can call like:

to get 7.

More interestingly, we can represent truth and falsehood as functions. Assuming implicit Currying from now on, true can be represented as a function that takes two arguments and returns its first one:

and false similarily as a function that returns its second argument:

Now we can represent the conditional if as a function of three arguments:

It just applies the truth value to the consequent and alternative arguments, so if is just syntactic sugar.

What about data structures? The simplest composite data structure is a pair, and we can do that with functions too:

so a pair is a function that takes a truth value and returns the car if it is true and the cdr if it is false, and car and cdr are functions that pass the appropriate truth value to the pair.

(I need to talk here about Church Numerals, when I can figure them out.)

Currying all of the above is left as an exercise for the reader.

So what does the lambda calculus actually look like? Well, it’s very terse. Functions are created by λ.x body where x is the argument, and are called like fa where f is the function and a the argument. Of course you can use brackets to disambiguate, but that’s it, nothing more to it. Here’s my attempt at a definition for if:

I hope I got that right: a function taking argument t (test) returning a function taking argument c (consequent) returning a function taking argument a (alternative) which applies t to c, and applies the result to a (remember t is Curried). Phew!

Currying is quite a useful thing to be able to do given support from a language, but it can be difficult to read. For F♮ I hit upon the idea of using a trailing comma to indicate a partial function call:

Which provides a visual clue to the reader that add() in this example is not being called with all of its arguments and therefore returns a function that expects the rest. Perhaps more practically, something like this is more visually appealing, once you get what the trailing comma means:

where we map eval with a supplied first argument env over the items of lst. This avoids having to create a closure:

 

An Abstract Machine

Another recent idea I’ve had is to build a compiler for a pscheme-like language (different syntax, extended functionality) that would target an abstract machine much as Java does (I know Java calls it a “Virtual Machine” but that’s an accident of history since Java originally targeted a real CPU and the replacement machine was a “virtualisation” of that.) I started looking at Parrot, but Parrot has built-in support for a single continuation only, and would make a second “failure” continuation doubly difficult to implement.

So I started thinking about building my own. The real joy of this is that I can prototype it in a high-level language like Perl, then re-write it into something fast like C or C++ later. I actually have a (mostly) working version in Perl for PScheme that runs about seven times faster than the interpreter, but it has issues.

Anyway I think the basic ideas are sound. In an interpreter a closure is a function plus an environment. In a compiled language that’s even simpler, a “function” is just a machine address, so a closure is a pair of addresses: address of environment on the heap and address of machine instruction in the core. A Success Continuation on the other hand has a lot more context. It contains the return address and the old environment, but also must restore any temporary variables, along with the previous continuation that will be returned to later. Finally a Failure continuation is a complete reset. It does not preserve any data when it is invoked, but completely restores the Abstract machine to a previous state.

So without further ado, here’s my abstract machine:

My Abstract Machine

PC
is just an integer index into the machine code table.
ENV
is the address of the current run-time environment (the compiler has pre-determined the locations of the data in the environment, the run time environment contains only the data.)
ARGS
would point to an array or linked list of actual arguments to a closure, before they are bound into an environment. They constitute the nearest lexical scope once inside the closure.
CONT
points at the most recent Continuation.
TEMPS
is just an arbitrarily sized array of temporary registers, their utility is questionable as there is no efficiency saving as there would be for a real CPU, but it makes the model closer to a standard compilation target.
FAIL
points at the most recent Failure continuation.
RESULT
is where a return value is passed to a Continuation.

Usefully, the different subcomponents (Closure, Continuation and Failure) are all contiguous, so invoking any of them is just a singlememcpy() from the stored object (on the heap) to the “cpu”. I don’t know if this is just blind luck or something deeper.

For each subcomponent there would be two abstract machine instructions: one to set up a new structure and one to invoke it. Specifically:

Closure
create_closure(PC, ENV)
create a new closure object (and put it in TEMPS somewhere)
call_closure()
memcpy(&AM, AM.TEMPS[n], sizeof(struct closure))
Continuation
create_continuation(PC)
allocate a new continuation struct, copy the relevant components of the AM into it, assign the PC to it directly, pointAM.CONT at the new continuation.
continue()
memcpy(&AM, AM.CONT, sizeof(struct continuation))
Failure
completely analagous to Continuation above.

So the plan is:

  1. Write the Abstract Machine in C or C++.
  2. Write an interpreter for the new language in Perl.
  3. Write a compiler for the new language in itself.
  4. Run the interpreted compiler on its own source code to produce bytecode for the compiler.
  5. Compile the compiler again using the bytecode compiler and diff the bytecodes: any difference is a problem.
  6. World domination.

So far, I have a parser and a partial interpreter for a new language called “F-Natural”: If you think of ML but with real Unification, backtracking and a javasript-like syntax you will get the basic idea. I’m still a long way from world domination but we live in hope. I may post some code here later.

A Programming Language in a Picture

Recently I’ve been thinking about how best to describe a really simple language so that the reader can just “get it” in one go. As you probably know I’m quite fond of UML and this is what I’ve come up with:

Language in UML

I’d hope that this diagram stands alone, but for the impatient I’ll describe it.

You can see that the implementation breaks down into three primary groups of classes: Syntax structures, Environment and Operations. The environment is the simplest and the best thing to look at first. You can see an abstract base class Env with concrete Frame and Empty subclasses. The Env class provides one concrete method: extend() which takes a dictionary (hash, or whatever) as argument and creates and returns a new Frame containing that dictionary and a parent pointer to the original environment. The Frame and Empty subclasses supply a lookup() method which, for the empty env just errors. For a non empty env it either looks up the symbol in the dictionary or calls lookup() on its parent, passing it the symbol.

The Syntax group of classes all inherit from a common abstract Exp base class that declares one abstract method: eval(env), that they all must implement. A Constantevaluates to itself. A Symbol looks itself up in the argument environment. A Conditional evaluates the test then either evaluates the consequent or the alternative.Function declarations evaluate by creating a new Closure object passing it the body and formal arguments to the function, along with the current environment.Application of an operation involves evaluating the operation expression in the current environment, evaluating the argument expressions in the current environment, then calling the resulting operations apply() method on the resulting evaluated arguments. Note that in operands.map(eval(env)) that eval is sort of Curried: the argument env is explicit, but the map supplies the objects whose eval methods are to be called.

Lastly, Operations themselves share an abstract Op base class with an abstract apply() method. An Op is either a Primitive or a Closure. The implementation of the primitive apply() is just an implementation detail, however the implementation of the Closure‘s apply() ties the whole thing up:

  1. Extend the captured environment (the one passed to new()) binding the formal arguments (passed to new()) to the actual arguments (passed to apply()).
  2. Evaluate the body of the closure in that new environment.

That’s about as simple as I can make it.