Tag Archives: featured

F Natural is Back

After a long hiatus I started teaching myself Python, and as a warm-up I wrote a little lambda interpreter in it. I was blown away by how convenient and quick it was in Python, and the project has kind of snowballed to the point where I have most of my previously only vapourware “F♮” functional language up and running (albeit only under a test harness.)

The source is GPL and on GitHub:  https://github.com/billhails/PyScheme

Blender Tip – Perspective Clouds

The problem: how to texture a sky sphere with procedural clouds so that the clouds appear to recede into the distance at the horizon, rather than just being wrapped around the sphere. I searched around without luck for a way to do this, before realising that it’s so easy to do, however the maths isn’t going to be obvious to everyone so I thought I’d share this quick tip.

For those who just want the node setup, here it is.

Node setup for Perspective Clouds

However if you’re interested here’s a brief explanation. Im assuming you’re familiar with the standard trick of using a noise texture through a colour ramp as the factor to mix between a sky texture and a clouds texture (or in this case just a plain colour). I won’t bother to explain that bit. What we want is to be able to control the scale of that cloud texture so it gets smaller as the scale value increases towards the horizon. It turns out that the value we want is just the distance from our point of view to each point on an imaginary flat horizontal cloud layer somewhere above our heads. If we take the texture co-ordinate of any point on the sky sphere as a normalised vector, then the z component of that vector is the height above ground of the end of that unit length vector. All we need to work out then is how long we would have to make that vector to touch our imaginary cloud layer.

If we say the cloud layer is 1 unit above the ground, then when z is 1, the vector is pointing straight up and the required vector length is 1. If z is 1/2, then the vector would have to be length 2 to touch the clouds, if z is 1/3, then the required length is 3 and so on. So the length of that vector is just 1/z. Plug 1/z into the scale of our noise texture and bingo, perspective clouds.

So You Feel Lucky?

I’ve just finished reading Stephen Jay Gould’s excellent book Wonderful Life (again) and it got me thinking about random trees.

In case you haven’t read it, Wonderful Life is about the fossil bed known as the Burgess Shales which contains extrordinarily well preserved fossils of soft and hard-bodied animals from a period just after the so-called Cambrian Explosion. The Cambrian Explosion marked the period when the seas first “exploded” with an enormous range of large, multicellular animals with hard shells that preserve easily. In the 1980s a detailed re-evaluation of the fossils found in the Burgess Shales provoked a scientific revolution in paleontology, because it turns out that only a small percentage of those fossils have any direct living descendants, and many of them represent previously unknown phyla (basic types of animals.) This did not fit comfortably with the established notion of evolution as ordered progress, with the basic groups of animals established early on and forming a predictable lineage all the way from microbe to man at the pinnacle. Rather it paints the picture of extinction being the norm, and the survival of one group or another very much in the hands of chance and historical contingency. The book is not an argument against Darwinism but rather a re-evaluation of some of its finer points. Crudely put, it’s not arguing against the existance of a Tree of Life, just questioning what shape the tree is.

Anyway with that in mind, and the somewhat vague hand-drawn trees in the book leaving my curiosity piqued, I started wondering what any real evolutionary tree might look like. Of course it’s impossible to ever produce an algorithm that will accurately represent a real evolutionary sequence, so I thought to keep it very simple.

We start with a “first progenitor“. It has two choices: form two new species or die out.

Each new species has the same option at the next toss of the coin. That’s it. In perl it would look something like this:

 

So there’s a 1/2 probability that the thing will never get started, and you’re left with a stump rather than a tree. But with 2 children, there’s only a ¼ chance that they will both die out, and if they both survive then there are 4 grandchildren, and so on. This code has a definite probability of running forever.

It turns out that if you run this a large number of times, and add up the  number of each depth reached, you get a curve that asymptotically approaches zero at infinity:

treegraph

The graph is normalized so the trees of depth zero come out at 0.5. The little kick at the right is those that reached the maximum depth in my test.

So what do these trees look like? I’ve given the game away by using a picture of one of them as the featured image for this post. As for generating the images, the excellent GraphViz comes to our rescue. With a little jiggery-pokery we can get the above perl code  to produce a .dot  file that we can feed to GraphViz and get a picture. I’ve extended the code to color nodes and vertices red if they are “survivors” (have descendents at the limiting depth) and black if they represent a species with no descendants. I’ve also changed the code to try again repeatedly until it generates a tree that reaches a limiting depth. Here’s a representative:

tree2

The limit was set at 60, so assuming 2 million years to create a species (I remember that figure from somewhere, I have a bad habit of throwing up unverified facts) this represents about 120,000,000 years of evolution from a single common ancestor. The interesting thing here I think is that the majority of branches don’t make it. Extinction is the norm, even for apparently large and flourishing branches. Apparently insignificant branches can suddenly flourish, and equally suddenly die out. I think this is close to Gould’s vision in general, if not in detail.

The other interesting thing is the huge variety of shapes. Some trees are wide, others are narrow, for example:

tree5

In this case all of the survivors share a common ancestor only a few generations (speciations) ago. This could easily be a model for the very earliest life, since the common ancestor of all current life on earth, who’s closest living relative is likely one of the Archaea, is far too complex to be a “first progenitor”.

I don’t know where I’m going with this from here, probably nowhere, but I think it’s interesting.

To finish off, here’s the full implementation of the tree generating code in case you want to try it yourself.  You can pick up GraphViz from www.graphviz.org and run it from the command-line (the commands are called dot , neato , circo  etc.) or via a gui.

 

Evaluating Partial Function Application

I’ve mentioned Currying and partial function application already. The idea is that given a function with more than one argument:

if we call it with less than the arguments it expects, then it will return a function that accepts the rest:

(The trailing comma is just a syntactic convention that I’ve come up with that lets the compiler know that we know what we are doing, and lets the reader know that there is Currying going on.) Now setting aside how we might type-check that, it turns out that it’s actually pretty easy to evaluate.

Normal application of a closure looks something like this (Java-ish pseudocode):

For those of you that don’t know Java, List<Symbol> means List of Symbol. And yes, we’re ignoring the possibility that we’re passed the wrong number of arguments, the type checker should deal with that.

Now if we are expecting that we might get fewer than the full set of arguments, we can instead create a new closure that expects the rest:

Note that the dictionary that we have been building is used to extend the environment of the new closure with the values we know already, and that the formal_args we’ve been chaining down is now precisely the remaining arguments that we haven’t seen yet.

Of course this allows for silly definitions like:

But presumably our type checker (if not our grammar) would disallow that sort of thing, because there’s nothing to put a trailing comma after.

[Edit] You could alternatively add a guard clause to  apply() that says if this closure is expecting arguments and doesn’t get any, just return the original closure. That way, something like:

while still silly, would at least not be unnecessarily inefficient.

Addendum – over-complete function application

So I got the above working in F♮ easily enough, then I noticed an anomaly. The type of:

is definately int → int → int, which means that the type checker is allowing it to be called like:  adder(2, 3). Why can’t I call it like that? It turns out I can:

Assuming the type checker has done its job, then if we have any actual arguments left over then they must be destined for the function that must be the result of evaluating the body. So instead of just evaluating the body in the new env, we additionally call  apply()  on the result, passing in the left-over arguments.

This is pretty cool. We can have:

and call it like  adder(2, 3) or  adder(2)(3), and we can have:

and call it like  add(2, 3) or  add(2)(3).

One or the Other, or Both?

The question arises: if we have implicit Currying, (partial function application) then do we need explicit Currying (explicitly returning a function from a function)? The answer is a resounding yes! Consider:

We’ve only called  bigfn once, when evaluating the first argument to map, so expensive_calculation only got called once, and the explicit closure calling either cheap_op_1 or  cheap_op_2 gets called on each element of the list.

If instead we had written:

Then the call to  expensive_calculation would get deferred until the  map actually called its argument function, repeatedly, for each element of the  long_list.

The Hindley-Milner Algorithm

The Hindley-Milner Algorithm is an algorithm for determining the types of expressions. Basically it’s a formalisation of this earlier post. There’s an article on Wikipedia which is frustratingly terse and mathematical. This is my attempt to explain some of that article to myself, and to anyone else who may be interested.

Background

The Hindley-Milner algorithm is concerned with type checking the lambda calculus, not any arbitrary programming language. However most (all?) programming language constructs can be transformed into lambda calculus. For example the lambda calculus only allows variables as formal arguments to functions, but the declaration of a temp variable:

can be replaced by an anonymous function call with argument:

Similarily the lambda calculus only treats on functions of one argument, but a function of more than one argument can be curried, etc.

Expressions

We start by defining the expressions (e) we will be type-checking:

[bnf lhs=”e”]
[rhs val=”E” desc=”A primitive expression, i.e. 3.”/]
[rhs val=”s” desc=”A symbol.”/]
[rhs val=”λs.e” desc=”A function definition. s is the formal argument symbol and e is the function body (expression).”/]
[rhs val=”(e e)” desc=”The application of an expression to an expression (a function call).”/]
[/bnf]

Types

Next we define our types (τ):

[bnf lhs=”τ”]
[rhs val=”T” desc=”A primitive type, i.e. int.”/]
[rhs val=”τ0 → τ1” desc=”A function of one argument taking type τ0 and returning type τ1“/]
[/bnf]

Requirement

We need a function:

[logic_equation num=1]
[statement lhs=”f(ε, e)” rhs=”τ”/]
[/logic_equation]

where:

[logic_table]
[logic_table_row lhs=”ε” desc=”A type environment.”/]
[logic_table_row lhs=”e” desc=”An expression.”/]
[logic_table_row lhs=”τ” desc=”A type”/]
[/logic_table]

Assumptions

We assume we already have:

[logic_equation num=2]
[statement lhs=”f(ε, E)” rhs=”T” desc=”A set of mappings from primitive expressions to their primitive types (from 3 to int, for example.)”/]
[/logic_equation]

The following equations are logic equations. They are easy enough to read, Everything above the line are assumptions. The statement below the line should follow if the assumptions are true.

Our second assumption is:

[logic_equation num=3]
[assumption lhs=”(s, τ)” op=”∈” rhs=”ε” desc=”If (s, τ) is in ε (i.e. if ε has a mapping from s to τ)”/]
[conclusion lhs=”f(ε, s)” rhs=”τ” desc=”Then in the context of ε, s is a τ”/]
[/logic_equation]

Informally symbols are looked up in the type environment.

Deductions

[logic_equation num=4]
[assumption lhs=”f(ε, g)” rhs=”τ1 → τ” desc=”If g is a function mapping a τ1 to a τ”/]
[assumption lhs=”f(ε, e)” rhs=”τ1” desc=”and e is a τ1“/]
[conclusion lhs=”f(ε, (g e))” rhs=”τ” desc=”Then the application of g to e is a τ”/]
[/logic_equation]

That is just common sense.

[logic_equation num=5]
[assumption lhs=”ε1” rhs=”ε ∪ (s, τ)” desc=”If ε1 is ε extended by (s, τ), e.g. if s is a τ”/]
[conclusion lhs=”f(ε, λs.e)” rhs=”τ → f(ε1, e)” desc=”Then the output type of a function with argument s of type τ, and body e, is the type of the body e in the context of ε1“/]
[/logic_equation]

This is just a bit tricky. We don’t necessarily know the value of τ when evaluating this expression, but that’s what logic variables are for.

Algorithm

  • We extend the set T of primitive types with an infinite set of type variables α1, α2 etc.
  • We have a function new which returns a fresh type variable each time it is called.
  • We have a function eq which unifies two equations.

We modify our function, part [4] (function application) as follows:

[logic_equation num=6]
[assumption lhs=”τ0” rhs=”f(ε, e0)” desc=”If τ0 is the type of e0“/]
[assumption lhs=”τ1” rhs=”f(ε, e1)” desc=”and τ1 is the type of e1“/]
[assumption lhs=”τ” rhs=”new” desc=”and τ is a fresh type variable”/]
[conclusion lhs=”f(ε, (e0 e1))” rhs=”eq(τ0, τ1 → τ); τ” desc=”Then after unifying τ0 with τ1 → τ, the type of (e0 e1) is τ.”/]
[/logic_equation]

That deserves a bit of discussion. We know e0 is a function, so it must have a type τa → τb for some types τa and τb. We calculate τ0 as the provisional type of e0 and τ1 as the type of e1, then create a new type variable τ to hold the type of (e0 e1).

Problem

Suppose e0 is the function length (the length of a list of some unspecified type τ2), then τ0 should come out as [τ2] → int (using [x] to mean list of x.)

Suppose further that τ1 is char.

We therefore unify:

[logic_equation]
[statement lhs=”{{{τ2}}}” op=”→” rhs=”int“/]
[statement lhs=”{{{char}}}” op=”→” rhs=”τ”/]
[/logic_equation]

Which correctly infers that the type of (length [‘c’]) is int. Unfortunately, in doing so, we permanently unify τ2 with char, forcing length to have permanent type [char] → int so this algorithm does not cope with polymorphic functions such as length.