All posts by Bill

An Abstract Machine

Another recent idea I’ve had is to build a compiler for a pscheme-like language (different syntax, extended functionality) that would target an abstract machine much as Java does (I know Java calls it a “Virtual Machine” but that’s an accident of history since Java originally targeted a real CPU and the replacement machine was a “virtualisation” of that.) I started looking at Parrot, but Parrot has built-in support for a single continuation only, and would make a second “failure” continuation doubly difficult to implement.

So I started thinking about building my own. The real joy of this is that I can prototype it in a high-level language like Perl, then re-write it into something fast like C or C++ later. I actually have a (mostly) working version in Perl for PScheme that runs about seven times faster than the interpreter, but it has issues.

Anyway I think the basic ideas are sound. In an interpreter a closure is a function plus an environment. In a compiled language that’s even simpler, a “function” is just a machine address, so a closure is a pair of addresses: address of environment on the heap and address of machine instruction in the core. A Success Continuation on the other hand has a lot more context. It contains the return address and the old environment, but also must restore any temporary variables, along with the previous continuation that will be returned to later. Finally a Failure continuation is a complete reset. It does not preserve any data when it is invoked, but completely restores the Abstract machine to a previous state.

So without further ado, here’s my abstract machine:

My Abstract Machine

PC
is just an integer index into the machine code table.
ENV
is the address of the current run-time environment (the compiler has pre-determined the locations of the data in the environment, the run time environment contains only the data.)
ARGS
would point to an array or linked list of actual arguments to a closure, before they are bound into an environment. They constitute the nearest lexical scope once inside the closure.
CONT
points at the most recent Continuation.
TEMPS
is just an arbitrarily sized array of temporary registers, their utility is questionable as there is no efficiency saving as there would be for a real CPU, but it makes the model closer to a standard compilation target.
FAIL
points at the most recent Failure continuation.
RESULT
is where a return value is passed to a Continuation.

Usefully, the different subcomponents (Closure, Continuation and Failure) are all contiguous, so invoking any of them is just a singlememcpy() from the stored object (on the heap) to the “cpu”. I don’t know if this is just blind luck or something deeper.

For each subcomponent there would be two abstract machine instructions: one to set up a new structure and one to invoke it. Specifically:

Closure
create_closure(PC, ENV)
create a new closure object (and put it in TEMPS somewhere)
call_closure()
memcpy(&AM, AM.TEMPS[n], sizeof(struct closure))
Continuation
create_continuation(PC)
allocate a new continuation struct, copy the relevant components of the AM into it, assign the PC to it directly, pointAM.CONT at the new continuation.
continue()
memcpy(&AM, AM.CONT, sizeof(struct continuation))
Failure
completely analagous to Continuation above.

So the plan is:

  1. Write the Abstract Machine in C or C++.
  2. Write an interpreter for the new language in Perl.
  3. Write a compiler for the new language in itself.
  4. Run the interpreted compiler on its own source code to produce bytecode for the compiler.
  5. Compile the compiler again using the bytecode compiler and diff the bytecodes: any difference is a problem.
  6. World domination.

So far, I have a parser and a partial interpreter for a new language called “F-Natural”: If you think of ML but with real Unification, backtracking and a javasript-like syntax you will get the basic idea. I’m still a long way from world domination but we live in hope. I may post some code here later.

A Programming Language in a Picture

Recently I’ve been thinking about how best to describe a really simple language so that the reader can just “get it” in one go. As you probably know I’m quite fond of UML and this is what I’ve come up with:

Language in UML

I’d hope that this diagram stands alone, but for the impatient I’ll describe it.

You can see that the implementation breaks down into three primary groups of classes: Syntax structures, Environment and Operations. The environment is the simplest and the best thing to look at first. You can see an abstract base class Env with concrete Frame and Empty subclasses. The Env class provides one concrete method: extend() which takes a dictionary (hash, or whatever) as argument and creates and returns a new Frame containing that dictionary and a parent pointer to the original environment. The Frame and Empty subclasses supply a lookup() method which, for the empty env just errors. For a non empty env it either looks up the symbol in the dictionary or calls lookup() on its parent, passing it the symbol.

The Syntax group of classes all inherit from a common abstract Exp base class that declares one abstract method: eval(env), that they all must implement. A Constantevaluates to itself. A Symbol looks itself up in the argument environment. A Conditional evaluates the test then either evaluates the consequent or the alternative.Function declarations evaluate by creating a new Closure object passing it the body and formal arguments to the function, along with the current environment.Application of an operation involves evaluating the operation expression in the current environment, evaluating the argument expressions in the current environment, then calling the resulting operations apply() method on the resulting evaluated arguments. Note that in operands.map(eval(env)) that eval is sort of Curried: the argument env is explicit, but the map supplies the objects whose eval methods are to be called.

Lastly, Operations themselves share an abstract Op base class with an abstract apply() method. An Op is either a Primitive or a Closure. The implementation of the primitive apply() is just an implementation detail, however the implementation of the Closure‘s apply() ties the whole thing up:

  1. Extend the captured environment (the one passed to new()) binding the formal arguments (passed to new()) to the actual arguments (passed to apply()).
  2. Evaluate the body of the closure in that new environment.

That’s about as simple as I can make it.

A Novice’s Impression of Archery

I feel at liberty to write about my impressions of archery in the secure knowledge that any decent archer will quite rightly not bother to read them, you have been warned.

I do love the sport though. It’s a perfect combination of the mental and the physical. When you watch world class archers repeatedly hitting a target about the size and colour of a large grapefruit at seventy meters it makes you realize how far you have to go. It’s quite a solitary sport in that you cannot (or at least should not try) to influence the performance of your competitors. The only way you can win is to focus on your own performance.

I shoot a compound bow. Currently it’s wound up to its maximum 52lb draw weight and I’m thinking I need to replace it soon to work up to 59lb (60lb is the maximum weight allowed in FITA competitions.)

So much of archery is about repetition: learn to shoot a 10, repeat. But in order to achieve that repetition you need to build up a routine that is much more mental than physical. The ability to repeat is keyed on minituae of hand and body position, and how do you remember from shot to shot, let alone from week to week, exactly where your hand touches your face or exactly how the bow sits in your hand?

A lot of that comes down to shooting form. Where your hand touches your face is called your anchor position and it is very important to be able to reproduce that. For a compound shooter with a release aid it is recommended that you straddle your jaw bone between the first and second fingers of your hand, then the peep sight can fine tune that position. But it’s very important that the peep sight is set to where your anchor position naturally falls, otherwise your hand will be wandering around your face trying to make the peep sight fit.

On that note, today I reduced the draw length of my bow by a whole inch, and what a difference! It’s like taking an ill-fitting suit to the tailors and having it fitted. I’d alredy adjusted my peep as discussed above, but something was still not right. basically the angle (back corner) of my jaw was between the tips of my second knuckles. Taking an inch off the draw length means my first and second fingers straddle the ridge of my jaw and the angle of my jaw sits neatly just in front of my first knuckles (nearest my palm.) It feels so much more stable that I can’t wait to try it out. Unfortunately we couldn’t shoot today because of high wind.