HSIE

The main part of a lazy functional program (and the thing that really drives evaluation), is the constant need for pattern matching.

Although functions are written mathematically in different cases, they obviously can't be evaluated like that: in a real machine, actual instructions have to be evaluated in an actual order.

In order to make this a reality, there is a "virtual machine" of sorts that has essentially four instructions: HEAD, SWITCH, IF and EVAL. The core algorithm in generating functional code is to convert systems of equations into this abstract code.

The intuitive overview

Consider this definition of take:
take n [] = []
take 0 Cons = []
take n (a:b) = a:(take (n-1) b)
How to proceed? It's possible to just work your way down the equations, trying each one in turn, but that's not actually sound. Apart from anything else, it's possible to end up evaluating an argument that doesn't terminate but was not needed to resolve which equation to use. While you can never be sure - any argument you choose to evaluate could never terminate, it's best to analyze the equations and try and choose a variable to evaluate which gives the greatest diversity of outcome.

Ideally, the system would check that either you have given complete equations or any overlap is consistent. While planned, this is not yet implemented. (example as before)

As humans, it seems intuitive that the second argument offers better payback than the first: if the argument evaluates to Nil, we can immediately select the first equation; if Cons, either of the other two may apply; and if neither of these, the code is in error.

If we were to select the first argument, on the other hand, even if the value is a number we gain no information. We need to do a second test - that it is zero - in order to conclude that we could choose the second equation. Otherwise, (i.e. a non-zero number or some other type), either of the other equations may apply equally.

(As an aside, it's worth pointing out that, even before we get on to discussions of how types can be handled, the HSIE transformation is done BEFORE typechecking, so this is basically a syntactic transformation - we know that Nil and Cons are constructors and/or types because of their literal names (i.e. beginning with a capital) and not because we have any information about their types. The knowledge about constants such as 0 is built into the compiler and the language.)

Continuing with the example, we decide to first HEAD evaluate the second argument and then switch on the result. If it is Nil, we return the first equation (Nil) because that equation holds regardless of the value of the first argument. If it is Cons, then we HEAD evaluate the first argument and, if it is an integer, test it for zero. If true, then we return the second equation (again Nil). In all other cases, we return the third equation. If the second argument is neither Nil nor Cons, then no equations match and we can only conclude that the definition of take is incomplete and return an error (it would, of course, be possible to override this behavior by adding an extra case take n l = ...; it is left as an exercise to the reader to prove that using the algorithm described here that case would only be selected if none of the others matched).

The outcome of this process would then be some virtual HSIE code not dissimilar to the following, noting that I have used v0 for the first argument and v1 for the second, and have likewise numbered the equations from 0:

HEAD v1
SWITCH v1 Nil
  EXPR 0
SWITCH v1 Cons
  HEAD v0
  SWITCH v0 Number
    IF v0 = 0
      EXPR 1
  EXPR 2
ERROR

In interpreting this code, the instructions are executed sequentially until an EXPR line is reached. That expression is then evaluated and execution ceases. Nested blocks are only examined if the leading statement is true, otherwise they are skipped and the next statement is tested. Note that while a set of SWITCH statements could, logically, be considered as a single block, there is no need for them to be listed in that way. At the end the ERROR statement stands in for a default EXPR case.

The actual algorithm

Since the key to functional evaluation is head evaluation, we have to have a knowledge of what variables we can evaluate and what possible values we expect them to have. We will work with a set of possible States where each state contains at least a mapping from variables to a list of pattern/expression pairs (we will come back to other elements of the State as we need them.

Continuing our example of take, we can construct the following initial state. We know that we have two arguments, thus have two possible variables v0 and v1. They can each take three possible patterns (one for each equation).

v0 => 
   n => E0
   0 => E1
   n => E2
v1 =>
   Nil => E0
   (Cons) => E1
   (Cons { head: a, tail: b}) => E2
Now, we also need to be aware that while we have used the same variable (n) in both equations E0 and E2, there's no real need for the user to do that; it could be anything. But as we unify these equations into a single piece of code (possibly reducing shared terms), we need to be sure we use the same set of variables everywhere. In order to do that, we substitute these arbitrary vN variables for the declared variables, being very clear where they came from, and introducing new "temporary" variables for nested patterns we may encounter along the way.

Since the first two expressions are free of variables, they do not need to be rewritten in this way. The third equation, however, depends on both v0 (as n) and the inner-bound variables a and b (for which we will introduce new variables v2 and v3). It thus becomes:

Cons v2 (take (- v0 1) v3))
Having built all of this initial state, we are now ready to start recursing. We will keep a list of such states, and as we process each one in turn it may produce more states that need processing. The algorithm will continue until all the states are processed. That the algorithm terminates is clear from the fact that new states are generated by matching constructors with arguments, and there must be a finite number of these.

The first step is to build a "decision table" which looks at each variable in the current State and analyzes what rewards would come from switching on that variable. We can visualize this table as something like this:

v0 ->
  Number ->
    0 -> E1
    E0
    E2
  * ->
    E0
    E2
v1 ->
  Nil ->
    {} -> E0
  Cons ->
    { head: v2 tail: v3 } ->
      E1
      E2
In other words, we can see that if we select v0, we can divide the world into two cases, and if we select v1, it is divided into three cases. Moreover, we can see that in the first case, no selection will immediately render unique results, while in the second we can see more partitioning.

We then apply a "scoring" algorithm to this that returns the average number of cases that still apply based on our constructor switching options, and then choose the variable that scores "lowest" (i.e. has least amount of confusion remaining).

In this case, that choice is v1. In choosing this, we need to write the HEAD 1 instruction somewhere. Each State also needs to carry with it a location in which to write its instructions. The initial state writes to the start of the body.

We then generate new State objects, one for each of the constructor cases, and write a SWITCH instruction for each of the actual constructors. The new State objects for these have there "instruction pointer" set to the inside of this block. The "default" case (i.e. for no matching constructor) takes the "leftover" instruction pointer after writing the SWITCH statements.

Finally, before moving on, we need to process any variables that were matched in the patterns for the cases that they matched. In this case, that only applies to the Cons case. We need to do two things. First, we need to issue additional instructions inside the SWITCH block to bind the new variables at runtime, like so:

SWITCH Cons
  BIND v2 v1.head
  BIND v3 v1.tail
(this was deliberately omitted earlier for clarity). Secondly, we need to add these variables to the state that is handling the Cons case to make sure they are later resolved.
v2 -> 
  a -> E2
v3 -> 
  b -> E2
In fact, because these are simple variables and not patterns, they are very simply resolved and, in fact, just dismissed by the algorithm, which has already done the most important thing which is to bind them.

Expressions

That just leaves us with the thorny issue of what do those "EXPR" statements actually mean?

Well, by the time the code gets to the point where it wants to execute one of those statements, is must have bound all the variables in the patterns on the left hand side. Thus the only free variables in the expression are the ones that are "globally" scoped (i.e. defined in some scope, not in a lambda form). Assuming that we have some way of resolving those around about now, we can see that everything on the right hand side is one of:

This being a lazy functional language, all we need to do is return the concept of doing this calculation - it will be evaluated when needed.

We can thus create (at the end of the code block) a set of closures which represent each of the function applications, introducing yet more locally bound variables to represent the values of these closures.

Finally, we can replace the "EXPR" statements with "RETURN" and an expression to return, which may be any of the first three forms above - any expression that wants to return an application needs to return the variable associated with the closure for that expression. For ease of additional steps, the RETURN expression should include all the nested closures that need to be created to support the closure being returned.

This leads to the following final code:

HEAD v1
SWITCH v1 Nil
  RETURN []
SWITCH v1 Cons
  BIND v2 v1.head
  BIND v3 v1.tail
  HEAD v0
  SWITCH v0 Number
    IF v0 0
      RETURN []
  RETURN v6 [v4, v5]
ERROR
CLOSURE v4
  PUSH -
  PUSH v0
  PUSH 1
CLOSURE v5
  PUSH take
  PUSH v4
  PUSH v3
CLOSURE v6
  PUSH :
  PUSH v2
  PUSH v5
Note that one of the features of a side-effect free language is that any two expressions which are the same will always have the same value. We can therefore at this stage reduce both code size and execution time by throwing away any closures which have exactly the same arguments in the same order. This doesn't occur in this case, but it is not hard to see that there are cases where it could occur either in a single equation or across a system of equations.