Abstract Heresies: February 2011

Monday, February 28, 2011

No stack? No problem.

“We have no badges. In fact, we don't need badges. I don't have to show you any stinking badges...”
— The Treasure of the Sierra Madre, B. Travern

I really blew it with that last post. Mea maxima culpa.

The IBM 704 didn't have a stack.
This was not unusual at the time. Most computer languages at the time had no need of a stack at all. Algol, however, was a different story. Friedrich L. (Fritz) Bauer had, a few years earlier, invented the stack in order to support recursive descent evaluation of expressions. This appeared to be the best way to support the recursion and block structure that ALGOL required. As Alan Perlis notes:

It was ALGOL 60 that really pointed a finger at, focused on the absolute necessity of learning how to define, control and manipulate stacks. ALGOL 60 would have been impossible to adequately process in a reasonable way without the concept of stacks. Though we had stacks before, only in ALGOL 60 did stacks come to take a central place in the design of processors.
—Alan J. Perlis, Transcripts of Presentation

Not having a stack can make it a bit of a challenge to program, but Steve “Slug” Russell invented a different technique. The assembly code of LISP I was written in continuation-passing style. (Russell did not name this technique; the phrase “continuation-passing style” seems to have been coined by Sussman and Steele in 1975.)

When a program is converted to continuation-passing style, all control transfer can be expressed with jump and indirect jump operations. This does not eliminate the need for temporary storage — any piece of re-entrant or recursive code will need to save and restore values in use as it calls subroutines — but it allows you to separate the control flow from the temporary storage management.

Closed subroutines in LISP are given a standard calling sequence. The arguments of the subroutines are placed in order in the AC, the MQ, and locations called ARG3 and ARG4.

After the arguments have been provided, the subroutine is entered by a TSX SUBR, 4 where SUBR is the location of the first instruction of the subroutine. The subroutine ends with its value in the AC and the instruction TRA 1,4 which returns control to the instruction after the TSX that set IR4. Note that any other program can be executed between this TSX and the TRA 1,4 so long as the contents of IR4 are restored to the value set by the TSX before the TRA is executed. A subroutine restores all index registers it uses to the value they had on entering the subroutine.
— S. Russell, Artificial Intelligence Project — RLE and MIT Computation Center, Writing and Debugging Programs — Memo 6

We can see this in the LISP 1.5 assembly code:

       REM
       REM APPEND(L1,L2)=
       REM (L1=0 YIELDS L2,1 YIELDS CONS(CAR(L1),APPEND(CDR(L1),L2))
A      HED
APPEND TNZ APNP1
       XCA
       TRA 1,4       ** Invoke continuation (return)

 APNP1 SXD AS1,4     ** Store continuation in memory
       TSX $SAVE,4   ** Call SAVE subroutine
       TXL     $END2,,CWR1+2      SAVE 2 ITEMS
       PDX 0,4
       CLA 0,4
       STO CWR1
       ANA DECM
       TSX APPEND,4  ** Recursive call to APPEND
       XCA
       LXA CWR1,4
       PXD 0,4
       TSX UNSAVE,4  ** Call UNSAVE to restore temps
       LXD AS1,4     ** Restore continuation to register
       TRA $CONS     ** Tail call CONS

Thursday, February 24, 2011

Where did the stack go?

[This post doesn't describe things the way I'd like to. Please see the subsequent post.]

The IBM 704 didn't have a stack.

This was not unusual at the time. Although stacks were known about for years before, low-level languages did not make much use of them. Early FORTRAN certainly didn't need a stack, so IBM didn't bother with a hardware implementation.

Not having a stack does make it a bit of a challenge to program. For example, when you call a subroutine, where does the return address go?

The IBM 704 had an instruction TSX (Transfer and Set Index) that was designed for subroutine linkage. TSX would do two things. First, it loaded a register with the two's-complement value of the program counter. (The IBM 704 was much better at subtraction than addition. The two's complement of the program counter made it easier to dispatch because subtracting a two's complement number gives the same result as adding the uncomplemented number.) Second, it loaded the program counter with the subroutine address. Loading the program counter would cause an unconditional jump to the subroutine.

There was obviously a data path from the instruction register to the program counter — that's how the target address got from the instruction stream to the program counter. Likewise, there was a data path from the program counter to the register set. However, there doesn't appear to be a data path from the register set back in the other direction. So we can use the TSX instruction to call a subroutine, but once there we need to do a nasty trick to get back. That trick is to end the subroutine with an unconditional jump (opcode TRA). When we enter the subroutine, we expect the return address to be in one of the registers (by convention, LISP used register 4). The first thing the subroutine does is to save the return address in the target field of the TRA instruction. This modifies the code so that the unconditional jump at the end takes us back to where the subroutine is called.

But what if the subroutine needs to be re-entrant?

Ultimately, you need a stack (or a heap). There are several options for implementing one. The obvious idea of allocating a block of memory and keeping a pointer to the top of stack was not obviously the best idea. Without dedicated hardware, operations like push and pop become quite expensive. You have to fetch the stack pointer from memory, perform an addition or subtraction, do an indirect memory reference, and write the new stack pointer back to memory. (And if you thought the x86 didn't have enough registers, the IBM 704 had only 3 index registers.)

Another method is to use a linked list as a stack. This is not obviously a bad idea. Once the linked list is created at program initialization time, pushing and popping can be done via pointer chasing. This avoids having to use the ALU for adding and subtracting the stack pointer. Furthermore, you didn't have to guess how much stack space to reserve. with only 32,000 words of memory, you didn't want to waste any on unused stack reserve. For these reasons, both IPL and LISP I used a ‘push-down list’ to implement re-entrant subroutines.

Wednesday, February 23, 2011

A harder puzzle

Suppose I have a procedure that needs to make temporary use of some expensive resource. The usual way to code this up is something like this:

(defun foo (x)
  (let ((resource nil))   ; or #f or whatever
    (unwind-protect
        (progn (setq resource (obtain-resource pool))
               (compute x resource))
      (if (not (null resource))
          (return-resource pool resource)))))

There are two problems with this. The first is a minor race condition where the resource can be lost if an asynchronous interrupt is taken right after obtain-resource returns but just before the value is assigned to the variable. We'll ignore that problem for now. The bigger problem is that we cannot easily spot the separation between the parts of foo concerned with resource management and the parts needed for computing the return value.

What we want to do is abstract out the resource management:

(defun foo (x)
  (call-with-resource resource-pool
    (lambda (resource)
      (compute x resource))))

(defun call-with-resource (pool receiver)
  (let ((resource nil))
    (unwind-protect
        (progn (setq resource (obtain-resource pool))
               (funcall receiver resource))
      (if (not (null resource))
          (return-resource pool resource)))))

(One could use macros here, too.)

Now we can see that foo just supplies an additional argument to a call to compute. As an additional bonus, we have a simple idiom that correctly encapsulates the acquisition and release of our resource. Other code that has the ‘boilerplate’ needed to manage the resource can be simplified to use our wrapper. It is less likely that resources are dropped by accidentally copying the boilerplate incorrectly. This sort of wrapper code is frequently found in Lisp or Scheme.

The puzzle is “How do you do this in Java?” To be more specific, suppose I have this code:

class Foo {
    ...
    public Answer method1 (int x) throws BadRequest {
        Resource resource = null;
        try {
            resource = pool.obtainResource();
            return computeAnswer (x, resource);
        } finally {
            if (resource != null)
                pool.returnResource(resource);
        }
    }

    private Answer computeAnswer (int x, Resource r) throws BadRequest {
        ....
    }
}

class Bar {
   ...
   public void method2 (Boolean flag, String name) throws NoSuchEntry {
        ExpensiveObject resource = null;
        try {
            resource = expensiveObjectPool.obtainResource();
            if (flag)
               resource.addName(name);
            doSomething(resource); 
        } finally {
            if (resource != null)
                expensiveObjectPool.returnResource(resource);
        }
    }

    private void doSomething (ExpensiveObject expo) throws NoSuchEntry {
        ....
    }      
}

and I want to write something more like this:

class Foo {
    ...
    public Answer method1 (int x) throws BadRequest {
        // WRAPPER CODE HERE
        return computeAnswer(x, resource);
        // POSSIBLE WRAPPER EPILOGUE
    }
 
    private Answer computeAnswer (int x, Resource r) throws BadRequest {
        ....
    }
}

class Bar {
   ...
   public void method2 (Boolean flag, String name) throws NoSuchEntry {
        // WRAPPER CODE HERE;
            if (flag)
               resource.addName(name);
            doSomething(resource); 
        // POSSIBLE WRAPPER EPILOGUE HERE
    }

    private void doSomething (ExpensiveObject expo) throws NoSuchEntry {
        ....
    }      
}

Where the wrapper and possible epilogue (if needed) are some relatively simple, easy to use constructs like a method call or inner class declaration or whatever.

An easy puzzle

Suppose I have some code like this:

(defun foo (x)
  (wrapper-macro
    (bar 42 x)))

Where wrapper-macro simply prints ‘Normal exit’ or ‘Throw’ as control returns from bar through foo. (If bar returns normally, all the return values are returned from foo after wrapper-macro prints. If bar exits via a throw, then the throw continues after wrapper-macro prints.)

The challenge: Implement wrapper-macro.

(If you wish, imagine that wrapper-macro simply expands to this:

(defun foo (x)
  (wrapper-function
    (lambda ()
      (bar 42 x))))

and the challenge is to implement wrapper-function.)

Friday, February 18, 2011

An early LISP program (circa Feb. 1960)

       TST M948-371-P. FOX-EXERCISE 1
DEFINE
(((COLLAPSE,(LAMBDA,(L),(COND,
   ((ATOM,L), (CONS,L,NIL))
   ((NULL,(CDR,L)),
      (COND,((ATOM,(CAR,L)),L),(T,(COLLAPSE,(CAR,L)))))
   (T,(APPEND,(COLLAPSE,(CAR,L)),(COLLAPSE,(CDR,L))))
))))) ()
COLLAPSE ((((A,B),((C))),((D,(E,F)),(G),((H))))) ()
COLLAPSE ((A,(B,(C,(D,(E))),F,(G,(H,J))))) ()
COLLAPSE ((((((A),B),C),D),E)) ()
   STOP))))))))))STOP
        FIN M948-371-P. FOX-EXERCISE 1

Notes:

This example and the quotes below were taken from the LISP I manual by Phyllis Fox. The output from the program contains these two lines:
APPLY OPERATOR AS OF SEPTEMBER 1, 1959
THE TIME IS NOW 2/16 1139.5
so we can date the sample run.
Each line above was punched on a separate card.
The TST (or SET) card indicates the beginning of the program. The FIN card indicates the end.
This is followed by triplets of the form <function>, <argument-list>, <environment>. Instead of read, eval, print, the main loop is more like read, read, read, apply, print.
The STOP card has lots of extra close parens in order to keep the reader from consuming the entire card deck if the programmer didn't correctly close all the parens.

The read program reads until it finds the correct number of final parenthesis.

S-expressions were originally supposed to have commas between the elements. The LISP I manual explains: “The commas in writing S-expressions may be omitted. This is an accident.” The author of the reader took a shortcut and treated whitespace as commas. “The current version of the read program interprets a blank (space) as a comma, so that no blanks are allowed within atomic symbols.”
“LISP II will be able to handle both integers and floating point numbers, but in LISP I integers are not allowed. ... Numbers are brought into a LISP program by being defined within S-expressions, and they must always be quoted. For example, the following expression is correct, (SUM,(QUOTE,0.6),X)”
“Values of compiled functions are computed about 60 times faster than the S-expressions for the functions could be interpreted... After a function has been compiled, it can be used as if it had been defined, but of course it will run much faster than it would have as an interpreted expression.”

Thursday, February 17, 2011

Observations on the Feeling of the Ugly and Insignificant

vsm001 said...
I neglected to add my biggest objection to Louis's project, the lack of a strictly (or lazily) functional subset.

Louis told me he is planning on writing a book about interpreter implementation. I have offered to write prolegomena for any future meta-circularity, but Louis doesn't seem too keen on the idea.

Anyway...
vsm001 went directly to the meta-point of many of my posts: This is no way to go about designing a language.

BUT, I wanted to explore a particular issue in the design space of languages, not whether we should be in the space in the first place. In particular, let us assume some plausible rationale for inventing a language and then ask the questions that vsm001 raised.

vsm001 asked,
What are the characteristics of problems that Reason is well-suited to solve?

I'll rephrase this. What are the characteristics of problems that would be better solved by a language that has two complementary ‘modes’ of operation as opposed to a language in which a single mode suffices? In other words, neither of the two modes of Reason taken separately is quite powerful enough to be Turing complete, you really need both of them; under what situations would this provide an advantage over using a single, Turing complete mode?

Tuesday, February 15, 2011

Critique of Reason

A language will not succeed without a good name. I have recently invented a very good name and now I am looking for a suitable language.
— Donald Knuth

Louis Reasoner is developing a new computer language. I'm letting him write this post.

“I have already done the most important part: the name of my language is Reason.

“The second most important thing is to come up with a trope. This will enable me to defend my language decisions with a pithy reply and call it a design philosophy. My trope is ‘reasonableness’. Reason will adhere to the philosophy of reasonableness. Anything in my language is, by definition, reasonable. Anything omitted is naturally unreasonable. Here is some more design philosophy:

Small is more reasonable than big.
Fast is more reasonable than slow.
Useful is more reasonable than useless.
Good is more reasonable than bad.
etc., etc., etc....

I'll generate more later.

“After the name and the self-styled philosophy, the next important thing is the syntax. The syntax of Reason is complicated, but reasonable. Bear with me. Although it might be a bit difficult to explain, it is, after sufficient practice, natural to use.

“Reason borrows heavily from Lisp and Scheme. Most of the programming in Reason is done in ‘Lisp’ mode. Lisp mode is exactly what you think it would be: you write programs just like you would in Lisp, but with a few modifications to make things more reasonable.

Prefix notation for everything by default, with these exceptions:
- Infix notation for arithmetic. — Let's just capitulate on this one. Operator precedence will be fleshed out later.
- Excess parenthesis are ignored where it is unambiguous. You can pretty much always add parenthesis to make something clearer or to override operator precedence.
- Function calls have the operator on the outside of the parenthesis. Not only is this more natural, it avoids weird interactions with operator precedence. Arguments are separated by commas.
- Postfix notation for post-increment and post-decrement. — I just like these.
No ‘special’ forms. We get rid of
- define — Lisp code tends to creep towards the right margin. Removing internal definitions makes the code ‘flatter’ and therefore easier to read. (Besides, it's hard to efficiently compile internal definitions.)
- begin — No good reason to include it.
- cond — Too many parenthesis.
- let — Again, too many parenthesis.
- lambda — This is just too advanced. Mere mortals can define a named procedure like anyone else would.
- letrec — As if lambda wasn't confusing enough! Who comes up with this stuff?
- quote — Too confusing. Actually, I wanted to leave quote in, but with infix and postfix in the mix I couldn't figure out where the quotation ended. I decided to punt.
Assignment is allowed. This is technically a ‘special form’, but it is so useful that I had to leave it in.

“Lisp mode is distinguished by its use of nesting as the primary means of combination. In fact, sequential combinations are disallowed. Lisp mode constructs are created compositionally from other Lisp mode constructs or primitives. Lisp mode is where most of the primitive procedures of the language are available.

“In order to radically simplify Lisp mode I had to introduce ‘Program’ mode. Program mode is more imperative than Lisp mode, so I tried to reduce the functionality and power of Program mode so that people would favor Lisp mode. Program mode has these features:

Special forms — Flow control, variable declaration and binding, conditionals, etc. are available in Program mode.
No artificially rigid syntax — Program mode is not really infix, prefix, or postfix. Each construct defines its own syntax. This improves readability for the built-in special forms. For example, the if construct can have a traditional else clause rather than just having two or three subconstructs like in Lisp.
No user-defined special forms — The user doesn't have to memorize any additional special forms beyond the built-in ones.
No arbitrary nesting — In Program mode, you can only nest a construct if the enclosing construct permits it. There is nesting, it is just controlled.
Concatenation for a means of combination — Lisp mode is obviously recursively evaluated. This isn't needed in Program mode because arbitrary nesting is disallowed. Instead, user constructs in Program mode are built by concatenation of Program mode primitives.
Easy switching to Lisp mode — You can always concatenate a Lisp mode construct with a Program mode construct. Originally, I didn't want to do this, but I ended up duplicating a lot of the Lisp mode functionality. By allowing the programmer to switch to Lisp mode pretty much whenever he wants, I avoided having to duplicate a lot of code.

“When you write a procedure in Reason, you start in Program mode. As I mentioned, you can switch to Lisp mode whenever you feel like it, but there is no way to switch back (this isn't strictly true, you can mimic the ability to switch back by use of a function call. The function body starts in Program mode.) You can define a procedure in either Program mode or Lisp mode. That is, you can explicitly mark a procedure as available in Program mode only or in Lisp mode only. Of course since you can always switch to Lisp mode, it makes no discernible difference if you define a procedure as Lisp mode only; you can use it in Program mode. But if you define a procedure in Program mode only, you simply cannot use it from Lisp mode. (It will be visible, you just can't call it.)

“I could go on for hours about how variables are looked up, but I don't want to bore everyone. I would like some feedback about my ideas so far, though.

How about it? Would any readers care to critique the design of Reason?

Monday, February 14, 2011

I whine too much.

Mea culpa.

   ...
   for (Foo foo : permute(fooList)) {
        operateOnFoo (foo);
    }
   ...

  private static final Random rng = new Random();

  private static <E> List<E> permute (List<E> list) {
    List<E> answer = Lists.newArrayList (list);
    Collections.shuffle (answer, rng);
    return answer;
  }

Tuesday, February 8, 2011

More about shuffle.

In my last post, I asked why
public static void shuffle(List<?> list) was given a ‘B’ grade. A few readers didn't see the problem, so I thought I'd clarify.

I was writing code to perform some operations on the elements of a collection. Multiple copies of the code will be running. I want to avoid having each copy being synchronized with the others, so it makes sense to iterate over the list elements in a non-deterministic order. This will not prevent multiple instances of the code from occasionally ‘colliding’ on a single object, but it will make this case the exception rather than the rule.

I originally wrote the code using a Set<Foo> rather than a List<Foo> because there is no natural ordering of Foo objects. The code looked like this:

    for (Foo foo : fooSet) {
        operateOnFoo (foo);
    }

where fooSet was an instance variable declared as Set<Foo>.

A Set is an unordered abstraction. Obviously, an iterator on a Set will return the objects is some order, but abstractly, there is no meaningful way to ‘change’ the order of the Set itself. A List, on the other hand, is an ordered abstraction that imposes a external order on the elements. (An array imposes the additional unnecessary constraints of compactedness on the indexes and random access.) Changing the instance variable from Set<Foo> to a List<Foo> was trivial.

The problem is now to ‘iterate pseudorandomly’. Pseudorandom iteration is not an uncommon operation, so I assumed there was a way to do it. Pseudorandom iteration is not common enough that a special iteration construct would be defined for it, so I assumed I would simply use a normal iterator and iterate over a permutation of the list. What immediately came to mind was something like:

    for (Foo foo : fooList.permute()) {
        operateOnFoo (foo);
    }

That doesn't work.

So Java doesn't provide a ‘permute’ method on a list. I vaguely recalled that Java had something called ‘shuffle’ rather than ‘permute’. However, the List class doesn't have a ‘shuffle’ method either. None of the methods on a List will permute it. That is strange because permuting a list is a fairly common operation.

I was sure that I had seen ‘shuffle’ somewhere, so I hauled out grep to search for some code that called it. I found some and chased down the documentation. The shuffle method is a static method in the Collections class. This means that I have to use a different syntax:

    for (Foo foo : Collections.shuffle(fooList)) {
        operateOnFoo (foo);
    }

That doesn't work.

The shuffle method mutates its argument but does not return a value. This means I cannot simply call the method as an expression, but I have to hoist the call to the nearest ‘Command’ level:

    Collections.shuffle(fooList);
    for (Foo foo : fooList) {
        operateOnFoo (foo);
    }

That works, but now I have to revisit my original design: is it OK to directly shuffle the fooList, or should I copy it first? Fortunately, I don't need a copy.

To recap, what I wanted to write was this:

    for (Foo foo : fooList.permute()) {
        operateOnFoo (foo);
    }

what I had to write was this:

    Collections.shuffle(fooList);
    for (Foo foo : fooList) {
        operateOnFoo (foo);
    }

By making shuffle ‘return’ void, the author has relegated the method to command contexts only. I happened to want to use it in an expression context.

I'm sure a lot of readers are wondering “So what?” It is in fact a trivial point, and now that I've discovered how to permute a list I doubt I'll encounter this particular problem in the future. But the point is that Java distinguishes between commands and expressions. You are prohibited from using a command where an expression is called for. However, you are permitted to use an expression where a command is called for. The programmer has to keep track of which things are commands because they require a command context. The programmer doesn't have to remember which things are expressions because they can be used in either context. By making shuffle a command, the author has added yet another item to my mental “remember these aren't expressions” list. If shuffle returned the shuffled list, I wouldn't have to think about it at all.

Arcane Sentiment noted that “ Java uses the return type as a way of declaring that the operation has side effects... ”. I guess I'm weird because I use the return type to declare the type of object that will be returned.

Friday, February 4, 2011

What's wrong with this?

The java.util.Collections library has this method:
public static void shuffle(List<?> list)

Were this homework, I'd give it a solid ‘B’. Readers are encouraged to suggest improvements.