Driving the Scsh disassembler from Geiser

(Note: This post refers to functionality provided by geiser-scsh.)

Most Scheme 48 and scsh programmers are familiar with the \,dis command. It allows you to disassemble Scheme code (a procedure or continuation) into the “assembly” understood by the Scheme 48 virtual machine. Here’s an example using scsh’s fork/pipe procedure:

> ,dis fork/pipe
fork/pipe
  0 (protocol 0 +)
  4 (make-env 1)
  7 (global fork)
 10 (push-local0 1)
 13 (push)
 14 (global really-fork/pipe)
 17 (call 2)

This is pretty cool. However, I thought it would be even better if Geiser could run the disassembler on a region of Scheme code, and pop open the disassembler output in a new buffer. Geiser already supports similar functionality for macro-expansion via GEISER-EXPAND-REGION. There’s no reason why it shouldn’t do the same for the disassembler. So that’s what I taught it to do – in this screenshot I’m disassembling the LETREC expression:

../img/scsh-disassembler.png

(I have to admit I was inspired to work on this by my jealousy of the disassembler functionality available to Common Lispers via the SLIME-DISASSEMBLE-{DEFINITION,SYMBOL} commands in SLIME.)

How it Works

Here’s how it works:

  • Highlight a region of Scheme (scsh) code in a buffer
  • Call ‘M-x geiser-scsh-disassemble-region‘ (a.k.a., ‘C-c RET C-d‘).
  • Emacs sends the Scheme code to the running scsh process for disassembly via a newly exported Geiser macro, GE:DISASSEMBLE. This macro wraps the code in a LAMBDA before sending it to the disassembler to work around the fact that the S48 DISASSEMBLE procedure won’t accept arbitrary Scheme expressions such as (+ 1 1). (Wrapping expressions in LAMBDA like this is not ideal for reasons I’ll explain below.)

Future Work

This implementation is a proof of concept – don’t believe in it. Think of it as a prototype of how this functionality could work if built into Geiser proper. Here are some of the issues with it from an implementation perspective:

  • It doesn’t use the correct Geiser protocols for sending Scheme code to evaluate, instead just piping in raw strings. This was expedient because of the way s48/scsh use non-reader-friendly strings like {#Uncategorized} as the final lines of output for procedures whose outputs are not defined by the standard. I think this can be fixed by coming up with a better implementation of the Geiser code evaluation protocols (in scsh/geiser/evaluation.scm) so that they handle all of the weird cases in S48 output.
  • Related to the previous point, I’m doing some ugly regex stuff on the stringified output of the disassembler to make it nicer before piping it into the temp buffer.
  • This functionality should really be added to Geiser itself, via the GEISER-DEBUG-* namespace. Then it would be forced to address both of the above points. Right now it’s just an ugly little hack in geiser-scsh.el. In principle, with the right infrastructure in GEISER-DEBUG-*, there’s nothing preventing a Guile or Racket implementation (here’s the Guile disassembler in action – you can see that it’s not so different from S48):
    scheme@(guile-user)> (define (foo n) (expt n n))
    scheme@(guile-user)> ,x foo
    Disassembly of #<procedure foo (n)>:
    
       0    (assert-nargs-ee/locals 1)      ;; 1 arg, 0 locals
       2    (toplevel-ref 1)                ;; `expt'
       4    (local-ref 0)                   ;; `n'
       6    (local-ref 0)                   ;; `n'
       8    (tail-call 2)                                         at (unknown file):5:16
    
    • The disassembler, like the macro-expansion functionality, should be able to work recursively. That is, in places where the S48 assembly makes procedure calls like (global ge:really-fork/pipe) (as it does in our first example way at the top of this post), you should be able to ask for the disassembled output of GE:REALLY-FORK/PIPE as well, all the way down to the core primitives. Currently, there are cases where you still need to call \,dis <PROCEDURE> at the command prompt. I think this limitation is created by the way we wrap the expression in a LAMBDA before sending it to the disassembler. A better design is needed, one that (for example) calls PROCEDURE? on the proposed input, and then decides whether the expression needs to be wrapped in a LAMBDA before going to the disassembler.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s