Tag Archives: scheme

The Sentinel File Pattern

In this short essay I’ll describe the “sentinel file” pattern, which I recently used when writing a command-line tool to use at $WORK for interacting with our web API.

The essence of the sentinel file pattern is that you use a certain file’s last-modified time as a record against which you compare other time-based values.

It is useful in many contexts, such as software builds; in the context of web APIs, it can be used to track whether you will need to reauthenticate with the API before you fire off a bunch of API calls.

The recipe is essentially this:

  • Update a sentinel file F‘s timestamp at time T.
  • When you are about to take an action such as make an API call, see if the current time, T’, is greater than the timeout value of your web API, V, plus the sentinel file’s existing timestamp T.

We can translate this into Scheme as follows (this is scsh, to be exact), where:

;; F = sentinel-file
;; T = (file-last-mod sentinel-file)
;; T' = (time)
;; V = api-timeout-value

(define (sentinel-expired? sentinel-file)
   (> (time)
      (+ (file-last-mod sentinel-file) api-timeout-value)))

Note that TIME and FILE-LAST-MOD are part of the scsh POSIX API.

This pattern is much more efficient than storing some kind of “am I logged in?” value in a JSON/YAML/XML/s-expression config file that has to be read in and parsed on every invocation and written out from time to time.

I debated whether to write about this simple technique at all because it seems like an old trick that many people know. However, I’m going to assume that I am not unique, and that there are lots of people out there who could benefit from using this technique when the right situation arises.

The Debugger is a Notebook

penrose-tiling-based-modular-origami.jpg

My style of programming has changed since the ODB. I now write insanely fast, making numerous mistakes. This gives me something to search for with the ODB. It’s fun.

– Bil Lewis, creator of the Omniscient Debugger

I. Programmers and Poets

In this short essay I will explore some similarities between the way (some) programmers work when doing exploratory programming that can be fruitfully compared to the way poets write poems. I will also sprinkle in some attempts to make the case that a good debugger is core to the experience of composing a new program that you don’t understand very well yet, and compare that with the experience of writing a poem. I know a lot about the former because I am not a very good programmer, so many programs that explore computer science concepts are “exploratory” for me; I know a lot about the latter because I am a reasonably good poet who has written many poems (which is to say, that I have edited many poems, which is really more important).

This work is largely inspired by:

  • The experience of working an programs for SICP exercises and getting popped into the MIT Scheme debugger a lot! 1
  • Using the Scheme 48/scsh inspector a bunch while working on geiser-scsh
  • Writing a lot of poems

II. Generating {Program,Poem} Text

Computer program texts are compressed descriptions of computational processes designed to be experienced by computers and humans. Similarly, poems are compressed descriptions of cognitive and emotional processes designed to be experienced by humans.

Both artifacts strive to encapsulate something that was understood by the author(s) at one point in time and convey that understanding to a reader at another point in time (human or machine). In poetry world, there are a number of different ways to work. There are ostensibly some writers who think really hard for a moment and write a fully-formed sentence. Then they think for a few moments more and write down another fully-formed sentence. And so on.

In reality, there are very few poets who work this way. Most people work using an approximation of what Sussman beautifully describes as “problem-solving by debugging almost-right plans”. 2 This is actually how human beings create new things! As my professor told our writing workshop, “I can’t teach you how to write. I can only teach you how to edit your own work”. Few people write well, and fewer edit well. But in the end, writing and editing are actually the same thing. When you first edit a poem, you may correct minor errors in the text. The more important work is “running the program” of the poem in your head, reading it over and over, reciting it aloud, testing whether it achieves the aesthetic purpose you have set for it. You will add a pronoun in one place, and replace an adjective in another. You might remove the last line, or add another stanza entirely. Do this for long enough, and you may find the poem has changed substantially over the course of having been “debugged”. It may also achieve a goal that you didn’t know existed when you began writing it. I suspect there is something very similar at work when people are doing exploratory programming sessions.

III. Debugger as Crutch/Enabler

Debuggers are seen by some as a crutch. I agree that debuggers are a crutch. There’s a reason crutches were invented. Because without them, you would have to crawl along, dragging your broken leg behind you in the dirt. And we all have a “broken leg” of sorts when we’re working on a problem we don’t understand very well.

I’d like to propose a better metaphor for debuggers. The debugger is a notebook where you can sketch out different versions of your program. You may correct minor errors in a variable declaration, or change a parameter to a procedure. You might redefine an existing procedure as a higher-order procedure that replaces two or three more verbose ones. And so on. All inside the debugger!

A sufficiently powerful debugger will give you the freedom to sketch out an idea quickly, watch it break, and play around in the environment where the breakage occurred, reading variable bindings, entering new definitions, etc.

I think of this style of programming as “sketching in your notebook” because you don’t write poems by staring at a blank sheet of paper for two minutes and then carefully writing a fully-formed sentence. You have an initial idea or impulse, and then you express it as fast as you can! You write down whatever parts of it you can manage to catch hold of, since your brain is working and associating much faster than your pen can really capture. What you end up with is a pile of things, some of which you will discard, some of which are worth keeping just as they are, and some of which are worth keeping but non-optimal and will need to be rewritten. If you actually have an idea worth expressing, you are in much greater danger of not capturing something than you are of making a mess. You will always start by making a mess and then cleaning it up 3.

I submit that a sufficently powerful, expressive debugging environment is as necessary to the programmer as a pocket notebook to the poet.

Interesting Reads

These essays explore writing, debugging, and thinking in more depth:

(Image courtesy fdecomite under Creative Commons License.)

Footnotes:

1

For more information about how cool the MIT Scheme debugger is, see Joe Marshall’s informative blog post.

2

This technique is mentioned on his CSAIL page here. For more information, see the link to his paper elsewhere on this page.

3

You may enjoy an interesting essay with this title: Make a Mess, Clean it Up!

Make your own iterators in Scheme using closures

idle-machines

Introduction

I recently bought a copy of Higher Order Perl (also known as HOP) by Mark Dominus. In chapter 4, he introduces iterators. At a high level, an iterator is a black box with an input funnel, an output funnel, and a big red “NEXT” button. You put something into the funnel at time T; then, at some later time T+n, you press the “NEXT” button and something useful comes out.

Iterators are useful in cases when you want to access a potentially huge list of things one at a time, but you don’t have access to a computer with infinite memory.

In this post we’ll look at a moderately useful iterator example; a directory walker that recurses down a directory tree on your file system. The implementation is given in Scheme (specifically, the scsh dialect); the code and the descriptions thereof are heavily inspired by Dominus’ excellent book. However, this post contains the work of an amateur. Any errors or omissions in the following code and explanations are mine.

Iterators are made with closures

In order to build a useful iterator, we need to have a basic understanding of closures. A closure is just a function packaged up with an environment. The environment allows us to store up state between function calls. As noted above, it’s easiest just to think of it as a little black box that we can pass around in our program.

In Scheme and Common Lisp, you build functions with LAMBDA. Here’s a very simple closure example in Scheme, translated from the first example given in HOP:

(define (upto m n)
  (set! m (- m 1)) ;needed since we don't have post-increment ($i++)
  (lambda ()
    (if (< m n)
        (begin (set! m (+ m 1)) m)
        #f)))

This is a function that returns a function (a closure). To use the closure returned by UPTO, assign it to a variable and call it whenever you want the next value. When the iterator is exhausted, it will return #f.

> (define 1to10 (upto 1 10))
> (1to10)
1
> (1to10)
2
...
> (1to10)
9
> (1to10)
10
> (1to10)
#f

A trivial file tree iterator

A more interesting and practical use of this technique is to walk a directory tree on your computer. In the Scheme procedure below, we take a list of directories and then build and return a closure that, when called, will walk the directory tree breadth-first, printing out the names of files as we go.

(define (dir-walk queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files)))
                          file))
                       ((file-regular? file) file)
                       (else #f)))))))

The important part to notice is the inner LAMBDA. This is where we create a closure by packaging up a procedure with an environment. The closure’s environment remembers the contents of the variable QUEUE in memory, so we can ask for the elements of QUEUE at our leisure.

Here it is running on my machine:

> (define dir-iter1 (dir-walk '("/Users/rloveland/Desktop/current/")))
> (dir-iter1)
"/Users/rloveland/Desktop/current/XProlog.jar"
> (dir-iter1)
"/Users/rloveland/Desktop/current/all-jira-issues.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/automate-campaign-setup.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/buflocal.scm"
> (dir-iter1)
"/Users/rloveland/Desktop/current/bundle.js"

Just for fun, here’s a version of the same function in Common Lisp. It’s essentially the same as the scsh version, save that it uses the nice CL pathname machinery instead of appending raw strings, and also makes use of the CL-FAD convenience library.

(defun dir-walk (queue)
  (let ((file nil))
    (lambda ()
      (if (not (null queue))
          (progn
            (setf file (car queue))
            (setf queue (cdr queue))
            (cond ((cl-fad:directory-pathname-p file)
                   (let ((new-files (cl-fad:list-directory file)))
                     (setf queue (append queue (mapcar #'(lambda (filename)
                                                           (merge-pathnames filename file))
                                                       new-files)))
                     file))
                  ((cl-fad:file-exists-p file) file)
                  (t nil)))))))

Usage is also essentially the same:

CL-USER> (defvar *tree-walker* (dir-walk '(#P"/Users/rloveland/Desktop/current/")))
*TREE-WALKER*
CL-USER> (type-of *tree-walker*)
FUNCTION
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Desktop/current/"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.debuggerDefaults"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.DS_Store"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hi"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hs"

A slightly less trivial file tree iterator

Listing out files and directories is nice, but not that useful. We’d like a way to see only those files and directories that have some interesting property.

This can be accomplished by passing in another argument to our DIR-WALK function: this argument will be yet another function that will test the current file to see whether it’s interesting to us. It’s pretty easy to change DIR-WALK to accept a function argument, INTERESTING?. This arbitrary function is used to check the file to see if we care about it.

This time around, when we build our internal queue, we use a call to FILTER to make sure that only interesting files get added.

(define (dir-walk* interesting? queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond
                  ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (filter interesting? (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files))))
                          (if (interesting? file) file)))
                  ((interesting? file) file)
                  (else #f)))
          #f))))

And here it is in use; in this example, we pass an INTERESTING? function that asks for only files that are not marked as executable:

> (define dir-iter2 (dir-walk* (lambda (f) (file-not-executable? f)) '("/home/rml/Desktop/current")))
> (dir-iter2)
> (dir-iter2)
"/home/rml/Desktop/current/A.hi"
> (dir-iter2)
"/home/rml/Desktop/current/A.hs"
> (dir-iter2)
"/home/rml/Desktop/current/A.o"
> (dir-iter2)
"/home/rml/Desktop/current/ABBREV.xlsx"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.class"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.java"

Conclusion

There are still bugs and unexpected behavior in DIR-WALK*. For example, on the first call to the resulting iterator there is no result due to the one-armed IF. There are also strange things that happen if we want to filter out directories from our results, since we mix together the FILE-DIRECTORY? check and the INTERESTING? check inside the iterator. However, despite these small nits, it can still do useful work on my machine 1, and it’s a good enough example of using closures to build iterators.

Here’s hoping you’ll enjoy playing with closures and iterators in your own code!

(Image courtesy zeitfanger.at under a Creative Commons License.)

Footnotes:

1 In the grandest internet tradition, “It works on my machine” ™.

Scsh is UNIX, Part 1: Sockets

gorge

Inspired by words from Ryan Tomayko and Jacob Kaplan-Moss, I set out to see how difficult it would be to implement a tiny UNIX echo server in scsh. It turns out to be fairly easy. In this post, we’ll cover:

  • How the UNIX socket programming model works at a high level
  • How to write a basic echo server in scsh

Unlike the echo servers of Tomayko and Kaplan-Moss, this one doesn’t use fork (yet). We will add forking to the server in a follow-up post. Hopefully this will make it easier for less experienced UNIX programmers (like me) to get started with.

The UNIX socket programming model for dummies

At a high level, the steps to open a server-side socket are:

  • create a socket “thing”
  • toggle some settings on it so the OS knows it’s for interweb use
  • bind it to an address
  • listen for connections on it
  • start accepting those connections

scsh does its part to make this easy by providing a high-level Scheme procedure, BIND-LISTEN-ACCEPT-LOOP, which handles this stuff for you, in addition to some nifty error handling and other bookkeeping chores. But we’re going to ignore that for now and write everything by hand to see how it’s done.

You should be able to enter all of the code in the next section directly in the scsh top-level. You don’t need to open any additional packages; this is just plain old scsh.

Our echo server

Our server takes two arguments: the PROC is a procedure that causes the server to actually do something; we’ll write in a minute. The PORT is the port you want to serve on, e.g., 49995:

(define (server proc port)
  (let* ((sock (create-socket protocol-family/internet socket-type/stream))
         (addr (internet-address->socket-address internet-address/any port)))
    (set-socket-option sock level/socket socket/reuse-address #t)
    (bind-socket sock addr)
    (listen-socket sock 5)
    (let loop ()
      (lambda () (accept-connection sock))
      proc
      (loop))))

The first thing you’ll notice is that it’s pretty sequential and quite simple really. We just go through the socket opening dance we described earlier: open, configure, bind, listen, and accept.

Since this is an echo server, we’ll just, like, echo stuff:

(define (echo socket address)
  (let* ((in (socket:inport socket))
         (out (socket:outport socket))
         (message (read-line in)))
      (format out "~A~%" message)
      (force-output out)))

As you can see, our ECHO procedure takes a socket and an address. (We don’t do anything with the address in this example, but our procedure needs to “implement this interface”, as they say, in order to work. You can see this for yourself in the scsh 0.6.7 tarball; it’s in scsh/network.scm.)

In our LET*-binding we create some convenient locally bound names for the socket structure’s input and output ports, and then we read a line from the socket’s input port.

Since this is echo server, we just write the same data back out with FORMAT. We call FORCE-OUTPUT to flush output to the terminal immediately. Otherwise Scheme will wait for the operating system’s output buffer to fill before writing out, and it will appear to the user that nothing is happening.

Trying it out

Let’s start it up and see if it works. Assuming you’ve loaded the procedures above in your scsh image somehow, enter this at the REPL:

> (server echo 49995)

The REPL should appear to “hang” while the server is running. Now go to your terminal and connect to the echo server. There are several ways to do it; here’s what I used:

~ $ telnet localhost 49995
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
HELLO
HELLO
who's there?
who's there?
i don't know
i don't know
could you stop that?
could you stop that?
fine then
fine then
goodbye
goodbye

Hopefully that was enough to get you started playing with sockets in scsh. I’ll write a followup post where we talk about the UNIX forking process model and update this example to make it a “preforking” server.

(Image courtesy Hope for Gorilla under a Creative Commons license.)

Driving the Scsh disassembler from Geiser

(Note: This post refers to functionality provided by geiser-scsh.)

Most Scheme 48 and scsh programmers are familiar with the \,dis command. It allows you to disassemble Scheme code (a procedure or continuation) into the “assembly” understood by the Scheme 48 virtual machine. Here’s an example using scsh’s fork/pipe procedure:

> ,dis fork/pipe
fork/pipe
  0 (protocol 0 +)
  4 (make-env 1)
  7 (global fork)
 10 (push-local0 1)
 13 (push)
 14 (global really-fork/pipe)
 17 (call 2)

This is pretty cool. However, I thought it would be even better if Geiser could run the disassembler on a region of Scheme code, and pop open the disassembler output in a new buffer. Geiser already supports similar functionality for macro-expansion via GEISER-EXPAND-REGION. There’s no reason why it shouldn’t do the same for the disassembler. So that’s what I taught it to do – in this screenshot I’m disassembling the LETREC expression:

../img/scsh-disassembler.png

(I have to admit I was inspired to work on this by my jealousy of the disassembler functionality available to Common Lispers via the SLIME-DISASSEMBLE-{DEFINITION,SYMBOL} commands in SLIME.)

How it Works

Here’s how it works:

  • Highlight a region of Scheme (scsh) code in a buffer
  • Call ‘M-x geiser-scsh-disassemble-region‘ (a.k.a., ‘C-c RET C-d‘).
  • Emacs sends the Scheme code to the running scsh process for disassembly via a newly exported Geiser macro, GE:DISASSEMBLE. This macro wraps the code in a LAMBDA before sending it to the disassembler to work around the fact that the S48 DISASSEMBLE procedure won’t accept arbitrary Scheme expressions such as (+ 1 1). (Wrapping expressions in LAMBDA like this is not ideal for reasons I’ll explain below.)

Future Work

This implementation is a proof of concept – don’t believe in it. Think of it as a prototype of how this functionality could work if built into Geiser proper. Here are some of the issues with it from an implementation perspective:

  • It doesn’t use the correct Geiser protocols for sending Scheme code to evaluate, instead just piping in raw strings. This was expedient because of the way s48/scsh use non-reader-friendly strings like {#Uncategorized} as the final lines of output for procedures whose outputs are not defined by the standard. I think this can be fixed by coming up with a better implementation of the Geiser code evaluation protocols (in scsh/geiser/evaluation.scm) so that they handle all of the weird cases in S48 output.
  • Related to the previous point, I’m doing some ugly regex stuff on the stringified output of the disassembler to make it nicer before piping it into the temp buffer.
  • This functionality should really be added to Geiser itself, via the GEISER-DEBUG-* namespace. Then it would be forced to address both of the above points. Right now it’s just an ugly little hack in geiser-scsh.el. In principle, with the right infrastructure in GEISER-DEBUG-*, there’s nothing preventing a Guile or Racket implementation (here’s the Guile disassembler in action – you can see that it’s not so different from S48):
    scheme@(guile-user)> (define (foo n) (expt n n))
    scheme@(guile-user)> ,x foo
    Disassembly of #<procedure foo (n)>:
    
       0    (assert-nargs-ee/locals 1)      ;; 1 arg, 0 locals
       2    (toplevel-ref 1)                ;; `expt'
       4    (local-ref 0)                   ;; `n'
       6    (local-ref 0)                   ;; `n'
       8    (tail-call 2)                                         at (unknown file):5:16
    
    • The disassembler, like the macro-expansion functionality, should be able to work recursively. That is, in places where the S48 assembly makes procedure calls like (global ge:really-fork/pipe) (as it does in our first example way at the top of this post), you should be able to ask for the disassembled output of GE:REALLY-FORK/PIPE as well, all the way down to the core primitives. Currently, there are cases where you still need to call \,dis <PROCEDURE> at the command prompt. I think this limitation is created by the way we wrap the expression in a LAMBDA before sending it to the disassembler. A better design is needed, one that (for example) calls PROCEDURE? on the proposed input, and then decides whether the expression needs to be wrapped in a LAMBDA before going to the disassembler.

The Josephus Problem

../img/origami-wreath.jpg

The Josephus Problem is deceptively simple. A number of people stand in a circle. (Let’s call this number N.) They agree to count around the circle, and every /M/th person will be killed. As they go around the circle, in what order do they die? Another way of looking at it is: who is the last person left alive?

For an interesting introduction, I recommend checking out the Wikipedia article.

I recently found this problem described in the 2nd edition of Sedgwick’s Algorithms, and thought it would be fun to implement in Scheme. Because I’m not a real programmer (and I had to translate to Scheme from the Pascal in the book), it took me a couple of hours instead of a couple of minutes. But what an enjoyable couple of hours!

I started, as usual for algorithmic programming tasks, with a simulation. In the book, Sedgwick provides an example: If there are 9 people, and every 5th person in the circle is killed, the order in which they’ll die is: 5, 1, 7, 4, 3, 6, 9, 2, 8.

It’s usually easier to start work on a problem by thinking about a simpler case, so that’s what I did. I chose N = 5, and M = 2. Below I’ll show the results of my “paper” simulation (OK, the paper was actually an Emacs buffer).

First, let me show you how to read my “notation”. It should be pretty easy to figure out. Below is an example of one “state snapshot”. I’ve labelled each line with its meaning. If you are a Scheme programmer, you will surmise that each named item corresponds to a “named-let” loop variable. In addition, the vertical bar character in the XS row represents the state of the “cursor” as it goes around the circle of 5 people.

;; I.                           <-- State Number
;; ------------
;; |1 2 3 4 5                   <--    XS
;; '()                          <--    YS
;; 0                            <-- COUNT

Above, we are in state 1, at the start of the program. The cursor is at the beginning of the list XS, which represents the circle of people. YS is our results list, where we will store the “dead”. As we travel around the circle and people are killed, they are added to YS. We could think of it as a “death toll”, but really it’s just a list of integers. As we go around the circle of people, we keep incrementing COUNT. When COUNT reaches the value of M, that person will be “killed”, that is, be added to YS.

When this happens, we reset the COUNT to 0 and keep going around the circle. There’s a catch, however. As more dead people are added to our result list YS, we need to ignore the spaces those people used to occupy in our count as we build COUNT back up towards M. In other words, the circle will have 4 people in it after the first person is killed. This means that as the circle gets smaller and smaller, people are killed more frequently.

I’ll show you how I’ve solved that problem in Scheme in a moment; first, the simulation:

;; I.
;; ------------
;; |1 2 3 4 5
;; '()
;; 0

;; II.
;; ------------
;; 1| 2 3 4 5
;; '()
;; 1

;; III.
;; ------------
;; 1 2| 3 4 5
;; '()
;; 1

;; IV.
;; ------------
;; 1 _ 3| 4 5
;; '(2)
;; 1

;; V.
;; ------------
;; 1 _ 3 4| 5
;; '(2)
;; 2

;; V.
;; ------------
;; 1 _ 3 _ 5|
;; '(4 2)
;; 1

;; VI.
;; ------------
;; 1| _ 3 _ 5
;; '(4 2)
;; 2

;; VII.
;; ------------
;; _ _ 3| _ 5
;; '(1 4 2)
;; 1

;; VIII.
;; ------------
;; _ _ 3 _ 5|
;; '(1 4 2)
;; 2

;; IX.
;; ------------
;; _ _ 3| _ _
;; '(5 1 4 2)
;; 1

;; X.
;; ------------
;; _ _ _| _ _
;; '(5 1 4 2)
;; 2

;; XI.
;; ------------
;; _ _ _| _ _
;; '(3 5 1 4 2)
;; 2

Now that you’ve seen how the algorithm should work on paper, let’s write some Scheme! I should state for the record that I did not read the Wikipedia article linked above until after I wrote this code. (This will be pretty obvious when you look at the code.) Here it is:

(define (josephus xs m)
  (let ((orig-xs (list-copy xs))
        (true-m (- m 1)))
    (let ((len (length xs)))
      (let loop ((xs xs)
                 (ys '())
                 (count 0))
        (cond 
         ;; If the dead list grows to the same length as our original
         ;; list, we know we're finished.
         ((= len (length ys))
          (reverse ys))
         ;; If XS is null, we have gone once around the circle.  We
         ;; start around again, leaving the count unchanged.
         ((null? xs)
          (loop orig-xs ys count))
         ;; If the current person we're looking at is already dead
         ;; (a ghost), move on.  And don't bother bumping the
         ;; count.
         ((member (car xs) ys)
          (loop (cdr xs) ys count))
         ;; If we're here, it's also the case that the current person
         ;; we're looking at is *not* in the dead list.  Therefore we
         ;; check if the count is equal to M.  If so, they must die.
         ((= count true-m)
          (loop (cdr xs) (cons (car xs) ys) 0))
         ;; If we get here, the current person we're looking at is
         ;; alive.  We check if the count is *not* equal to M.  If it
         ;; is not, we skip this person and bump the count.
         ((not (= count true-m))
          (loop (cdr xs) ys (+ count 1)))
         ;; We should never get here.
         (else #f))))))

How does it compare to the output described in Sedgwick? It seems to work!

> (josephus '(1 2 3 4 5 6 7 8 9) 5)
'(5 1 7 4 3 6 9 2 8)

There are several inefficiencies we could tackle, though. The first and most obvious is that we should try to calculate the solution mathematically (as shown in some of the Wikipedia examples) instead of building up lists using CONS. Another is that we make a “canonical” copy of the input list for restarting the loop. A third is the use of MEMBER to determine whether the person we’re currently visiting is already in the “dead” list.

For example, here’s the trace output showing 19 calls to MEMBER for a pretty small input:

(josephus '(1 2 3 4 5) 2)
[Enter (member 1 '())
 Leave member #f]
... 17 more calls to member ...
[Enter (member 3 '(5 1 4 2))
 Leave member #f]
'(2 4 1 5 3)

That said, in this case I was happy just to get something working. If this were going to be used for anything “real”, I would probably rewrite it statefully using vectors. Most of the fun was in coming up with this first draft without resorting to any stateful idioms.

(Image courtesy Chris Lott via CC-BY license.)

Scsh Manual Available in Texinfo Format

../img/shells.jpg

I am pleased to announce that, many years later, the famous Scsh Reference Manual has been converted to Texinfo. You can get your very own copy here: https://github.com/rmloveland/scsh-manual-texinfo.

This conversion represents quite a bit of work. Although I would have liked to do the conversion with a program, there was just too much of an impedance mismatch between the original LaTeX sources and Texinfo. Something tells me that’s why this conversion hasn’t happened until now. Whatever the reason, I’m glad to finally collect some internet cool points, even if it is almost twenty years later.

In the course of the conversion, I uncovered a number of small spelling, grammatical, and usage issues, which have now been fixed. To be clear: this is no criticism of the original authors. Several of them appear to be non-native English speakers. Their English is miles better than my German!

The resulting manual builds very nicely into HTML, PDF, or Info files. The PDF in particular is nice, since you get beautiful, clickable links everywhere! And the Info files are quite easy to install in your local Emacs.

Speaking of Emacs, if you like hacking Scsh from Emacs, try my geiser-scsh setup. Better yet, help me hack on it!

(Image courtesy steeljam under CC-BY-NC-ND license.)

SICP Exercises 1.5 through 1.7

../img/origami-pockets.jpg

In this installment, we’ll look at exercises 1.5 through 1.7. Note that these exercises are taken from the first edition.

Exercise 1.5

The GOOD-ENOUGH? test used in computing square roots will not be very effective for finding the square roots of very small numbers. Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inadequate for very large numbers. Explain these statements, with tests showing how the test fails for small and large numbers. An alternative strategy for implementing GOOD-ENOUGH? is to watch how GUESS changes from one iteration to the next and to stop when the change is a very small fraction of the guess. Design a square-root procedure that uses this kind of test. Does this work better for small and large numbers?

First, here are the original definitions of GOOD-ENOUGH? and friends from the text:

(define (square x)
  (* x x))

(define (sqrt-iter guess x)
  (if (good-enough? guess x)
      guess
      (sqrt-iter (improve guess x)
                 x)))

(define (improve guess x)
  (average guess (/ x guess)))

(define (average x y)
  (/ (+ x y) 2))

(define (good-enough? guess x)
  (< (abs (- (square guess) x)) 0.001))

(define (new-sqrt x)
  (sqrt-iter 1.0 x))

Let’s test the operation of NEW-SQRT. In the results below, NEW-SQRT is off by a factor of ten when taking the root of .00001 – a small number.

(sqrt .00001)
0.00316228 ;Three one thousandths.
(new-sqrt .00001)
0.0313565 ;Three one hundredths. Wat?

Why are we so far off? Let’s trace GOOD-ENOUGH?, since it’s used to decide when our result is, er, good enough.

> ,trace good-enough?
> (new-sqrt .00001)
[Enter (good-enough? 1. 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.500005 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.250012 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.125026 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.0625531 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.0313565 1.e-05)
 Leave good-enough? #t]
0.0313565

In the last call to GOOD-ENOUGH?, we’re still off by a factor of ten! This is because inside GOOD-ENOUGH?, we’re checking our results against an absolute value (also known as a “magic number”), which is dumb. As suggested by the text, let’s implement a better GOOD-ENOUGH? that watches how GUESS changes from one iteration to the next and returns true when the value of GUESS stops changing by much. This means we’ll have to update SQRT-ITER, too.

(define (good-enough? last-guess this-guess x)
  (if (< (abs (- this-guess last-guess)) .0001) ;Watch how GUESS changes.
      #t
      #f))

(define (sqrt-iter last-guess this-guess x)
  (if (good-enough? last-guess this-guess x)
      this-guess
      (sqrt-iter this-guess (improve this-guess x) x)))

(define (new-sqrt x)
  (sqrt-iter 0 1.0 x))

Now we’ll test our new GOOD-ENOUGH?:

> (sqrt .0001)
0.01
> (new-sqrt .0001)
0.01
> (sqrt .1)
0.316228
> (new-sqrt .1)
0.316228
> (sqrt .0000001)
0.000316228
> (new-sqrt .0000001)
0.000319804 ;Pretty darn close!

As we can see from the tests, this version of GOOD-ENOUGH? starts to lose accuracy when we work with millionths. I’m happy with that accuracy for this application. :-)

Exercise 1.6

Newton’s method for cube roots is based on the fact that if y is an approximation for the cube root of x, then a better approximation is given by the value

(x/y**2 + 2y) / 3

Use this formula to implement a cube-root procedure analogous to the square root procedure.

After doing the last problem, this is pretty much a “color by numbers” implementation:

(define (cube-iter last-guess this-guess x)
  (if (good-enough? last-guess this-guess x)
      this-guess
      (cube-iter this-guess (cube-improve this-guess x) x)))

(define (cube-improve guess x)
  (/ (+ (/ x (square guess)) (* 2 guess)) 3))

(define (cube x)
  (cube-iter 0 1.0 x))

Testing it is also straightforward:

  • Take the cube root Y of a number, X
  • Use EXPT to cube Y
  • See if the number you get back is X

The tests look good. Note that GOOD-ENOUGH? is still being traced (just for fun).

> (cube 16)
[Enter (good-enough? 0 1. 16)
 Leave good-enough? #f]
[Enter (good-enough? 1. 6. 16)
 Leave good-enough? #f]
[Enter (good-enough? 6. 4.14815 16)
 Leave good-enough? #f]
[Enter (good-enough? 4.14815 3.07538 16)
 Leave good-enough? #f]
[Enter (good-enough? 3.07538 2.61415 16)
 Leave good-enough? #f]
[Enter (good-enough? 2.61415 2.5232 16)
 Leave good-enough? #f]
[Enter (good-enough? 2.5232 2.51985 16)
 Leave good-enough? #f]
[Enter (good-enough? 2.51985 2.51984 16)
 Leave good-enough? #t]
2.51984
> ##
2.51984
> (expt ## 3)
16.
> (expt 16 3)
4096
> (cube ##)
[Enter (good-enough? 0 1. 4096)
 Leave good-enough? #f]
[Enter (good-enough? 1. 1366. 4096)
 Leave good-enough? #f]
[Enter (good-enough? 1366. 910.667 4096)
 Leave good-enough? #f]
[Enter (good-enough? 910.667 607.113 4096)
 Leave good-enough? #f]
[Enter (good-enough? 607.113 404.746 4096)
 Leave good-enough? #f]
[Enter (good-enough? 404.746 269.839 4096)
 Leave good-enough? #f]
[Enter (good-enough? 269.839 179.911 4096)
 Leave good-enough? #f]
[Enter (good-enough? 179.911 119.983 4096)
 Leave good-enough? #f]
[Enter (good-enough? 119.983 80.0836 4096)
 Leave good-enough? #f]
[Enter (good-enough? 80.0836 53.6019 4096)
 Leave good-enough? #f]
[Enter (good-enough? 53.6019 36.2098 4096)
 Leave good-enough? #f]
[Enter (good-enough? 36.2098 25.1812 4096)
 Leave good-enough? #f]
[Enter (good-enough? 25.1812 18.9407 4096)
 Leave good-enough? #f]
[Enter (good-enough? 18.9407 16.4329 4096)
 Leave good-enough? #f]
[Enter (good-enough? 16.4329 16.0113 4096)
 Leave good-enough? #f]
[Enter (good-enough? 16.0113 16. 4096)
 Leave good-enough? #f]
[Enter (good-enough? 16. 16. 4096)
 Leave good-enough? #t]
16.

Exercise 1.7

Each of the following two procedures defines a method for adding two positive integers in terms of the more primitive operators 1+, which increments its argument by 1, and -1+, which decrements its argument by 1.

(define (incr n) (+ n 1))               ;Modern 1+
(define (decr n) (- n 1))               ;Modern -1+

;; The FOO and BAR designators are just used to differentiate these
;; procedures from our built-in addition procedure (and from each
;; other).
(define (foo+ a b)
  (if (= a 0)
      b
      (incr (foo+ (decr a) b))))

(define (bar+ a b)
  (if (= a 0)
      b
      (bar+ (decr a)
            (incr b))))

Using the substitution model, illustrate the process generated by each procedure in evaluating (+ 4 5). Are these processes iterative or recursive?

Let’s start by tracing FOO+. This wasn’t strictly necessary, but I like tracing things.

> (foo+ 3 4)
[Enter (foo+ 3 4)
[Enter (foo+ 2 4)
[Enter (foo+ 1 4)
[Enter (foo+ 0 4)
 Leave foo+ 4]
 Leave foo+ 5]
 Leave foo+ 6]
 Leave foo+ 7]
7

As you can see, we keep deferring operations until we reach our base case. This is shown in the way we enter the recursive FOO+ procedure call four times, and each time we decrement the variable A, but B remains the same. Then, when we hit our base case (when A is equal to 0), we make the addition calls we’ve been storing up all this time. You can tell this is happening because every time we `Leave’ the call to FOO+, we’re still adding to the value.

Here’s another, possibly better way to visualize it:

(foo+ 3 4)
(incr (foo+ (decr 3) 4))
(incr (incr (foo+ (decr 2) 4)))
(incr (incr (incr (foo+ (decr 1) 4))))
(incr (incr (incr 4)))

Therefore this procedure uses O(n) time in B and O(n) space in A (assuming it’s called as (FOO+ A B).)

Now let’s look at BAR+. Here’s the text of the procedure again:

(define (bar+ a b)
  (if (= a 0)
      b
      (bar+ (decr a)
            (incr b))))

BAR+ looks like it will generate an iterative process. Its self-call appears to be in tail position. Let’s see how it looks in the REPL:

> (bar+ 3 4)
[Enter (bar+ 3 4)
[Enter (bar+ 2 5)
[Enter (bar+ 1 6)
[Enter (bar+ 0 7)
 Leave bar+ 7]
 Leave bar+ 7]
 Leave bar+ 7]
 Leave bar+ 7]
7

In this case, we can be pretty sure we’re generating an iterative process (with no deferred computations being built up) because when we `Leave’ the call to BAR+ we’re not doing anything additional to the result, it’s just being passed back up to the top-level unchanged.

Here’s a trace written out by hand:

(BAR+ 4 3)
(BAR+ 3 4)
(BAR+ 2 5)
(BAR+ 1 6)
(BAR+ 0 7)

As you can see, we don’t defer operations that take up additional space. Therefore we’re O(1) in space, and O(n) in time for this implementation.

(Image courtesy SallySetsForth under Creative Commons license.)

Simple Network Search in Scheme

../img/winston-horn-network.png

I’ve been having fun translating some of the code in Winston and Horn’s Lisp into Scheme. This book is amazing – clearly written, with lots of motivating examples and applications. As SICP is to language implementation, Lisp is to application development, with chapters covering constraint propagation, forward and backward chaining, simulation, object-oriented programming, and so on. And it does include the obligatory Lisp interpreter in one chapter, if you’re into that sort of thing.

In this installment, based on Chapter 19, we will look at some simple strategies for searching for a path between two nodes on a network (a graph). The network we’ll be using is shown in the diagram above.

Here’s the same network, represented as an alist where each CAR:CDR pair represents a NODE:NEIGHBORS relationship:

'((f e)
  (e b d f)
  (d s a e)
  (c b)
  (b a c e)
  (a s b d)
  (s a d))

The high-level strategy the authors use is to traverse the network, building up a list of partial paths. If a partial path ever reaches the point where it describes a full path between the two network nodes we’re after, we’ve been successful.

As with trees, we can do either a breadth-first or depth-first traversal. Here’s what the intermediate partial paths will look like for a breadth-first traversal that builds a path between nodes S and F:

(s)
(s a)
(s d)
(s a b)
(s a d)
(s d a)
(s d e)
(s a b c)
(s a b e)
(s a d e)
(s d a b)
(s d e b)
'(s d e f)

Based on that output, we can deduce that every time we visit a node, we want to extend our partial paths list with that node. Here’s one option – its only problem is that it will happily build circular paths that keep us from ever finding the node we want:

(define (%buggy-extend path)             ;builds circular paths
  (map (lambda (new-node) 
         (cons new-node path))
       (%get-neighbor (first path))))

(Incidentally, I’ve become fond of the convention whereby internal procedures that aren’t part of a public-facing API are prefixed with the `%’ character. This can be found in some parts of the MIT Scheme sources, and I believe it’s used in Racket as well. I’ve started writing lots of my procedures using this notation to remind me that the code I’m writing is not the real `API’, that the design will need more work, and that the current code is just a first draft. I’m using that convention here.)

Here’s a better version that checks if we’ve already visited the node before adding it to the partial paths list – as a debugging aid it prints out the current path before extending it:

(define (%extend path)
  (display (reverse path))
  (newline)
  (map (lambda (new-node)
         (cons new-node path))
       (filter (lambda (neighbor)
                 (not (member neighbor path)))
               (%get-neighbor (first path)))))

You may have noticed the %GET-NEIGHBOR procedure; it’s just part of some silly data structure bookkeeping code. Please feel free to deride me in the comments for my use of a global variable. What can I say? I’m Scheming like it’s 1988 over here! Here’s the boilerplate:

(define *neighbors* '())

(define (%add-neighbor! k v)
  (let ((new-neighbor (cons k v)))
    (set! *neighbors*
          (cons new-neighbor *neighbors*))))

(define (%get-neighbor k)
  (let ((val (assoc k *neighbors*)))
    (if val
        (cdr val)
        '())))

(%add-neighbor! 's '(a d))
(%add-neighbor! 'a '(s b d))
(%add-neighbor! 'b '(a c e))
(%add-neighbor! 'c '(b))
(%add-neighbor! 'd '(s a e))
(%add-neighbor! 'e '(b d f))
(%add-neighbor! 'f '(e))

Now that we have our data structure and a way to extend our partial path list (non-circularly), we can write the main search procedure, %BREADTH-FIRST. The authors have a lovely way of explaining its operation:

BREADTH-FIRST is said to do a “breadth-first” search because it extends all partial paths out to uniform length before extending any to a greater length.

Here’s the code, translated to use a more Schemely, iterative named LET instead of the linear-recursive definition from the book 1:

(define (%breadth-first start finish network)
  (let ((queue (list (list start))))
    (let loop ((start start)
               (finish finish)
               (network network)
               (queue queue))
      (cond ((null? queue) '())                    ;Queue empty?
            ((equal? finish (first (first queue))) ;Finish found?
             (reverse (first queue)))              ;Return path.
            (else
             (loop start
                   finish               ;Try again.
                   network
                   (append
                    (rest queue)
                    (extend (first queue))))))))) ;New paths in front.

(A better way to write this procedure would be to implement a generic internal search procedure that takes its `breadthiness’ or `depthiness’ as a parameter. We could then wrap it with nicer public-facing search procedures specific names.)

Meanwhile, back at the REPL… we remind ourselves of what *neighbors* actually looks like, and then we search for a path between the nodes S and F.

> *neighbors*
'((f e) (e b d f) (d s a e) (c b) (b a c e) (a s b d) (s a d))
> (%breadth-first 's 'f *neighbors*)
(s)
(s a)
(s d)
(s a b)
(s a d)
(s d a)
(s d e)
(s a b c)
(s a b e)
(s a d e)
(s d a b)
(s d e b)
'(s d e f)

What fun! I can almost imagine using a three-dimensional variant of these searches for a space wargame with wormhole travel! Except, you know, they’d need to be much faster and more skillfully implemented. There’s also the tiny requirement to write the surrounding game…

Footnotes:

1 It shouldn’t need to be said, but: Of course the authors knew better; they were trying to hide that unnecessary complexity from the reader until later.

Geiser and Scsh are Talking

../img/fractal-cheetah.jpg

In this post I’d like to announce an early version of scsh 1 support for Geiser 2. There’s plenty left to do, but enough of Geiser is working that you can write scsh with real completions provided by scsh itself! What’s more, if you want to you can hack on the Geiser support from the inside. :-}

Working features include:

  • Symbol and module completion
  • Module imports (mostly)
  • view macroexpansions in buffer
  • Talking to Geiser over its async protocol (the foundation of everything else)

Some features aren’t there yet:

  • Jumping to procedure definition in source
  • Show callers/callees of a procedure
  • Jumping to module source
  • Jumping to documentation for the thing at point (I’ve only gotten this to work sporadically for some reason)

Getting the Code

You can get the code here:

https://github.com/rmloveland/geiser-scsh

See the README and TODO.org files in the repo for more detailed information about project goals and what works and what doesn’t. Feel free to open issues on Github for any broken things (there are plenty).

You can also get a copy of the scsh manual in Texinfo here – please let me know if you can get the Geiser doc commands working with it, so far I haven’t had much success:

https://github.com/rmloveland/scsh-manual-texinfo

Getting Started

There are a few setup quirks:

  • First, set Emacs’ geiser-scheme-dir variable to wherever you’re working from; here’s what I do:
    (setq geiser-scheme-dir (expand-file-name "~/Code/geiser-scsh/scheme/"))  
    
  • Second, go ahead and load up geiser-scsh.el from the git repo.
  • Next, hop into Customize via M-x customize-group ENTER geiser ENTER, and set the following:
    Custom Variable Value Note
    Geiser Repl Skip Version Check P on You must set this, or else Emacs throws you into the debugger. This is a bug I need to fix.
    Geiser Active Implementations scsh
    Geiser Default Implementation scsh I do this since I’m mostly working in scsh, but it shouldn’t be necessary.

    (Note that you’ll need to set geiser-scheme-dir back to its original value if you want to go back to using Guile or Racket.)

At this point, you should be able to do M-x run-scsh and enjoy your scsh hacking with nice completions and whatnot. If you’re going to hack on the Geiser support, do a C-u M-x geiser-show-logs; a buffer will open where you can listen to scsh and Emacs talking. And of course, the source is your friend. :-}

(Image courtesy RayMorris1 under CC-BY-NC-ND license.)