A full implementation of breadth-first search of a graph

In this post I’ll share a full, standalone implementation of breadth-first search of a graph. Breadth-first search (hereafter BFS) is used in many graph algorithms. In particular it is useful for finding the shortest path (in terms of fewest vertices, or “hops”) between two vertices A and B.

There are many pseudocode examples showing breadth-first search of a graph on the interwebs, and lots of textual descriptions. There are also many videos on Youtube talking about this algorithm.

Unfortunately, despite all of the above, I had a really hard time finding the complete source code of a graph traversal that I could read and study — at least, one that was at the right level of complexity for me to actually learn how it works. A lot of the code I found online or in books was too abstract and hid implementation details from me by relying on “helper” libraries which were not really explained. For example, this seemed to be the case with the Java code from Sedgewick. The underlying data structures of most algorithms’ implementations were hidden away in library code which I would have needed to study to understand what was really happening.

On the other hand, the couple of production-quality graph implementations I looked at were very hard to read for someone not very familiar with the code and the algorithm already.

I knew that in order to really learn how this algorithm works, I needed to aim somewhere in the middle. In the end, I decided to implement it myself to make sure I understood it. I’m sure that’s pretty common!

So what do I mean by a full, standalone implementation?

I mean that you can download just the code on this page, and run it on any machine where Perl is installed, and it will work. Other than the language it’s written in, there are no external libraries or other dependencies.

Without further ado here is my take on a full implementation of breadth-first search of a graph. If you download only the code on this page and run it on your machine, it will work – the batteries are included, as they say.

(Incidentally, the statement above is also true of the Scheme code from Simple Network Search in Scheme, if you’d rather look at an implementation in a non-ALGOL language. However, that code is not really “my own creation”, it’s merely ported to Scheme from a Lisp book.)

Algorithm prose description

At a high level, the way BFS works is:

  • 1. Push the starting vertex A onto a queue
  • 2. While the queue is not empty, get the next vertex from it (in this case A, since we just added it)
  • 3. Look at each neighboring vertex N of A and take the following steps:
  • 1. If N has already been marked as visited, skip it and go to the next neighbor of A.
  • 2. Add the pair (N, A) to a spanning tree (or a reverse linked list). If we find the end node B, we will walk this spanning tree backward to find the path from the end vertex B back to the start vertex A.
  • 3. If the currently visited neighbor N is the end vertex B we’re looking for, we walk the spanning tree backward to build the path taken from B back to A.
  • 4. If the currently visited neighbor N is not the end vertex, we push N onto the queue to be processed.
  • 5. Finally, we mark N as visited.

Pseudocode

To supplement the prose description above, here is some pseudocode for BFS. It maps pretty directly onto the concrete implementation we will get into in the next section. I did complain a little above about pseudocode, but of course I’ll subject you to mine just the same. :-} I do think this is a reasonably “complete” pseudocode that you could actually use to implement BFS, unlike many examples that I found.

    function bfs(start, end, graph) {

        push start onto queue
        mark start as seen
        make a spanning tree for storing the path from start to end

        while queue is not empty {
            vertex = get next item from queue
            for each neighbor of vertex in the graph {
                if neighbor has been seen {
                    next
                }
                add (neighbor, vertex) to the spanning tree
                if this neighbor is the end node we want {
                    walk the spanning tree, printing path from start to end
                    exit
                }
                else {
                    push neighbor onto the queue
                }
                mark this neighbor as visited
            }
        }
      print "No path found"
      exit
    }

Implementation

The code below shows a direct implementation of BFS in Perl. The graph it will search is the one shown in the diagram at the top of this post. In code, the data structure for the graph is as shown below. It’s an array of arrays: the first element of each array is the vertex A, followed by an array of its neighbors. This is the classic “adjacency list” representation of a graph.

    my $graph = [['s', ['a', 'd']],
                 ['a', ['s', 'b', 'd']],
                 ['b', ['a', 'c', 'e']],
                 ['c', ['b']],
                 ['d', ['s', 'a', 'e']],
                 ['e', ['b', 'd', 'f']]];

Next, here is the main loop of BFS. If you know Perl, you can see that it maps pretty much 1:1 to the description in the above pseudocode. The shape of this code would be pretty much the same in another language such as Python or Ruby, with a few small syntactic differences, but hardly any semantic differences.

    sub find_path_between {
        my ( $start, $end, $graph ) = @_;

        return () unless defined $start && defined $end;

        my @path;     # Path so far
        my @queue;    # Vertices still to visit.
        my %seen;     # Vertices already seen.
        my $found;    # Whether we have found the wanted vertex.
        my $st = {};  # Spanning tree, used to find paths.

        if ( $start eq $end ) {
            push @path, $start;
            return @path;
        }

        push @queue, $start;
        $seen{$start}++;

        while (@queue) {
            my $v         = shift @queue;
            my $neighbors = get_neighbors( $v, $graph );

            for my $neighbor (@$neighbors) {
                next if $seen{$neighbor};
                _st_add( $v, $neighbor, $st );
                if ( $neighbor eq $end ) {
                    $found++;
                    @path = _st_walk( $start, $end, $st );
                    return @path;
                }
                else {
                    push @queue, $neighbor;
                }
                $seen{$neighbor}++;
            }
        }
        return $found ? @path : ();
    }

Once the main loop structure is written so that you are walking the vertices of the graph in the correct (breadth-first) order, one part I found slightly tricky was keeping a “trail of bread crumbs” from the end node back to the start. You may have noticed that the code above uses two helper functions — _st_add and _st_walk — that hide that complexity away a little bit. We will now look at them in more detail.

_st_add is trivial, and could have been written directly as a hash table access. The idea is to keep a logical “reverse linked list” structure in a hash table, where each vertex added to the list has a link pointing back to the previous vertex. I hid it from myself in this function so I could read the logic in the main search loop more easily.

    sub _st_add {
        my ( $vertex, $neighbor, $st ) = @_;
        $st->{$neighbor}->{prev} = $vertex;
    }

_st_walk is a little more interesting. As noted above, we kept a trail of bread crumbs from our end vertex (if such exists) back to the start vertex. _st_walk walks that trail of crumbs backward and builds an array holding all of the vertices visited along the way, which it returns to the caller.

    sub _st_walk {
        my ( $start, $end, $st ) = @_;

        my @path;
        push @path, $end;

        my $prev = $st->{$end}->{prev};
        while (1) {
            if ( $prev eq $start ) {
                push @path, $start;
                last;
            }
            push @path, $prev;
            $prev = $st->{$prev}->{prev};
            next;
        }
        return reverse @path;
    }

Finally, you may have noticed that in order to visit all of the neighbors of some node A we have to be able to list them. The function get_neighbors handles that task, along with its helper function _find_index. Let’s look at them in turn.

get_neighbors looks up some node N in the graph, and returns its list of neighbors, if there are any.

    sub get_neighbors {
        my ( $k, $graph ) = @_;

        my $index = _find_index( $k, $graph );

        if ( defined $index ) {
            return $graph->[$index]->[1];
        }
        else {
            return;
        }
    }

_find_index looks up the node N‘s index in our array-based graph representation. This is actually not a very performant way to do this, since it’s doing a linear search into an array. There are ways to speed this up, such as using a hash of arrays for faster vertex lookup, or using a cache. However I felt it would be better to keep the graph’s data representation as simple as possible for this example. (Incidentally, the below is exactly the sort of somewhat uninteresting but very necessary bookkeeping code I was having a hard time finding in much of the example code in books or online.)

    sub _find_index {
        my ( $wanted, $graph ) = @_;

        # Naive linear search, for now.
        my $i = 0;
        for my $elem (@$graph) {

            # Definedness check here is necessary because we delete
            # elements from the graph by setting the element's index to
            # undef.  In other words, some graph indices can be undef.
            if ( defined $elem->[0] && $elem->[0] eq $wanted ) {
                return $i;
            }
            $i++;
        }
        return;
    }

Finally, we have a main function that drives everything. First, we find the shortest path (by number of nodes) from ‘s’ to ‘c’. Then, we find the shortest path from ‘s’ to ‘f’:

    sub main {

        my $graph = [
            [ 's', [ 'a', 'd' ] ],
            [ 'a', [ 's', 'b', 'd' ] ],
            [ 'b', [ 'a', 'c', 'e' ] ],
            [ 'c', ['b'] ],
            [ 'd', [ 's', 'a', 'e' ] ],
            [ 'e', [ 'b', 'd', 'f' ] ]
        ];

        my $start = 's';
        my $end   = 'c';

        my @path = find_path_between( $start, $end, $graph );

        print qq[Path from '$start' to '$end' is: @path\n];

        # Find a second path.
        $end  = 'f';
        @path = find_path_between( $start, $end, $graph );
        print qq[Path from '$start' to '$end' is: @path\n];
    }

Putting everything above together and running it will print the following output:

    Path from 's' to 'c' is: s a b c
    Path from 's' to 'f' is: s d e f

References

1. Aho, Ullman, Hopcroft – Data Structures and Algorithms.

2. Sedgewick – Algorithms in Java, Part 5: Graph Algorithms.

3. Orwant, Hietaniemi, Macdonald – Mastering Algorithms with Perl.

Appendix: The complete program listing

    #!perl

    use strict;
    use warnings;

    sub find_path_between {
        my ( $start, $end, $graph ) = @_;

        return () unless defined $start && defined $end;

        my @path;     # Path so far
        my @queue;    # Vertices still to visit.
        my %seen;     # Vertices already seen.
        my $found;    # Whether we have found the wanted vertex.
        my $st = {};  # Spanning tree, used to find paths.

        if ( $start eq $end ) {
            push @path, $start;
            return @path;
        }

        push @queue, $start;
        $seen{$start}++;

        while (@queue) {
            my $v         = shift @queue;
            my $neighbors = get_neighbors( $v, $graph );

            for my $neighbor (@$neighbors) {
                next if $seen{$neighbor};
                _st_add( $v, $neighbor, $st );
                if ( $neighbor eq $end ) {
                    $found++;
                    @path = _st_walk( $start, $end, $st );
                    return @path;
                }
                else {
                    push @queue, $neighbor;
                }
                $seen{$neighbor}++;
            }
        }
        return $found ? @path : ();
    }

    sub _st_walk {
        my ( $start, $end, $st ) = @_;

        my @path;

        push @path, $end;
        my $prev = $st->{$end}->{prev};
        while (1) {
            if ( $prev eq $start ) {
                push @path, $start;
                last;
            }
            push @path, $prev;
            $prev = $st->{$prev}->{prev};
            next;
        }
        return reverse @path;
    }

    sub _st_add {
        my ( $vertex, $neighbor, $st ) = @_;
        $st->{$neighbor}->{prev} = $vertex;
    }

    sub get_neighbors {
        my ( $k, $graph ) = @_;

        my $index = _find_index( $k, $graph );

        if ( defined $index ) {
            return $graph->[$index]->[1];
        }
        else {
            return;
        }
    }

    sub _find_index {
        my ( $wanted, $graph ) = @_;

        # Naive linear search, for now.
        my $i = 0;
        for my $elem (@$graph) {

            # Definedness check here is necessary because we delete
            # elements from the graph by setting the element's index to
            # undef.  In other words, some graph indices can be undef.
            if ( defined $elem->[0] && $elem->[0] eq $wanted ) {
                return $i;
            }
            $i++;
        }
        return;
    }

    sub main {
        my $graph = [
            [ 's', [ 'a', 'd' ] ],
            [ 'a', [ 's', 'b', 'd' ] ],
            [ 'b', [ 'a', 'c', 'e' ] ],
            [ 'c', ['b'] ],
            [ 'd', [ 's', 'a', 'e' ] ],
            [ 'e', [ 'b', 'd', 'f' ] ]
        ];

        my $start = 's';
        my $end   = 'c';

        my @path = find_path_between( $start, $end, $graph );

        print qq[Path from '$start' to '$end' is: @path\n];

        # Find a second path.
        $end  = 'f';
        @path = find_path_between( $start, $end, $graph );
        print qq[Path from '$start' to '$end' is: @path\n];
    }

    main();

Three views on a loop

Recently I needed to translate a function I had written in Scheme to Common Lisp for … reasons. As it turns out, this was not entirely straightforward. The function was originally written in a very recursive style. In order to have something that would run portably across Common Lisp implementations I needed something that didn’t rely so heavily on recursion.

This was not easy for me. I have written much much more Scheme than Common Lisp in my life, and it feels quite natural to me now to design loops using recursion. So that bridge had to be crossed somehow. This is the story of how I crossed it.

The procedure in question is merge, which is used to implement the core of a merge sort.

Let’s start with the Scheme. The code below shows a straightforward implementation of merge from a Scheme programmer’s perspective (or at least this one’s). As mentioned above, this is the merging procedure that is used to implement the core of a merge sort. It’s very similar to the code presented in Bottom Up Merge Sort in Scheme.

(define (merge l r pred)
  (letrec ((merge-aux
            (lambda (left right pred result)
              (cond ((and (null? left)
                          (null? right))
                     (reverse result))
                    ((and (not (null? left))
                          (not (null? right)))
                     (if (pred (car left) (car right))
                         (merge-aux (cdr left)
                                    right pred (cons (car left) result))
                       (merge-aux left
                                  (cdr right) pred (cons (car right) result))))
                    ((not (null? left))
                     (merge-aux (cdr left) right pred (cons (car left) result)))
                    ((not (null? right))
                     (merge-aux left (cdr right) pred (cons (car right) result)))
                    (else #f)))))
    (merge-aux l r pred '())))

As you can see this code makes heavy use of recursive procedure calls, the default iteration construct in Scheme. As noted above, we can’t expect this sort of code to be supported in portable Common Lisp code, so it needs to be translated.

(Side note: If you are interested in some notes on which Lisp implementations support tail calls, recursive self-calls, etc., see this interesting article.)

Below is my first attempt to translate the Scheme code directly into Common Lisp. It still relies heavily on recursive function calls. If you compile this in a recent CMUCL or Clozure CL the recursion doesn’t seem to bother them, since their compilers support recursion well. And the performance will be pretty good, too. But that won’t be the case in every Lisp.

(Note that we call this function rml/merge because Common Lisp already has a built-in merge — about which more later.)

(defun rml/merge (l r pred)
  (labels ((merge-aux (pred left right result)
             (cond ((and (null left)
                         (null right))
                    (reverse result))
                   ((and (not (null left))
                         (not (null right)))
                    (if (funcall pred (car left) (car right))
                        (merge-aux pred (cdr left)
                                   right (cons (car left) result))
                        (merge-aux pred left
                                   (cdr right) (cons (car right) result))))
                   ((and (not (null left)) (null right))
                    (merge-aux pred (cdr left) right (cons (car left) result)))
                   ((and (null left) (not (null right)))
                    (merge-aux pred left (cdr right) (cons (car right) result)))
                   (t 'FAIL))))
    (merge-aux pred l r '())))

As we discussed, some Common Lisp implementations will not handle this level of recursion. One Lisp in particular was pretty salty about it. :-}

IMAGE

genera-stack-overflow

CLISP and ABCL will also explode pretty much immediately.

It took a bit of thinking, but I remembered reading about something called BLOCK in good old CLtL2. It lets you create a named block that you can later jump out of at your convenience, e.g.,

(block outer
   ;; ... do things
   ;; ok, i'm done!
   (return-from outer my-result))

This reminds me of a similar pattern I’ve used in Perl (or Java for that matter) for early returns.

LOOP: for my $elem (@list) {
    # do things ....
    next LOOP if check($elem);
    # do other things ...
}

Using my newfound weapon (hammer?) block, I was able to slay the recursion dragon as shown below in rml/merge2. This code is speaking Common Lisp, but it still has a heavy Scheme accent! The shape of the computation feels pretty similar, at least.

(defun rml/merge2 (left right pred)
  (let ((left left)
        (right right)
        (result '()))
    (block outer
      (loop do
           (block inner
             (cond ((and (null left) (null right))
                    (return-from outer))
                   ((and (not (null left))
                         (not (null right)))
                    (if (funcall pred (car left) (car right))
                        (progn
                          (push (car left) result)
                          (setf left (cdr left))
                          (return-from inner))
                        (progn
                          (push (car right) result)
                          (setf right (cdr right))
                          (return-from inner))))
                   ((not (null left))
                    (progn
                      (push (car left) result)
                      (setf left (cdr left))
                      (return-from inner)))
                   ((not (null right))
                    (progn
                      (push (car right) result)
                      (setf right (cdr right))
                      (return-from inner)))
                   (t                   ; Should never get here (tm)
                    (error "ERROR: RML/MERGE2: ~A ~A ~A~%" left right result))))))
    (nreverse result)))

At a high level, the code works as follows:

  1. let-bind some variables to the arguments passed in, so we can bash on them at will later on.
  2. Open up an outer block to capture the state of the loop, with a name we can use for an early return. This is what we will jump out to when we’re done.
  3. Start a big loop with an inner block. I’m using a magic number here (one billion) as a shorthand for “just loop forever” since we are going to jump out when we’re done anyway. (“One-billion-element-long lists should be enough for anyone.”)
  4. Inside the loop’s inner block, we model the iteration created by recursive function calls using jumps. In every step of the cond, we jump out to the inner block’s label just as we would in Scheme by making a procedure call, except here we need to be explicit using return-from, whereas Scheme lets us be implicit.
  5. Once left and right are both empty, our work is done, so we jump back out to the outer block. Then, we destructively sort the result list in place with nreverse, just because we can. (Hey, it was temporary anyway.)

This isn’t very satisfying, though. It felt icky using block, which feels like a building block you’d need to build a looping construct (aka compiler/macro output), but not something “user-level” code should need to use. So I did some more reading of Common Lisp code. In particular I discovered a couple of things:

  • A bare loop with no arguments will happily go on forever; you don’t need to say loop repeat 5 ....
  • Inside a loop, you don’t need to say do (something else) unless, prior to that point, you’ve used loop‘s “English” syntax, e.g., loop from 1 to 10 do ....
  • Because this isn’t Scheme, we don’t need to do the 2-step dance of “put car of A here, set A to cdr of A there”. We can push the return value of pop onto Adirectly, and rely on the fact that pop mutates its argument list to handle setting A to cdr of A. Mmmmm, yummy state-bashing.

Thanks to the above, we now have code for rml/merge3. It got a lot shorter and simpler along the way, while saying the same things:

(defun rml/merge3 (left right pred)
  (let ((result '()))
    (loop
       (cond ((and (null left) (null right))
              (return (nreverse result)))
             ((and (not (null left))
                   (not (null right)))
              (if (funcall pred (car left) (car right))
                  (push (pop left) result)
                  (push (pop right) result)))
             ((not (null left))
              (push (pop left) result))
             ((not (null right))
              (push (pop right) result))
             (t                         ; Should never get here (tm)
              (error "ERROR: RML/MERGE2: ~A ~A ~A~%" left right result))))))

Now that I had some code, I needed to make sure it actually gave the right answers (aka the hard part). Because Common Lisp has a merge function built in, I was able to write a bit of test code to check my merge function’s implementation against what the Lisp wizards did. There’s no feeling like standing on the shoulders of giants!

Here is the test code, which you can paste into your REPL and try for yourself. It runs our merge function against the system’s merge 1000 times on the same randomly generated input lists. For each test case, if the outputs match, we output a t; otherwise nil.

(defun make-random-list (size &key max)
  (loop repeat (random size) collecting (random max)))

(defun check-merge-output (function size)
  (let* ((left (make-random-list size :max size))
         (right (make-random-list size :max size))
         (left-copy (copy-list left))
         (right-copy (copy-list right))
         (expected (merge 'list left-copy right-copy #'<))
         (got (funcall function left right #'<)))
    (if (equalp expected got)
        t
        (values expected got))))

(defun run-merge-tests (function)
  (loop repeat 1000 collecting
       (if (check-merge-output function 1000) t nil)))

Here’s what the output of a successful run looks like (Luckily, I have not yet found any unsuccessful ones):

CL-USER> (run-merge-tests #'rml/merge3)
(T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T ...)

Based on the testing shown here, I think this implementation is Probably Fine ™. I’m happy with it, and it performs well enough for my needs.

In a future post, we’ll use the merge function we’ve shared here to write a merge sort.

Performance Postamble

But what about the performance? It’s not bad, actually! As usual, it can depend on the implementation. Below I’ll list out how it compares in a few implementations. However I should state right now for the record that:

  1. I’m not a performance or benchmarking expert. You should probably just stop reading here, because what follows is likely to be nonsense.
  2. My implementation only works on lists (not vectors, too) and does not accept a sort key, as the standard dictates.
  3. My implementation is not battle-hardened from living inside a Common Lisp implementation for decades (the most important point of all).

That said, what follows is what I saw when I ran the test code below on a few implementations:

(defun check-merge-performance (function)
  (time
   (let* ((len 1000000)
          (max 1000000)
          (xs (make-random-list len :max max))
          (ys (make-random-list len :max max)))
     (progn
       (funcall function xs ys #'<)
       (values)))))

(defun check-system-merge-performance ()
  (time
   (let* ((len 1000000)
          (max 1000000)
          (xs (make-random-list len :max max))
          (ys (make-random-list len :max max)))
     (progn
       (merge 'list xs ys #'<)
       (values)))))

Note that the following tests were all run on a several-years-old Windows laptop with spinning disks, slow RAM, and a bunch of other applications running. Did I mention I’m not a performance and benchmarking expert? :-}

Armed Bear Common Lisp

ABCL‘s merge implementation did not do so well against mine, but I think I must be missing something. For example, there’s probably a lot of subtlety around JVM performance on Windows I just have no idea about.

Mine ran in about 17 seconds.

CL-USER> (check-merge-performance #'rml/merge3)
17.452 seconds real time
4000684 cons cells
NIL

Unfortunately, theirs took about 5 minutes to do about the same (*) computation. I’m not sure why. It’s possible I’m making some kind of mistake here, but I don’t know what it is. I used the timing code below because at first I thought the ABCL merge check had just hung altogether.

CL-USER> (time
          (let* ((len 1000000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)
              (values))))

312.803 seconds real time
412497 cons cells
NIL

(*) It’s not exactly the same, because remember that CL’s merge works on lists or vectors, so there will be some dispatching.

Out of curiosity, I went back and tried a few smaller inputs. ABCL seems to do OK with 2 100,000-element lists of random numbers, as shown below. Further, if you look at the number of cons cells, the function does appear to be O(n), which you would expect to see. This makes me think the problem is simply that 2 lists of 1,000,000 elements in the JVM is just too big a load for my little laptop, somehow.

CL-USER> (time
          (let* ((len 100000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)               (values)))) 0.028 seconds real time 4025 cons cells NIL CL-USER> (time
          (let* ((len 10000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)
              (values))))
2.24 seconds real time
40113 cons cells
NIL

CLISP

CLISP‘s implementation is about twice as fast as mine, and uses about half as much memory, as is right and proper.

CL-USER> (check-merge-performance #'rml/merge3)
Real time: 9.634908 sec.
Run time: 9.578125 sec.
Space: 56000000 Bytes
GC: 16, GC time: 0.578125 sec.
; No value
CL-USER> (check-system-merge-performance)
Real time: 4.6764607 sec.
Run time: 4.671875 sec.
Space: 32000000 Bytes
GC: 6, GC time: 0.40625 sec.
; No value

Corman Lisp

Corman Lisp is a Common Lisp implementation for Windows. It’s fast and has really good Win32 integration. I enjoy the IDE a lot.

If you like Lisp, you may enjoy this interesting talk by its creator Roger Corman where he discusses its development.

The system merge is about twice as fast as mine. It seems to come down to my implementation creating a lot of garbage, which is expected given my naive use of push and pop when I should be doing more direct mutation and structure sharing using e.g. rplacd.

(check-merge-performance #'rml/merge3)
Total Execution time: 0.360289 seconds
Time spent garbage collecting: 0.093065 seconds

(check-system-merge-performance)
Total Execution time: 0.180082 seconds
Time spent garbage collecting: 0.0 seconds

LispWorks (32-bit Hobbyist Edition for Windows)

LispWorks is a proprietary implementation put out by a company in England. Not only is their merge faster, they’re using about half as much memory. All as expected.

P.S. I just bought the “Hobbyist” version of the software, and I can definitely recommend it. Fast compiler, and the IDE and class browser, debugger, etc., are lot of fun to use.

CL-USER 40 > (check-merge-performance #'rml/merge3)
Timing the evaluation of (LET* ((LEN 100000) (MAX 100000) (XS (MAKE-RANDOM-LIST LEN :MAX MAX)) (YS (MAKE-RANDOM-LIST LEN :MAX MAX))) (PROGN (FUNCALL FUNCTION XS YS (FUNCTION <)) (VALUES))) User time    =        0.062 System time  =        0.015 Elapsed time =        0.063 Allocation   = 2895872 bytes 0 Page faults CL-USER 41 > (check-system-merge-performance)
Timing the evaluation of (LET* ((LEN 100000) (MAX 100000) (XS (MAKE-RANDOM-LIST LEN :MAX MAX)) (YS (MAKE-RANDOM-LIST LEN :MAX MAX))) (PROGN (MERGE (QUOTE LIST) XS YS (FUNCTION <)) (VALUES)))

User time    =        0.015
System time  =        0.000
Elapsed time =        0.015
Allocation   = 1434380 bytes
0 Page faults

Anyway that’s all for now. As I said, you should take the “results” above with a grain of salt. Nonetheless it was interesting seeing how the different implementations behaved in an unscientific comparison.

Thinking about software documentation as the output of a lossy compression algorithm

It’s useful to think about software docs as the output of a lossy compression algorithm.

stick

How many times have you heard or read the following comments?

  1. “The docs are always out of date”
  2. “I don’t bother reading the docs, I just read the source code”
  3. “If you write self-documenting code, you don’t need to write docs”

If you work in software, I bet the answer is: a lot. They range in truth value from #1, which is a tautology, to #3, which is a fantasy. I can relate to #2, since I had to do it just yesterday when using a semi-undocumented Ruby library.

I think all of these points of view (and more) are explained when you think about software documentation as the output of a lossy compression algorithm. Many (most? all?) of the things that you love and/or hate about the documentation for a particular piece of software are explained by this.

Quoth wiki:

Well-designed lossy compression technology often reduces file sizes significantly before degradation is noticed by the end-user. Even when noticeable by the user, further data reduction may be desirable (e.g., for real-time communication, to reduce transmission times, or to reduce storage needs).

As such, the features of software documentation are similar to the features of other products of lossy compression algorithms.

Take mp3s for example. The goal of an mp3 is not to be the highest-fidelity replication of the audio experience it represents. The goal of an mp3 is to provide a “good enough” audio experience given the necessary tradeoffs that had to be made because of constraints such as:

  • Time: How expensive in CPU time is it to run the decompression algorithm? Can the CPU on the device handle it?
  • Space: How much disk space will the file take on the device?

Similarly, we might say that the goal of software documentation is not to be the highest-fidelity replication of the “understanding” experience it (theoretically) represents. The goal of a piece of documentation is to provide a “good enough” learning experience given the necessary tradeoffs that had to be made because of constraints such as:

  • Time: How expensive in person time is it to “run the decompression algorithm”, i.e., learn how the system works well enough to write something coherent? And then to actually write it? How many technical writers does the organization employ per engineer? (In many companies it’s ~1 per 40-50 engineers) How many concurrent projects is that writer working on, across how many teams?
  • Space: How much information does the user need to use the system? How little information can you get away with providing before users give up in disgust?

Remember that fewer, more effective materials cost more to produce. This is similar to the way better compression algorithms may cost more than worse ones along various axes you care about (dollar cost for proprietary algorithms, CPU, memory, etc.)

It takes longer to write more concise documentation, draw clear system diagrams, etc., since those are signs that you actually understand the system better, and have thus compressed the relevant information about it into fewer bytes.

And oh by the way, in practice in an “agile” (lol) environment you don’t have enough time to write the “best” docs for any given feature X. Just like the programmer who wrote feature X would likely admit that she didn’t have enough time to write the “best” implementation according to her standards.

Quoth Pascal:

I would have written a shorter letter, but I did not have the time.

So the next time you are frustrated by the docs for some piece of software (if any docs exist at all), instead of some platitude about docs sucking, think “oh, lossy compression”.

(Image courtesy Jonathan Sureau under Creative Commons license.)

The Josephus Problem

../img/origami-wreath.jpg

The Josephus Problem is deceptively simple. A number of people stand in a circle. (Let’s call this number N.) They agree to count around the circle, and every /M/th person will be killed. As they go around the circle, in what order do they die? Another way of looking at it is: who is the last person left alive?

For an interesting introduction, I recommend checking out the Wikipedia article.

I recently found this problem described in the 2nd edition of Sedgwick’s Algorithms, and thought it would be fun to implement in Scheme. Because I’m not a real programmer (and I had to translate to Scheme from the Pascal in the book), it took me a couple of hours instead of a couple of minutes. But what an enjoyable couple of hours!

I started, as usual for algorithmic programming tasks, with a simulation. In the book, Sedgwick provides an example: If there are 9 people, and every 5th person in the circle is killed, the order in which they’ll die is: 5, 1, 7, 4, 3, 6, 9, 2, 8.

It’s usually easier to start work on a problem by thinking about a simpler case, so that’s what I did. I chose N = 5, and M = 2. Below I’ll show the results of my “paper” simulation (OK, the paper was actually an Emacs buffer).

First, let me show you how to read my “notation”. It should be pretty easy to figure out. Below is an example of one “state snapshot”. I’ve labelled each line with its meaning. If you are a Scheme programmer, you will surmise that each named item corresponds to a “named-let” loop variable. In addition, the vertical bar character in the XS row represents the state of the “cursor” as it goes around the circle of 5 people.

;; I.                           <-- State Number
;; ------------
;; |1 2 3 4 5                   <--    XS
;; '()                          <--    YS
;; 0                            <-- COUNT

Above, we are in state 1, at the start of the program. The cursor is at the beginning of the list XS, which represents the circle of people. YS is our results list, where we will store the “dead”. As we travel around the circle and people are killed, they are added to YS. We could think of it as a “death toll”, but really it’s just a list of integers. As we go around the circle of people, we keep incrementing COUNT. When COUNT reaches the value of M, that person will be “killed”, that is, be added to YS.

When this happens, we reset the COUNT to 0 and keep going around the circle. There’s a catch, however. As more dead people are added to our result list YS, we need to ignore the spaces those people used to occupy in our count as we build COUNT back up towards M. In other words, the circle will have 4 people in it after the first person is killed. This means that as the circle gets smaller and smaller, people are killed more frequently.

I’ll show you how I’ve solved that problem in Scheme in a moment; first, the simulation:

;; I.
;; ------------
;; |1 2 3 4 5
;; '()
;; 0

;; II.
;; ------------
;; 1| 2 3 4 5
;; '()
;; 1

;; III.
;; ------------
;; 1 2| 3 4 5
;; '()
;; 1

;; IV.
;; ------------
;; 1 _ 3| 4 5
;; '(2)
;; 1

;; V.
;; ------------
;; 1 _ 3 4| 5
;; '(2)
;; 2

;; V.
;; ------------
;; 1 _ 3 _ 5|
;; '(4 2)
;; 1

;; VI.
;; ------------
;; 1| _ 3 _ 5
;; '(4 2)
;; 2

;; VII.
;; ------------
;; _ _ 3| _ 5
;; '(1 4 2)
;; 1

;; VIII.
;; ------------
;; _ _ 3 _ 5|
;; '(1 4 2)
;; 2

;; IX.
;; ------------
;; _ _ 3| _ _
;; '(5 1 4 2)
;; 1

;; X.
;; ------------
;; _ _ _| _ _
;; '(5 1 4 2)
;; 2

;; XI.
;; ------------
;; _ _ _| _ _
;; '(3 5 1 4 2)
;; 2

Now that you’ve seen how the algorithm should work on paper, let’s write some Scheme! I should state for the record that I did not read the Wikipedia article linked above until after I wrote this code. (This will be pretty obvious when you look at the code.) Here it is:

(define (josephus xs m)
  (let ((orig-xs (list-copy xs))
        (true-m (- m 1)))
    (let ((len (length xs)))
      (let loop ((xs xs)
                 (ys '())
                 (count 0))
        (cond 
         ;; If the dead list grows to the same length as our original
         ;; list, we know we're finished.
         ((= len (length ys))
          (reverse ys))
         ;; If XS is null, we have gone once around the circle.  We
         ;; start around again, leaving the count unchanged.
         ((null? xs)
          (loop orig-xs ys count))
         ;; If the current person we're looking at is already dead
         ;; (a ghost), move on.  And don't bother bumping the
         ;; count.
         ((member (car xs) ys)
          (loop (cdr xs) ys count))
         ;; If we're here, it's also the case that the current person
         ;; we're looking at is *not* in the dead list.  Therefore we
         ;; check if the count is equal to M.  If so, they must die.
         ((= count true-m)
          (loop (cdr xs) (cons (car xs) ys) 0))
         ;; If we get here, the current person we're looking at is
         ;; alive.  We check if the count is *not* equal to M.  If it
         ;; is not, we skip this person and bump the count.
         ((not (= count true-m))
          (loop (cdr xs) ys (+ count 1)))
         ;; We should never get here.
         (else #f))))))

How does it compare to the output described in Sedgwick? It seems to work!

> (josephus '(1 2 3 4 5 6 7 8 9) 5)
'(5 1 7 4 3 6 9 2 8)

There are several inefficiencies we could tackle, though. The first and most obvious is that we should try to calculate the solution mathematically (as shown in some of the Wikipedia examples) instead of building up lists using CONS. Another is that we make a “canonical” copy of the input list for restarting the loop. A third is the use of MEMBER to determine whether the person we’re currently visiting is already in the “dead” list.

For example, here’s the trace output showing 19 calls to MEMBER for a pretty small input:

(josephus '(1 2 3 4 5) 2)
[Enter (member 1 '())
 Leave member #f]
... 17 more calls to member ...
[Enter (member 3 '(5 1 4 2))
 Leave member #f]
'(2 4 1 5 3)

That said, in this case I was happy just to get something working. If this were going to be used for anything “real”, I would probably rewrite it statefully using vectors. Most of the fun was in coming up with this first draft without resorting to any stateful idioms.

(Image courtesy Chris Lott via CC-BY license.)

Bottom Up Merge Sort in Scheme

../img/37-tiles.jpg

Recently I’ve begun a project to implement a number of basic algorithms in Scheme, which I’d like to eventually grow into a free (as in freedom) ebook. Having just done a Binary Search in Scheme, I thought it would be fun to give merge sort a try.

According to the mighty interwebs, merge sort is a good choice for sorting linked lists (a.k.a., Lisp lists). Unfortunately the only Lisp merge sort implementation examples I’ve been able to find on the web have been recursive, not iterative.

The implementation described here is an iterative, bottom-up merge sort, written in a functional style. (I daren’t say the functional style, lest any real Scheme wizards show up and burn me to a crisp.)

First, generate a list of random numbers

In order to have something to sort, we need a procedure that generates a list of random numbers – note that the docstring is allowed by MIT/GNU Scheme; YMMV with other Schemes.

(define (make-list-of-random-numbers list-length max)
  ;; Int Int -> List
  "Make a list of random integers less than MAX that's LIST-LENGTH long."
  (letrec ((maker
            (lambda (list-length max result)
              (let loop ((n list-length) (result '()))
                (if (= n 0)
                    result 
                    (loop (- n 1) (cons (random max) result)))))))
    (maker list-length max '())))

Then, write a merge procedure

This implementation of the merge procedure is a straight port of the one described on the Wikipedia Merge Sort page, with one minor difference to make the sort faster 1.

An English description of the merge operation is as follows:

  • If both items passed in are numbers (or strings), wrap them up in lists and recur. (In this example we only care about sorting numbers)
  • If both lists are empty, return the result.
  • If neither list is empty:
    • If the first item in the first list is “less than” the first item in the second list, cons it onto the result and recur.
    • Otherwise, cons the first item in the second list on the result and recur.
  • If the first list still has items in it, cons the first item onto the result and recur.
  • If the second list still has items in it, cons the first item onto the result and recur.
  • If none of the above conditions are true, return #f. I put this here for debugging purposes while writing this code; now that the procedure is debugged, it is never reached. (Note: “debugged” just means “I haven’t found another bug yet”.)
(define (rml/merge pred l r)
  (letrec ((merge-aux
            (lambda (pred left right result)
              (cond 
               ((and (number? left)
                     (number? right))
                (merge-aux pred 
                           (list left) 
                           (list right) 
                           result))
               ((and (string? left)
                     (string? right))
                (merge-aux pred
                           (list left) 
                           (list right) 
                           result))
               ((and (null? left)
                     (null? right))
                (reverse result))
               ((and (not (null? left))
                     (not (null? right)))
                (if (pred (car left)
                          (car right))
                    (merge-aux pred
                               (cdr left)
                               right
                               (cons (car left) result))
                    (merge-aux pred
                               left
                               (cdr right)
                               (cons (car right) result))))
               ((not (null? left))
                (merge-aux pred (cdr left) right (cons (car left) result)))
               ((not (null? right))
                (merge-aux pred left (cdr right) (cons (car right) result)))
               (else #f)))))
    (merge-aux pred l r '())))

We can run a few merges to get a feel for how it works. The comparison predicate we pass as the first argument will let us sort all kinds of things, but for the purposes of this example we’ll stick to numbers:

(rml/merge < '(360 388 577) '(10 811 875 995))
;Value 11: (10 360 388 577 811 875 995)

(rml/merge < '(8 173 227 463 528 817) '(10 360 388 577 811 875 995))
;Value 12: (8 10 173 227 360 388 463 528 577 811 817 875 995)

(rml/merge < 
           '(218 348 486 520 639 662 764 766 886 957 961 964)
           '(8 10 173 227 360 388 463 528 577 811 817 875 995))
;Value 14: (8 10 173 218 227 348 360 388 463 486 520 528 577 639 662 764 766 811 817 875 886 957 961 964 995)

Finally, do a bottom up iterative merge sort

It took me a while to figure out how to do the iterative merge sort in a Schemely fashion. As usual, it wasn’t until I took the time to model the procedure on paper that I got somewhere. Here’s what I wrote in my notebook:

;;  XS                   |      RESULT
;;---------------------------------------------

'(5 1 2 9 7 8 4 3 6)            '()
    '(2 9 7 8 4 3 6)            '((1 5))
        '(7 8 4 3 6)            '((2 9) (1 5))
            '(4 3 6)            '((7 8) (2 9) (1 5))
                '(6)            '((3 4) (7 8) (2 9) (1 5))
                 '()            '((6) (3 4) (7 8) (2 9) (1 5))

;; XS is null, and RESULT is not of length 1 (meaning it isn't sorted
;; yet), so we recur, swapping the two:

'((6) (3 4) (7 8) (2 9) (1 5))  '()
          '((7 8) (2 9) (1 5))  '((3 4 6))
                      '((1 5))  '((2 7 8 9) (3 4 6))
                           '()  '((1 5) (2 7 8 9) (3 4 6))

;; Once more XS is null, but RESULT is still not sorted, so we swap
;; and recur again

'((1 5) (2 7 8 9) (3 4 6))      '()
                  '(3 4 6)      '((1 2 5 7 8 9))
                       '()      '((3 4 6) (1 2 5 7 8 9))

;; Same story: swap and recur!

'((3 4 6) (1 2 5 7 8 9))        '()
                     '()        '((1 2 3 4 5 6 7 8 9))

;; Finally, we reach our base case: XS is null, and RESULT is of
;; length 1, meaning that it contains a sorted list

'(1 2 3 4 5 6 7 8 9)

This was a really fun little problem to think about and visualize. It just so happens that it fell out in a functional style; usually I don’t mind doing a bit of state-bashing, especially if it’s procedure-local. Here’s the code that does the sort shown above:

(define (rml/merge-sort xs pred)
  (let loop ((xs xs)
             (result '()))
    (cond ((and (null? xs)
                (null? (cdr result)))
           (car result))
          ((null? xs)
           (loop result
                 xs))
          ((null? (cdr xs))
           (loop (cdr xs)
                 (cons (car xs) result)))
          (else
           (loop (cddr xs)
                 (cons (rml/merge <
                              (first xs)
                              (second xs))
                       result))))))

That’s nice, but how does it perform?

A good test of our merge sort is to compare it to the system’s sort procedure. In the case of MIT/GNU Scheme, we’ll need to compile our code if we hope to get anywhere close to the system’s speed. If your Scheme is interpreted, you don’t have to bother of course.

To make the test realistic, we’ll create three lists of random numbers: one with 20,000 items, another with 200,000, and finally a giant list of 2,000,000 random numbers. This should give us a good idea of our sort’s performance. Here’s the output of timing first two sorts, 20,000 and 200,000 2:

;;; Load compiled code

(load "mergesort")
;Loading "mergesort.so"... done
;Value: rml/insertion-sort2

;;; Define our lists

(define unsorted-20000 (make-list-of-random-numbers 20000 200000))
;Value: unsorted-20000

(define unsorted-200000 (make-list-of-random-numbers 200000 2000000))
;Value: unsorted-200000

;;; Sort the list with 20,000 items

(with-timing-output (rml/merge-sort unsorted-20000 <))
;Run time:      .03
;GC time:       0.
;Actual time:   .03

(with-timing-output (sort unsorted-20000 <))
;Run time:      .02
;GC time:       0.
;Actual time:   .021

;;; Sort the list with 200,000 items

(with-timing-output (rml/merge-sort unsorted-200000 <))
;Run time:      .23
;GC time:       0.
;Actual time:   .252

(with-timing-output (sort unsorted-200000 <))
;Run time:      .3
;GC time:       0.
;Actual time:   .3

As you can see, our sort procedure is on par with the system’s for these inputs. Now let’s turn up the heat. How about a list with 2,000,000 random numbers?

;;; Sort the list with 2,000,000 items

(define unsorted-2000000 (make-list-of-random-numbers 2000000 20000000))
;Value: unsorted-2000000

(with-timing-output (rml/merge-sort4 unsorted-2000000 <))
;Aborting!: out of memory
;GC #34: took:   0.80 (100%) CPU time,   0.10 (100%) real time; free: 11271137
;GC #35: took:   0.70 (100%) CPU time,   0.90  (81%) real time; free: 11271917
;GC #36: took:   0.60 (100%) CPU time,   0.90  (99%) real time; free: 11271917

(with-timing-output (sort unsorted-2000000 <))
;Run time:      2.48
;GC time:       0.
;Actual time:   2.474

No go. On a MacBook with 4GB of RAM, our merge sort runs out of memory, while the system sort procedure works just fine. It seems the wizards who implemented this Scheme system knew what they were doing after all! :-}

It should be pretty clear at this point why we’re running out of memory. In MIT/GNU Scheme, the system sort procedure uses vectors and mutation (and is no doubt highly tuned for the compiler), whereas we take a relatively brain-dead approach that uses lists and lots of cons-ing. I leave it as an exercise for the reader (or perhaps my future self) to rewrite this code so that it doesn’t run out of memory.

(Image courtesy mag3737 under Creative Commons license.)

Footnotes:

1 An earlier implementation started off the sort by “exploding” the list to be sorted so that ='(1 2 3)= became ='((1) (2) (3))=. This is convenient for testing purposes, but very expensive. It’s also unnecessary after the first round of merging. We avoid the need to explode the list altogether by teaching merge to accept numbers and listify them when they appear. We could also do the same for strings and other types as necessary.

2 For the definition of the with-timing-output macro, see here.

Binary Search in Scheme

../img/hex-twist-variation-2.jpg

Intro

Just for fun, I’ve begun translating some of the algorithms from Mastering Algorithms with Perl into Scheme. My hope is that I’ll get two things out of this: a better knowledge of algorithms, and of Scheme hacking.

Binary search is one of the first algorithms listed in the book; it’s tricky to write a correct binary search, but I had the Perl code to work from. Let’s see how I did.

What’s binary search?

Binary search is a method for finding a specific item in a sorted list. Here’s how it works:

  1. Take a guess that the item you want is in the middle of the current search “window” (when you start, the search window is the entire list).
  2. If the item is where you guessed it would be, return the index (the location of your guess).
  3. If your guess is “less than” the item you want (based on a comparison function you choose), recur, this time raising the “bottom” of the search window to the midway point.
  4. If your guess is “greater than” the item you want (based on your comparison function), recur, this time lowering the “top” of the search window to the midway point.

In other words, you cut the size of the search window in half every time through the loop. This gives you a worst-case running time of about (/ (log n) (log 2)) steps. This means you can find an item in a sorted list of 20,000,000,000 (twenty billion) items in about 34 steps.

Reading lines from a file

Before I could start writing a binary search, I needed a sorted list of items. I decided to work with a sorted list of words from /usr/share/dict/words, so I wrote a couple of little procedures to make a list of words from a subset of that file. (I didn’t want to read the entire large file into a list in memory.)

Note: Both format and the Lisp-inspired #!optional keyword are available in MIT Scheme; they made writing the re-matches? procedure more convenient.

  • re-matches? checks if a regular expression matches a string (in this case, a line from a file).
  • make-list-of-words-matching is used to loop over the lines of the words file and return a list of lines matching the provided regular expression.

Now I have the tools I need to make my word list.

(load-option 'format)

(define (re-matches? re line #!optional display-matches)
  ;; Regex String . Boolean -> Boolean
  "Attempt to match RE against LINE. Print the match if DISPLAY-MATCHES is set."
  (let ((match (re-string-match re line)))
    (if match
        (if (not (default-object? display-matches))
            (begin (format #t "|~A|~%" (re-match-extract line match 0))
                   #t)
            #t)
        #f)))

(define (make-list-of-words-matching re file)
  ;; Regex String -> List
  "Given a regular expression RE, loop over FILE, gathering matches."
  (call-with-input-file file
    (lambda (port)
      (let loop ((source (read-line port)) (sink '()))
        (if (eof-object? source)
            sink
            (loop (read-line port) (if (re-matches? re source)
                             (cons source sink)
                             sink)))))))

Writing tests

Since I am not one of the 10% of programmers who can implement a correct binary search on paper, I started out by writing a test procedure. The test procedure grew over time as I found bugs and read an interesting discussion about the various edge cases a binary search procedure should handle. These include:

  • Empty list
  • List has one word
  • List has two word
  • Word is not there and “less than” anything in the list
  • Word is not there and “greater than” anything in the list
  • Word is first item
  • Word is last item
  • List is all one word
  • If multiple copies of word are in list, return the first word found (this could be implemented to return the first or last duplicated word)

Furthermore, I added a few “sanity checks” that check the return values against known outputs. Here are the relevant procedures:

  • assert= checks two numbers for equality and prints a result
  • assert-equal checks two Scheme objects against each other with equal? and prints a result
  • run-binary-search-tests reads in words from a file and runs all of our tests
(define (assert= expected got #!optional noise)
  ;; Int Int -> IO
  (if (= expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (assert-equal? expected got #!optional noise)
  ;; Thing Thing -> IO
  (if (equal? expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (run-binary-search-tests)
  ;; -> IO
  "Run our binary search tests using known words from the 'words' file.
This file should be in the current working directory."
  (with-working-directory-pathname (pwd)
    (lambda ()
      (if (file-exists? "words")
          (begin
            (format #t "file 'words' exists, making a list...~%")
            (let* ((unsorted (make-list-of-words-matching "acc" "words"))
                   (sorted (sort unsorted string<?)))
              (format #t "doing binary searches...~%")
              (assert-equal? #f (binary-search "test" '())) ; empty list
              (assert-equal? #f (binary-search "aardvark" sorted)) ; element absent and too small
              (assert-equal? #f (binary-search "zebra" sorted)) ; element absent and too large
              (assert= 0 (binary-search "accusive" '("accusive"))) ; list of length one
              (assert= 0 (binary-search "acca" sorted)) ; first element of list
              (assert= 1 (binary-search "aardvark" '("aardvark" "aardvark" "babylon"))) ; multiple copies of word in list
              (assert= 1 (binary-search "barbaric" '("accusive" "barbaric"))) ; list of length two
              (assert= 98 (binary-search "acclamator" sorted))
              (assert= 127 (binary-search "aardvark" (map (lambda (x) "aardvark") test-list))) ; list is all one value
              (assert= 143 (binary-search "accomplice" sorted))
              (assert= 254 (binary-search "accustomedly" sorted))
              (assert= 255 (binary-search "accustomedness" sorted)))))))) ; last element of list

The binary search procedure

Finally, here’s the binary search procedure; it uses a couple of helper procedures for clarity.

  • ->int is a helper procedure that does a quick and dirty integer conversion on its argument
  • split-difference takes a low and high number and returns the floor of the halfway point between the two
  • binary-search takes an optional debug-print argument that I used a lot while debugging. The format statements and the optional argument tests add a lot of bulk – now that the procedure is debugged, they can probably be removed. (Aside: I wonder how much “elegant” code started out like this and was revised after sufficient initial testing and debugging?)
(define (->int n)
  ;; Number -> Int
  "Given a number N, return its integer representation.
N can be an integer or flonum (yes, it's quick and dirty)."
  (flo:floor->exact (exact->inexact n)))

(define (split-difference low high)
  ;; Int Int -> Int
  "Given two numbers, return their rough average."
  (if (= (- high low) 1)
      1
      (->int (/ (- high low) 2))))

(define (binary-search word xs #!optional debug-print)
  ;; String List -> Int
  "Do binary search of list XS for WORD. Return the index found, or #f."
  (if (null? xs)
      #f
      (let loop ((low 0) (high (- (length xs) 1)))
        (let* ((try (+ low (split-difference low high)))
               (word-at-try (list-ref xs try)))
          (cond
           ((string=? word-at-try word) try)
           ((< (- high low) 1) #f)
           ((= (- high try) 1) 
            (if (string=? (list-ref xs low) word)
                low
                #f))
           ((string<? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string<? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop (+ 1 try) high)) ; raise the bottom of the window
                (loop (+ 1 try) high)))
           ((string>? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string>? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop low (+ 1 try))) ; lower the top of the window
                (loop low (+ 1 try))))
           (else #f))))))

Takeaways

This exercise has taught me a lot.

  1. Writing correct code is hard. (I’m confident that this code is not correct.) You need to figure out your invariants and edge cases first. I didn’t, and it made things a lot harder.
  2. It’s been said a million times, but tests are code. The tests required some debugging of their own.
  3. Once they worked, the tests were extremely helpful. Especially now that I’m at the point where (if this were “for real”) additional features would need to be added, the format calls removed, the procedure speeded up, and so on.

I hope this has been useful to some other aspiring Scheme wizards out there. Happy Hacking!

(Image courtesy Melisande under Creative Commons license.)

What is the Highest Sum of a Number Triangle?

https://logicgrimoire.files.wordpress.com/2012/02/wpid-paper-triangles.jpg

A Description of the Problem

We are given a triangle of numbers, and we are asked to write a program that computes the highest sum of numbers passed on a route that starts at the top and ends somewhere on the base.

Some additional restrictions:

  • Each step can go diagonally down to the left or the right.
  • The number of rows in the triangle will be between 1 and 100, inclusive.
  • The numbers that populate the triangle are integers between 0 and 99.

(Read the original description here.)

What are our Inputs and Outputs?

Our initial input data will live in a file called triangle-input.txt, which contains the following:

5
7
3 8
8 1 0 
2 7 4 4
4 5 2 6 5

Note that the first line of the file is not part of the triangle itself; it’s there to tell us how many levels deep the triangle goes.

30

We’ll place our output in a file called triangle-output.txt, which will contain a single integer. We’ll place the names of our input and output files in the program constants INPUT_FILE and OUTPUT_FILE, respectively.

Reading the Input, Writing the Output

First, we’ll need to get the values out of triangle-input.txt and store them in a convenient structure. In this case, an array of arrays should do.

We begin by creating an empty array, tmp, which will act as a temporary storage space for the lines of triangle-input.txt.

We’ll read the lines of the file one at a time. For each line, we split the string, e.g., “1 2 3 4 5\n”, into a list, and push that line onto our array tmp.

tmp = []

File.open(INPUT_FILE, "r") do |f|
  f.lines.each do |line|
    tmp.push line.split
  end
end

Unfortunately, we’re pushing an array of strings onto tmp, since

"4 5 2 6 5\n".split

returns this:

["4", "5", "2", "6", "5"]

Therefore, we’ll take a second pass through and convert those arrays of strings into arrays of numbers. The resulting array will allow us to – finally! – begin our real work. Notice that this is a nested call to map, best read from the inside out. The value returned by the inner map is returned to the outer, with the results stored in variable tri, since the return value of map is always an array.

tri = tmp.map { |array| array.map { |elem| elem.to_i } }

We’ll wrap everything here up in a method, read_triangle_input, which will show up in the complete program listing below. We’ll also leave the write_triangle_output method for the listing; it requires little explanation.

Solving our Actual Problem

Now that the housekeeping chores are out of the way, we can jump in and begin the real work of finding the highest sum in our triangle.

We’d like to write a method, triangle_sum, which, given an array of arrays like the one we’ve just constructed, returns an integer representing the highest sum calculated on its imagined path “down through the triangle.”

Since our route through the triangle is restricted to “steps” that can “go diagonally down to the left or the right,” the most natural representation of this data is as a tree. We’ll simulate this in a crude way using an array of arrays.

The Inner Loop

Since our fundamental data structure is an array, we’ll need to choose a looping construct; we’ll stick with the “C-style” iterative constructs here since they map well onto this problem (no pun intended). We’ll use an index variable to keep track of where we are in the iteration.

Inside the loop, we want to know three things (among others):

  • Where am I?
  • Where is the element to my upper left?
  • Where is the element to my upper right?

We’ll denote the answers to these questions with the variables this, upleft, and upright in our program listing below.

Remember the problem description: Starting at the root of the tree, we’ll keep a running sum of all the numbers we’ve seen thus far on this particular path through the tree.

Visualizing Our Movement Down the Triangle

In order to solve this problem, we started with some hand-simulation of what would eventually become the final algorithm. The example below shows how it works: given where we are in the array of arrays, we look “up and to the left” and “up and to the right” of where we are. In the following diagrams, we denote our position in the algorithm with a lovely text arrow.

Index: 1

[7] <---
[3, 8]
[8, 1, 0]
[2, 7, 4, 4]
[4, 5, 2, 6, 5]

When index is 1, there isn’t anything to do, since we can’t look “up” at anything. Therefore, move on.

Index: 2

[7]
[10, 15] <---  Was: [3, 8]
[8, 1, 0]
[2, 7, 4, 4]
[4, 5, 2, 6, 5]

When index is 2, we look up and to the left (7), and up and to the right (also 7). Each time through the loop, we create a new temporary array called next_row, in which to store the running sums. In this case, we create a new array [10, 15] by adding 7 to each of [3, 8]. We then replace the old array with the new.

Index: 3

[7]
[10, 15]
[18, 11, 16, 15] <--- Was: [8, 1, 0]
[2, 7, 4, 4]
[4, 5, 2, 6, 5]

index is now 3. We perform the same operations as above: first, create an empty array. Then, for each element in the old array [8, 1, 0], we add the value of the element (which we’ll call this in our program code) to the values of upleft and upright (these are obviously our variable names for the “up and to the left” and “up and to the right” values we’ve already mentioned). In each case we push the results of these additions onto the new temporary array. Once we’ve finished, we replace the existing array with the new, as before.

Index: 4

[7]
[10, 15]
[18, 11, 16, 15]
[20, 25, 18, 15, 20, 20] <--- Was: [2, 7, 4, 4]
[4, 5, 2, 6, 5]

Index: 5

[7]
[10, 15]
[18, 11, 16, 15]
[20, 25, 18, 15, 20, 20]
[24, 25, 30, 27, 20, 24, 21, 20] <--- Was: [4, 5, 2, 6, 5]

Result: 30

Here we show two steps more of the process, and its completion. We can easily see that 30 is the largest sum in the last array, and our answer.

We notice that the “new” interior arrays we’re creating on each turn through the loop are longer than the originals they replace, so we’re not being as efficient with memory as we’d like. At least Array expansion is an O(n) operation!

The Complete Program Listing of triangle.rb

This essay is over 1200 words long already according to wc -w. Therefore, since this algorithm can be described very succinctly in code, I’ll break the rules of literate programming and simply end with the program listing itself. Note the temporary variables next_row, this, upleft, and upright, which are described in the section “Visualizing Our Movement Down the Triangle” above.

As always, the contents of this literate program are available at Github.

(Update: better solutions and discussion over at the Ruby Reddit)

#!/usr/bin/env ruby

require 'test/unit'

INPUT_FILE  = "triangle-input.txt"
OUTPUT_FILE = "triangle-output.txt"

def read_triangle_input
  tmp = []

  File.open(INPUT_FILE, "r") do |f|
    f.lines.each do |line|
      tmp.push line.split
    end
  end

  tri = tmp.map { |array| array.map { |elem| elem.to_i } }
end

def write_triangle_output(result)
  File.open(OUTPUT_FILE, "w") do |f|
    f.print result
  end
end

def triangle_sum(tri)
  a = Array.new(tri)
  index = 1
  len = a.shift[0]-1
  while index <= len
    next_row = []
    for i in 0..index
      this = a[index][i]
      upleft = a[index-1][i-1]
      upright = a[index-1][i]

      if i == 0
        next_row.push this + upright
      elsif i == index
        next_row.push this + upleft
      else
        next_row.push this + upleft
        next_row.push this + upright
      end
    end
    a[index] = next_row
    index += 1
  end
  a[index-1].max
end

tri = read_triangle_input
highest_sum = triangle_sum(tri)
write_triangle_output(highest_sum)  

class TestTriangleSum < Test::Unit::TestCase
  def test_01
    tri = read_triangle_input
    expected = triangle_sum(tri)
    assert_equal expected, 30
  end
end

(Image courtesy Mélisande* under Creative Commons license.)

Finding the Equilibrium Index of an Array

The Problem Description

A zero-indexed array A consisting of N integers is given. An
equilibrium index of this array is any integer P such that 0 <=
P < N and the sum of elements of lower indices is equal to the sum
of elements of higher indices, i.e.,

A[0] + A[1] + ... + A[P-1] =
A[P+1] + ... + A[N-2] + A[N-1].

The sum of zero elements is assumed to be equal to 0.

Write a function that, given a zero-indexed array A consisting of
N integers, returns any of its equilibrium indices. The function
should return -1 if no equilibrium index exists.

  • Expected worst-case time complexity is O(N)
  • Expected worst-case space complexity is O(N), beyond input storage

(not counting the storage required for input arguments).

The above is taken from the problem description by Codility; they discuss one solution (written in C) here.

Notes on this Implementation

First things first: we’ll avoid the return -1 “C-ism,” as it’s more natural to return nil in Ruby. In fact, Ruby sees -1 as True, so returning it will break many common predicates, whereas nil won’t.

Experienced programmers will observe that this implementation does not meet the O(n) time and space complexity requirements as listed above. Absolute simplicity of implementation is the focus right now, but that may change in a future version.

Finally, notice that this short essay is itself a literate program, implemented using the wonderful Org-mode capabilities which are included with the Emacs text editor.

The Inner Loop

Since most of the program’s work is done inside a single while loop, we’ll begin there.

Let A be the Ruby array

[-7, 1, 5, 2, -4, 3, 0].

The equilibrium index of A can be computed by hand, so it’s a good place to start. It’s also our first test case (see below).

We’ve chosen to implement the iteration over A using a while loop rather than the more idiomatic Array#each method since we need to keep track of our current index into the array.

As we begin the loop, the variable index is initialized as the length of the array minus one. This is required because Ruby’s arrays are zero-based, but the value returned by its Array#length method is not.

while index > 0
  left  = a[0..index-1].reduce(:+)
  right = a[index+1..len].reduce(:+)
  if left == right
    return index
  end
  index -= 1
end

We’ll iterate backwards through the array from the end. At each value of index, we’ll split A into two smaller arrays, left and right. This requires allocating two new arrays in memory, reading the values of the desired “slices” of A, and copying them into left and right, respectively. This operation provides for an implementation that is simple to visualize and understand, but it’s also the reason why we fail to meet the requirements for O(n) space and time complexity. A more efficient implementation would avoid these unnecessary allocation and copying steps, and we might change this program at some point to achieve that.

Visualizing the Iteration

Let’s look at left and right at each stage of the iteration:

https://logicgrimoire.files.wordpress.com/2012/02/wpid-equi.png

We can see that when we split A at index, we always leave a “gap” in between, and it’s this gap which will provide us with the answer we seek: the equilibrium index (provided that the equilibrium index is defined for the given array, that is). At every iteration in the diagram above, we sum the values within left and right using the Array#reduce method. If left and right are equal, index is defined as the equilibrium index, and we return index. Otherwise, we’ll end up escaping the loop and returning nil.

The equi Function

Looking briefly at the entire equi function, we can see that it’s just a wrapper for the while loop. First we set the value of index and len as bookkeeping measures. We then enter the while loop as described above. If the loop completes without returning a value, the program returns to the next outer context and returns the nil value, which lets our caller know that this array doesn’t have an equilibrium index.

def equi(a)
  index = a.length-1
  len   = a.length-1
  while index > 0
    left  = a[0..index-1].reduce(:+)
    right = a[index+1..len].reduce(:+)
    if left == right
      return index
    end
    index -= 1
  end
  nil
end

Test Cases

Over the course of developing the program, a number of test cases have come to mind. We’ll begin with the given array A that we started with.

def test_random
  sample_array = [-7, 1, 5, 2, -4, 3, 0]
  expected = equi(sample_array)
  assert_equal expected, 3
end

Here we’ve defined a trivial `pyramid’ array of values that ascend and descend symmetrically.

def test_trivial_pyramid
  sample_array = [1, 2, 3, 4, 3, 2, 1]
  expected = equi(sample_array)
  assert_equal expected, 3
end

This test checks the case where the first value is equal to the sum of all the rest (save the equilibrium index, of course).

 def test_biggest_first
  sample_array = [99, 0, 66, 32, 1]
  expected = equi(sample_array)
  assert_equal expected, 1
end

Similarly, we check for the case where the last value alone is equal to the sum of left.

def test_biggest_last
  sample_array = [66, 32, 1, 0, 99]
  expected = equi(sample_array)
  assert_equal expected, 3
end

We should return nil for an array with a single element, since the equilibrium index is not defined in that case.

def test_single_element
  sample_array = [0]
  expected = equi(sample_array)
  assert_equal expected, nil
end

The same is true of an array containing a single nil.

def test_single_nil
  sample_array = [nil]
  expected = equi(sample_array)
  assert_equal expected, nil
end

The Complete Program Listing

Finally, we have the complete program listing for the tangle‘d file literate-equi.rb. Since we’ve included our tests by subclassing the Test::Unit class, Ruby will run them for us when we invoke the program. Running ruby literate-equi.rb at the command shell should return the following output:

~/Desktop/Dropbox/current/logicgrimoire $ ruby literate-equi.rb 
Loaded suite literate-equi
Started
......
Finished in 0.002195 seconds.

6 tests, 6 assertions, 0 failures, 0 errors

The program itself:

#!/usr/bin/env ruby

require 'test/unit'

def equi(a)
  index = a.length-1
  len   = a.length-1
  while index > 0
    left  = a[0..index-1].reduce(:+)
    right = a[index+1..len].reduce(:+)
    if left == right
      return index
    end
    index -= 1
  end
  nil
end


class TestEqui < Test::Unit::TestCase
  def test_random
    sample_array = [-7, 1, 5, 2, -4, 3, 0]
    expected = equi(sample_array)
    assert_equal expected, 3
  end

  def test_trivial_pyramid
    sample_array = [1, 2, 3, 4, 3, 2, 1]
    expected = equi(sample_array)
    assert_equal expected, 3
  end
   def test_biggest_first
    sample_array = [99, 0, 66, 32, 1]
    expected = equi(sample_array)
    assert_equal expected, 1
  end
  def test_biggest_last
    sample_array = [66, 32, 1, 0, 99]
    expected = equi(sample_array)
    assert_equal expected, 3
  end
  def test_single_element
    sample_array = [0]
    expected = equi(sample_array)
    assert_equal expected, nil
  end
  def test_single_nil
    sample_array = [nil]
    expected = equi(sample_array)
    assert_equal expected, nil
  end

end