Category Archives: Lisp

Three views on a loop

Recently I needed to translate a function I had written in Scheme to Common Lisp for … reasons. As it turns out, this was not entirely straightforward. The function was originally written in a very recursive style. In order to have something that would run portably across Common Lisp implementations I needed something that didn’t rely so heavily on recursion.

This was not easy for me. I have written much much more Scheme than Common Lisp in my life, and it feels quite natural to me now to design loops using recursion. So that bridge had to be crossed somehow. This is the story of how I crossed it.

The procedure in question is merge, which is used to implement the core of a merge sort.

Let’s start with the Scheme. The code below shows a straightforward implementation of merge from a Scheme programmer’s perspective (or at least this one’s). As mentioned above, this is the merging procedure that is used to implement the core of a merge sort. It’s very similar to the code presented in Bottom Up Merge Sort in Scheme.

(define (merge l r pred)
  (letrec ((merge-aux
            (lambda (left right pred result)
              (cond ((and (null? left)
                          (null? right))
                     (reverse result))
                    ((and (not (null? left))
                          (not (null? right)))
                     (if (pred (car left) (car right))
                         (merge-aux (cdr left)
                                    right pred (cons (car left) result))
                       (merge-aux left
                                  (cdr right) pred (cons (car right) result))))
                    ((not (null? left))
                     (merge-aux (cdr left) right pred (cons (car left) result)))
                    ((not (null? right))
                     (merge-aux left (cdr right) pred (cons (car right) result)))
                    (else #f)))))
    (merge-aux l r pred '())))

As you can see this code makes heavy use of recursive procedure calls, the default iteration construct in Scheme. As noted above, we can’t expect this sort of code to be supported in portable Common Lisp code, so it needs to be translated.

(Side note: If you are interested in some notes on which Lisp implementations support tail calls, recursive self-calls, etc., see this interesting article.)

Below is my first attempt to translate the Scheme code directly into Common Lisp. It still relies heavily on recursive function calls. If you compile this in a recent CMUCL or Clozure CL the recursion doesn’t seem to bother them, since their compilers support recursion well. And the performance will be pretty good, too. But that won’t be the case in every Lisp.

(Note that we call this function rml/merge because Common Lisp already has a built-in merge — about which more later.)

(defun rml/merge (l r pred)
  (labels ((merge-aux (pred left right result)
             (cond ((and (null left)
                         (null right))
                    (reverse result))
                   ((and (not (null left))
                         (not (null right)))
                    (if (funcall pred (car left) (car right))
                        (merge-aux pred (cdr left)
                                   right (cons (car left) result))
                        (merge-aux pred left
                                   (cdr right) (cons (car right) result))))
                   ((and (not (null left)) (null right))
                    (merge-aux pred (cdr left) right (cons (car left) result)))
                   ((and (null left) (not (null right)))
                    (merge-aux pred left (cdr right) (cons (car right) result)))
                   (t 'FAIL))))
    (merge-aux pred l r '())))

As we discussed, some Common Lisp implementations will not handle this level of recursion. One Lisp in particular was pretty salty about it. :-}

IMAGE

genera-stack-overflow

CLISP and ABCL will also explode pretty much immediately.

It took a bit of thinking, but I remembered reading about something called BLOCK in good old CLtL2. It lets you create a named block that you can later jump out of at your convenience, e.g.,

(block outer
   ;; ... do things
   ;; ok, i'm done!
   (return-from outer my-result))

This reminds me of a similar pattern I’ve used in Perl (or Java for that matter) for early returns.

LOOP: for my $elem (@list) {
    # do things ....
    next LOOP if check($elem);
    # do other things ...
}

Using my newfound weapon (hammer?) block, I was able to slay the recursion dragon as shown below in rml/merge2. This code is speaking Common Lisp, but it still has a heavy Scheme accent! The shape of the computation feels pretty similar, at least.

(defun rml/merge2 (left right pred)
  (let ((left left)
        (right right)
        (result '()))
    (block outer
      (loop do
           (block inner
             (cond ((and (null left) (null right))
                    (return-from outer))
                   ((and (not (null left))
                         (not (null right)))
                    (if (funcall pred (car left) (car right))
                        (progn
                          (push (car left) result)
                          (setf left (cdr left))
                          (return-from inner))
                        (progn
                          (push (car right) result)
                          (setf right (cdr right))
                          (return-from inner))))
                   ((not (null left))
                    (progn
                      (push (car left) result)
                      (setf left (cdr left))
                      (return-from inner)))
                   ((not (null right))
                    (progn
                      (push (car right) result)
                      (setf right (cdr right))
                      (return-from inner)))
                   (t                   ; Should never get here (tm)
                    (error "ERROR: RML/MERGE2: ~A ~A ~A~%" left right result))))))
    (nreverse result)))

At a high level, the code works as follows:

  1. let-bind some variables to the arguments passed in, so we can bash on them at will later on.
  2. Open up an outer block to capture the state of the loop, with a name we can use for an early return. This is what we will jump out to when we’re done.
  3. Start a big loop with an inner block. I’m using a magic number here (one billion) as a shorthand for “just loop forever” since we are going to jump out when we’re done anyway. (“One-billion-element-long lists should be enough for anyone.”)
  4. Inside the loop’s inner block, we model the iteration created by recursive function calls using jumps. In every step of the cond, we jump out to the inner block’s label just as we would in Scheme by making a procedure call, except here we need to be explicit using return-from, whereas Scheme lets us be implicit.
  5. Once left and right are both empty, our work is done, so we jump back out to the outer block. Then, we destructively sort the result list in place with nreverse, just because we can. (Hey, it was temporary anyway.)

This isn’t very satisfying, though. It felt icky using block, which feels like a building block you’d need to build a looping construct (aka compiler/macro output), but not something “user-level” code should need to use. So I did some more reading of Common Lisp code. In particular I discovered a couple of things:

  • A bare loop with no arguments will happily go on forever; you don’t need to say loop repeat 5 ....
  • Inside a loop, you don’t need to say do (something else) unless, prior to that point, you’ve used loop‘s “English” syntax, e.g., loop from 1 to 10 do ....
  • Because this isn’t Scheme, we don’t need to do the 2-step dance of “put car of A here, set A to cdr of A there”. We can push the return value of pop onto Adirectly, and rely on the fact that pop mutates its argument list to handle setting A to cdr of A. Mmmmm, yummy state-bashing.

Thanks to the above, we now have code for rml/merge3. It got a lot shorter and simpler along the way, while saying the same things:

(defun rml/merge3 (left right pred)
  (let ((result '()))
    (loop
       (cond ((and (null left) (null right))
              (return (nreverse result)))
             ((and (not (null left))
                   (not (null right)))
              (if (funcall pred (car left) (car right))
                  (push (pop left) result)
                  (push (pop right) result)))
             ((not (null left))
              (push (pop left) result))
             ((not (null right))
              (push (pop right) result))
             (t                         ; Should never get here (tm)
              (error "ERROR: RML/MERGE2: ~A ~A ~A~%" left right result))))))

Now that I had some code, I needed to make sure it actually gave the right answers (aka the hard part). Because Common Lisp has a merge function built in, I was able to write a bit of test code to check my merge function’s implementation against what the Lisp wizards did. There’s no feeling like standing on the shoulders of giants!

Here is the test code, which you can paste into your REPL and try for yourself. It runs our merge function against the system’s merge 1000 times on the same randomly generated input lists. For each test case, if the outputs match, we output a t; otherwise nil.

(defun make-random-list (size &key max)
  (loop repeat (random size) collecting (random max)))

(defun check-merge-output (function size)
  (let* ((left (make-random-list size :max size))
         (right (make-random-list size :max size))
         (left-copy (copy-list left))
         (right-copy (copy-list right))
         (expected (merge 'list left-copy right-copy #'<))
         (got (funcall function left right #'<)))
    (if (equalp expected got)
        t
        (values expected got))))

(defun run-merge-tests (function)
  (loop repeat 1000 collecting
       (if (check-merge-output function 1000) t nil)))

Here’s what the output of a successful run looks like (Luckily, I have not yet found any unsuccessful ones):

CL-USER> (run-merge-tests #'rml/merge3)
(T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T ...)

Based on the testing shown here, I think this implementation is Probably Fine ™. I’m happy with it, and it performs well enough for my needs.

In a future post, we’ll use the merge function we’ve shared here to write a merge sort.

Performance Postamble

But what about the performance? It’s not bad, actually! As usual, it can depend on the implementation. Below I’ll list out how it compares in a few implementations. However I should state right now for the record that:

  1. I’m not a performance or benchmarking expert. You should probably just stop reading here, because what follows is likely to be nonsense.
  2. My implementation only works on lists (not vectors, too) and does not accept a sort key, as the standard dictates.
  3. My implementation is not battle-hardened from living inside a Common Lisp implementation for decades (the most important point of all).

That said, what follows is what I saw when I ran the test code below on a few implementations:

(defun check-merge-performance (function)
  (time
   (let* ((len 1000000)
          (max 1000000)
          (xs (make-random-list len :max max))
          (ys (make-random-list len :max max)))
     (progn
       (funcall function xs ys #'<)
       (values)))))

(defun check-system-merge-performance ()
  (time
   (let* ((len 1000000)
          (max 1000000)
          (xs (make-random-list len :max max))
          (ys (make-random-list len :max max)))
     (progn
       (merge 'list xs ys #'<)
       (values)))))

Note that the following tests were all run on a several-years-old Windows laptop with spinning disks, slow RAM, and a bunch of other applications running. Did I mention I’m not a performance and benchmarking expert? :-}

Armed Bear Common Lisp

ABCL‘s merge implementation did not do so well against mine, but I think I must be missing something. For example, there’s probably a lot of subtlety around JVM performance on Windows I just have no idea about.

Mine ran in about 17 seconds.

CL-USER> (check-merge-performance #'rml/merge3)
17.452 seconds real time
4000684 cons cells
NIL

Unfortunately, theirs took about 5 minutes to do about the same (*) computation. I’m not sure why. It’s possible I’m making some kind of mistake here, but I don’t know what it is. I used the timing code below because at first I thought the ABCL merge check had just hung altogether.

CL-USER> (time
          (let* ((len 1000000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)
              (values))))

312.803 seconds real time
412497 cons cells
NIL

(*) It’s not exactly the same, because remember that CL’s merge works on lists or vectors, so there will be some dispatching.

Out of curiosity, I went back and tried a few smaller inputs. ABCL seems to do OK with 2 100,000-element lists of random numbers, as shown below. Further, if you look at the number of cons cells, the function does appear to be O(n), which you would expect to see. This makes me think the problem is simply that 2 lists of 1,000,000 elements in the JVM is just too big a load for my little laptop, somehow.

CL-USER> (time
          (let* ((len 100000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)               (values)))) 0.028 seconds real time 4025 cons cells NIL CL-USER> (time
          (let* ((len 10000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)
              (values))))
2.24 seconds real time
40113 cons cells
NIL

CLISP

CLISP‘s implementation is about twice as fast as mine, and uses about half as much memory, as is right and proper.

CL-USER> (check-merge-performance #'rml/merge3)
Real time: 9.634908 sec.
Run time: 9.578125 sec.
Space: 56000000 Bytes
GC: 16, GC time: 0.578125 sec.
; No value
CL-USER> (check-system-merge-performance)
Real time: 4.6764607 sec.
Run time: 4.671875 sec.
Space: 32000000 Bytes
GC: 6, GC time: 0.40625 sec.
; No value

Corman Lisp

Corman Lisp is a Common Lisp implementation for Windows. It’s fast and has really good Win32 integration. I enjoy the IDE a lot.

If you like Lisp, you may enjoy this interesting talk by its creator Roger Corman where he discusses its development.

The system merge is about twice as fast as mine. It seems to come down to my implementation creating a lot of garbage, which is expected given my naive use of push and pop when I should be doing more direct mutation and structure sharing using e.g. rplacd.

(check-merge-performance #'rml/merge3)
Total Execution time: 0.360289 seconds
Time spent garbage collecting: 0.093065 seconds

(check-system-merge-performance)
Total Execution time: 0.180082 seconds
Time spent garbage collecting: 0.0 seconds

LispWorks (32-bit Hobbyist Edition for Windows)

LispWorks is a proprietary implementation put out by a company in England. Not only is their merge faster, they’re using about half as much memory. All as expected.

P.S. I just bought the “Hobbyist” version of the software, and I can definitely recommend it. Fast compiler, and the IDE and class browser, debugger, etc., are lot of fun to use.

CL-USER 40 > (check-merge-performance #'rml/merge3)
Timing the evaluation of (LET* ((LEN 100000) (MAX 100000) (XS (MAKE-RANDOM-LIST LEN :MAX MAX)) (YS (MAKE-RANDOM-LIST LEN :MAX MAX))) (PROGN (FUNCALL FUNCTION XS YS (FUNCTION <)) (VALUES))) User time    =        0.062 System time  =        0.015 Elapsed time =        0.063 Allocation   = 2895872 bytes 0 Page faults CL-USER 41 > (check-system-merge-performance)
Timing the evaluation of (LET* ((LEN 100000) (MAX 100000) (XS (MAKE-RANDOM-LIST LEN :MAX MAX)) (YS (MAKE-RANDOM-LIST LEN :MAX MAX))) (PROGN (MERGE (QUOTE LIST) XS YS (FUNCTION <)) (VALUES)))

User time    =        0.015
System time  =        0.000
Elapsed time =        0.015
Allocation   = 1434380 bytes
0 Page faults

Anyway that’s all for now. As I said, you should take the “results” above with a grain of salt. Nonetheless it was interesting seeing how the different implementations behaved in an unscientific comparison.

Genera Notes, Part 1/N

I think I’d like to start sharing some screenshots and notes taken while playing with my local Open Genera installation.  In part for historical capture reasons, and in part because I think it’s fun.

I set up the system using the instructions in this Youtube video, which can be tl;dr’d as:

  • install an old-ass Ubuntu in a Virtualbox
  • configure NFS and other random things
  • SSH in with Putty and fire up an Open Genera instance via X Windows

The old-ass Ubuntu is allegedly necessary due to some behavior in newer X that breaks Open Genera, but I haven’t verified that yet, only read it.

I’m planning to write up the (Virtualbox on Windows) installation process shown in that video soon for my own future reference.  At that point I’ll probably write a script to automate it as well.

If you don’t use Windows, there is already this excellent tutorial: Running Open Genera 2.0 on Linux.  I’ve exchanged mail with the author of this piece, he seems like quite a nice guy in addition to being pretty knowledgeable.  Apparently Open Genera runs more robustly on a Compaq AlphaServerDS10L (or similar machine) as was originally intended, though it’s much slower than modern systems.

13 November 2018

Currently reading the Genera User’s Guide section entitled Getting Acquainted with Dynamic Windows (link is to the exact page of the PDF on Internet Archive!).  There is a list of bookmarks on the right that I’d like to revisit and finish reading.

To add a document to the Bookmarks pane in Document Examiner, either:

  • Visit it (so it’s added automatically)
  • From somewhere else in the interface, when you see a link (AKA a hoverable title for a document or section thereof), press Shift and then click with the left mouse button (also denoted as Sh-Mouse-M by the system – you need a three-button mouse to use this system properly)

Note: the excessive (?) whitespace in the screenshot below is due to the fact that we’re running at 1920×1080, which is my laptop’s default resolution but is probably (?) larger than any physical Lisp Machine monitor that ever existed.  Based on some pictures of actual monitors I’ve seen, I wonder if this environment would profit from running on a vertically oriented monitor as well.  Something to play with.

As I read various docs, I’ve been taking notes in Zmacs.  The Zmacs buffer shown in the next screenshot is actually getting written to the Linux machine’s (virtual) disk, and can thus be backed up, edited from other text editors, etc.  It’s all happening over NFS!  And as you have probably deduced from the window borders, this Genera window is being served over the X Windows system (specifically XMing running on Windows).

Here’s the Zmacs window after being expanded using the System menu (shown, which can be accessed at any time via Sh-Mouse-R):

In addition to the Genera window, there is the “cold load” window that is also displayed via X Windows while Open Genera is running.  And lo!  As I began writing this, Genera crashed by trying to display an ellipse (an image from the ZMail documentation, specifically Conceptual Zmail Architecture), which caused it to try to reference an array out of bounds (I don’t know why, yet).  Here’s the backtrace as shown in the cold load window (the Genera window with Document Examiner just beeped and froze – when that happens it’s time to look at the cold load window):

In the cold load window’s debugger I was able to ascertain the following keys’ meanings, at least on my laptop (confusingly, the keys here do not map to the same keys as in the Genera X window):

  • Shift-E means “eval (program)”, dropping you into Lisp
  • * (asterisk) means Abort.  It popped me back up out of the cold load stream and into Genera (the Document Examiner in particular).  My document window state was nuked, but I was able to click the bookmark to return to the section I was reading.  (Now to go back and see if I can get Document Examiner to crash again with a bad array subscript by viewing that page again!).

Note: the above crash(es) happened while I was simultaneously loading CLIM in the Listener, which seems to put a lot of load on the (smallish, Virtualbox’d) system I’m running on.  So it may have something to do with that.  Here’s what some of the output of loading CLIM looks like:

Oh, and another thing: Zmail can read GNU Emacs RMAIL files according to the docs!  I’m not 100% sure what “UNIX” mail format means in this context, but perhaps it means good old mbox?

Anyway I’ve got a lot left to explore on this system.  As it is I’ve been reading the documentation and browsing around in my off hours for the last week or two, and it feels like I’m just getting started.

Advent of Code, Day 3

In this post I’ll describe my solution for Day 3 of the Advent of Code.

Problem Description

Day 3: Perfectly Spherical Houses in a Vacuum

Santa is delivering presents to an infinite two-dimensional grid of houses.

He begins by delivering a present to the house at his starting location, and then an elf at the North Pole calls him via radio and tells him where to move next. Moves are always exactly one house to the north (‘^’), south (‘v’), east (‘>’), or west (‘<‘). After each move, he delivers another present to the house at his new location.

However, the elf back at the north pole has had a little too much eggnog, and so his directions are a little off, and Santa ends up visiting some houses more than once. How many houses receive at least one present?

For example:

‘>’ delivers presents to 2 houses: one at the starting location, and one to the east.

‘^>v<‘ delivers presents to 4 houses in a square, including twice to the house at his starting/ending location.

‘^v^v^v^v^v’ delivers a bunch of presents to some very lucky children at only 2 houses.

Solution

Broadly speaking, my solution consisted of:

  • Reading the directions file to determine the largest x and y values of the grid
  • Making a shaped array using those dimensions (specifically, we double the array dimensions to allow for movement up, down, forward, and back)
  • Starting in the center of the shaped array, follow the instructions from the “map” and mark every house (called a “position” in the code) if it hasn’t already been visited
  • Every time we visit a house we haven’t already visited, we bump a counter

Here’s the Scheme code that accomplishes those steps:

;; read in the string
;; sum the ^ and v chars to get the height of the matrix (graph)
;; sum the < and > chars to get the width of the matrix (graph)

(define (north? ch) (char=? ch #\^))
(define (south? ch) (char=? ch #\v))
(define (east? ch) (char=? ch #\>))
(define (west? ch) (char=? ch #\<))

(define (make-shape width height)
  ;; Int Int Int -> Shape
  (shape 0 width 0 height))

(define (read-shape-file file)
  ;; Pathname -> Shape
  (with-input-from-file file
    (lambda ()
      (let loop ((width 0)
         (height 0)
         (min-width  0)
         (max-width 0)
         (min-height 0)
         (max-height 0)
         (ch (read-char)))
    (if (eof-object? ch)
        (make-shape
         (* 2 (- max-width min-width))
         (* 2 (-  max-height min-height)))
        (cond ((north? ch)
           (loop width (+ height 1)
             min-width max-width
             (min min-height height)
             (max max-height height)
             (read-char)))
          ((south? ch)
           (loop width (- height 1)
             min-width max-width
             (min min-height height)
             (max max-height height)
             (read-char)))
          ((east? ch)
           (loop (+ width 1) height
             (min min-width width)
             (max max-width width)
             min-height max-height
             (read-char)))
          ((west? ch)
           (loop (- width 1) height
             (min min-width width)
             (max max-width width)
             min-height max-height
             (read-char)))
          (else (error "WHOA"))))))))

;; We make a shaped, multi-dimensional array (SRFI-25) in the size
;; it's determined we need by our earlier check.

(define (make-grid shape)
  ;; Shape -> Array
  (make-array shape #f))

;; The POSITION data type

(define-record-type position
  (make-position x y)
  position?
  (x position-x set-position-x!)
  (y position-y set-position-y!))

(define (array-center arr)
  ;; Array -> Position
  (let ((len-x (array-length arr 0))
    (len-y (array-length arr 1)))
    (make-position (/ len-x 2)
           (/ len-y 2))))

(define (make-relative-position x y ch)
  ;; Int Int Char -> Position
  (let ((vals '()))
    (cond ((north? ch)
       (set! vals (list (+ x 1) y)))
      ((south? ch)
       (set! vals (list (- x 1) y)))
      ((east? ch)
       (set! vals (list x (- y 1))))
      ((west? ch)
       (set! vals (list x (+ y 1))))
      (else (set! vals (list x y))))
    (make-position (first vals)
           (second vals))))

(define (visited? arr x y)
  ;; Array Int Int -> Bool
  (array-ref arr x y))

(define (set-visited! arr x y)
  ;; Array Int Int -> Undefined
  (array-set! arr x y #t))

(define (visit-locations arr file)
  ;; Pathname -> Int
  (let ((visited-count 0))
    (with-input-from-file file
      (lambda ()
    (let loop ((ch (read-char))
           (current-position (array-center arr)))
      (if (eof-object? ch)
          visited-count
          (begin
        (let* ((current-x (position-x current-position))
               (current-y (position-y current-position))
               (next-position
            (make-relative-position current-x current-y ch)))
          (if (not (visited? arr current-x current-y))
              (begin
            (set! visited-count (+ visited-count 1))
            (set-visited! arr current-x current-y)
            (loop (read-char)
                  next-position))
              (loop (read-char) next-position))))))))
    visited-count))

;; eof

Once this code is loaded up in the REPL, you can use it as shown below. (Note that the answer shown at the end isn’t real to avoid a spoiler.)

(set! *the-array* (make-grid (read-shape-file (expand-file-name "~/Code/personal/advent-of-code/03.dat"))))
'#{Array:srfi-9-record-type-descriptor}

> (array-size *the-array*)
42588

> (array-center *the-array*)
'#{Position}

> (define *the-file* (expand-file-name "~/Code/personal/advent-of-code/03.dat"))

> (visit-locations *the-array* *the-file*)
12345

Related Posts

Advent of Code, Day 2

This post describes my solution for Day 2 of the Advent of Code.

Problem Description

First, the problem description (copied from the website):


Day 2: I Was Told There Would Be No Math

The elves are running low on wrapping paper, and so they need to submit an order for more. They have a list of the dimensions (length l, width w, and height h) of each present, and only want to order exactly as much as they need.

Fortunately, every present is a box (a perfect right rectangular prism), which makes calculating the required wrapping paper for each gift a little easier: find the surface area of the box, which is 2 x l x w + 2 x w x h + 2 x h x l. The elves also need a little extra paper for each present: the area of the smallest side.

For example:

  • A present with dimensions 2x3x4 requires 2 x 6 + 2 x 12 + 2 x 8 = 52 square feet of wrapping paper plus 6 square feet of slack, for a total of 58 square feet.
  • A present with dimensions 1x1x10 requires 2 x 1 + 2 x 10 + 2 x 10 = 42 square feet of wrapping paper plus 1 square foot of slack, for a total of 43 square feet.

All numbers in the elves’ list are in feet. How many total square feet of wrapping paper should they order?


Solution

Once again, we’ll be working in Scheme.

For this problem, I decided to create a “box” data type. In addition to the automatically generated accessors (thanks SRFI-9!), I wrote several procedures to perform calculations on boxes, namely:

  • SURFACE-AREA: Calculate the box’s surface area.
  • SMALLEST-SIDE: Determine which of the box’s sides has the smallest surface area (the extra material makes it easier to wrap).

WRAPPING-PAPER is just a “wrapper” (pun intended) around the first two.

LINE->BOX, READ-BOXES, and SUM-BOXES are all about parsing the input file contents and shuffling them into the box data type that we use to do the actual calculation. The only part that required a bit of thought was the line with STRING-TOKENIZE in LINE->BOX. In Perl I’d use my @params = split /x/, $line without even thinking, but I was less familiar with Scheme’s facility for solving this problem, so it took a few minutes to puzzle out the right part of Scheme’s “API”. (STRING-TOKENIZE was helpfully provided by SRFI-13.)

Abstract data types FTW! I’ll be using them more as the month’s challenges progress.

;; ,open srfi-9 srfi-13 sort

(define-record-type box
  (make-box l w h)
  box?
  (l box-length set-box-length!)
  (w box-width set-box-width!)
  (h box-height set-box-height!))

(define (surface-area box)
  ;; Box -> Int
  (let ((l (box-length box))
    (w (box-width box))
    (h (box-height box)))
    (+ (* 2 l w)
       (* 2 w h)
       (* 2 h l))))

(define (smallest-side box)
  ;; Box -> Int
  (define (smallest-two xs)
    ;; List -> List
    (let ((sorted (sort-list xs <)))
      (list (first sorted)
        (second sorted))))
  (let ((l (box-length box))
    (w (box-width box))
    (h (box-height box)))
    (apply * (smallest-two (list l w h)))))

(define (wrapping-paper box)
  ;; Box -> Int
  (let ((minimum (surface-area box))
    (extra (smallest-side box)))
    (+ minimum extra)))

(define (line->box line)
  ;; String -> Box
  (define (line->lon s)
    ;; String -> List<Number>
    (let ((xs (string-tokenize s (char-set-complement (char-set #\x)))))
      (map string->number xs)))
  (let* ((dims (line->lon line)))
    (let ((l (first dims))
      (w (second dims))
      (h (third dims)))
      (make-box l w h))))

(define (read-boxes file)
  ;; Pathname -> List<Box>
  (with-input-from-file file
    (lambda ()
      (let loop ((line (read-line))
         (ys '()))
    (if (eof-object? line)
        ys
        (loop
         (read-line)
         (cons (line->box line) ys)))))))

(define (sum-boxes boxes)
  ;; List<Box> -> Int
  (let ((xs (map wrapping-paper boxes)))
    (apply + xs)))

;; eof

Related Posts

Advent of Code, Day 1

origami-dragon

Advent of Code is a site that provides a programming problem for every day in December leading up to Christmas.

I’ve become a little obsessed with it over the last few days, and thought I’d write up my results. So far I’ve been working in Scheme.

Here’s the Day 1 problem description:


Day 1: Not Quite Lisp

Santa was hoping for a white Christmas, but his weather machine’s “snow” function is powered by stars, and he’s fresh out! To save Christmas, he needs you to collect fifty stars by December 25th.

Collect stars by helping Santa solve puzzles. Two puzzles will be made available on each day in the advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!

Here’s an easy puzzle to warm you up.

Santa is trying to deliver presents in a large apartment building, but he can’t find the right floor – the directions he got are a little confusing. He starts on the ground floor (floor 0) and then follows the instructions one character at a time.

An opening parenthesis, (, means he should go up one floor, and a closing parenthesis, ), means he should go down one floor.

The apartment building is very tall, and the basement is very deep; he will never find the top or bottom floors.

For example:

  • (()) and ()() both result in floor 0.
  • ((( and (()(()( both result in floor 3.
  • ))((((( also results in floor 3.
  • ()) and ))( both result in floor -1 (the first basement level).
  • ))) and )())()) both result in floor -3.

To what floor do the instructions take Santa?


The file of instructions looks something like this (but much larger):

()()((((()(((())(()(()((((((()(()(((())))((()(((()))((())(()((()

Here’s the (reasonably straight-forward) Scheme code. It basically just iterates through the input file, bumping a counter up or down based on the type of paren read from the input port. The definitions of UP-FLOOR? and DOWN-FLOOR? weren’t really necessary, but they made the main procedure a little easier to read.

(define (up-floor? x)
  (char=? x #\())

(define (down-floor? x)
  (char=? x #\)))

(define (find-floor file)
  (with-input-from-file file
    (lambda ()
      (let loop ((floor-number 0)
         (char (read-char)))
    (if (eof-object? char)
        floor-number
        (loop (cond ((up-floor? char)
             (+ floor-number 1))
            ((down-floor? char)
             (- floor-number 1))
            (else floor-number))
          (read-char)))))))

(Image courtesy Strongpaper under a Creative Commons license.)

A Solution to the Five Weekends Problem in Common Lisp

neuschwanstein

I’ve been enjoying playing around with Rosetta Code problems lately. Here’s a solution I posted last week for the Five Weekends problem in Common Lisp:

  ;;; http://rosettacode.org/wiki/Five_weekends

  ;; Given a date, get the day of the week.  Adapted from
  ;; http://lispcookbook.github.io/cl-cookbook/dates_and_times.html

  (defun day-of-week (day month year)
    (nth-value
     6
     (decode-universal-time
          (encode-universal-time 0 0 0 day month year 0)
          0)))

  (defparameter *long-months* '(1 3 5 7 8 10 12))

  (defun sundayp (day month year)
    (= (day-of-week day month year) 6))

  (defun ends-on-sunday-p (month year)
    (sundayp 31 month year))

  ;; We use the "long month that ends on Sunday" rule.
  (defun has-five-weekends-p (month year)
    (and (member month *long-months*)
             (ends-on-sunday-p month year)))

  ;; For the extra credit problem.
  (defun has-at-least-one-five-weekend-month-p (year)
    (let ((flag nil))
          (loop for month in *long-months* do
                   (if (has-five-weekends-p month year)
                           (setf flag t)))
          flag))

  (defun solve-it ()
    (let ((good-months '())
                  (bad-years 0))
          (loop for year from 1900 to 2100 do
             ;; First form in the PROGN is for the extra credit.
                   (progn (unless (has-at-least-one-five-weekend-month-p year)
                                    (incf bad-years))
                                  (loop for month in *long-months* do
                                           (when (has-five-weekends-p month year)
                                             (push (list month year) good-months)))))
          (let ((len (length good-months)))
            (format t "~A months have five weekends.~%" len)
            (format t "First 5 months: ~A~%" (subseq good-months (- len 5) len))
            (format t "Last 5 months: ~A~%" (subseq good-months 0 5))
            (format t "Years without a five-weekend month: ~A~%" bad-years))))

(Image copyright gacabo under Creative Commons license.)

Reasons to Recommend Common Lisp

On Reddit, /u/rhabarba asks: Should I learn (Common) Lisp? Here’s my reply:

I am a technical writer and “hobbyist” programmer who does some unofficial programming at work to automate documentation things. To give you a sense of my language background, I enjoy programming in Perl, Python, Scheme, and Common Lisp.

Reasons to recommend Common Lisp include:

  • In many ways, it’s a superset of the other languages I mentioned: you can “script” with regular expression and filesystem support, and you can also write larger programs using built-in OO that’s still more advanced than the built-in Perl or Python stuff.
  • Multi-paradigm: you can write code in whatever style you want to solve your problem. You only use OO if you want to, you’re not forced by the language. Functional style is there for you, especially via libraries. You can also be imperative and bash as much state as you want. Almost any type of code you might want to write can be (probably has been?) written in Common Lisp. There’s a lot of room to grow as you learn more.
  • Dynamic, interactive, and inspectable: redefine almost anything on the fly at the REPL, watch those changes get picked up by the rest of your system as it runs. This is true for running web servers, whatever. The debugger and inspector available via SLIME in Emacs is also better than anything else I’ve used in other languages.
  • Multiple good implementations to choose from; there are several fast “native” compilers, several portable interpreter-based implementations. If you write portable code you can run on many different implementations, so you can mix and match depending on what’s easiest to install or use for your needs.
  • Quicklisp lets you install any of 1200* libraries, very easily. You’ve got all the basics: HTTP clients and servers, JSON, XML, and Markdown parsers, and lots more advanced stuff too.
  • 25 year old code that does advanced, weird, or just cool things still works fine. It just runs way faster now.
  • Good books: ‘Paradigms of Artificial Intelligence Programming’ by Peter Norvig, ‘Object-Oriented Programming in Common Lisp’ by Sonya Keene, and many others.

Finally, I have been able to write programs to do “harder” things in Lisp than I have in other languages, even though I’m not really a programmer or “engineer” or whatever title you prefer. I think the simple list data structure is a great way to bootstrap a prototype of your program, and it encourages data-structure thinking rather than “stringy” thinking as some languages do. That said, I use a lot of languages to get my work done, so it’s not a religion or anything.

Good luck!

Make your own iterators in Scheme using closures

idle-machines

Introduction

I recently bought a copy of Higher Order Perl (also known as HOP) by Mark Dominus. In chapter 4, he introduces iterators. At a high level, an iterator is a black box with an input funnel, an output funnel, and a big red “NEXT” button. You put something into the funnel at time T; then, at some later time T+n, you press the “NEXT” button and something useful comes out.

Iterators are useful in cases when you want to access a potentially huge list of things one at a time, but you don’t have access to a computer with infinite memory.

In this post we’ll look at a moderately useful iterator example; a directory walker that recurses down a directory tree on your file system. The implementation is given in Scheme (specifically, the scsh dialect); the code and the descriptions thereof are heavily inspired by Dominus’ excellent book. However, this post contains the work of an amateur. Any errors or omissions in the following code and explanations are mine.

Iterators are made with closures

In order to build a useful iterator, we need to have a basic understanding of closures. A closure is just a function packaged up with an environment. The environment allows us to store up state between function calls. As noted above, it’s easiest just to think of it as a little black box that we can pass around in our program.

In Scheme and Common Lisp, you build functions with LAMBDA. Here’s a very simple closure example in Scheme, translated from the first example given in HOP:

(define (upto m n)
  (set! m (- m 1)) ;needed since we don't have post-increment ($i++)
  (lambda ()
    (if (< m n)
        (begin (set! m (+ m 1)) m)
        #f)))

This is a function that returns a function (a closure). To use the closure returned by UPTO, assign it to a variable and call it whenever you want the next value. When the iterator is exhausted, it will return #f.

> (define 1to10 (upto 1 10))
> (1to10)
1
> (1to10)
2
...
> (1to10)
9
> (1to10)
10
> (1to10)
#f

A trivial file tree iterator

A more interesting and practical use of this technique is to walk a directory tree on your computer. In the Scheme procedure below, we take a list of directories and then build and return a closure that, when called, will walk the directory tree breadth-first, printing out the names of files as we go.

(define (dir-walk queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files)))
                          file))
                       ((file-regular? file) file)
                       (else #f)))))))

The important part to notice is the inner LAMBDA. This is where we create a closure by packaging up a procedure with an environment. The closure’s environment remembers the contents of the variable QUEUE in memory, so we can ask for the elements of QUEUE at our leisure.

Here it is running on my machine:

> (define dir-iter1 (dir-walk '("/Users/rloveland/Desktop/current/")))
> (dir-iter1)
"/Users/rloveland/Desktop/current/XProlog.jar"
> (dir-iter1)
"/Users/rloveland/Desktop/current/all-jira-issues.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/automate-campaign-setup.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/buflocal.scm"
> (dir-iter1)
"/Users/rloveland/Desktop/current/bundle.js"

Just for fun, here’s a version of the same function in Common Lisp. It’s essentially the same as the scsh version, save that it uses the nice CL pathname machinery instead of appending raw strings, and also makes use of the CL-FAD convenience library.

(defun dir-walk (queue)
  (let ((file nil))
    (lambda ()
      (if (not (null queue))
          (progn
            (setf file (car queue))
            (setf queue (cdr queue))
            (cond ((cl-fad:directory-pathname-p file)
                   (let ((new-files (cl-fad:list-directory file)))
                     (setf queue (append queue (mapcar #'(lambda (filename)
                                                           (merge-pathnames filename file))
                                                       new-files)))
                     file))
                  ((cl-fad:file-exists-p file) file)
                  (t nil)))))))

Usage is also essentially the same:

CL-USER> (defvar *tree-walker* (dir-walk '(#P"/Users/rloveland/Desktop/current/")))
*TREE-WALKER*
CL-USER> (type-of *tree-walker*)
FUNCTION
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Desktop/current/"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.debuggerDefaults"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.DS_Store"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hi"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hs"

A slightly less trivial file tree iterator

Listing out files and directories is nice, but not that useful. We’d like a way to see only those files and directories that have some interesting property.

This can be accomplished by passing in another argument to our DIR-WALK function: this argument will be yet another function that will test the current file to see whether it’s interesting to us. It’s pretty easy to change DIR-WALK to accept a function argument, INTERESTING?. This arbitrary function is used to check the file to see if we care about it.

This time around, when we build our internal queue, we use a call to FILTER to make sure that only interesting files get added.

(define (dir-walk* interesting? queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond
                  ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (filter interesting? (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files))))
                          (if (interesting? file) file)))
                  ((interesting? file) file)
                  (else #f)))
          #f))))

And here it is in use; in this example, we pass an INTERESTING? function that asks for only files that are not marked as executable:

> (define dir-iter2 (dir-walk* (lambda (f) (file-not-executable? f)) '("/home/rml/Desktop/current")))
> (dir-iter2)
> (dir-iter2)
"/home/rml/Desktop/current/A.hi"
> (dir-iter2)
"/home/rml/Desktop/current/A.hs"
> (dir-iter2)
"/home/rml/Desktop/current/A.o"
> (dir-iter2)
"/home/rml/Desktop/current/ABBREV.xlsx"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.class"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.java"

Conclusion

There are still bugs and unexpected behavior in DIR-WALK*. For example, on the first call to the resulting iterator there is no result due to the one-armed IF. There are also strange things that happen if we want to filter out directories from our results, since we mix together the FILE-DIRECTORY? check and the INTERESTING? check inside the iterator. However, despite these small nits, it can still do useful work on my machine 1, and it’s a good enough example of using closures to build iterators.

Here’s hoping you’ll enjoy playing with closures and iterators in your own code!

(Image courtesy zeitfanger.at under a Creative Commons License.)

Footnotes:

1 In the grandest internet tradition, “It works on my machine” ™.

Simple Network Search in Scheme

../img/winston-horn-network.png

I’ve been having fun translating some of the code in Winston and Horn’s Lisp into Scheme. This book is amazing – clearly written, with lots of motivating examples and applications. As SICP is to language implementation, Lisp is to application development, with chapters covering constraint propagation, forward and backward chaining, simulation, object-oriented programming, and so on. And it does include the obligatory Lisp interpreter in one chapter, if you’re into that sort of thing.

In this installment, based on Chapter 19, we will look at some simple strategies for searching for a path between two nodes on a network (a graph). The network we’ll be using is shown in the diagram above.

Here’s the same network, represented as an alist where each CAR:CDR pair represents a NODE:NEIGHBORS relationship:

'((f e)
  (e b d f)
  (d s a e)
  (c b)
  (b a c e)
  (a s b d)
  (s a d))

The high-level strategy the authors use is to traverse the network, building up a list of partial paths. If a partial path ever reaches the point where it describes a full path between the two network nodes we’re after, we’ve been successful.

As with trees, we can do either a breadth-first or depth-first traversal. Here’s what the intermediate partial paths will look like for a breadth-first traversal that builds a path between nodes S and F:

(s)
(s a)
(s d)
(s a b)
(s a d)
(s d a)
(s d e)
(s a b c)
(s a b e)
(s a d e)
(s d a b)
(s d e b)
'(s d e f)

Based on that output, we can deduce that every time we visit a node, we want to extend our partial paths list with that node. Here’s one option – its only problem is that it will happily build circular paths that keep us from ever finding the node we want:

(define (%buggy-extend path)             ;builds circular paths
  (map (lambda (new-node) 
         (cons new-node path))
       (%get-neighbor (first path))))

(Incidentally, I’ve become fond of the convention whereby internal procedures that aren’t part of a public-facing API are prefixed with the `%’ character. This can be found in some parts of the MIT Scheme sources, and I believe it’s used in Racket as well. I’ve started writing lots of my procedures using this notation to remind me that the code I’m writing is not the real `API’, that the design will need more work, and that the current code is just a first draft. I’m using that convention here.)

Here’s a better version that checks if we’ve already visited the node before adding it to the partial paths list – as a debugging aid it prints out the current path before extending it:

(define (%extend path)
  (display (reverse path))
  (newline)
  (map (lambda (new-node)
         (cons new-node path))
       (filter (lambda (neighbor)
                 (not (member neighbor path)))
               (%get-neighbor (first path)))))

You may have noticed the %GET-NEIGHBOR procedure; it’s just part of some silly data structure bookkeeping code. Please feel free to deride me in the comments for my use of a global variable. What can I say? I’m Scheming like it’s 1988 over here! Here’s the boilerplate:

(define *neighbors* '())

(define (%add-neighbor! k v)
  (let ((new-neighbor (cons k v)))
    (set! *neighbors*
          (cons new-neighbor *neighbors*))))

(define (%get-neighbor k)
  (let ((val (assoc k *neighbors*)))
    (if val
        (cdr val)
        '())))

(%add-neighbor! 's '(a d))
(%add-neighbor! 'a '(s b d))
(%add-neighbor! 'b '(a c e))
(%add-neighbor! 'c '(b))
(%add-neighbor! 'd '(s a e))
(%add-neighbor! 'e '(b d f))
(%add-neighbor! 'f '(e))

Now that we have our data structure and a way to extend our partial path list (non-circularly), we can write the main search procedure, %BREADTH-FIRST. The authors have a lovely way of explaining its operation:

BREADTH-FIRST is said to do a “breadth-first” search because it extends all partial paths out to uniform length before extending any to a greater length.

Here’s the code, translated to use a more Schemely, iterative named LET instead of the linear-recursive definition from the book 1:

(define (%breadth-first start finish network)
  (let ((queue (list (list start))))
    (let loop ((start start)
               (finish finish)
               (network network)
               (queue queue))
      (cond ((null? queue) '())                    ;Queue empty?
            ((equal? finish (first (first queue))) ;Finish found?
             (reverse (first queue)))              ;Return path.
            (else
             (loop start
                   finish               ;Try again.
                   network
                   (append
                    (rest queue)
                    (extend (first queue))))))))) ;New paths in front.

(A better way to write this procedure would be to implement a generic internal search procedure that takes its `breadthiness’ or `depthiness’ as a parameter. We could then wrap it with nicer public-facing search procedures specific names.)

Meanwhile, back at the REPL… we remind ourselves of what *neighbors* actually looks like, and then we search for a path between the nodes S and F.

> *neighbors*
'((f e) (e b d f) (d s a e) (c b) (b a c e) (a s b d) (s a d))
> (%breadth-first 's 'f *neighbors*)
(s)
(s a)
(s d)
(s a b)
(s a d)
(s d a)
(s d e)
(s a b c)
(s a b e)
(s a d e)
(s d a b)
(s d e b)
'(s d e f)

What fun! I can almost imagine using a three-dimensional variant of these searches for a space wargame with wormhole travel! Except, you know, they’d need to be much faster and more skillfully implemented. There’s also the tiny requirement to write the surrounding game…

Footnotes:

1 It shouldn’t need to be said, but: Of course the authors knew better; they were trying to hide that unnecessary complexity from the reader until later.

Edwin `dired-do-shell-command’ on files

Evaluate this code in your Edwin REPL and you’re one step closer to being able to use Edwin as your primary file manager. I’ve reimplemented the Emacs `dired-do-shell-command’ function as an Edwin command (note that it puts these definitions in the `edwin dired’ environment’):

(ge '(edwin dired))

(define (string-join xs)
  (list->string (concatenate
    (map (lambda (s) (string->list s)) xs))))

(define (shell-command-prompt prompt)
  (prompt-for-string prompt #f
                            'DEFAULT-TYPE 'INSERTED-DEFAULT
                            'HISTORY 'SHELL-COMMAND))

(define (pathname->string pathname)
  (uri->string (pathname->uri pathname)))

(define-command dired-do-shell-command
  "Run a shell command on the file or directory at the current point."
  (lambda ()
    (list (shell-command-prompt "Shell command on file: ")
    (command-argument)))
  (lambda (command pathname)
    ((ref-command shell-command)
     (string-join
      (list command " "
            (pathname->string (dired-current-pathname)) " &")) #f)))

(define-key 'dired #\! 'dired-do-shell-command)