Three views on a loop

Recently I needed to translate a function I had written in Scheme to Common Lisp for … reasons. As it turns out, this was not entirely straightforward. The function was originally written in a very recursive style. In order to have something that would run portably across Common Lisp implementations I needed something that didn’t rely so heavily on recursion.

This was not easy for me. I have written much much more Scheme than Common Lisp in my life, and it feels quite natural to me now to design loops using recursion. So that bridge had to be crossed somehow. This is the story of how I crossed it.

The procedure in question is merge, which is used to implement the core of a merge sort.

Let’s start with the Scheme. The code below shows a straightforward implementation of merge from a Scheme programmer’s perspective (or at least this one’s). As mentioned above, this is the merging procedure that is used to implement the core of a merge sort. It’s very similar to the code presented in Bottom Up Merge Sort in Scheme.

(define (merge l r pred)
  (letrec ((merge-aux
            (lambda (left right pred result)
              (cond ((and (null? left)
                          (null? right))
                     (reverse result))
                    ((and (not (null? left))
                          (not (null? right)))
                     (if (pred (car left) (car right))
                         (merge-aux (cdr left)
                                    right pred (cons (car left) result))
                       (merge-aux left
                                  (cdr right) pred (cons (car right) result))))
                    ((not (null? left))
                     (merge-aux (cdr left) right pred (cons (car left) result)))
                    ((not (null? right))
                     (merge-aux left (cdr right) pred (cons (car right) result)))
                    (else #f)))))
    (merge-aux l r pred '())))

As you can see this code makes heavy use of recursive procedure calls, the default iteration construct in Scheme. As noted above, we can’t expect this sort of code to be supported in portable Common Lisp code, so it needs to be translated.

(Side note: If you are interested in some notes on which Lisp implementations support tail calls, recursive self-calls, etc., see this interesting article.)

Below is my first attempt to translate the Scheme code directly into Common Lisp. It still relies heavily on recursive function calls. If you compile this in a recent CMUCL or Clozure CL the recursion doesn’t seem to bother them, since their compilers support recursion well. And the performance will be pretty good, too. But that won’t be the case in every Lisp.

(Note that we call this function rml/merge because Common Lisp already has a built-in merge — about which more later.)

(defun rml/merge (l r pred)
  (labels ((merge-aux (pred left right result)
             (cond ((and (null left)
                         (null right))
                    (reverse result))
                   ((and (not (null left))
                         (not (null right)))
                    (if (funcall pred (car left) (car right))
                        (merge-aux pred (cdr left)
                                   right (cons (car left) result))
                        (merge-aux pred left
                                   (cdr right) (cons (car right) result))))
                   ((and (not (null left)) (null right))
                    (merge-aux pred (cdr left) right (cons (car left) result)))
                   ((and (null left) (not (null right)))
                    (merge-aux pred left (cdr right) (cons (car right) result)))
                   (t 'FAIL))))
    (merge-aux pred l r '())))

As we discussed, some Common Lisp implementations will not handle this level of recursion. One Lisp in particular was pretty salty about it. :-}

IMAGE

genera-stack-overflow

CLISP and ABCL will also explode pretty much immediately.

It took a bit of thinking, but I remembered reading about something called BLOCK in good old CLtL2. It lets you create a named block that you can later jump out of at your convenience, e.g.,

(block outer
   ;; ... do things
   ;; ok, i'm done!
   (return-from outer my-result))

This reminds me of a similar pattern I’ve used in Perl (or Java for that matter) for early returns.

LOOP: for my $elem (@list) {
    # do things ....
    next LOOP if check($elem);
    # do other things ...
}

Using my newfound weapon (hammer?) block, I was able to slay the recursion dragon as shown below in rml/merge2. This code is speaking Common Lisp, but it still has a heavy Scheme accent! The shape of the computation feels pretty similar, at least.

(defun rml/merge2 (left right pred)
  (let ((left left)
        (right right)
        (result '()))
    (block outer
      (loop do
           (block inner
             (cond ((and (null left) (null right))
                    (return-from outer))
                   ((and (not (null left))
                         (not (null right)))
                    (if (funcall pred (car left) (car right))
                        (progn
                          (push (car left) result)
                          (setf left (cdr left))
                          (return-from inner))
                        (progn
                          (push (car right) result)
                          (setf right (cdr right))
                          (return-from inner))))
                   ((not (null left))
                    (progn
                      (push (car left) result)
                      (setf left (cdr left))
                      (return-from inner)))
                   ((not (null right))
                    (progn
                      (push (car right) result)
                      (setf right (cdr right))
                      (return-from inner)))
                   (t                   ; Should never get here (tm)
                    (error "ERROR: RML/MERGE2: ~A ~A ~A~%" left right result))))))
    (nreverse result)))

At a high level, the code works as follows:

  1. let-bind some variables to the arguments passed in, so we can bash on them at will later on.
  2. Open up an outer block to capture the state of the loop, with a name we can use for an early return. This is what we will jump out to when we’re done.
  3. Start a big loop with an inner block. I’m using a magic number here (one billion) as a shorthand for “just loop forever” since we are going to jump out when we’re done anyway. (“One-billion-element-long lists should be enough for anyone.”)
  4. Inside the loop’s inner block, we model the iteration created by recursive function calls using jumps. In every step of the cond, we jump out to the inner block’s label just as we would in Scheme by making a procedure call, except here we need to be explicit using return-from, whereas Scheme lets us be implicit.
  5. Once left and right are both empty, our work is done, so we jump back out to the outer block. Then, we destructively sort the result list in place with nreverse, just because we can. (Hey, it was temporary anyway.)

This isn’t very satisfying, though. It felt icky using block, which feels like a building block you’d need to build a looping construct (aka compiler/macro output), but not something “user-level” code should need to use. So I did some more reading of Common Lisp code. In particular I discovered a couple of things:

  • A bare loop with no arguments will happily go on forever; you don’t need to say loop repeat 5 ....
  • Inside a loop, you don’t need to say do (something else) unless, prior to that point, you’ve used loop‘s “English” syntax, e.g., loop from 1 to 10 do ....
  • Because this isn’t Scheme, we don’t need to do the 2-step dance of “put car of A here, set A to cdr of A there”. We can push the return value of pop onto Adirectly, and rely on the fact that pop mutates its argument list to handle setting A to cdr of A. Mmmmm, yummy state-bashing.

Thanks to the above, we now have code for rml/merge3. It got a lot shorter and simpler along the way, while saying the same things:

(defun rml/merge3 (left right pred)
  (let ((result '()))
    (loop
       (cond ((and (null left) (null right))
              (return (nreverse result)))
             ((and (not (null left))
                   (not (null right)))
              (if (funcall pred (car left) (car right))
                  (push (pop left) result)
                  (push (pop right) result)))
             ((not (null left))
              (push (pop left) result))
             ((not (null right))
              (push (pop right) result))
             (t                         ; Should never get here (tm)
              (error "ERROR: RML/MERGE2: ~A ~A ~A~%" left right result))))))

Now that I had some code, I needed to make sure it actually gave the right answers (aka the hard part). Because Common Lisp has a merge function built in, I was able to write a bit of test code to check my merge function’s implementation against what the Lisp wizards did. There’s no feeling like standing on the shoulders of giants!

Here is the test code, which you can paste into your REPL and try for yourself. It runs our merge function against the system’s merge 1000 times on the same randomly generated input lists. For each test case, if the outputs match, we output a t; otherwise nil.

(defun make-random-list (size &key max)
  (loop repeat (random size) collecting (random max)))

(defun check-merge-output (function size)
  (let* ((left (make-random-list size :max size))
         (right (make-random-list size :max size))
         (left-copy (copy-list left))
         (right-copy (copy-list right))
         (expected (merge 'list left-copy right-copy #'<))
         (got (funcall function left right #'<)))
    (if (equalp expected got)
        t
        (values expected got))))

(defun run-merge-tests (function)
  (loop repeat 1000 collecting
       (if (check-merge-output function 1000) t nil)))

Here’s what the output of a successful run looks like (Luckily, I have not yet found any unsuccessful ones):

CL-USER> (run-merge-tests #'rml/merge3)
(T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T ...)

Based on the testing shown here, I think this implementation is Probably Fine ™. I’m happy with it, and it performs well enough for my needs.

In a future post, we’ll use the merge function we’ve shared here to write a merge sort.

Performance Postamble

But what about the performance? It’s not bad, actually! As usual, it can depend on the implementation. Below I’ll list out how it compares in a few implementations. However I should state right now for the record that:

  1. I’m not a performance or benchmarking expert. You should probably just stop reading here, because what follows is likely to be nonsense.
  2. My implementation only works on lists (not vectors, too) and does not accept a sort key, as the standard dictates.
  3. My implementation is not battle-hardened from living inside a Common Lisp implementation for decades (the most important point of all).

That said, what follows is what I saw when I ran the test code below on a few implementations:

(defun check-merge-performance (function)
  (time
   (let* ((len 1000000)
          (max 1000000)
          (xs (make-random-list len :max max))
          (ys (make-random-list len :max max)))
     (progn
       (funcall function xs ys #'<)
       (values)))))

(defun check-system-merge-performance ()
  (time
   (let* ((len 1000000)
          (max 1000000)
          (xs (make-random-list len :max max))
          (ys (make-random-list len :max max)))
     (progn
       (merge 'list xs ys #'<)
       (values)))))

Note that the following tests were all run on a several-years-old Windows laptop with spinning disks, slow RAM, and a bunch of other applications running. Did I mention I’m not a performance and benchmarking expert? :-}

Armed Bear Common Lisp

ABCL‘s merge implementation did not do so well against mine, but I think I must be missing something. For example, there’s probably a lot of subtlety around JVM performance on Windows I just have no idea about.

Mine ran in about 17 seconds.

CL-USER> (check-merge-performance #'rml/merge3)
17.452 seconds real time
4000684 cons cells
NIL

Unfortunately, theirs took about 5 minutes to do about the same (*) computation. I’m not sure why. It’s possible I’m making some kind of mistake here, but I don’t know what it is. I used the timing code below because at first I thought the ABCL merge check had just hung altogether.

CL-USER> (time
          (let* ((len 1000000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)
              (values))))

312.803 seconds real time
412497 cons cells
NIL

(*) It’s not exactly the same, because remember that CL’s merge works on lists or vectors, so there will be some dispatching.

Out of curiosity, I went back and tried a few smaller inputs. ABCL seems to do OK with 2 100,000-element lists of random numbers, as shown below. Further, if you look at the number of cons cells, the function does appear to be O(n), which you would expect to see. This makes me think the problem is simply that 2 lists of 1,000,000 elements in the JVM is just too big a load for my little laptop, somehow.

CL-USER> (time
          (let* ((len 100000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)               (values)))) 0.028 seconds real time 4025 cons cells NIL CL-USER> (time
          (let* ((len 10000)
                 (max 1000000)
                 (xs (make-random-list len :max max))
                 (ys (make-random-list len :max max)))
            (progn
              (merge 'list xs ys #'<)
              (values))))
2.24 seconds real time
40113 cons cells
NIL

CLISP

CLISP‘s implementation is about twice as fast as mine, and uses about half as much memory, as is right and proper.

CL-USER> (check-merge-performance #'rml/merge3)
Real time: 9.634908 sec.
Run time: 9.578125 sec.
Space: 56000000 Bytes
GC: 16, GC time: 0.578125 sec.
; No value
CL-USER> (check-system-merge-performance)
Real time: 4.6764607 sec.
Run time: 4.671875 sec.
Space: 32000000 Bytes
GC: 6, GC time: 0.40625 sec.
; No value

Corman Lisp

Corman Lisp is a Common Lisp implementation for Windows. It’s fast and has really good Win32 integration. I enjoy the IDE a lot.

If you like Lisp, you may enjoy this interesting talk by its creator Roger Corman where he discusses its development.

The system merge is about twice as fast as mine. It seems to come down to my implementation creating a lot of garbage, which is expected given my naive use of push and pop when I should be doing more direct mutation and structure sharing using e.g. rplacd.

(check-merge-performance #'rml/merge3)
Total Execution time: 0.360289 seconds
Time spent garbage collecting: 0.093065 seconds

(check-system-merge-performance)
Total Execution time: 0.180082 seconds
Time spent garbage collecting: 0.0 seconds

LispWorks (32-bit Hobbyist Edition for Windows)

LispWorks is a proprietary implementation put out by a company in England. Not only is their merge faster, they’re using about half as much memory. All as expected.

P.S. I just bought the “Hobbyist” version of the software, and I can definitely recommend it. Fast compiler, and the IDE and class browser, debugger, etc., are lot of fun to use.

CL-USER 40 > (check-merge-performance #'rml/merge3)
Timing the evaluation of (LET* ((LEN 100000) (MAX 100000) (XS (MAKE-RANDOM-LIST LEN :MAX MAX)) (YS (MAKE-RANDOM-LIST LEN :MAX MAX))) (PROGN (FUNCALL FUNCTION XS YS (FUNCTION <)) (VALUES))) User time    =        0.062 System time  =        0.015 Elapsed time =        0.063 Allocation   = 2895872 bytes 0 Page faults CL-USER 41 > (check-system-merge-performance)
Timing the evaluation of (LET* ((LEN 100000) (MAX 100000) (XS (MAKE-RANDOM-LIST LEN :MAX MAX)) (YS (MAKE-RANDOM-LIST LEN :MAX MAX))) (PROGN (MERGE (QUOTE LIST) XS YS (FUNCTION <)) (VALUES)))

User time    =        0.015
System time  =        0.000
Elapsed time =        0.015
Allocation   = 1434380 bytes
0 Page faults

Anyway that’s all for now. As I said, you should take the “results” above with a grain of salt. Nonetheless it was interesting seeing how the different implementations behaved in an unscientific comparison.

Advent of Code 2017, Day 2

This is my solution for Day 2 of this year’s Advent of Code.

You may also enjoy browsing the Day 2 solutions megathread on Reddit.

PROBLEM

The spreadsheet consists of rows of apparently-random numbers. To make sure the recovery process is on the right track, they need you to calculate the spreadsheet’s checksum. For each row, determine the difference between the largest value and the smallest value; the checksum is the sum of all of these differences.

For example, given the following spreadsheet:

5 1 9 5
7 5 3
2 4 6 8

The first row’s largest and smallest values are 9 and 1, and their difference is 8.

The second row’s largest and smallest values are 7 and 3, and their difference is 4.

The third row’s difference is 6.

In this example, the spreadsheet’s checksum would be 8 + 4 + 6 = 18.

SOLUTION

(define (line->list line)
  ;; String -> List
  (let ((read-ln (field-reader (infix-splitter (rx (+ whitespace)))))
        (in-port (make-string-input-port line)))
    (receive (record fields)
        (read-ln in-port)
      (map string->number fields))))

(define (read-spreadsheet file)
  ;; File -> List[List[Number]]
  (call-with-input-file file
    (lambda (port)
      (let loop ((line (read-line port))
                 (results '()))
        (if (eof-object? line)
            results
            (loop (read-line port) (cons line results)))))))

(define (main prog+args)
  (let ((rows (read-spreadsheet "/Users/rloveland/Code/personal/advent-of-code/2017/02/02.dat")))
    (write (apply + (map
                     (lambda (row)
                       (let* ((xs (line->list row))
                              (min (apply min xs))
                              (max (apply max xs)))
                         (- max min)))
                     rows)))
    (newline)))

Advent of Code 2017, Day 1

This is my solution for Day 1 of this year’s Advent of Code.

You may also enjoy browsing the Day 1 solutions megathread on Reddit.

PROBLEM

The captcha requires you to review a sequence of digits (your puzzle input) and find the sum of all digits that match the next digit in the list. The list is circular, so the digit after the last digit is the first digit in the list.

For example:

  • 1122 produces a sum of 3 (1 + 2) because the first digit (1) matches the second digit and the third digit (2) matches the fourth digit.

  • 1111 produces 4 because each digit (all 1) matches the next.

  • 1234 produces 0 because no digit matches the next.

  • 91212129 produces 9 because the only digit that matches the next one is the last digit, 9.

SOLUTION

(define captcha-input "5994521226795838")

'(set! captcha-input "1111")

'(set! captcha-input "1122")

'(set! captcha-input "1234")

'(set! captcha-input "91212129")

(define (gather-matches s)
  ;; String -> List
  (let ((in-port (make-string-input-port s)) (count 0) (head #f) (vals '()))
    (let loop ((cur (read-char in-port)) (next (peek-char in-port)) (count count) (vals vals))
      (if (eof-object? next)
          (if (char=? cur head)
              (cons cur vals)
              vals)
          (cond ((= count 0)
                 (begin
                   (set! head cur)
                   (loop cur next (+ 1 count) vals)))
                 ((char=? cur next)
                 (loop (read-char in-port) (peek-char in-port) (+ 1 count) (cons cur vals)))
                (else (loop (read-char in-port) (peek-char in-port) (+ 1 count) vals)))))))

(define (main prog+args)
  (let* ((matches (gather-matches captcha-input))
         (matches* (map (lambda (c) (string->number (string c))) matches))
         (sum (apply + matches*)))
    (begin
      (format #t "MATCHES*: ~A~%" matches*)
      (format #t "SUM: ~A~%" sum))))

A Portable Scheme Module System

collins-mustang

 

In this post I’d like to introduce load-module, a portable Scheme module system.

Why did I write a module system?

  • Simplicity: A single-file module system in about 200 lines of code
  • Understandability: The implementation avoids wizardry and should be accessible to anyone who knows the language
  • Portability: One system that can be used across multiple implementations

The way it works is this:

  1. You have a file (say, utils.scm) with Scheme code in it that implements stuff that you want to live in the same module.
  2. You create another file (utils.mod, but that extension is easy to change) which lists the procedures and syntax you want to export.
  3. The load-module procedure reads utils.scm, rewriting unexported procedure names such that only the procedures you want exported show up at the top-level. Everything else gets rewritten during load-time as an ignorable “gensym” of the form %--gensym-utils-random-integer-8190504171, where “utils” is the module name, and “random-integer” is the procedure internal to your module.

The module file format is very simple:

(define-module utils
  (exports random-integer atom? take))

The module system exports one procedure: load-module. Run it like so to get the procedures from the aforementioned hypothetical utils package into your environment:

> (load "load-module.scm")
> (load-module 'utils)
#t
> (random-integer 199)
76
> (atom? 199)
#t

If you care, there’s more information about over at the project README.

(Image courtesy Geoff Collins under Creative Commons license.)

Editing Chrome Textareas with Edwin

edwin-editing-textarea

In this post, I’ll describe how to edit Chrome textareas with the Edwin text editor that comes built-in with MIT/GNU Scheme.

If you just want to see the end result, see the screenshot and video at the end of this post.

These instructions will also work with recent releases of the Opera browser (since the newer Chromium-based versions can run Chrome plugins). They may also work at some point with Firefox, when Mozilla implements the new WebExtensions API.

At a high level, the steps to edit Chrome textareas with Edwin are:

  1. Install a browser add-on
  2. Customize Edwin with a few hacks
  3. Write a shell script to make it easy to launch Edwin from the command line
  4. Run a local “edit server” that interacts with the browser add-on and launches Edwin

On This Page

Install the ‘Edit with Emacs’ add-on

Install the Edit with Emacs add-on from the Chrome Web Store.

Load some Edwin hacks

The default way to open Edwin is to run

$ mit-scheme --edit

This just launches an Edwin editor window. From there, you need to manually open files and edit them.

What we need is a way to launch Edwin and open a specific file automatically. Most editors you are familiar with already do this, e.g.,

$ vim /tmp/foo.txt
$ emacsclient /tmp/bar.txt

To be able to launch Edwin in this way, we need to hack a few procedures in the file editor.scm in the MIT/GNU Scheme source and load them from the Edwin init file. We’ll tackle each of these tasks separately below.

Hacking editor.scm

To get Edwin to open a file on startup, we need to tweak three procedures in editor.scm to accept and/or pass around filename arguments:

  • CREATE-EDITOR
  • STANDARD-EDITOR-INITIALIZATION
  • EDIT

Here’s the code; you can just paste it into a file somewhere. For the purposes of this post we’ll call it open-edwin-on-file.scm:

;;;; open-edwin-on-file.scm -- Rich's hacks to open Edwin on a specific file.

;;; These (minor) changes are all to the file `editor.scm'. They are
;;; all that is needed to allow Edwin to be opened on a specific file
;;; by adding a `filename' argument to the EDIT procedure.

(define (create-editor file . args)
  (let ((args
     (if (null? args)
         create-editor-args
         (begin
           (set! create-editor-args args)
           args)))
        (filename (if (file-exists? file)
                      file
                      #f)))
    (reset-editor)
    (event-distributor/invoke! editor-initializations)
    (set! edwin-editor
      (make-editor "Edwin"
               (let ((name (and (not (null? args)) (car args))))
             (if name
                 (let ((type (name->display-type name)))
                   (if (not type)
                   (error "Unknown display type name:" name))
                   (if (not (display-type/available? type))
                   (error "Requested display type unavailable:"
                      type))
                   type)
                 (default-display-type '())))
               (if (null? args) '() (cdr args))))
    (set! edwin-initialization
      (lambda ()
        (set! edwin-initialization #f)
        (if filename
                (standard-editor-initialization filename)
                (standard-editor-initialization))
    (set! edwin-continuation #f)
    unspecific))))

(define (standard-editor-initialization #!optional filename)
  (with-editor-interrupts-disabled
   (lambda ()
     (if (and (not init-file-loaded?)
          (not inhibit-editor-init-file?))
     (begin
       (let ((filename (os/init-file-name)))
         (if (file-exists? filename)
         (load-edwin-file filename '(EDWIN) #t)))
       (set! init-file-loaded? #t)
       unspecific))))
  (let ((buffer (find-buffer initial-buffer-name))
        (filename (if (not (default-object? filename))
                      ((ref-command find-file) filename)
                      #f)))
    (if (and buffer
         (not inhibit-initial-inferior-repl?))
    (start-inferior-repl!
     buffer
     (nearest-repl/environment)
     (and (not (ref-variable inhibit-startup-message))
          (cmdl-message/append
           (cmdl-message/active
        (lambda (port)
          (identify-world port)
          (newline port)))
           (cmdl-message/strings
        "You are in an interaction window of the Edwin editor."
                "Type `C-h' for help, or `C-h t' for a tutorial."
                "`C-h m' will describe some commands."
                "`C-h' means: hold down the Ctrl key and type `h'.")))))))

(define (edit file . args)
  (call-with-current-continuation
   (lambda (continuation)
     (cond (within-editor?
        (error "edwin: Editor already running"))
       ((not edwin-editor)
        (apply create-editor file args))
       ((not (null? args))
        (error "edwin: Arguments ignored when re-entering editor" args))
       (edwin-continuation
        => (lambda (restart)
         (set! edwin-continuation #f)
         (within-continuation restart
           (lambda ()
             (set! editor-abort continuation)
             unspecific)))))
     (fluid-let ((editor-abort continuation)
         (current-editor edwin-editor)
         (within-editor? #t)
         (editor-thread (current-thread))
         (editor-thread-root-continuation)
         (editor-initial-threads '())
         (inferior-thread-changes? #f)
         (inferior-threads '())
         (recursive-edit-continuation #f)
         (recursive-edit-level 0))
       (editor-grab-display edwin-editor
     (lambda (with-editor-ungrabbed operations)
       (let ((message (cmdl-message/null)))
         (cmdl/start
          (make-cmdl
           (nearest-cmdl)
           dummy-i/o-port
           (lambda (cmdl)
         cmdl       ;ignore
         (bind-condition-handler (list condition-type:error)
             internal-error-handler
           (lambda ()
             (call-with-current-continuation
              (lambda (root-continuation)
            (set! editor-thread-root-continuation
                  root-continuation)
            (with-notification-output-port null-output-port
              (lambda ()
                (do ((thunks (let ((thunks editor-initial-threads))
                       (set! editor-initial-threads '())
                       thunks)
                     (cdr thunks)))
                ((null? thunks))
                  (create-thread root-continuation (car thunks)))
                (top-level-command-reader
                 edwin-initialization)))))))
         message)
           #f
           `((START-CHILD ,(editor-start-child-cmdl with-editor-ungrabbed))
         (CHILD-PORT ,(editor-child-cmdl-port (nearest-cmdl/port)))
         ,@operations))
          message))))))))

Update your Edwin init file

Then, you’ll need to tweak your Edwin init file (also known as ~/.edwin) to load this file into Edwin’s environment on startup:

(load "/path/to/open-edwin-on-file.scm" '(edwin))

Write a shell script to make it easier launch Edwin from the command line

Now that the EDIT procedure takes a filename argument, we can wrap this all up in a shell script that calls Edwin with the right arguments. There may be other ways to accomplish this than in the code shown below, but it works.

Note that the path to my local installation of MIT/GNU Scheme on Mac OS X is slightly tweaked from the official install location. What’s important is that Scheme is invoked using the right “band”, or image file. For more information, see the fine manual.

Take the code below and stick it somewhere on your $PATH; on my machine it lives at ~/bin/edwin.

#!/usr/bin/env sh

EDIT_FILE=$1
SCHEME_CODE="(edit \"$EDIT_FILE\")"

if [[ $(uname) == 'Darwin' ]]; then
  _SCHEME_DIR=/Applications/MIT-Scheme.app/Contents/Resources
  SCHEME=$_SCHEME_DIR/mit-scheme
  MITSCHEME_BAND=$SCHEME_DIR/all.com
  CMD=$SCHEME
fi

if [[ $(uname) == 'Linux' ]]; then
  CMD=scheme
fi

N=$RANDOM
F=/tmp/edit-$N.scm

touch $F
echo $SCHEME_CODE > $F

$CMD --load $F

Install an edit server

Although the extension is called ‘Edit with Emacs’, it can be used with any text editor. You just need to be able to run a local “edit server” that generates the right inputs and outputs. Since Chrome extensions can’t launch apps directly, the extension running in the browser needs to act as a client to a locally running server, which will launch the app.

Since we want to launch Edwin, we’ll need to run a local edit server. Here’s the one that I use:

https://gist.github.com/frodwith/367752

To get the server to launch Edwin, I save the gist somewhere as editserver.psgi and run the following script (for more information on the environment variables and what they mean, see the comments in the gist):

#!/usr/bin/env sh
EDITSERVER_CMD='edwin %s' \
EDITSERVER_BLOCKING=1 \
screen -d -m `which plackup` -s Starman -p 9292 -a ~/Code/mathoms/editserver.psgi

The relevant bit for running Edwin is the EDITSERVER_CMD environment variable, which we’ve set to run the edwin script shown above.

Note that this server is written in Perl and requires you to install the Starman and Plack modules. If you don’t like Perl or don’t know how to install Perl modules, there are other servers out there that should work for you, such as this one written in Python.

Edit text!

Once you’ve done everything above and gotten it working together, you should be able to click the “edit” button next to your browser textarea and start Edwin. It will look something like the following screenshot (which you saw at the beginning of this post):

edwin-editing-textarea

If you prefer video, check out this short demo on YouTube.

Advent of Code, Day 3

In this post I’ll describe my solution for Day 3 of the Advent of Code.

Problem Description

Day 3: Perfectly Spherical Houses in a Vacuum

Santa is delivering presents to an infinite two-dimensional grid of houses.

He begins by delivering a present to the house at his starting location, and then an elf at the North Pole calls him via radio and tells him where to move next. Moves are always exactly one house to the north (‘^’), south (‘v’), east (‘>’), or west (‘<‘). After each move, he delivers another present to the house at his new location.

However, the elf back at the north pole has had a little too much eggnog, and so his directions are a little off, and Santa ends up visiting some houses more than once. How many houses receive at least one present?

For example:

‘>’ delivers presents to 2 houses: one at the starting location, and one to the east.

‘^>v<‘ delivers presents to 4 houses in a square, including twice to the house at his starting/ending location.

‘^v^v^v^v^v’ delivers a bunch of presents to some very lucky children at only 2 houses.

Solution

Broadly speaking, my solution consisted of:

  • Reading the directions file to determine the largest x and y values of the grid
  • Making a shaped array using those dimensions (specifically, we double the array dimensions to allow for movement up, down, forward, and back)
  • Starting in the center of the shaped array, follow the instructions from the “map” and mark every house (called a “position” in the code) if it hasn’t already been visited
  • Every time we visit a house we haven’t already visited, we bump a counter

Here’s the Scheme code that accomplishes those steps:

;; read in the string
;; sum the ^ and v chars to get the height of the matrix (graph)
;; sum the < and > chars to get the width of the matrix (graph)

(define (north? ch) (char=? ch #\^))
(define (south? ch) (char=? ch #\v))
(define (east? ch) (char=? ch #\>))
(define (west? ch) (char=? ch #\<))

(define (make-shape width height)
  ;; Int Int Int -> Shape
  (shape 0 width 0 height))

(define (read-shape-file file)
  ;; Pathname -> Shape
  (with-input-from-file file
    (lambda ()
      (let loop ((width 0)
         (height 0)
         (min-width  0)
         (max-width 0)
         (min-height 0)
         (max-height 0)
         (ch (read-char)))
    (if (eof-object? ch)
        (make-shape
         (* 2 (- max-width min-width))
         (* 2 (-  max-height min-height)))
        (cond ((north? ch)
           (loop width (+ height 1)
             min-width max-width
             (min min-height height)
             (max max-height height)
             (read-char)))
          ((south? ch)
           (loop width (- height 1)
             min-width max-width
             (min min-height height)
             (max max-height height)
             (read-char)))
          ((east? ch)
           (loop (+ width 1) height
             (min min-width width)
             (max max-width width)
             min-height max-height
             (read-char)))
          ((west? ch)
           (loop (- width 1) height
             (min min-width width)
             (max max-width width)
             min-height max-height
             (read-char)))
          (else (error "WHOA"))))))))

;; We make a shaped, multi-dimensional array (SRFI-25) in the size
;; it's determined we need by our earlier check.

(define (make-grid shape)
  ;; Shape -> Array
  (make-array shape #f))

;; The POSITION data type

(define-record-type position
  (make-position x y)
  position?
  (x position-x set-position-x!)
  (y position-y set-position-y!))

(define (array-center arr)
  ;; Array -> Position
  (let ((len-x (array-length arr 0))
    (len-y (array-length arr 1)))
    (make-position (/ len-x 2)
           (/ len-y 2))))

(define (make-relative-position x y ch)
  ;; Int Int Char -> Position
  (let ((vals '()))
    (cond ((north? ch)
       (set! vals (list (+ x 1) y)))
      ((south? ch)
       (set! vals (list (- x 1) y)))
      ((east? ch)
       (set! vals (list x (- y 1))))
      ((west? ch)
       (set! vals (list x (+ y 1))))
      (else (set! vals (list x y))))
    (make-position (first vals)
           (second vals))))

(define (visited? arr x y)
  ;; Array Int Int -> Bool
  (array-ref arr x y))

(define (set-visited! arr x y)
  ;; Array Int Int -> Undefined
  (array-set! arr x y #t))

(define (visit-locations arr file)
  ;; Pathname -> Int
  (let ((visited-count 0))
    (with-input-from-file file
      (lambda ()
    (let loop ((ch (read-char))
           (current-position (array-center arr)))
      (if (eof-object? ch)
          visited-count
          (begin
        (let* ((current-x (position-x current-position))
               (current-y (position-y current-position))
               (next-position
            (make-relative-position current-x current-y ch)))
          (if (not (visited? arr current-x current-y))
              (begin
            (set! visited-count (+ visited-count 1))
            (set-visited! arr current-x current-y)
            (loop (read-char)
                  next-position))
              (loop (read-char) next-position))))))))
    visited-count))

;; eof

Once this code is loaded up in the REPL, you can use it as shown below. (Note that the answer shown at the end isn’t real to avoid a spoiler.)

(set! *the-array* (make-grid (read-shape-file (expand-file-name "~/Code/personal/advent-of-code/03.dat"))))
'#{Array:srfi-9-record-type-descriptor}

> (array-size *the-array*)
42588

> (array-center *the-array*)
'#{Position}

> (define *the-file* (expand-file-name "~/Code/personal/advent-of-code/03.dat"))

> (visit-locations *the-array* *the-file*)
12345

Related Posts

Advent of Code, Day 2

This post describes my solution for Day 2 of the Advent of Code.

Problem Description

First, the problem description (copied from the website):


Day 2: I Was Told There Would Be No Math

The elves are running low on wrapping paper, and so they need to submit an order for more. They have a list of the dimensions (length l, width w, and height h) of each present, and only want to order exactly as much as they need.

Fortunately, every present is a box (a perfect right rectangular prism), which makes calculating the required wrapping paper for each gift a little easier: find the surface area of the box, which is 2 x l x w + 2 x w x h + 2 x h x l. The elves also need a little extra paper for each present: the area of the smallest side.

For example:

  • A present with dimensions 2x3x4 requires 2 x 6 + 2 x 12 + 2 x 8 = 52 square feet of wrapping paper plus 6 square feet of slack, for a total of 58 square feet.
  • A present with dimensions 1x1x10 requires 2 x 1 + 2 x 10 + 2 x 10 = 42 square feet of wrapping paper plus 1 square foot of slack, for a total of 43 square feet.

All numbers in the elves’ list are in feet. How many total square feet of wrapping paper should they order?


Solution

Once again, we’ll be working in Scheme.

For this problem, I decided to create a “box” data type. In addition to the automatically generated accessors (thanks SRFI-9!), I wrote several procedures to perform calculations on boxes, namely:

  • SURFACE-AREA: Calculate the box’s surface area.
  • SMALLEST-SIDE: Determine which of the box’s sides has the smallest surface area (the extra material makes it easier to wrap).

WRAPPING-PAPER is just a “wrapper” (pun intended) around the first two.

LINE->BOX, READ-BOXES, and SUM-BOXES are all about parsing the input file contents and shuffling them into the box data type that we use to do the actual calculation. The only part that required a bit of thought was the line with STRING-TOKENIZE in LINE->BOX. In Perl I’d use my @params = split /x/, $line without even thinking, but I was less familiar with Scheme’s facility for solving this problem, so it took a few minutes to puzzle out the right part of Scheme’s “API”. (STRING-TOKENIZE was helpfully provided by SRFI-13.)

Abstract data types FTW! I’ll be using them more as the month’s challenges progress.

;; ,open srfi-9 srfi-13 sort

(define-record-type box
  (make-box l w h)
  box?
  (l box-length set-box-length!)
  (w box-width set-box-width!)
  (h box-height set-box-height!))

(define (surface-area box)
  ;; Box -> Int
  (let ((l (box-length box))
    (w (box-width box))
    (h (box-height box)))
    (+ (* 2 l w)
       (* 2 w h)
       (* 2 h l))))

(define (smallest-side box)
  ;; Box -> Int
  (define (smallest-two xs)
    ;; List -> List
    (let ((sorted (sort-list xs <)))
      (list (first sorted)
        (second sorted))))
  (let ((l (box-length box))
    (w (box-width box))
    (h (box-height box)))
    (apply * (smallest-two (list l w h)))))

(define (wrapping-paper box)
  ;; Box -> Int
  (let ((minimum (surface-area box))
    (extra (smallest-side box)))
    (+ minimum extra)))

(define (line->box line)
  ;; String -> Box
  (define (line->lon s)
    ;; String -> List<Number>
    (let ((xs (string-tokenize s (char-set-complement (char-set #\x)))))
      (map string->number xs)))
  (let* ((dims (line->lon line)))
    (let ((l (first dims))
      (w (second dims))
      (h (third dims)))
      (make-box l w h))))

(define (read-boxes file)
  ;; Pathname -> List<Box>
  (with-input-from-file file
    (lambda ()
      (let loop ((line (read-line))
         (ys '()))
    (if (eof-object? line)
        ys
        (loop
         (read-line)
         (cons (line->box line) ys)))))))

(define (sum-boxes boxes)
  ;; List<Box> -> Int
  (let ((xs (map wrapping-paper boxes)))
    (apply + xs)))

;; eof

Related Posts

Advent of Code, Day 1

origami-dragon

Advent of Code is a site that provides a programming problem for every day in December leading up to Christmas.

I’ve become a little obsessed with it over the last few days, and thought I’d write up my results. So far I’ve been working in Scheme.

Here’s the Day 1 problem description:


Day 1: Not Quite Lisp

Santa was hoping for a white Christmas, but his weather machine’s “snow” function is powered by stars, and he’s fresh out! To save Christmas, he needs you to collect fifty stars by December 25th.

Collect stars by helping Santa solve puzzles. Two puzzles will be made available on each day in the advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!

Here’s an easy puzzle to warm you up.

Santa is trying to deliver presents in a large apartment building, but he can’t find the right floor – the directions he got are a little confusing. He starts on the ground floor (floor 0) and then follows the instructions one character at a time.

An opening parenthesis, (, means he should go up one floor, and a closing parenthesis, ), means he should go down one floor.

The apartment building is very tall, and the basement is very deep; he will never find the top or bottom floors.

For example:

  • (()) and ()() both result in floor 0.
  • ((( and (()(()( both result in floor 3.
  • ))((((( also results in floor 3.
  • ()) and ))( both result in floor -1 (the first basement level).
  • ))) and )())()) both result in floor -3.

To what floor do the instructions take Santa?


The file of instructions looks something like this (but much larger):

()()((((()(((())(()(()((((((()(()(((())))((()(((()))((())(()((()

Here’s the (reasonably straight-forward) Scheme code. It basically just iterates through the input file, bumping a counter up or down based on the type of paren read from the input port. The definitions of UP-FLOOR? and DOWN-FLOOR? weren’t really necessary, but they made the main procedure a little easier to read.

(define (up-floor? x)
  (char=? x #\())

(define (down-floor? x)
  (char=? x #\)))

(define (find-floor file)
  (with-input-from-file file
    (lambda ()
      (let loop ((floor-number 0)
         (char (read-char)))
    (if (eof-object? char)
        floor-number
        (loop (cond ((up-floor? char)
             (+ floor-number 1))
            ((down-floor? char)
             (- floor-number 1))
            (else floor-number))
          (read-char)))))))

(Image courtesy Strongpaper under a Creative Commons license.)

Statistics over Git Repositories with Scsh

stream-small.jpg

Figure 1: A forest stream outside New Paltz, NY.

In this post I’ll share a scsh port of a nice shell script from Gary Bernhardt’s Destroy All Software screencasts. This is taken from season 1, episode 1 of the series 1.

The script is used to gather statistics on a git repository. You pass it a regex matching a filename, and it outputs a table showing how many lines of that type of file were included in each commit.

For example, I might want to see how the number of lines of documentation in Markdown files changed across commits:

$ repo-stats ".md$"
... snip! ...
52      c36cc6d First version of diff-checking code.
52      9ed53c3 Tweaks.
52      9e17d7e Add new service.
64      b293c3d Describe how to use the diffing code.
64      1886164 Update comments and documentation.
64      4a7ba26 Bump TODO prio.

The scsh code to do this is below; it’s a nearly 1:1 translation of Mr. Bernhardt’s bash code into scsh. It does differ in a few ways:

  • No dynamic/global variables: In the bash code there are variables being used inside functions that weren’t passed in as arguments to those functions. This is fine for small programs, but is probably not a Good Thing ™.
  • Since scsh is based on Scheme 48, we get a nice inspector/debugger for free.
  • At this program size, we don’t need to break out the Scheme 48 module system. However, if we wanted to integrate this scsh code cleanly with a larger system, we could do so fairly easily.
  • Something about how scsh is calling git and piping its output isn’t turning off git’s dumb (IMO) “I will behave differently depending on what kind of output I think I’m writing to” behavior. Therefore, unlike in Mr. Bernhardt’s example, we need to unset the GIT_PAGER environment variable.
  • Mr. Bernhardt used bash in his video due to its ubiquity. Scsh fails utterly in this regard, since almost no one uses it. However, that doesn’t really matter unless you need to distribute your code to a wider audience.2
  • Subjectively, Scheme is an immeasurably nicer language than whatever weird flavor of POSIXy sh is available.

Enough rambling, let’s have some code:

#!/usr/local/bin/scsh \
-e main -s
!#

(setenv "GIT_PAGER" "")

(define (revisions)
  (run/strings (git rev-list --reverse HEAD)))

(define (commit-description rev)
  (run/string (git log --oneline -1 ,rev)))

(define (number-of-lines file-pattern rev)
  (run/string
   (| (git ls-tree -r ,rev)
      (grep ,file-pattern)
      (awk "{print $3}")
      (xargs git show)
      (wc -l))))

(define (main prog+args)
  (let ((pat (second prog+args))
        (revs (revisions)))
    (for-each
     (lambda (rev)
       (let ((column-1 (string-trim-both (number-of-lines pat rev)))
             (column-2 (string-trim-both (commit-description rev))))
         (format #t "~A\t~A~%" column-1 column-2)))
     revs)))

Footnotes:

1

I feel like I should note for the record that:

  1. This is a legitimate, paid copy of Mr. Bernhardt’s videos that we’re working from.
  2. Although I’m only a few episodes into season 1, I am really enjoying the series and would recommend.
2

And if you do need to distribute your code to a wider audience, there is an easy way to dump a heap image that should be runnable by any other scsh VM of the same version. I’ve done this myself to distribute reasonably large/complex scripts to coworkers. I’m written a little scsh library to automate the process of installing an “app” in a heap image. I hope to write about it here soon.

The Sentinel File Pattern

In this short essay I’ll describe the “sentinel file” pattern, which I recently used when writing a command-line tool to use at $WORK for interacting with our web API.

The essence of the sentinel file pattern is that you use a certain file’s last-modified time as a record against which you compare other time-based values.

It is useful in many contexts, such as software builds; in the context of web APIs, it can be used to track whether you will need to reauthenticate with the API before you fire off a bunch of API calls.

The recipe is essentially this:

  • Update a sentinel file F‘s timestamp at time T.
  • When you are about to take an action such as make an API call, see if the current time, T’, is greater than the timeout value of your web API, V, plus the sentinel file’s existing timestamp T.

We can translate this into Scheme as follows (this is scsh, to be exact), where:

;; F = sentinel-file
;; T = (file-last-mod sentinel-file)
;; T' = (time)
;; V = api-timeout-value

(define (sentinel-expired? sentinel-file)
   (> (time)
      (+ (file-last-mod sentinel-file) api-timeout-value)))

Note that TIME and FILE-LAST-MOD are part of the scsh POSIX API.

This pattern is much more efficient than storing some kind of “am I logged in?” value in a JSON/YAML/XML/s-expression config file that has to be read in and parsed on every invocation and written out from time to time.

I debated whether to write about this simple technique at all because it seems like an old trick that many people know. However, I’m going to assume that I am not unique, and that there are lots of people out there who could benefit from using this technique when the right situation arises.