## Advent of Code 2017, Day 2

This is my solution for Day 2 of this year’s Advent of Code.

You may also enjoy browsing the Day 2 solutions megathread on Reddit.

## PROBLEM

The spreadsheet consists of rows of apparently-random numbers. To make sure the recovery process is on the right track, they need you to calculate the spreadsheet’s checksum. For each row, determine the difference between the largest value and the smallest value; the checksum is the sum of all of these differences.

For example, given the following spreadsheet:

```5 1 9 5
7 5 3
2 4 6 8
```

The first row’s largest and smallest values are 9 and 1, and their difference is 8.

The second row’s largest and smallest values are 7 and 3, and their difference is 4.

The third row’s difference is 6.

In this example, the spreadsheet’s checksum would be 8 + 4 + 6 = 18.

## SOLUTION

```(define (line->list line)
;; String -> List
(in-port (make-string-input-port line)))
(map string->number fields))))

;; File -> List[List[Number]]
(call-with-input-file file
(lambda (port)
(results '()))
(if (eof-object? line)
results
(loop (read-line port) (cons line results)))))))

(define (main prog+args)
(write (apply + (map
(lambda (row)
(let* ((xs (line->list row))
(min (apply min xs))
(max (apply max xs)))
(- max min)))
rows)))
(newline)))
```

## Advent of Code 2017, Day 1

This is my solution for Day 1 of this year’s Advent of Code.

You may also enjoy browsing the Day 1 solutions megathread on Reddit.

## PROBLEM

The captcha requires you to review a sequence of digits (your puzzle input) and find the sum of all digits that match the next digit in the list. The list is circular, so the digit after the last digit is the first digit in the list.

For example:

• 1122 produces a sum of 3 (1 + 2) because the first digit (1) matches the second digit and the third digit (2) matches the fourth digit.

• 1111 produces 4 because each digit (all 1) matches the next.

• 1234 produces 0 because no digit matches the next.

• 91212129 produces 9 because the only digit that matches the next one is the last digit, 9.

## SOLUTION

```(define captcha-input "5994521226795838")

(define (gather-matches s)
;; String -> List
(let ((in-port (make-string-input-port s)) (count 0) (head #f) (vals '()))
(let loop ((cur (read-char in-port)) (next (peek-char in-port)) (count count) (vals vals))
(if (eof-object? next)
(cons cur vals)
vals)
(cond ((= count 0)
(begin
(loop cur next (+ 1 count) vals)))
((char=? cur next)
(loop (read-char in-port) (peek-char in-port) (+ 1 count) (cons cur vals)))
(else (loop (read-char in-port) (peek-char in-port) (+ 1 count) vals)))))))

(define (main prog+args)
(matches* (map (lambda (c) (string->number (string c))) matches))
(sum (apply + matches*)))
(begin
(format #t "MATCHES*: ~A~%" matches*)
(format #t "SUM: ~A~%" sum))))
```

## Introduction

I recently bought a copy of Higher Order Perl (also known as HOP) by Mark Dominus. In chapter 4, he introduces iterators. At a high level, an iterator is a black box with an input funnel, an output funnel, and a big red “NEXT” button. You put something into the funnel at time `T`; then, at some later time `T+n`, you press the “NEXT” button and something useful comes out.

Iterators are useful in cases when you want to access a potentially huge list of things one at a time, but you don’t have access to a computer with infinite memory.

In this post we’ll look at a moderately useful iterator example; a directory walker that recurses down a directory tree on your file system. The implementation is given in Scheme (specifically, the scsh dialect); the code and the descriptions thereof are heavily inspired by Dominus’ excellent book. However, this post contains the work of an amateur. Any errors or omissions in the following code and explanations are mine.

## Iterators are made with closures

In order to build a useful iterator, we need to have a basic understanding of closures. A closure is just a function packaged up with an environment. The environment allows us to store up state between function calls. As noted above, it’s easiest just to think of it as a little black box that we can pass around in our program.

In Scheme and Common Lisp, you build functions with `LAMBDA`. Here’s a very simple closure example in Scheme, translated from the first example given in HOP:

```(define (upto m n)
(set! m (- m 1)) ;needed since we don't have post-increment (\$i++)
(lambda ()
(if (< m n)
(begin (set! m (+ m 1)) m)
#f)))
```

This is a function that returns a function (a closure). To use the closure returned by `UPTO`, assign it to a variable and call it whenever you want the next value. When the iterator is exhausted, it will return `#f`.

```> (define 1to10 (upto 1 10))
> (1to10)
1
> (1to10)
2
...
> (1to10)
9
> (1to10)
10
> (1to10)
#f
```

## A trivial file tree iterator

A more interesting and practical use of this technique is to walk a directory tree on your computer. In the Scheme procedure below, we take a list of directories and then build and return a closure that, when called, will walk the directory tree breadth-first, printing out the names of files as we go.

```(define (dir-walk queue)
(let ((file #f))
(lambda ()
(if (not (null? queue))
(begin (set! file (car queue))
(set! queue (cdr queue))
(cond ((file-directory? file)
(let ((new-files (directory-files file)))
(set! queue (append queue (map (lambda (filename)
(string-append file "/" filename))
new-files)))
file))
((file-regular? file) file)
(else #f)))))))
```

The important part to notice is the inner `LAMBDA`. This is where we create a closure by packaging up a procedure with an environment. The closure’s environment remembers the contents of the variable `QUEUE` in memory, so we can ask for the elements of `QUEUE` at our leisure.

Here it is running on my machine:

```> (define dir-iter1 (dir-walk '("/Users/rloveland/Desktop/current/")))
> (dir-iter1)
"/Users/rloveland/Desktop/current/XProlog.jar"
> (dir-iter1)
"/Users/rloveland/Desktop/current/all-jira-issues.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/automate-campaign-setup.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/buflocal.scm"
> (dir-iter1)
"/Users/rloveland/Desktop/current/bundle.js"
```

Just for fun, here’s a version of the same function in Common Lisp. It’s essentially the same as the scsh version, save that it uses the nice CL pathname machinery instead of appending raw strings, and also makes use of the CL-FAD convenience library.

```(defun dir-walk (queue)
(let ((file nil))
(lambda ()
(if (not (null queue))
(progn
(setf file (car queue))
(setf queue (cdr queue))
(setf queue (append queue (mapcar #'(lambda (filename)
(merge-pathnames filename file))
new-files)))
file))
(t nil)))))))
```

Usage is also essentially the same:

```CL-USER> (defvar *tree-walker* (dir-walk '(#P"/Users/rloveland/Desktop/current/")))
*TREE-WALKER*
CL-USER> (type-of *tree-walker*)
FUNCTION
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Desktop/current/"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.debuggerDefaults"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.DS_Store"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hi"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hs"
```

## A slightly less trivial file tree iterator

Listing out files and directories is nice, but not that useful. We’d like a way to see only those files and directories that have some interesting property.

This can be accomplished by passing in another argument to our `DIR-WALK` function: this argument will be yet another function that will test the current file to see whether it’s interesting to us. It’s pretty easy to change `DIR-WALK` to accept a function argument, `INTERESTING?`. This arbitrary function is used to check the file to see if we care about it.

This time around, when we build our internal queue, we use a call to `FILTER` to make sure that only interesting files get added.

```(define (dir-walk* interesting? queue)
(let ((file #f))
(lambda ()
(if (not (null? queue))
(begin (set! file (car queue))
(set! queue (cdr queue))
(cond
((file-directory? file)
(let ((new-files (directory-files file)))
(set! queue (append queue (filter interesting? (map (lambda (filename)
(string-append file "/" filename))
new-files))))
(if (interesting? file) file)))
((interesting? file) file)
(else #f)))
#f))))
```

And here it is in use; in this example, we pass an `INTERESTING?` function that asks for only files that are not marked as executable:

```> (define dir-iter2 (dir-walk* (lambda (f) (file-not-executable? f)) '("/home/rml/Desktop/current")))
> (dir-iter2)
> (dir-iter2)
"/home/rml/Desktop/current/A.hi"
> (dir-iter2)
"/home/rml/Desktop/current/A.hs"
> (dir-iter2)
"/home/rml/Desktop/current/A.o"
> (dir-iter2)
"/home/rml/Desktop/current/ABBREV.xlsx"
> (dir-iter2)
> (dir-iter2)
```

## Conclusion

There are still bugs and unexpected behavior in `DIR-WALK*`. For example, on the first call to the resulting iterator there is no result due to the one-armed `IF`. There are also strange things that happen if we want to filter out directories from our results, since we mix together the `FILE-DIRECTORY?` check and the `INTERESTING?` check inside the iterator. However, despite these small nits, it can still do useful work on my machine 1, and it’s a good enough example of using closures to build iterators.

Here’s hoping you’ll enjoy playing with closures and iterators in your own code!

(Image courtesy zeitfanger.at under a Creative Commons License.)

## Footnotes:

1 In the grandest internet tradition, “It works on my machine” ™.

## Introduction

Just for fun, i’ve implemented a command line “word count” program in `scsh`. Along the way i’ve learned about a few really neat features of `scsh`, which I’ll discuss more below.

What is a word count program? In my perfect world, a word counter does only one thing: count the number of words it sees on standard input, and print that number to standard output. Let me give some examples of what should qualify as “words” for our purposes:

below
that’s
schleswig-holstein (one word!)
autonomy
friday’s
putrescence

## Basic Design

If I were specifying a word count program in natural language, I might think of this series of steps:

1. Read standard input (fd 0) into some kind of input buffer.
2. Iterate over that buffer, checking for breaks (a.k.a. whitespace) between words.

A string like the following might prove tricky enough (feel free to replace the reference to “Satan” with your preferred \$EVIL_DEITY):

Every good dog goes to heaven – and if not? – well, I hear
Satan has de-
licious bones to chew!

By inspection, the sentence above has eighteen words. (Note that the implementation of `wc` in scsh described here (eventually) says 19, and GNU’s `wc` says there are 21–both are incorrect, but more on that below.) There are several edge cases to note in this sentence:

1. A space at the beginning of the sentence.
2. A clause inside em-dashes.
3. A hyphenated word occurring at the end of a line.

Some things we should probably do, given the above:

1. Ignore spaces at the beginnings of lines.
2. Ignore punctuation in general, such as periods, exclamation points…
3. ..except hyphens. Hyphens join two or more words into one. This should work across newlines; that would take care of the “hyphenated word at line’s end” case.

Of the above 3 items, the last point sounds the trickiest. I think we need to take care of the hyphen case eventually, but for now let’s punt on it and worry about getting something basic working. Starting at a high level, here’s my take on the program flow:

1. Loop over the lines of standard input.
2. For each line, split the line into words using an appropriately “clever” regular expression.
3. Keep a running total of how many words we’ve seen.

## Implementation Notes

Sounds simple enough, right? Well, after some hacking around, here’s what I came up with. Items 1 and 3 from the above program flow list are pretty standard parts of any programming language, and scsh has them covered, as we’ll see. That leaves it up to me to implement the above-mentioned “clever” regular expression (which, as you’ll see shortly, could stand to be a little, um, “clever-er”).

In the course of describing this implementation, i’ll introduce several interesting features of scsh,

• The `rx’ regular expression notation
• The `awk’ macro
• The `regexp-fold-right’ procedure

and briefly mention another:

• Scheme 48/scsh “records”.

### The Not-so-Clever Regular Expression

Rather than do it here, I’ve written the description of the regular expression using comments in the code itself. One of the strengths of shivers’ `rx’ regular expression package (standard in scsh) is that it allows the programmer to do matching notation using a syntax that, while not Scheme proper, is s-expression based rather than the traditional string-based syntax (Don’t worry: the string syntax is also supported). Because of the s-expressions, we can format these regexps easily in Emacs, as well as add descriptive inline comments.

The Perl aficionados among you will note that this is similar to Perl’s `-x’ regular expression modifier, which allows you to “Extend your pattern’s legibility by permitting whitespace and comments.” Of course, when we use scsh, we get all of that, and the rest of Scheme for free!

In any event, here is the expression that we match against every line of input:

```(define wc-rx (rx (:                   ; start rx matching sequence
(* whitespace)      ; 0 or more spaces to start the line
(+ alphanumeric)    ; match 1 or more chars in [0-9A-Za-z]
(?                  ; then, optionally match the following:
(or #\' #\`)       ; - apostrophe or backtick
(* alphanumeric))  ; - 0 or more [0-9a-zA-Z]
(* whitespace))))   ; - 0 or more spaces to end the line
```

Let’s see how it does against our word list from above:

Text Count
below 1
that’s 1
schleswig-holstein 2
autonomy 1
friday’s 1
putrescence 1

Looks like we’re failing to match correctly against “schleswig-holstein”, a hyphenated word. This is something of a corner case, since relatively few words in English use hyphens. Therefore, I’ll punt on it for now (as I did on the “hyphenated line break” case above), with the caveat that it may need to be implemented in future.

### Looping with awk

I tried to avoid learning too much about scsh’s `awk’ macro at first. Unfortunately, I just didn’t take the time to read the manual in this case. My thinking was: Can’t I just write a `for-each` style Scheme loop? Well, yes. scsh actually comes with `regexp-for-each` procedure, which will happily loop across a string, gobbling up one match after another. The difficulty arose when i tried to update a state variable from inside my `awk’ macro loop. I just couldn’t seem to get it to work. Dramatization (This is not the actual code–I’ve since deleted it):

```(let ((count 0))
... stuff here ...
(set! count (+ 1 count)))
... etc. ...)
```

It turns out that this was extremely dumb, since `awk’ provides a nice state variable abstraction already – here’s the definition of the form from the excellent manual:

```(awk NEXT-RECORD RECORD+FIELD-VARS
[COUNTER] STATE-VAR-DECLS
CLAUSE_1 ...)
```

Using the block above (and the manual) as a reference, let’s look at the actual code in the section below – you’ll have to substitute the existing code below for the terms `NEXT-RECORD’ and the like:

1. Each time through the loop, `awk’ gets its next set of values using the procedure NEXT-RECORD; in this case, it uses a simple `read-line`.
2. The variable `line` (standing in for RECORD+FIELD-VARS) holds the line of input returned by `read-line`.
3. `awk’ provides an optional COUNTER variable, which is omitted since isn’t needed in this case.
4. We set the value of the state variable `words` to 0 (This is `STATE-VAR-DECLS’).
5. Finally, we get to `CLAUSE_1′. In the `awk’ macro, there are several types of clauses, which are all evaluated in sequence (this is unlike the way `cond` works, for example). In the most basic case, the clause is of the form `(/test/ /body/)`. As you can see here, test is simply `#t`, Scheme’s boolean truth value, so its corresponding body is evaluated every time through the loop. This causes the value of our state variable `words` to be incremented by the `length` of the list returned by the `regexp-fold-right’ block. For now we’ll just treat that block as a black box which looks at a string and returns a list of matching substrings.
```(awk (read-line) (line) ((words 0))
(#t (+ words
(length
(regexp-fold-right wc-rx (lambda (m i lis)
(cons (match:substring m 0) lis))
'() line)))))
```

### Gathering Matches with `regexp-fold-right`

As we’ve just deduced by reading through the `awk’ loop, `regexp-fold-right` returns a list of substrings matching our regular expression `wc-rx`. We’re going to take the length of that list and add it to our running total. But how does this crazy `regexp-fold-right` work, anyway? It took me a while to figure out, as I hadn’t used any of the functional-style `fold` procedures before.

First, the code:

```(regexp-fold-right wc-rx (lambda (m i lis)
(cons (match:substring m 0) lis))
'() line)
```

My initial impressions: I can see that we’re using our regular expression `wc-rx` and doing something with a `lambda`-expression that `cons`-es part of the resulting `match` record onto an empty list. Somehow this is operating on the `line` of input read in by `awk’.

Looking at the code above with the manual open, we see the following:

• Again, `m` is a `match’ record returned by matching `wc-rx` against this line of input.
• `i` is the index of the end of the match described by `m`
• `lis` is the list of substring matches we’ll be building up by repeatedly `cons`-ing the substring in the match record `m` onto `lis` – in this way we build up a list of matching substrings of our regular expression

To make sure we really understand this, let’s step through an example. We’ll create a simple match against 2 lines of input, and diagram the states of variables `m`, `i`, `lis`, and `line` during each step. By watching the ways these variables change as we loop over each line, we’ll get a fuller understanding of the way `regexp-fold-right` works. Four iterations should be enough:

STEP 1
String “let’s go to the Boston aquarium!”
i 0
lis ‘()
(match:substring m 0) “”
STEP 2
String “go to the Boston aquarium!”
i 6
lis ‘(“let’s “)
(match:substring m 0) “let’s “
STEP 3
String “to the Boston aquarium!”
i 9
lis ‘(“go ” “let’s “)
(match:substring m 0) “go “
STEP 4
String “the Boston aquarium!”
i 12
lis ‘(“to ” “go ” “let’s “)
(match:substring m 0) “to “

## Does it Work?

It’s been fun writing this program, but how does it stack up against GNU `wc`? According to my quick and dirty testing, quite well. Depending upon the type of file and how much weird formatting and markup it contains, our scsh word count will report slightly more or fewer words than GNU’s `wc`. Here are a few examples from readily available text files (The first file is Neal Stephenson’s In the Beginning was the Command Line, the second Homer’s Iliad):

File GNU wc wc.scm Discrepancy (%)
command-line.txt 36331 37262 0.025

Those numbers look pretty good to me. And remember that we still have room for improvement, since we don’t handle hyphenated words correctly (easy to implement), nor do we handle hyphenated line breaks (slightly harder). Remember from Basic Design that GNU word count doesn’t handle the hyphenated line break correctly either, and also appears to be incorrectly treating `–’ as a word, at least according to our single test.

Just for fun, I ran both implementations against this post – not surprisingly, they’re pretty close:

 Homegrown scsh version GNU version 1957 1988

## Summary

To sum up, we’ve implemented a relatively sane word count program and introduced a few interesting areas of Scheme and scsh:

• `rx’ regular expression notation
• `awk’ loop macro
• `fold’-style functional iteration
• Scheme 48/scsh records

You can read more about these and other features on the scsh website. Below you’ll find the full version of the program described in this post.

Finally, thanks for reading, and happy scsh hacking!

## Full Program Listing

```#!/usr/local/bin/scsh \
-e main -s

Display the number of words read from standard input
to standard output.
!#

(define wc-rx (rx (:                   ; start rx matching sequence
(* whitespace)      ; 0 or more spaces to start the line
(+ alphanumeric)    ; match 1 or more chars in [0-9A-Za-z]
(?                  ; then, optionally match the following:
(or #\' #\`)       ; - apostrophe or backtick
(* alphanumeric))  ; - 0 or more [0-9a-zA-Z]
(* whitespace))))   ; - 0 or more spaces to end the line

(define (main prog+args)
(display
(#t (+ words
(length
(regexp-fold-right wc-rx (lambda (m i lis)
(cons (match:substring m 0) lis))
'() line))))))
(newline))
```

## Footnotes:

1 For more information about records, as well as the other features introduced here, consult the excellent manual at http://www.scsh.net/docu/man.html.

(Image courtesy Melisande under Creative Commons license.)

## How I Built scsh 0.7 from Github (for newbies)

The following is a short message I sent to the scsh-users mailing list. It’s archived at Gmane, but I thought I’d add it here (with minor edits) as well for the sake of redundancy since it may prove useful to someone else.

Hey fellow scshers,

Warning: Newbie info follows, real hackers need not continue reading.

I’ve managed to build scsh 0.7 on my 32-bit Fedora box, but required slight tweaking of the Makefile so that scsh could see my Scheme48 1.9 development libraries. I had to tweak the Makefile vars like so:

CC = gcc -I/usr/local/include
LIBS = -lutil -L/usr/local/lib/scheme48-1.9T
SCHEME48 = /usr/local/bin/s48

Note that I already had Scheme 1.8 installed as `scheme48`, so I needed to call the 1.9 executable `s48`. Also note that installing the development version of Scheme48 1.9 (as required by scsh 0.7) will require you to have an already installed Scheme 48 of 1.6 or above.

I probably could have done this with a few `configure` flags like so (I think) 1:

./configure –libdir=/usr/local/lib/scheme48-1.9T
–includedir=/usr/local/include

Oh well. Posting this here if there are any mere lusers like myself hanging around.

It occurs to me that it might be nice to write up a step-by-step install guide that covers a number of systems (32- and 64-bit). I’ve got access to a 64-bit MacBook Pro; I think I’ll try that next and report back.

Best, Rich

## Footnotes:

1 I tried these `configure` flags and they don’t work. Use the directions for modifying the Makefile instead.