Advent of Code 2017, Day 2

This is my solution for Day 2 of this year’s Advent of Code.

You may also enjoy browsing the Day 2 solutions megathread on Reddit.

PROBLEM

The spreadsheet consists of rows of apparently-random numbers. To make sure the recovery process is on the right track, they need you to calculate the spreadsheet’s checksum. For each row, determine the difference between the largest value and the smallest value; the checksum is the sum of all of these differences.

For example, given the following spreadsheet:

5 1 9 5
7 5 3
2 4 6 8

The first row’s largest and smallest values are 9 and 1, and their difference is 8.

The second row’s largest and smallest values are 7 and 3, and their difference is 4.

The third row’s difference is 6.

In this example, the spreadsheet’s checksum would be 8 + 4 + 6 = 18.

SOLUTION

(define (line->list line)
  ;; String -> List
  (let ((read-ln (field-reader (infix-splitter (rx (+ whitespace)))))
        (in-port (make-string-input-port line)))
    (receive (record fields)
        (read-ln in-port)
      (map string->number fields))))

(define (read-spreadsheet file)
  ;; File -> List[List[Number]]
  (call-with-input-file file
    (lambda (port)
      (let loop ((line (read-line port))
                 (results '()))
        (if (eof-object? line)
            results
            (loop (read-line port) (cons line results)))))))

(define (main prog+args)
  (let ((rows (read-spreadsheet "/Users/rloveland/Code/personal/advent-of-code/2017/02/02.dat")))
    (write (apply + (map
                     (lambda (row)
                       (let* ((xs (line->list row))
                              (min (apply min xs))
                              (max (apply max xs)))
                         (- max min)))
                     rows)))
    (newline)))
Advertisements

A Mini Python and Shell Tutorial

wooly-mammoth-cp

The following is an email I sent to a couple of coworkers whom I’d been teaching a short Python course for technical writers, using Automate the Boring Stuff with Python. The email was meant to show them a real-life example of how a technical writer can use Python and shell scripting to automate something that is, well, boring. In this case, the task was to clean up a CSV file containing a list of git commits to the AppNexus REST APIs.

Because of the way we received this data, it had duplicate entries, and lots of non-interesting merge commits that were unrelated to a feature (a feature is generally associated with a JIRA ticket). Our task was to review the commits and see if there was anything interesting that should be added to our monthly API release notes.

(The names of my coworkers have been changed, obv.)


To: Jane X. (‘REDACTED@appnexus.com’)

Subject: Filtered API git commits to review (bonus: mini Python & shell tutorial)

From: Rich Loveland (‘REDACTED@appnexus.com’)

CC: Victoria Y. (‘REDACTED@appnexus.com’)

Date: Wed, 18 Nov 2015 16:59:04 -0500

+Victoria for the code fun

Jane, the file of commit logs for you to review is attached (along with some others). But so what, that’s boring! Let’s talk about how it was made.

To make the really boring task of reviewing API git commits less awful, let’s do some programming for fun. First let’s write a short Python script to pull out only those commits that have a JIRA ticket ID in them (since we don’t care about the other ones), and call it ‘filter-commit-messages.py’:

  #!/usr/bin/env python

  import re
  import sys

  jira_pat = "[A-Z]+-[0-9]+"

  for line in sys.stdin.readlines():
      m = re.search(jira_pat, line)
      if m:
          print(line)

This tries to match a regular expression against each line of its input (in this case the compiled API git commit list), and prints the line if the match occurs.

Let’s make it executable from our shell:

$ cd ~/bin
$ ln -s ~/work/code/filter-commit-messages.py filter-commit-messages
$ chmod +x ~/bin/filter-commit-messages
$ export PATH=$HOME/bin:$PATH

Then we can run it on the text file with the git commits like so:

$ filter-commit-messages < api-release-november-2015.csv

(The “<” in the shell means “Read your input from this place”.)

This prints out only the matching lines, but there are a lot of annoying extra lines in the output. We can get rid of those lines while sorting them like so:

$ filter-commit-messages < api-release-november-2015.csv | sort 

(The ”

” in the shell means “Pass your output through to this other command”.)

Now that we are extracting only the important lines, let’s throw them in a file:

$ filter-commit-messages < api-release-november-2015.csv | sort > api-release-november-2015-actual.csv

(The “>” near the end means “Write all of the output to this place”.)

We can see how much less reading we have to do now by running a word count program (‘wc’) on the before and after files:

$ wc -l api-release-november-2015.csv # old
     201 api-release-november-2015.csv
$ wc -l api-release-november-2015-actual.csv # new
     115 api-release-november-2015-actual.csv

(The “-l” means “count the lines”.)

Now, since Jane and I each have to review half of the commits, we can use the ‘split’ shell command to break the file in half. Since we know the file is 115 lines, we need to tell ‘split’ how many lines to put in each half with the ‘-l’ option (see ‘man split’ in your terminal):

$ split -l 58 api-release-november-2015-actual.csv COMMITS-TO-REVIEW

‘split’ takes the last argument, “COMMITS-TO-REVIEW”, and creates two files based on that, “COMMITS-TO-REVIEWaa” and “COMMITS-TO-REVIEWbb”, which we can rename for each reviewer:

$ mv COMMITS-TO-REVIEWaa COMMITS-TO-REVIEW-RICH
$ mv COMMITS-TO-REVIEWbb COMMITS-TO-REVIEW-JANE

A nice thing is that because we sorted the lines of the files, each reviewer gets commits by a sorted subset of the engineers, making it easier to see their related commits next to each other.

p.s. We didn’t actually need a Python program for the first part, we could have just used ‘grep’ and stayed with shell commands. But hey!

p.p.s. With more work, this could all be put together into a single program if we were inclined, but since it doesn’t get used that often it’s probably OK to type a few commands.

(Image courtesy William Hartman under Creative Commons license.)

Make your own iterators in Scheme using closures

idle-machines

Introduction

I recently bought a copy of Higher Order Perl (also known as HOP) by Mark Dominus. In chapter 4, he introduces iterators. At a high level, an iterator is a black box with an input funnel, an output funnel, and a big red “NEXT” button. You put something into the funnel at time T; then, at some later time T+n, you press the “NEXT” button and something useful comes out.

Iterators are useful in cases when you want to access a potentially huge list of things one at a time, but you don’t have access to a computer with infinite memory.

In this post we’ll look at a moderately useful iterator example; a directory walker that recurses down a directory tree on your file system. The implementation is given in Scheme (specifically, the scsh dialect); the code and the descriptions thereof are heavily inspired by Dominus’ excellent book. However, this post contains the work of an amateur. Any errors or omissions in the following code and explanations are mine.

Iterators are made with closures

In order to build a useful iterator, we need to have a basic understanding of closures. A closure is just a function packaged up with an environment. The environment allows us to store up state between function calls. As noted above, it’s easiest just to think of it as a little black box that we can pass around in our program.

In Scheme and Common Lisp, you build functions with LAMBDA. Here’s a very simple closure example in Scheme, translated from the first example given in HOP:

(define (upto m n)
  (set! m (- m 1)) ;needed since we don't have post-increment ($i++)
  (lambda ()
    (if (< m n)
        (begin (set! m (+ m 1)) m)
        #f)))

This is a function that returns a function (a closure). To use the closure returned by UPTO, assign it to a variable and call it whenever you want the next value. When the iterator is exhausted, it will return #f.

> (define 1to10 (upto 1 10))
> (1to10)
1
> (1to10)
2
...
> (1to10)
9
> (1to10)
10
> (1to10)
#f

A trivial file tree iterator

A more interesting and practical use of this technique is to walk a directory tree on your computer. In the Scheme procedure below, we take a list of directories and then build and return a closure that, when called, will walk the directory tree breadth-first, printing out the names of files as we go.

(define (dir-walk queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files)))
                          file))
                       ((file-regular? file) file)
                       (else #f)))))))

The important part to notice is the inner LAMBDA. This is where we create a closure by packaging up a procedure with an environment. The closure’s environment remembers the contents of the variable QUEUE in memory, so we can ask for the elements of QUEUE at our leisure.

Here it is running on my machine:

> (define dir-iter1 (dir-walk '("/Users/rloveland/Desktop/current/")))
> (dir-iter1)
"/Users/rloveland/Desktop/current/XProlog.jar"
> (dir-iter1)
"/Users/rloveland/Desktop/current/all-jira-issues.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/automate-campaign-setup.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/buflocal.scm"
> (dir-iter1)
"/Users/rloveland/Desktop/current/bundle.js"

Just for fun, here’s a version of the same function in Common Lisp. It’s essentially the same as the scsh version, save that it uses the nice CL pathname machinery instead of appending raw strings, and also makes use of the CL-FAD convenience library.

(defun dir-walk (queue)
  (let ((file nil))
    (lambda ()
      (if (not (null queue))
          (progn
            (setf file (car queue))
            (setf queue (cdr queue))
            (cond ((cl-fad:directory-pathname-p file)
                   (let ((new-files (cl-fad:list-directory file)))
                     (setf queue (append queue (mapcar #'(lambda (filename)
                                                           (merge-pathnames filename file))
                                                       new-files)))
                     file))
                  ((cl-fad:file-exists-p file) file)
                  (t nil)))))))

Usage is also essentially the same:

CL-USER> (defvar *tree-walker* (dir-walk '(#P"/Users/rloveland/Desktop/current/")))
*TREE-WALKER*
CL-USER> (type-of *tree-walker*)
FUNCTION
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Desktop/current/"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.debuggerDefaults"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.DS_Store"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hi"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hs"

A slightly less trivial file tree iterator

Listing out files and directories is nice, but not that useful. We’d like a way to see only those files and directories that have some interesting property.

This can be accomplished by passing in another argument to our DIR-WALK function: this argument will be yet another function that will test the current file to see whether it’s interesting to us. It’s pretty easy to change DIR-WALK to accept a function argument, INTERESTING?. This arbitrary function is used to check the file to see if we care about it.

This time around, when we build our internal queue, we use a call to FILTER to make sure that only interesting files get added.

(define (dir-walk* interesting? queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond
                  ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (filter interesting? (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files))))
                          (if (interesting? file) file)))
                  ((interesting? file) file)
                  (else #f)))
          #f))))

And here it is in use; in this example, we pass an INTERESTING? function that asks for only files that are not marked as executable:

> (define dir-iter2 (dir-walk* (lambda (f) (file-not-executable? f)) '("/home/rml/Desktop/current")))
> (dir-iter2)
> (dir-iter2)
"/home/rml/Desktop/current/A.hi"
> (dir-iter2)
"/home/rml/Desktop/current/A.hs"
> (dir-iter2)
"/home/rml/Desktop/current/A.o"
> (dir-iter2)
"/home/rml/Desktop/current/ABBREV.xlsx"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.class"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.java"

Conclusion

There are still bugs and unexpected behavior in DIR-WALK*. For example, on the first call to the resulting iterator there is no result due to the one-armed IF. There are also strange things that happen if we want to filter out directories from our results, since we mix together the FILE-DIRECTORY? check and the INTERESTING? check inside the iterator. However, despite these small nits, it can still do useful work on my machine 1, and it’s a good enough example of using closures to build iterators.

Here’s hoping you’ll enjoy playing with closures and iterators in your own code!

(Image courtesy zeitfanger.at under a Creative Commons License.)

Footnotes:

1 In the grandest internet tradition, “It works on my machine” ™.