Oh my, am I really considering XML?

Since writing Why Markdown is not my favorite text markup language, I’ve been thinking more about document formats.

More and more I begin to see the impetus for the design of XML, despite its sometimes ugly implementation. With XML you avoid much of the ambiguity of parsing plain-text-based formats and just write the document AST directly. Whether this is a good or bad thing seems to depend on the tools you have available to you, but I think I’m starting to see the light.

At $WORK, for example, I’ve been writing directly in the “XML-ish” Confluence storage format since it was introduced in Confluence 4. Combined with the right editing environment (such as that provided by Emacs’ nxml-mode), it’s easy to navigate XML “structurally” in such a way that you no longer really see the tags.

It’s sort of like being Neo in The Matrix except that, instead of making cool shit happen in an immersive virtual world you’re, um, writing XML.

However, not all is roses in XML-land. In an ideal world, you could maintain a set of XML documents and reliably transform them into other valid formats using a simple set of tools that are easy to learn and use. In reality, many of the extant XML tools such as XSLT exhibit a design aesthetic that is deeply unappealing to most programmers. The semantics of XSLT are interesting, but the syntax appears to be a result of the mistakes that are often made when programmers decide to create their own DSLs. Olin Shivers has a good discussion of the often-broken “little language” phenomenon in his scsh paper.

Speaking of Scheme, it’s possible that something reasonable can be built with SXML. I’ve also had good results using Perl and Mojo::DOM to build Graphviz diagrams of the links among Confluence wiki pages as part of a hacked-together “link checker” (Users of Confluence in an industrial setting will know that the built-in link-checking in Confluence only “sort of” works, which is indistinguishable in practice from not actually working — hence the need to build my own thing).

I’ve also been playing around with MIT Scheme’s built-in XML Parser, and so far I’m preferring it to the Perl or SXML way of doing things.

Advent of Code, Day 3

In this post I’ll describe my solution for Day 3 of the Advent of Code.

Problem Description

Day 3: Perfectly Spherical Houses in a Vacuum

Santa is delivering presents to an infinite two-dimensional grid of houses.

He begins by delivering a present to the house at his starting location, and then an elf at the North Pole calls him via radio and tells him where to move next. Moves are always exactly one house to the north (‘^’), south (‘v’), east (‘>’), or west (‘<‘). After each move, he delivers another present to the house at his new location.

However, the elf back at the north pole has had a little too much eggnog, and so his directions are a little off, and Santa ends up visiting some houses more than once. How many houses receive at least one present?

For example:

‘>’ delivers presents to 2 houses: one at the starting location, and one to the east.

‘^>v<‘ delivers presents to 4 houses in a square, including twice to the house at his starting/ending location.

‘^v^v^v^v^v’ delivers a bunch of presents to some very lucky children at only 2 houses.

Solution

Broadly speaking, my solution consisted of:

  • Reading the directions file to determine the largest x and y values of the grid
  • Making a shaped array using those dimensions (specifically, we double the array dimensions to allow for movement up, down, forward, and back)
  • Starting in the center of the shaped array, follow the instructions from the “map” and mark every house (called a “position” in the code) if it hasn’t already been visited
  • Every time we visit a house we haven’t already visited, we bump a counter

Here’s the Scheme code that accomplishes those steps:

;; read in the string
;; sum the ^ and v chars to get the height of the matrix (graph)
;; sum the < and > chars to get the width of the matrix (graph)

(define (north? ch) (char=? ch #\^))
(define (south? ch) (char=? ch #\v))
(define (east? ch) (char=? ch #\>))
(define (west? ch) (char=? ch #\<))

(define (make-shape width height)
  ;; Int Int Int -> Shape
  (shape 0 width 0 height))

(define (read-shape-file file)
  ;; Pathname -> Shape
  (with-input-from-file file
    (lambda ()
      (let loop ((width 0)
         (height 0)
         (min-width  0)
         (max-width 0)
         (min-height 0)
         (max-height 0)
         (ch (read-char)))
    (if (eof-object? ch)
        (make-shape
         (* 2 (- max-width min-width))
         (* 2 (-  max-height min-height)))
        (cond ((north? ch)
           (loop width (+ height 1)
             min-width max-width
             (min min-height height)
             (max max-height height)
             (read-char)))
          ((south? ch)
           (loop width (- height 1)
             min-width max-width
             (min min-height height)
             (max max-height height)
             (read-char)))
          ((east? ch)
           (loop (+ width 1) height
             (min min-width width)
             (max max-width width)
             min-height max-height
             (read-char)))
          ((west? ch)
           (loop (- width 1) height
             (min min-width width)
             (max max-width width)
             min-height max-height
             (read-char)))
          (else (error "WHOA"))))))))

;; We make a shaped, multi-dimensional array (SRFI-25) in the size
;; it's determined we need by our earlier check.

(define (make-grid shape)
  ;; Shape -> Array
  (make-array shape #f))

;; The POSITION data type

(define-record-type position
  (make-position x y)
  position?
  (x position-x set-position-x!)
  (y position-y set-position-y!))

(define (array-center arr)
  ;; Array -> Position
  (let ((len-x (array-length arr 0))
    (len-y (array-length arr 1)))
    (make-position (/ len-x 2)
           (/ len-y 2))))

(define (make-relative-position x y ch)
  ;; Int Int Char -> Position
  (let ((vals '()))
    (cond ((north? ch)
       (set! vals (list (+ x 1) y)))
      ((south? ch)
       (set! vals (list (- x 1) y)))
      ((east? ch)
       (set! vals (list x (- y 1))))
      ((west? ch)
       (set! vals (list x (+ y 1))))
      (else (set! vals (list x y))))
    (make-position (first vals)
           (second vals))))

(define (visited? arr x y)
  ;; Array Int Int -> Bool
  (array-ref arr x y))

(define (set-visited! arr x y)
  ;; Array Int Int -> Undefined
  (array-set! arr x y #t))

(define (visit-locations arr file)
  ;; Pathname -> Int
  (let ((visited-count 0))
    (with-input-from-file file
      (lambda ()
    (let loop ((ch (read-char))
           (current-position (array-center arr)))
      (if (eof-object? ch)
          visited-count
          (begin
        (let* ((current-x (position-x current-position))
               (current-y (position-y current-position))
               (next-position
            (make-relative-position current-x current-y ch)))
          (if (not (visited? arr current-x current-y))
              (begin
            (set! visited-count (+ visited-count 1))
            (set-visited! arr current-x current-y)
            (loop (read-char)
                  next-position))
              (loop (read-char) next-position))))))))
    visited-count))

;; eof

Once this code is loaded up in the REPL, you can use it as shown below. (Note that the answer shown at the end isn’t real to avoid a spoiler.)

(set! *the-array* (make-grid (read-shape-file (expand-file-name "~/Code/personal/advent-of-code/03.dat"))))
'#{Array:srfi-9-record-type-descriptor}

> (array-size *the-array*)
42588

> (array-center *the-array*)
'#{Position}

> (define *the-file* (expand-file-name "~/Code/personal/advent-of-code/03.dat"))

> (visit-locations *the-array* *the-file*)
12345

Related Posts

Advent of Code, Day 2

This post describes my solution for Day 2 of the Advent of Code.

Problem Description

First, the problem description (copied from the website):


Day 2: I Was Told There Would Be No Math

The elves are running low on wrapping paper, and so they need to submit an order for more. They have a list of the dimensions (length l, width w, and height h) of each present, and only want to order exactly as much as they need.

Fortunately, every present is a box (a perfect right rectangular prism), which makes calculating the required wrapping paper for each gift a little easier: find the surface area of the box, which is 2 x l x w + 2 x w x h + 2 x h x l. The elves also need a little extra paper for each present: the area of the smallest side.

For example:

  • A present with dimensions 2x3x4 requires 2 x 6 + 2 x 12 + 2 x 8 = 52 square feet of wrapping paper plus 6 square feet of slack, for a total of 58 square feet.
  • A present with dimensions 1x1x10 requires 2 x 1 + 2 x 10 + 2 x 10 = 42 square feet of wrapping paper plus 1 square foot of slack, for a total of 43 square feet.

All numbers in the elves’ list are in feet. How many total square feet of wrapping paper should they order?


Solution

Once again, we’ll be working in Scheme.

For this problem, I decided to create a “box” data type. In addition to the automatically generated accessors (thanks SRFI-9!), I wrote several procedures to perform calculations on boxes, namely:

  • SURFACE-AREA: Calculate the box’s surface area.
  • SMALLEST-SIDE: Determine which of the box’s sides has the smallest surface area (the extra material makes it easier to wrap).

WRAPPING-PAPER is just a “wrapper” (pun intended) around the first two.

LINE->BOX, READ-BOXES, and SUM-BOXES are all about parsing the input file contents and shuffling them into the box data type that we use to do the actual calculation. The only part that required a bit of thought was the line with STRING-TOKENIZE in LINE->BOX. In Perl I’d use my @params = split /x/, $line without even thinking, but I was less familiar with Scheme’s facility for solving this problem, so it took a few minutes to puzzle out the right part of Scheme’s “API”. (STRING-TOKENIZE was helpfully provided by SRFI-13.)

Abstract data types FTW! I’ll be using them more as the month’s challenges progress.

;; ,open srfi-9 srfi-13 sort

(define-record-type box
  (make-box l w h)
  box?
  (l box-length set-box-length!)
  (w box-width set-box-width!)
  (h box-height set-box-height!))

(define (surface-area box)
  ;; Box -> Int
  (let ((l (box-length box))
    (w (box-width box))
    (h (box-height box)))
    (+ (* 2 l w)
       (* 2 w h)
       (* 2 h l))))

(define (smallest-side box)
  ;; Box -> Int
  (define (smallest-two xs)
    ;; List -> List
    (let ((sorted (sort-list xs <)))
      (list (first sorted)
        (second sorted))))
  (let ((l (box-length box))
    (w (box-width box))
    (h (box-height box)))
    (apply * (smallest-two (list l w h)))))

(define (wrapping-paper box)
  ;; Box -> Int
  (let ((minimum (surface-area box))
    (extra (smallest-side box)))
    (+ minimum extra)))

(define (line->box line)
  ;; String -> Box
  (define (line->lon s)
    ;; String -> List<Number>
    (let ((xs (string-tokenize s (char-set-complement (char-set #\x)))))
      (map string->number xs)))
  (let* ((dims (line->lon line)))
    (let ((l (first dims))
      (w (second dims))
      (h (third dims)))
      (make-box l w h))))

(define (read-boxes file)
  ;; Pathname -> List<Box>
  (with-input-from-file file
    (lambda ()
      (let loop ((line (read-line))
         (ys '()))
    (if (eof-object? line)
        ys
        (loop
         (read-line)
         (cons (line->box line) ys)))))))

(define (sum-boxes boxes)
  ;; List<Box> -> Int
  (let ((xs (map wrapping-paper boxes)))
    (apply + xs)))

;; eof

Related Posts

Advent of Code, Day 1

origami-dragon

Advent of Code is a site that provides a programming problem for every day in December leading up to Christmas.

I’ve become a little obsessed with it over the last few days, and thought I’d write up my results. So far I’ve been working in Scheme.

Here’s the Day 1 problem description:


Day 1: Not Quite Lisp

Santa was hoping for a white Christmas, but his weather machine’s “snow” function is powered by stars, and he’s fresh out! To save Christmas, he needs you to collect fifty stars by December 25th.

Collect stars by helping Santa solve puzzles. Two puzzles will be made available on each day in the advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!

Here’s an easy puzzle to warm you up.

Santa is trying to deliver presents in a large apartment building, but he can’t find the right floor – the directions he got are a little confusing. He starts on the ground floor (floor 0) and then follows the instructions one character at a time.

An opening parenthesis, (, means he should go up one floor, and a closing parenthesis, ), means he should go down one floor.

The apartment building is very tall, and the basement is very deep; he will never find the top or bottom floors.

For example:

  • (()) and ()() both result in floor 0.
  • ((( and (()(()( both result in floor 3.
  • ))((((( also results in floor 3.
  • ()) and ))( both result in floor -1 (the first basement level).
  • ))) and )())()) both result in floor -3.

To what floor do the instructions take Santa?


The file of instructions looks something like this (but much larger):

()()((((()(((())(()(()((((((()(()(((())))((()(((()))((())(()((()

Here’s the (reasonably straight-forward) Scheme code. It basically just iterates through the input file, bumping a counter up or down based on the type of paren read from the input port. The definitions of UP-FLOOR? and DOWN-FLOOR? weren’t really necessary, but they made the main procedure a little easier to read.

(define (up-floor? x)
  (char=? x #\())

(define (down-floor? x)
  (char=? x #\)))

(define (find-floor file)
  (with-input-from-file file
    (lambda ()
      (let loop ((floor-number 0)
         (char (read-char)))
    (if (eof-object? char)
        floor-number
        (loop (cond ((up-floor? char)
             (+ floor-number 1))
            ((down-floor? char)
             (- floor-number 1))
            (else floor-number))
          (read-char)))))))

(Image courtesy Strongpaper under a Creative Commons license.)

A Solution to the Five Weekends Problem in Common Lisp

neuschwanstein

I’ve been enjoying playing around with Rosetta Code problems lately. Here’s a solution I posted last week for the Five Weekends problem in Common Lisp:

  ;;; http://rosettacode.org/wiki/Five_weekends

  ;; Given a date, get the day of the week.  Adapted from
  ;; http://lispcookbook.github.io/cl-cookbook/dates_and_times.html

  (defun day-of-week (day month year)
    (nth-value
     6
     (decode-universal-time
          (encode-universal-time 0 0 0 day month year 0)
          0)))

  (defparameter *long-months* '(1 3 5 7 8 10 12))

  (defun sundayp (day month year)
    (= (day-of-week day month year) 6))

  (defun ends-on-sunday-p (month year)
    (sundayp 31 month year))

  ;; We use the "long month that ends on Sunday" rule.
  (defun has-five-weekends-p (month year)
    (and (member month *long-months*)
             (ends-on-sunday-p month year)))

  ;; For the extra credit problem.
  (defun has-at-least-one-five-weekend-month-p (year)
    (let ((flag nil))
          (loop for month in *long-months* do
                   (if (has-five-weekends-p month year)
                           (setf flag t)))
          flag))

  (defun solve-it ()
    (let ((good-months '())
                  (bad-years 0))
          (loop for year from 1900 to 2100 do
             ;; First form in the PROGN is for the extra credit.
                   (progn (unless (has-at-least-one-five-weekend-month-p year)
                                    (incf bad-years))
                                  (loop for month in *long-months* do
                                           (when (has-five-weekends-p month year)
                                             (push (list month year) good-months)))))
          (let ((len (length good-months)))
            (format t "~A months have five weekends.~%" len)
            (format t "First 5 months: ~A~%" (subseq good-months (- len 5) len))
            (format t "Last 5 months: ~A~%" (subseq good-months 0 5))
            (format t "Years without a five-weekend month: ~A~%" bad-years))))

(Image copyright gacabo under Creative Commons license.)

Emacs Compilation Mode Regexp for Perl 6

Emacs Compilation Mode Regexp for Perl 6

Stick this in your .emacs if you want Perl 6 support in Emacs’ compilation mode:

  (add-to-list 'compilation-error-regexp-alist 'perl6-stack-trace)
  (add-to-list 'compilation-error-regexp-alist-alist
               '(perl6-stack-trace .
                                   ("at \\([A-Za-z-_]+\\(\.p[l6m]\\)?\\):\\([[:digit:]]+\\)"
                                    1 3)))

I don’t know if this would be a good addition to perl6-mode, or if compilation mode regexps are supposed to live elsewhere.

If you like this sort of thing, see also: flycheck-perl6.

Statistics over Git Repositories with Scsh

stream-small.jpg

Figure 1: A forest stream outside New Paltz, NY.

In this post I’ll share a scsh port of a nice shell script from Gary Bernhardt’s Destroy All Software screencasts. This is taken from season 1, episode 1 of the series 1.

The script is used to gather statistics on a git repository. You pass it a regex matching a filename, and it outputs a table showing how many lines of that type of file were included in each commit.

For example, I might want to see how the number of lines of documentation in Markdown files changed across commits:

$ repo-stats ".md$"
... snip! ...
52      c36cc6d First version of diff-checking code.
52      9ed53c3 Tweaks.
52      9e17d7e Add new service.
64      b293c3d Describe how to use the diffing code.
64      1886164 Update comments and documentation.
64      4a7ba26 Bump TODO prio.

The scsh code to do this is below; it’s a nearly 1:1 translation of Mr. Bernhardt’s bash code into scsh. It does differ in a few ways:

  • No dynamic/global variables: In the bash code there are variables being used inside functions that weren’t passed in as arguments to those functions. This is fine for small programs, but is probably not a Good Thing ™.
  • Since scsh is based on Scheme 48, we get a nice inspector/debugger for free.
  • At this program size, we don’t need to break out the Scheme 48 module system. However, if we wanted to integrate this scsh code cleanly with a larger system, we could do so fairly easily.
  • Something about how scsh is calling git and piping its output isn’t turning off git’s dumb (IMO) “I will behave differently depending on what kind of output I think I’m writing to” behavior. Therefore, unlike in Mr. Bernhardt’s example, we need to unset the GIT_PAGER environment variable.
  • Mr. Bernhardt used bash in his video due to its ubiquity. Scsh fails utterly in this regard, since almost no one uses it. However, that doesn’t really matter unless you need to distribute your code to a wider audience.2
  • Subjectively, Scheme is an immeasurably nicer language than whatever weird flavor of POSIXy sh is available.

Enough rambling, let’s have some code:

#!/usr/local/bin/scsh \
-e main -s
!#

(setenv "GIT_PAGER" "")

(define (revisions)
  (run/strings (git rev-list --reverse HEAD)))

(define (commit-description rev)
  (run/string (git log --oneline -1 ,rev)))

(define (number-of-lines file-pattern rev)
  (run/string
   (| (git ls-tree -r ,rev)
      (grep ,file-pattern)
      (awk "{print $3}")
      (xargs git show)
      (wc -l))))

(define (main prog+args)
  (let ((pat (second prog+args))
        (revs (revisions)))
    (for-each
     (lambda (rev)
       (let ((column-1 (string-trim-both (number-of-lines pat rev)))
             (column-2 (string-trim-both (commit-description rev))))
         (format #t "~A\t~A~%" column-1 column-2)))
     revs)))

Footnotes:

1

I feel like I should note for the record that:

  1. This is a legitimate, paid copy of Mr. Bernhardt’s videos that we’re working from.
  2. Although I’m only a few episodes into season 1, I am really enjoying the series and would recommend.
2

And if you do need to distribute your code to a wider audience, there is an easy way to dump a heap image that should be runnable by any other scsh VM of the same version. I’ve done this myself to distribute reasonably large/complex scripts to coworkers. I’m written a little scsh library to automate the process of installing an “app” in a heap image. I hope to write about it here soon.