Thinking about software documentation as the output of a lossy compression algorithm

stick

How many times have you heard or read the following comments?

  1. “The docs are always out of date”
  2. “I don’t bother reading the docs, I just read the source code”
  3. “If you write self-documenting code, you don’t need to write docs”

If you work in software, I bet the answer is: a lot. They range in truth value from #1, which is a tautology, to #3, which is a fantasy. I can relate to #2, since I had to do it just yesterday when using a semi-undocumented Ruby library.

I think all of these points of view (and more) are explained when you think about software documentation as the output of a lossy compression algorithm. Many (most? all?) of the things that you love and/or hate about the documentation for a particular piece of software are explained by this.

Quoth wiki:

Well-designed lossy compression technology often reduces file sizes significantly before degradation is noticed by the end-user. Even when noticeable by the user, further data reduction may be desirable (e.g., for real-time communication, to reduce transmission times, or to reduce storage needs).

As such, the features of software documentation are similar to the features of other products of lossy compression algorithms.

Take mp3s for example. The goal of an mp3 is not to be the highest-fidelity replication of the audio experience it represents. The goal of an mp3 is to provide a “good enough” audio experience given the necessary tradeoffs that had to be made because of constraints such as:

  • Time: How expensive in CPU time is it to run the decompression algorithm? Can the CPU on the device handle it?
  • Space: How much disk space will the file take on the device?

Similarly, we might say that the goal of software documentation is not to be the highest-fidelity replication of the “understanding” experience it (theoretically) represents. The goal of a piece of documentation is to provide a “good enough” learning experience given the necessary tradeoffs that had to be made because of constraints such as:

  • Time: How expensive in person time is it to “run the decompression algorithm”, i.e., learn how the system works well enough to write something coherent? And then to actually write it? How many technical writers does the organization employ per engineer? (In many companies it’s ~1 per 40-50 engineers) How many concurrent projects is that writer working on, across how many teams?
  • Space: How much information does the user need to use the system? How little information can you get away with providing before users give up in disgust?

Remember that fewer, more effective materials cost more to produce. This is similar to the way better compression algorithms may cost more than worse ones along various axes you care about (dollar cost for proprietary algorithms, CPU, memory, etc.)

It takes longer to write more concise documentation, draw clear system diagrams, etc., since those are signs that you actually understand the system better, and have thus compressed the relevant information about it into fewer bytes.

And oh by the way, in practice in an “agile” (lol) environment you don’t have enough time to write the “best” docs for any given feature X. Just like the programmer who wrote feature X would likely admit that she didn’t have enough time to write the “best” implementation according to her standards.

Quoth Pascal:

I would have written a shorter letter, but I did not have the time.

So the next time you are frustrated by the docs for some piece of software (if any docs exist at all), instead of some platitude about docs sucking, think “oh, lossy compression”.

(Image courtesy Jonathan Sureau under Creative Commons license.)

Advertisements

The Josephus Problem

../img/origami-wreath.jpg

The Josephus Problem is deceptively simple. A number of people stand in a circle. (Let’s call this number N.) They agree to count around the circle, and every /M/th person will be killed. As they go around the circle, in what order do they die? Another way of looking at it is: who is the last person left alive?

For an interesting introduction, I recommend checking out the Wikipedia article.

I recently found this problem described in the 2nd edition of Sedgwick’s Algorithms, and thought it would be fun to implement in Scheme. Because I’m not a real programmer (and I had to translate to Scheme from the Pascal in the book), it took me a couple of hours instead of a couple of minutes. But what an enjoyable couple of hours!

I started, as usual for algorithmic programming tasks, with a simulation. In the book, Sedgwick provides an example: If there are 9 people, and every 5th person in the circle is killed, the order in which they’ll die is: 5, 1, 7, 4, 3, 6, 9, 2, 8.

It’s usually easier to start work on a problem by thinking about a simpler case, so that’s what I did. I chose N = 5, and M = 2. Below I’ll show the results of my “paper” simulation (OK, the paper was actually an Emacs buffer).

First, let me show you how to read my “notation”. It should be pretty easy to figure out. Below is an example of one “state snapshot”. I’ve labelled each line with its meaning. If you are a Scheme programmer, you will surmise that each named item corresponds to a “named-let” loop variable. In addition, the vertical bar character in the XS row represents the state of the “cursor” as it goes around the circle of 5 people.

;; I.                           <-- State Number
;; ------------
;; |1 2 3 4 5                   <--    XS
;; '()                          <--    YS
;; 0                            <-- COUNT

Above, we are in state 1, at the start of the program. The cursor is at the beginning of the list XS, which represents the circle of people. YS is our results list, where we will store the “dead”. As we travel around the circle and people are killed, they are added to YS. We could think of it as a “death toll”, but really it’s just a list of integers. As we go around the circle of people, we keep incrementing COUNT. When COUNT reaches the value of M, that person will be “killed”, that is, be added to YS.

When this happens, we reset the COUNT to 0 and keep going around the circle. There’s a catch, however. As more dead people are added to our result list YS, we need to ignore the spaces those people used to occupy in our count as we build COUNT back up towards M. In other words, the circle will have 4 people in it after the first person is killed. This means that as the circle gets smaller and smaller, people are killed more frequently.

I’ll show you how I’ve solved that problem in Scheme in a moment; first, the simulation:

;; I.
;; ------------
;; |1 2 3 4 5
;; '()
;; 0

;; II.
;; ------------
;; 1| 2 3 4 5
;; '()
;; 1

;; III.
;; ------------
;; 1 2| 3 4 5
;; '()
;; 1

;; IV.
;; ------------
;; 1 _ 3| 4 5
;; '(2)
;; 1

;; V.
;; ------------
;; 1 _ 3 4| 5
;; '(2)
;; 2

;; V.
;; ------------
;; 1 _ 3 _ 5|
;; '(4 2)
;; 1

;; VI.
;; ------------
;; 1| _ 3 _ 5
;; '(4 2)
;; 2

;; VII.
;; ------------
;; _ _ 3| _ 5
;; '(1 4 2)
;; 1

;; VIII.
;; ------------
;; _ _ 3 _ 5|
;; '(1 4 2)
;; 2

;; IX.
;; ------------
;; _ _ 3| _ _
;; '(5 1 4 2)
;; 1

;; X.
;; ------------
;; _ _ _| _ _
;; '(5 1 4 2)
;; 2

;; XI.
;; ------------
;; _ _ _| _ _
;; '(3 5 1 4 2)
;; 2

Now that you’ve seen how the algorithm should work on paper, let’s write some Scheme! I should state for the record that I did not read the Wikipedia article linked above until after I wrote this code. (This will be pretty obvious when you look at the code.) Here it is:

(define (josephus xs m)
  (let ((orig-xs (list-copy xs))
        (true-m (- m 1)))
    (let ((len (length xs)))
      (let loop ((xs xs)
                 (ys '())
                 (count 0))
        (cond 
         ;; If the dead list grows to the same length as our original
         ;; list, we know we're finished.
         ((= len (length ys))
          (reverse ys))
         ;; If XS is null, we have gone once around the circle.  We
         ;; start around again, leaving the count unchanged.
         ((null? xs)
          (loop orig-xs ys count))
         ;; If the current person we're looking at is already dead
         ;; (a ghost), move on.  And don't bother bumping the
         ;; count.
         ((member (car xs) ys)
          (loop (cdr xs) ys count))
         ;; If we're here, it's also the case that the current person
         ;; we're looking at is *not* in the dead list.  Therefore we
         ;; check if the count is equal to M.  If so, they must die.
         ((= count true-m)
          (loop (cdr xs) (cons (car xs) ys) 0))
         ;; If we get here, the current person we're looking at is
         ;; alive.  We check if the count is *not* equal to M.  If it
         ;; is not, we skip this person and bump the count.
         ((not (= count true-m))
          (loop (cdr xs) ys (+ count 1)))
         ;; We should never get here.
         (else #f))))))

How does it compare to the output described in Sedgwick? It seems to work!

> (josephus '(1 2 3 4 5 6 7 8 9) 5)
'(5 1 7 4 3 6 9 2 8)

There are several inefficiencies we could tackle, though. The first and most obvious is that we should try to calculate the solution mathematically (as shown in some of the Wikipedia examples) instead of building up lists using CONS. Another is that we make a “canonical” copy of the input list for restarting the loop. A third is the use of MEMBER to determine whether the person we’re currently visiting is already in the “dead” list.

For example, here’s the trace output showing 19 calls to MEMBER for a pretty small input:

(josephus '(1 2 3 4 5) 2)
[Enter (member 1 '())
 Leave member #f]
... 17 more calls to member ...
[Enter (member 3 '(5 1 4 2))
 Leave member #f]
'(2 4 1 5 3)

That said, in this case I was happy just to get something working. If this were going to be used for anything “real”, I would probably rewrite it statefully using vectors. Most of the fun was in coming up with this first draft without resorting to any stateful idioms.

(Image courtesy Chris Lott via CC-BY license.)

Bottom Up Merge Sort in Scheme

../img/37-tiles.jpg

Recently I’ve begun a project to implement a number of basic algorithms in Scheme, which I’d like to eventually grow into a free (as in freedom) ebook. Having just done a Binary Search in Scheme, I thought it would be fun to give merge sort a try.

According to the mighty interwebs, merge sort is a good choice for sorting linked lists (a.k.a., Lisp lists). Unfortunately the only Lisp merge sort implementation examples I’ve been able to find on the web have been recursive, not iterative.

The implementation described here is an iterative, bottom-up merge sort, written in a functional style. (I daren’t say the functional style, lest any real Scheme wizards show up and burn me to a crisp.)

First, generate a list of random numbers

In order to have something to sort, we need a procedure that generates a list of random numbers – note that the docstring is allowed by MIT/GNU Scheme; YMMV with other Schemes.

(define (make-list-of-random-numbers list-length max)
  ;; Int Int -> List
  "Make a list of random integers less than MAX that's LIST-LENGTH long."
  (letrec ((maker
            (lambda (list-length max result)
              (let loop ((n list-length) (result '()))
                (if (= n 0)
                    result 
                    (loop (- n 1) (cons (random max) result)))))))
    (maker list-length max '())))

Then, write a merge procedure

This implementation of the merge procedure is a straight port of the one described on the Wikipedia Merge Sort page, with one minor difference to make the sort faster 1.

An English description of the merge operation is as follows:

  • If both items passed in are numbers (or strings), wrap them up in lists and recur. (In this example we only care about sorting numbers)
  • If both lists are empty, return the result.
  • If neither list is empty:
    • If the first item in the first list is “less than” the first item in the second list, cons it onto the result and recur.
    • Otherwise, cons the first item in the second list on the result and recur.
  • If the first list still has items in it, cons the first item onto the result and recur.
  • If the second list still has items in it, cons the first item onto the result and recur.
  • If none of the above conditions are true, return #f. I put this here for debugging purposes while writing this code; now that the procedure is debugged, it is never reached. (Note: “debugged” just means “I haven’t found another bug yet”.)
(define (rml/merge pred l r)
  (letrec ((merge-aux
            (lambda (pred left right result)
              (cond 
               ((and (number? left)
                     (number? right))
                (merge-aux pred 
                           (list left) 
                           (list right) 
                           result))
               ((and (string? left)
                     (string? right))
                (merge-aux pred
                           (list left) 
                           (list right) 
                           result))
               ((and (null? left)
                     (null? right))
                (reverse result))
               ((and (not (null? left))
                     (not (null? right)))
                (if (pred (car left)
                          (car right))
                    (merge-aux pred
                               (cdr left)
                               right
                               (cons (car left) result))
                    (merge-aux pred
                               left
                               (cdr right)
                               (cons (car right) result))))
               ((not (null? left))
                (merge-aux pred (cdr left) right (cons (car left) result)))
               ((not (null? right))
                (merge-aux pred left (cdr right) (cons (car right) result)))
               (else #f)))))
    (merge-aux pred l r '())))

We can run a few merges to get a feel for how it works. The comparison predicate we pass as the first argument will let us sort all kinds of things, but for the purposes of this example we’ll stick to numbers:

(rml/merge < '(360 388 577) '(10 811 875 995))
;Value 11: (10 360 388 577 811 875 995)

(rml/merge < '(8 173 227 463 528 817) '(10 360 388 577 811 875 995))
;Value 12: (8 10 173 227 360 388 463 528 577 811 817 875 995)

(rml/merge < 
           '(218 348 486 520 639 662 764 766 886 957 961 964)
           '(8 10 173 227 360 388 463 528 577 811 817 875 995))
;Value 14: (8 10 173 218 227 348 360 388 463 486 520 528 577 639 662 764 766 811 817 875 886 957 961 964 995)

Finally, do a bottom up iterative merge sort

It took me a while to figure out how to do the iterative merge sort in a Schemely fashion. As usual, it wasn’t until I took the time to model the procedure on paper that I got somewhere. Here’s what I wrote in my notebook:

;;  XS                   |      RESULT
;;---------------------------------------------

'(5 1 2 9 7 8 4 3 6)            '()
    '(2 9 7 8 4 3 6)            '((1 5))
        '(7 8 4 3 6)            '((2 9) (1 5))
            '(4 3 6)            '((7 8) (2 9) (1 5))
                '(6)            '((3 4) (7 8) (2 9) (1 5))
                 '()            '((6) (3 4) (7 8) (2 9) (1 5))

;; XS is null, and RESULT is not of length 1 (meaning it isn't sorted
;; yet), so we recur, swapping the two:

'((6) (3 4) (7 8) (2 9) (1 5))  '()
          '((7 8) (2 9) (1 5))  '((3 4 6))
                      '((1 5))  '((2 7 8 9) (3 4 6))
                           '()  '((1 5) (2 7 8 9) (3 4 6))

;; Once more XS is null, but RESULT is still not sorted, so we swap
;; and recur again

'((1 5) (2 7 8 9) (3 4 6))      '()
                  '(3 4 6)      '((1 2 5 7 8 9))
                       '()      '((3 4 6) (1 2 5 7 8 9))

;; Same story: swap and recur!

'((3 4 6) (1 2 5 7 8 9))        '()
                     '()        '((1 2 3 4 5 6 7 8 9))

;; Finally, we reach our base case: XS is null, and RESULT is of
;; length 1, meaning that it contains a sorted list

'(1 2 3 4 5 6 7 8 9)

This was a really fun little problem to think about and visualize. It just so happens that it fell out in a functional style; usually I don’t mind doing a bit of state-bashing, especially if it’s procedure-local. Here’s the code that does the sort shown above:

(define (rml/merge-sort xs pred)
  (let loop ((xs xs)
             (result '()))
    (cond ((and (null? xs)
                (null? (cdr result)))
           (car result))
          ((null? xs)
           (loop result
                 xs))
          ((null? (cdr xs))
           (loop (cdr xs)
                 (cons (car xs) result)))
          (else
           (loop (cddr xs)
                 (cons (rml/merge <
                              (first xs)
                              (second xs))
                       result))))))

That’s nice, but how does it perform?

A good test of our merge sort is to compare it to the system’s sort procedure. In the case of MIT/GNU Scheme, we’ll need to compile our code if we hope to get anywhere close to the system’s speed. If your Scheme is interpreted, you don’t have to bother of course.

To make the test realistic, we’ll create three lists of random numbers: one with 20,000 items, another with 200,000, and finally a giant list of 2,000,000 random numbers. This should give us a good idea of our sort’s performance. Here’s the output of timing first two sorts, 20,000 and 200,000 2:

;;; Load compiled code

(load "mergesort")
;Loading "mergesort.so"... done
;Value: rml/insertion-sort2

;;; Define our lists

(define unsorted-20000 (make-list-of-random-numbers 20000 200000))
;Value: unsorted-20000

(define unsorted-200000 (make-list-of-random-numbers 200000 2000000))
;Value: unsorted-200000

;;; Sort the list with 20,000 items

(with-timing-output (rml/merge-sort unsorted-20000 <))
;Run time:      .03
;GC time:       0.
;Actual time:   .03

(with-timing-output (sort unsorted-20000 <))
;Run time:      .02
;GC time:       0.
;Actual time:   .021

;;; Sort the list with 200,000 items

(with-timing-output (rml/merge-sort unsorted-200000 <))
;Run time:      .23
;GC time:       0.
;Actual time:   .252

(with-timing-output (sort unsorted-200000 <))
;Run time:      .3
;GC time:       0.
;Actual time:   .3

As you can see, our sort procedure is on par with the system’s for these inputs. Now let’s turn up the heat. How about a list with 2,000,000 random numbers?

;;; Sort the list with 2,000,000 items

(define unsorted-2000000 (make-list-of-random-numbers 2000000 20000000))
;Value: unsorted-2000000

(with-timing-output (rml/merge-sort4 unsorted-2000000 <))
;Aborting!: out of memory
;GC #34: took:   0.80 (100%) CPU time,   0.10 (100%) real time; free: 11271137
;GC #35: took:   0.70 (100%) CPU time,   0.90  (81%) real time; free: 11271917
;GC #36: took:   0.60 (100%) CPU time,   0.90  (99%) real time; free: 11271917

(with-timing-output (sort unsorted-2000000 <))
;Run time:      2.48
;GC time:       0.
;Actual time:   2.474

No go. On a MacBook with 4GB of RAM, our merge sort runs out of memory, while the system sort procedure works just fine. It seems the wizards who implemented this Scheme system knew what they were doing after all! :-}

It should be pretty clear at this point why we’re running out of memory. In MIT/GNU Scheme, the system sort procedure uses vectors and mutation (and is no doubt highly tuned for the compiler), whereas we take a relatively brain-dead approach that uses lists and lots of cons-ing. I leave it as an exercise for the reader (or perhaps my future self) to rewrite this code so that it doesn’t run out of memory.

(Image courtesy mag3737 under Creative Commons license.)

Footnotes:

1 An earlier implementation started off the sort by “exploding” the list to be sorted so that ='(1 2 3)= became ='((1) (2) (3))=. This is convenient for testing purposes, but very expensive. It’s also unnecessary after the first round of merging. We avoid the need to explode the list altogether by teaching merge to accept numbers and listify them when they appear. We could also do the same for strings and other types as necessary.

2 For the definition of the with-timing-output macro, see here.

Binary Search in Scheme

../img/hex-twist-variation-2.jpg

Intro

Just for fun, I’ve begun translating some of the algorithms from Mastering Algorithms with Perl into Scheme. My hope is that I’ll get two things out of this: a better knowledge of algorithms, and of Scheme hacking.

Binary search is one of the first algorithms listed in the book; it’s tricky to write a correct binary search, but I had the Perl code to work from. Let’s see how I did.

What’s binary search?

Binary search is a method for finding a specific item in a sorted list. Here’s how it works:

  1. Take a guess that the item you want is in the middle of the current search “window” (when you start, the search window is the entire list).
  2. If the item is where you guessed it would be, return the index (the location of your guess).
  3. If your guess is “less than” the item you want (based on a comparison function you choose), recur, this time raising the “bottom” of the search window to the midway point.
  4. If your guess is “greater than” the item you want (based on your comparison function), recur, this time lowering the “top” of the search window to the midway point.

In other words, you cut the size of the search window in half every time through the loop. This gives you a worst-case running time of about (/ (log n) (log 2)) steps. This means you can find an item in a sorted list of 20,000,000,000 (twenty billion) items in about 34 steps.

Reading lines from a file

Before I could start writing a binary search, I needed a sorted list of items. I decided to work with a sorted list of words from /usr/share/dict/words, so I wrote a couple of little procedures to make a list of words from a subset of that file. (I didn’t want to read the entire large file into a list in memory.)

Note: Both format and the Lisp-inspired #!optional keyword are available in MIT Scheme; they made writing the re-matches? procedure more convenient.

  • re-matches? checks if a regular expression matches a string (in this case, a line from a file).
  • make-list-of-words-matching is used to loop over the lines of the words file and return a list of lines matching the provided regular expression.

Now I have the tools I need to make my word list.

(load-option 'format)

(define (re-matches? re line #!optional display-matches)
  ;; Regex String . Boolean -> Boolean
  "Attempt to match RE against LINE. Print the match if DISPLAY-MATCHES is set."
  (let ((match (re-string-match re line)))
    (if match
        (if (not (default-object? display-matches))
            (begin (format #t "|~A|~%" (re-match-extract line match 0))
                   #t)
            #t)
        #f)))

(define (make-list-of-words-matching re file)
  ;; Regex String -> List
  "Given a regular expression RE, loop over FILE, gathering matches."
  (call-with-input-file file
    (lambda (port)
      (let loop ((source (read-line port)) (sink '()))
        (if (eof-object? source)
            sink
            (loop (read-line port) (if (re-matches? re source)
                             (cons source sink)
                             sink)))))))

Writing tests

Since I am not one of the 10% of programmers who can implement a correct binary search on paper, I started out by writing a test procedure. The test procedure grew over time as I found bugs and read an interesting discussion about the various edge cases a binary search procedure should handle. These include:

  • Empty list
  • List has one word
  • List has two word
  • Word is not there and “less than” anything in the list
  • Word is not there and “greater than” anything in the list
  • Word is first item
  • Word is last item
  • List is all one word
  • If multiple copies of word are in list, return the first word found (this could be implemented to return the first or last duplicated word)

Furthermore, I added a few “sanity checks” that check the return values against known outputs. Here are the relevant procedures:

  • assert= checks two numbers for equality and prints a result
  • assert-equal checks two Scheme objects against each other with equal? and prints a result
  • run-binary-search-tests reads in words from a file and runs all of our tests
(define (assert= expected got #!optional noise)
  ;; Int Int -> IO
  (if (= expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (assert-equal? expected got #!optional noise)
  ;; Thing Thing -> IO
  (if (equal? expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (run-binary-search-tests)
  ;; -> IO
  "Run our binary search tests using known words from the 'words' file.
This file should be in the current working directory."
  (with-working-directory-pathname (pwd)
    (lambda ()
      (if (file-exists? "words")
          (begin
            (format #t "file 'words' exists, making a list...~%")
            (let* ((unsorted (make-list-of-words-matching "acc" "words"))
                   (sorted (sort unsorted string<?)))
              (format #t "doing binary searches...~%")
              (assert-equal? #f (binary-search "test" '())) ; empty list
              (assert-equal? #f (binary-search "aardvark" sorted)) ; element absent and too small
              (assert-equal? #f (binary-search "zebra" sorted)) ; element absent and too large
              (assert= 0 (binary-search "accusive" '("accusive"))) ; list of length one
              (assert= 0 (binary-search "acca" sorted)) ; first element of list
              (assert= 1 (binary-search "aardvark" '("aardvark" "aardvark" "babylon"))) ; multiple copies of word in list
              (assert= 1 (binary-search "barbaric" '("accusive" "barbaric"))) ; list of length two
              (assert= 98 (binary-search "acclamator" sorted))
              (assert= 127 (binary-search "aardvark" (map (lambda (x) "aardvark") test-list))) ; list is all one value
              (assert= 143 (binary-search "accomplice" sorted))
              (assert= 254 (binary-search "accustomedly" sorted))
              (assert= 255 (binary-search "accustomedness" sorted)))))))) ; last element of list

The binary search procedure

Finally, here’s the binary search procedure; it uses a couple of helper procedures for clarity.

  • ->int is a helper procedure that does a quick and dirty integer conversion on its argument
  • split-difference takes a low and high number and returns the floor of the halfway point between the two
  • binary-search takes an optional debug-print argument that I used a lot while debugging. The format statements and the optional argument tests add a lot of bulk – now that the procedure is debugged, they can probably be removed. (Aside: I wonder how much “elegant” code started out like this and was revised after sufficient initial testing and debugging?)
(define (->int n)
  ;; Number -> Int
  "Given a number N, return its integer representation.
N can be an integer or flonum (yes, it's quick and dirty)."
  (flo:floor->exact (exact->inexact n)))

(define (split-difference low high)
  ;; Int Int -> Int
  "Given two numbers, return their rough average."
  (if (= (- high low) 1)
      1
      (->int (/ (- high low) 2))))

(define (binary-search word xs #!optional debug-print)
  ;; String List -> Int
  "Do binary search of list XS for WORD. Return the index found, or #f."
  (if (null? xs)
      #f
      (let loop ((low 0) (high (- (length xs) 1)))
        (let* ((try (+ low (split-difference low high)))
               (word-at-try (list-ref xs try)))
          (cond
           ((string=? word-at-try word) try)
           ((< (- high low) 1) #f)
           ((= (- high try) 1) 
            (if (string=? (list-ref xs low) word)
                low
                #f))
           ((string<? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string<? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop (+ 1 try) high)) ; raise the bottom of the window
                (loop (+ 1 try) high)))
           ((string>? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string>? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop low (+ 1 try))) ; lower the top of the window
                (loop low (+ 1 try))))
           (else #f))))))

Takeaways

This exercise has taught me a lot.

  1. Writing correct code is hard. (I’m confident that this code is not correct.) You need to figure out your invariants and edge cases first. I didn’t, and it made things a lot harder.
  2. It’s been said a million times, but tests are code. The tests required some debugging of their own.
  3. Once they worked, the tests were extremely helpful. Especially now that I’m at the point where (if this were “for real”) additional features would need to be added, the format calls removed, the procedure speeded up, and so on.

I hope this has been useful to some other aspiring Scheme wizards out there. Happy Hacking!

(Image courtesy Melisande under Creative Commons license.)

What is the Highest Sum of a Number Triangle?

https://logicgrimoire.files.wordpress.com/2012/02/wpid-paper-triangles.jpg

A Description of the Problem

We are given a triangle of numbers, and we are asked to write a program that computes the highest sum of numbers passed on a route that starts at the top and ends somewhere on the base.

Some additional restrictions:

  • Each step can go diagonally down to the left or the right.
  • The number of rows in the triangle will be between 1 and 100, inclusive.
  • The numbers that populate the triangle are integers between 0 and 99.

(Read the original description here.)

What are our Inputs and Outputs?

Our initial input data will live in a file called triangle-input.txt, which contains the following:

5
7
3 8
8 1 0 
2 7 4 4
4 5 2 6 5

Note that the first line of the file is not part of the triangle itself; it’s there to tell us how many levels deep the triangle goes.

30

We’ll place our output in a file called triangle-output.txt, which will contain a single integer. We’ll place the names of our input and output files in the program constants INPUT_FILE and OUTPUT_FILE, respectively.

Reading the Input, Writing the Output

First, we’ll need to get the values out of triangle-input.txt and store them in a convenient structure. In this case, an array of arrays should do.

We begin by creating an empty array, tmp, which will act as a temporary storage space for the lines of triangle-input.txt.

We’ll read the lines of the file one at a time. For each line, we split the string, e.g., “1 2 3 4 5\n”, into a list, and push that line onto our array tmp.

tmp = []

File.open(INPUT_FILE, "r") do |f|
  f.lines.each do |line|
    tmp.push line.split
  end
end

Unfortunately, we’re pushing an array of strings onto tmp, since

"4 5 2 6 5\n".split

returns this:

["4", "5", "2", "6", "5"]

Therefore, we’ll take a second pass through and convert those arrays of strings into arrays of numbers. The resulting array will allow us to – finally! – begin our real work. Notice that this is a nested call to map, best read from the inside out. The value returned by the inner map is returned to the outer, with the results stored in variable tri, since the return value of map is always an array.

tri = tmp.map { |array| array.map { |elem| elem.to_i } }

We’ll wrap everything here up in a method, read_triangle_input, which will show up in the complete program listing below. We’ll also leave the write_triangle_output method for the listing; it requires little explanation.

Solving our Actual Problem

Now that the housekeeping chores are out of the way, we can jump in and begin the real work of finding the highest sum in our triangle.

We’d like to write a method, triangle_sum, which, given an array of arrays like the one we’ve just constructed, returns an integer representing the highest sum calculated on its imagined path “down through the triangle.”

Since our route through the triangle is restricted to “steps” that can “go diagonally down to the left or the right,” the most natural representation of this data is as a tree. We’ll simulate this in a crude way using an array of arrays.

The Inner Loop

Since our fundamental data structure is an array, we’ll need to choose a looping construct; we’ll stick with the “C-style” iterative constructs here since they map well onto this problem (no pun intended). We’ll use an index variable to keep track of where we are in the iteration.

Inside the loop, we want to know three things (among others):

  • Where am I?
  • Where is the element to my upper left?
  • Where is the element to my upper right?

We’ll denote the answers to these questions with the variables this, upleft, and upright in our program listing below.

Remember the problem description: Starting at the root of the tree, we’ll keep a running sum of all the numbers we’ve seen thus far on this particular path through the tree.

Visualizing Our Movement Down the Triangle

In order to solve this problem, we started with some hand-simulation of what would eventually become the final algorithm. The example below shows how it works: given where we are in the array of arrays, we look “up and to the left” and “up and to the right” of where we are. In the following diagrams, we denote our position in the algorithm with a lovely text arrow.

Index: 1

[7] <---
[3, 8]
[8, 1, 0]
[2, 7, 4, 4]
[4, 5, 2, 6, 5]

When index is 1, there isn’t anything to do, since we can’t look “up” at anything. Therefore, move on.

Index: 2

[7]
[10, 15] <---  Was: [3, 8]
[8, 1, 0]
[2, 7, 4, 4]
[4, 5, 2, 6, 5]

When index is 2, we look up and to the left (7), and up and to the right (also 7). Each time through the loop, we create a new temporary array called next_row, in which to store the running sums. In this case, we create a new array [10, 15] by adding 7 to each of [3, 8]. We then replace the old array with the new.

Index: 3

[7]
[10, 15]
[18, 11, 16, 15] <--- Was: [8, 1, 0]
[2, 7, 4, 4]
[4, 5, 2, 6, 5]

index is now 3. We perform the same operations as above: first, create an empty array. Then, for each element in the old array [8, 1, 0], we add the value of the element (which we’ll call this in our program code) to the values of upleft and upright (these are obviously our variable names for the “up and to the left” and “up and to the right” values we’ve already mentioned). In each case we push the results of these additions onto the new temporary array. Once we’ve finished, we replace the existing array with the new, as before.

Index: 4

[7]
[10, 15]
[18, 11, 16, 15]
[20, 25, 18, 15, 20, 20] <--- Was: [2, 7, 4, 4]
[4, 5, 2, 6, 5]

Index: 5

[7]
[10, 15]
[18, 11, 16, 15]
[20, 25, 18, 15, 20, 20]
[24, 25, 30, 27, 20, 24, 21, 20] <--- Was: [4, 5, 2, 6, 5]

Result: 30

Here we show two steps more of the process, and its completion. We can easily see that 30 is the largest sum in the last array, and our answer.

We notice that the “new” interior arrays we’re creating on each turn through the loop are longer than the originals they replace, so we’re not being as efficient with memory as we’d like. At least Array expansion is an O(n) operation!

The Complete Program Listing of triangle.rb

This essay is over 1200 words long already according to wc -w. Therefore, since this algorithm can be described very succinctly in code, I’ll break the rules of literate programming and simply end with the program listing itself. Note the temporary variables next_row, this, upleft, and upright, which are described in the section “Visualizing Our Movement Down the Triangle” above.

As always, the contents of this literate program are available at Github.

(Update: better solutions and discussion over at the Ruby Reddit)

#!/usr/bin/env ruby

require 'test/unit'

INPUT_FILE  = "triangle-input.txt"
OUTPUT_FILE = "triangle-output.txt"

def read_triangle_input
  tmp = []

  File.open(INPUT_FILE, "r") do |f|
    f.lines.each do |line|
      tmp.push line.split
    end
  end

  tri = tmp.map { |array| array.map { |elem| elem.to_i } }
end

def write_triangle_output(result)
  File.open(OUTPUT_FILE, "w") do |f|
    f.print result
  end
end

def triangle_sum(tri)
  a = Array.new(tri)
  index = 1
  len = a.shift[0]-1
  while index <= len
    next_row = []
    for i in 0..index
      this = a[index][i]
      upleft = a[index-1][i-1]
      upright = a[index-1][i]

      if i == 0
        next_row.push this + upright
      elsif i == index
        next_row.push this + upleft
      else
        next_row.push this + upleft
        next_row.push this + upright
      end
    end
    a[index] = next_row
    index += 1
  end
  a[index-1].max
end

tri = read_triangle_input
highest_sum = triangle_sum(tri)
write_triangle_output(highest_sum)  

class TestTriangleSum < Test::Unit::TestCase
  def test_01
    tri = read_triangle_input
    expected = triangle_sum(tri)
    assert_equal expected, 30
  end
end

(Image courtesy Mélisande* under Creative Commons license.)

Finding the Equilibrium Index of an Array

The Problem Description

A zero-indexed array A consisting of N integers is given. An
equilibrium index of this array is any integer P such that 0 <=
P < N and the sum of elements of lower indices is equal to the sum
of elements of higher indices, i.e.,

A[0] + A[1] + ... + A[P-1] =
A[P+1] + ... + A[N-2] + A[N-1].

The sum of zero elements is assumed to be equal to 0.

Write a function that, given a zero-indexed array A consisting of
N integers, returns any of its equilibrium indices. The function
should return -1 if no equilibrium index exists.

  • Expected worst-case time complexity is O(N)
  • Expected worst-case space complexity is O(N), beyond input storage

(not counting the storage required for input arguments).

The above is taken from the problem description by Codility; they discuss one solution (written in C) here.

Notes on this Implementation

First things first: we’ll avoid the return -1 “C-ism,” as it’s more natural to return nil in Ruby. In fact, Ruby sees -1 as True, so returning it will break many common predicates, whereas nil won’t.

Experienced programmers will observe that this implementation does not meet the O(n) time and space complexity requirements as listed above. Absolute simplicity of implementation is the focus right now, but that may change in a future version.

Finally, notice that this short essay is itself a literate program, implemented using the wonderful Org-mode capabilities which are included with the Emacs text editor.

The Inner Loop

Since most of the program’s work is done inside a single while loop, we’ll begin there.

Let A be the Ruby array

[-7, 1, 5, 2, -4, 3, 0].

The equilibrium index of A can be computed by hand, so it’s a good place to start. It’s also our first test case (see below).

We’ve chosen to implement the iteration over A using a while loop rather than the more idiomatic Array#each method since we need to keep track of our current index into the array.

As we begin the loop, the variable index is initialized as the length of the array minus one. This is required because Ruby’s arrays are zero-based, but the value returned by its Array#length method is not.

while index > 0
  left  = a[0..index-1].reduce(:+)
  right = a[index+1..len].reduce(:+)
  if left == right
    return index
  end
  index -= 1
end

We’ll iterate backwards through the array from the end. At each value of index, we’ll split A into two smaller arrays, left and right. This requires allocating two new arrays in memory, reading the values of the desired “slices” of A, and copying them into left and right, respectively. This operation provides for an implementation that is simple to visualize and understand, but it’s also the reason why we fail to meet the requirements for O(n) space and time complexity. A more efficient implementation would avoid these unnecessary allocation and copying steps, and we might change this program at some point to achieve that.

Visualizing the Iteration

Let’s look at left and right at each stage of the iteration:

https://logicgrimoire.files.wordpress.com/2012/02/wpid-equi.png

We can see that when we split A at index, we always leave a “gap” in between, and it’s this gap which will provide us with the answer we seek: the equilibrium index (provided that the equilibrium index is defined for the given array, that is). At every iteration in the diagram above, we sum the values within left and right using the Array#reduce method. If left and right are equal, index is defined as the equilibrium index, and we return index. Otherwise, we’ll end up escaping the loop and returning nil.

The equi Function

Looking briefly at the entire equi function, we can see that it’s just a wrapper for the while loop. First we set the value of index and len as bookkeeping measures. We then enter the while loop as described above. If the loop completes without returning a value, the program returns to the next outer context and returns the nil value, which lets our caller know that this array doesn’t have an equilibrium index.

def equi(a)
  index = a.length-1
  len   = a.length-1
  while index > 0
    left  = a[0..index-1].reduce(:+)
    right = a[index+1..len].reduce(:+)
    if left == right
      return index
    end
    index -= 1
  end
  nil
end

Test Cases

Over the course of developing the program, a number of test cases have come to mind. We’ll begin with the given array A that we started with.

def test_random
  sample_array = [-7, 1, 5, 2, -4, 3, 0]
  expected = equi(sample_array)
  assert_equal expected, 3
end

Here we’ve defined a trivial `pyramid’ array of values that ascend and descend symmetrically.

def test_trivial_pyramid
  sample_array = [1, 2, 3, 4, 3, 2, 1]
  expected = equi(sample_array)
  assert_equal expected, 3
end

This test checks the case where the first value is equal to the sum of all the rest (save the equilibrium index, of course).

 def test_biggest_first
  sample_array = [99, 0, 66, 32, 1]
  expected = equi(sample_array)
  assert_equal expected, 1
end

Similarly, we check for the case where the last value alone is equal to the sum of left.

def test_biggest_last
  sample_array = [66, 32, 1, 0, 99]
  expected = equi(sample_array)
  assert_equal expected, 3
end

We should return nil for an array with a single element, since the equilibrium index is not defined in that case.

def test_single_element
  sample_array = [0]
  expected = equi(sample_array)
  assert_equal expected, nil
end

The same is true of an array containing a single nil.

def test_single_nil
  sample_array = [nil]
  expected = equi(sample_array)
  assert_equal expected, nil
end

The Complete Program Listing

Finally, we have the complete program listing for the tangle‘d file literate-equi.rb. Since we’ve included our tests by subclassing the Test::Unit class, Ruby will run them for us when we invoke the program. Running ruby literate-equi.rb at the command shell should return the following output:

~/Desktop/Dropbox/current/logicgrimoire $ ruby literate-equi.rb 
Loaded suite literate-equi
Started
......
Finished in 0.002195 seconds.

6 tests, 6 assertions, 0 failures, 0 errors

The program itself:

#!/usr/bin/env ruby

require 'test/unit'

def equi(a)
  index = a.length-1
  len   = a.length-1
  while index > 0
    left  = a[0..index-1].reduce(:+)
    right = a[index+1..len].reduce(:+)
    if left == right
      return index
    end
    index -= 1
  end
  nil
end


class TestEqui < Test::Unit::TestCase
  def test_random
    sample_array = [-7, 1, 5, 2, -4, 3, 0]
    expected = equi(sample_array)
    assert_equal expected, 3
  end

  def test_trivial_pyramid
    sample_array = [1, 2, 3, 4, 3, 2, 1]
    expected = equi(sample_array)
    assert_equal expected, 3
  end
   def test_biggest_first
    sample_array = [99, 0, 66, 32, 1]
    expected = equi(sample_array)
    assert_equal expected, 1
  end
  def test_biggest_last
    sample_array = [66, 32, 1, 0, 99]
    expected = equi(sample_array)
    assert_equal expected, 3
  end
  def test_single_element
    sample_array = [0]
    expected = equi(sample_array)
    assert_equal expected, nil
  end
  def test_single_nil
    sample_array = [nil]
    expected = equi(sample_array)
    assert_equal expected, nil
  end

end