A Portable Scheme Module System

collins-mustang

 

In this post I’d like to introduce load-module, a portable Scheme module system.

Why did I write a module system?

  • Simplicity: A single-file module system in about 200 lines of code
  • Understandability: The implementation avoids wizardry and should be accessible to anyone who knows the language
  • Portability: One system that can be used across multiple implementations

The way it works is this:

  1. You have a file (say, utils.scm) with Scheme code in it that implements stuff that you want to live in the same module.
  2. You create another file (utils.mod, but that extension is easy to change) which lists the procedures and syntax you want to export.
  3. The load-module procedure reads utils.scm, rewriting unexported procedure names such that only the procedures you want exported show up at the top-level. Everything else gets rewritten during load-time as an ignorable “gensym” of the form %--gensym-utils-random-integer-8190504171, where “utils” is the module name, and “random-integer” is the procedure internal to your module.

The module file format is very simple:

(define-module utils
  (exports random-integer atom? take))

The module system exports one procedure: load-module. Run it like so to get the procedures from the aforementioned hypothetical utils package into your environment:

> (load "load-module.scm")
> (load-module 'utils)
#t
> (random-integer 199)
76
> (atom? 199)
#t

If you care, there’s more information about over at the project README.

(Image courtesy Geoff Collins under Creative Commons license.)

Advertisements

The Josephus Problem

../img/origami-wreath.jpg

The Josephus Problem is deceptively simple. A number of people stand in a circle. (Let’s call this number N.) They agree to count around the circle, and every /M/th person will be killed. As they go around the circle, in what order do they die? Another way of looking at it is: who is the last person left alive?

For an interesting introduction, I recommend checking out the Wikipedia article.

I recently found this problem described in the 2nd edition of Sedgwick’s Algorithms, and thought it would be fun to implement in Scheme. Because I’m not a real programmer (and I had to translate to Scheme from the Pascal in the book), it took me a couple of hours instead of a couple of minutes. But what an enjoyable couple of hours!

I started, as usual for algorithmic programming tasks, with a simulation. In the book, Sedgwick provides an example: If there are 9 people, and every 5th person in the circle is killed, the order in which they’ll die is: 5, 1, 7, 4, 3, 6, 9, 2, 8.

It’s usually easier to start work on a problem by thinking about a simpler case, so that’s what I did. I chose N = 5, and M = 2. Below I’ll show the results of my “paper” simulation (OK, the paper was actually an Emacs buffer).

First, let me show you how to read my “notation”. It should be pretty easy to figure out. Below is an example of one “state snapshot”. I’ve labelled each line with its meaning. If you are a Scheme programmer, you will surmise that each named item corresponds to a “named-let” loop variable. In addition, the vertical bar character in the XS row represents the state of the “cursor” as it goes around the circle of 5 people.

;; I.                           <-- State Number
;; ------------
;; |1 2 3 4 5                   <--    XS
;; '()                          <--    YS
;; 0                            <-- COUNT

Above, we are in state 1, at the start of the program. The cursor is at the beginning of the list XS, which represents the circle of people. YS is our results list, where we will store the “dead”. As we travel around the circle and people are killed, they are added to YS. We could think of it as a “death toll”, but really it’s just a list of integers. As we go around the circle of people, we keep incrementing COUNT. When COUNT reaches the value of M, that person will be “killed”, that is, be added to YS.

When this happens, we reset the COUNT to 0 and keep going around the circle. There’s a catch, however. As more dead people are added to our result list YS, we need to ignore the spaces those people used to occupy in our count as we build COUNT back up towards M. In other words, the circle will have 4 people in it after the first person is killed. This means that as the circle gets smaller and smaller, people are killed more frequently.

I’ll show you how I’ve solved that problem in Scheme in a moment; first, the simulation:

;; I.
;; ------------
;; |1 2 3 4 5
;; '()
;; 0

;; II.
;; ------------
;; 1| 2 3 4 5
;; '()
;; 1

;; III.
;; ------------
;; 1 2| 3 4 5
;; '()
;; 1

;; IV.
;; ------------
;; 1 _ 3| 4 5
;; '(2)
;; 1

;; V.
;; ------------
;; 1 _ 3 4| 5
;; '(2)
;; 2

;; V.
;; ------------
;; 1 _ 3 _ 5|
;; '(4 2)
;; 1

;; VI.
;; ------------
;; 1| _ 3 _ 5
;; '(4 2)
;; 2

;; VII.
;; ------------
;; _ _ 3| _ 5
;; '(1 4 2)
;; 1

;; VIII.
;; ------------
;; _ _ 3 _ 5|
;; '(1 4 2)
;; 2

;; IX.
;; ------------
;; _ _ 3| _ _
;; '(5 1 4 2)
;; 1

;; X.
;; ------------
;; _ _ _| _ _
;; '(5 1 4 2)
;; 2

;; XI.
;; ------------
;; _ _ _| _ _
;; '(3 5 1 4 2)
;; 2

Now that you’ve seen how the algorithm should work on paper, let’s write some Scheme! I should state for the record that I did not read the Wikipedia article linked above until after I wrote this code. (This will be pretty obvious when you look at the code.) Here it is:

(define (josephus xs m)
  (let ((orig-xs (list-copy xs))
        (true-m (- m 1)))
    (let ((len (length xs)))
      (let loop ((xs xs)
                 (ys '())
                 (count 0))
        (cond 
         ;; If the dead list grows to the same length as our original
         ;; list, we know we're finished.
         ((= len (length ys))
          (reverse ys))
         ;; If XS is null, we have gone once around the circle.  We
         ;; start around again, leaving the count unchanged.
         ((null? xs)
          (loop orig-xs ys count))
         ;; If the current person we're looking at is already dead
         ;; (a ghost), move on.  And don't bother bumping the
         ;; count.
         ((member (car xs) ys)
          (loop (cdr xs) ys count))
         ;; If we're here, it's also the case that the current person
         ;; we're looking at is *not* in the dead list.  Therefore we
         ;; check if the count is equal to M.  If so, they must die.
         ((= count true-m)
          (loop (cdr xs) (cons (car xs) ys) 0))
         ;; If we get here, the current person we're looking at is
         ;; alive.  We check if the count is *not* equal to M.  If it
         ;; is not, we skip this person and bump the count.
         ((not (= count true-m))
          (loop (cdr xs) ys (+ count 1)))
         ;; We should never get here.
         (else #f))))))

How does it compare to the output described in Sedgwick? It seems to work!

> (josephus '(1 2 3 4 5 6 7 8 9) 5)
'(5 1 7 4 3 6 9 2 8)

There are several inefficiencies we could tackle, though. The first and most obvious is that we should try to calculate the solution mathematically (as shown in some of the Wikipedia examples) instead of building up lists using CONS. Another is that we make a “canonical” copy of the input list for restarting the loop. A third is the use of MEMBER to determine whether the person we’re currently visiting is already in the “dead” list.

For example, here’s the trace output showing 19 calls to MEMBER for a pretty small input:

(josephus '(1 2 3 4 5) 2)
[Enter (member 1 '())
 Leave member #f]
... 17 more calls to member ...
[Enter (member 3 '(5 1 4 2))
 Leave member #f]
'(2 4 1 5 3)

That said, in this case I was happy just to get something working. If this were going to be used for anything “real”, I would probably rewrite it statefully using vectors. Most of the fun was in coming up with this first draft without resorting to any stateful idioms.

(Image courtesy Chris Lott via CC-BY license.)

Bottom Up Merge Sort in Scheme

../img/37-tiles.jpg

Recently I’ve begun a project to implement a number of basic algorithms in Scheme, which I’d like to eventually grow into a free (as in freedom) ebook. Having just done a Binary Search in Scheme, I thought it would be fun to give merge sort a try.

According to the mighty interwebs, merge sort is a good choice for sorting linked lists (a.k.a., Lisp lists). Unfortunately the only Lisp merge sort implementation examples I’ve been able to find on the web have been recursive, not iterative.

The implementation described here is an iterative, bottom-up merge sort, written in a functional style. (I daren’t say the functional style, lest any real Scheme wizards show up and burn me to a crisp.)

First, generate a list of random numbers

In order to have something to sort, we need a procedure that generates a list of random numbers – note that the docstring is allowed by MIT/GNU Scheme; YMMV with other Schemes.

(define (make-list-of-random-numbers list-length max)
  ;; Int Int -> List
  "Make a list of random integers less than MAX that's LIST-LENGTH long."
  (letrec ((maker
            (lambda (list-length max result)
              (let loop ((n list-length) (result '()))
                (if (= n 0)
                    result 
                    (loop (- n 1) (cons (random max) result)))))))
    (maker list-length max '())))

Then, write a merge procedure

This implementation of the merge procedure is a straight port of the one described on the Wikipedia Merge Sort page, with one minor difference to make the sort faster 1.

An English description of the merge operation is as follows:

  • If both items passed in are numbers (or strings), wrap them up in lists and recur. (In this example we only care about sorting numbers)
  • If both lists are empty, return the result.
  • If neither list is empty:
    • If the first item in the first list is “less than” the first item in the second list, cons it onto the result and recur.
    • Otherwise, cons the first item in the second list on the result and recur.
  • If the first list still has items in it, cons the first item onto the result and recur.
  • If the second list still has items in it, cons the first item onto the result and recur.
  • If none of the above conditions are true, return #f. I put this here for debugging purposes while writing this code; now that the procedure is debugged, it is never reached. (Note: “debugged” just means “I haven’t found another bug yet”.)
(define (rml/merge pred l r)
  (letrec ((merge-aux
            (lambda (pred left right result)
              (cond 
               ((and (number? left)
                     (number? right))
                (merge-aux pred 
                           (list left) 
                           (list right) 
                           result))
               ((and (string? left)
                     (string? right))
                (merge-aux pred
                           (list left) 
                           (list right) 
                           result))
               ((and (null? left)
                     (null? right))
                (reverse result))
               ((and (not (null? left))
                     (not (null? right)))
                (if (pred (car left)
                          (car right))
                    (merge-aux pred
                               (cdr left)
                               right
                               (cons (car left) result))
                    (merge-aux pred
                               left
                               (cdr right)
                               (cons (car right) result))))
               ((not (null? left))
                (merge-aux pred (cdr left) right (cons (car left) result)))
               ((not (null? right))
                (merge-aux pred left (cdr right) (cons (car right) result)))
               (else #f)))))
    (merge-aux pred l r '())))

We can run a few merges to get a feel for how it works. The comparison predicate we pass as the first argument will let us sort all kinds of things, but for the purposes of this example we’ll stick to numbers:

(rml/merge < '(360 388 577) '(10 811 875 995))
;Value 11: (10 360 388 577 811 875 995)

(rml/merge < '(8 173 227 463 528 817) '(10 360 388 577 811 875 995))
;Value 12: (8 10 173 227 360 388 463 528 577 811 817 875 995)

(rml/merge < 
           '(218 348 486 520 639 662 764 766 886 957 961 964)
           '(8 10 173 227 360 388 463 528 577 811 817 875 995))
;Value 14: (8 10 173 218 227 348 360 388 463 486 520 528 577 639 662 764 766 811 817 875 886 957 961 964 995)

Finally, do a bottom up iterative merge sort

It took me a while to figure out how to do the iterative merge sort in a Schemely fashion. As usual, it wasn’t until I took the time to model the procedure on paper that I got somewhere. Here’s what I wrote in my notebook:

;;  XS                   |      RESULT
;;---------------------------------------------

'(5 1 2 9 7 8 4 3 6)            '()
    '(2 9 7 8 4 3 6)            '((1 5))
        '(7 8 4 3 6)            '((2 9) (1 5))
            '(4 3 6)            '((7 8) (2 9) (1 5))
                '(6)            '((3 4) (7 8) (2 9) (1 5))
                 '()            '((6) (3 4) (7 8) (2 9) (1 5))

;; XS is null, and RESULT is not of length 1 (meaning it isn't sorted
;; yet), so we recur, swapping the two:

'((6) (3 4) (7 8) (2 9) (1 5))  '()
          '((7 8) (2 9) (1 5))  '((3 4 6))
                      '((1 5))  '((2 7 8 9) (3 4 6))
                           '()  '((1 5) (2 7 8 9) (3 4 6))

;; Once more XS is null, but RESULT is still not sorted, so we swap
;; and recur again

'((1 5) (2 7 8 9) (3 4 6))      '()
                  '(3 4 6)      '((1 2 5 7 8 9))
                       '()      '((3 4 6) (1 2 5 7 8 9))

;; Same story: swap and recur!

'((3 4 6) (1 2 5 7 8 9))        '()
                     '()        '((1 2 3 4 5 6 7 8 9))

;; Finally, we reach our base case: XS is null, and RESULT is of
;; length 1, meaning that it contains a sorted list

'(1 2 3 4 5 6 7 8 9)

This was a really fun little problem to think about and visualize. It just so happens that it fell out in a functional style; usually I don’t mind doing a bit of state-bashing, especially if it’s procedure-local. Here’s the code that does the sort shown above:

(define (rml/merge-sort xs pred)
  (let loop ((xs xs)
             (result '()))
    (cond ((and (null? xs)
                (null? (cdr result)))
           (car result))
          ((null? xs)
           (loop result
                 xs))
          ((null? (cdr xs))
           (loop (cdr xs)
                 (cons (car xs) result)))
          (else
           (loop (cddr xs)
                 (cons (rml/merge <
                              (first xs)
                              (second xs))
                       result))))))

That’s nice, but how does it perform?

A good test of our merge sort is to compare it to the system’s sort procedure. In the case of MIT/GNU Scheme, we’ll need to compile our code if we hope to get anywhere close to the system’s speed. If your Scheme is interpreted, you don’t have to bother of course.

To make the test realistic, we’ll create three lists of random numbers: one with 20,000 items, another with 200,000, and finally a giant list of 2,000,000 random numbers. This should give us a good idea of our sort’s performance. Here’s the output of timing first two sorts, 20,000 and 200,000 2:

;;; Load compiled code

(load "mergesort")
;Loading "mergesort.so"... done
;Value: rml/insertion-sort2

;;; Define our lists

(define unsorted-20000 (make-list-of-random-numbers 20000 200000))
;Value: unsorted-20000

(define unsorted-200000 (make-list-of-random-numbers 200000 2000000))
;Value: unsorted-200000

;;; Sort the list with 20,000 items

(with-timing-output (rml/merge-sort unsorted-20000 <))
;Run time:      .03
;GC time:       0.
;Actual time:   .03

(with-timing-output (sort unsorted-20000 <))
;Run time:      .02
;GC time:       0.
;Actual time:   .021

;;; Sort the list with 200,000 items

(with-timing-output (rml/merge-sort unsorted-200000 <))
;Run time:      .23
;GC time:       0.
;Actual time:   .252

(with-timing-output (sort unsorted-200000 <))
;Run time:      .3
;GC time:       0.
;Actual time:   .3

As you can see, our sort procedure is on par with the system’s for these inputs. Now let’s turn up the heat. How about a list with 2,000,000 random numbers?

;;; Sort the list with 2,000,000 items

(define unsorted-2000000 (make-list-of-random-numbers 2000000 20000000))
;Value: unsorted-2000000

(with-timing-output (rml/merge-sort4 unsorted-2000000 <))
;Aborting!: out of memory
;GC #34: took:   0.80 (100%) CPU time,   0.10 (100%) real time; free: 11271137
;GC #35: took:   0.70 (100%) CPU time,   0.90  (81%) real time; free: 11271917
;GC #36: took:   0.60 (100%) CPU time,   0.90  (99%) real time; free: 11271917

(with-timing-output (sort unsorted-2000000 <))
;Run time:      2.48
;GC time:       0.
;Actual time:   2.474

No go. On a MacBook with 4GB of RAM, our merge sort runs out of memory, while the system sort procedure works just fine. It seems the wizards who implemented this Scheme system knew what they were doing after all! :-}

It should be pretty clear at this point why we’re running out of memory. In MIT/GNU Scheme, the system sort procedure uses vectors and mutation (and is no doubt highly tuned for the compiler), whereas we take a relatively brain-dead approach that uses lists and lots of cons-ing. I leave it as an exercise for the reader (or perhaps my future self) to rewrite this code so that it doesn’t run out of memory.

(Image courtesy mag3737 under Creative Commons license.)

Footnotes:

1 An earlier implementation started off the sort by “exploding” the list to be sorted so that ='(1 2 3)= became ='((1) (2) (3))=. This is convenient for testing purposes, but very expensive. It’s also unnecessary after the first round of merging. We avoid the need to explode the list altogether by teaching merge to accept numbers and listify them when they appear. We could also do the same for strings and other types as necessary.

2 For the definition of the with-timing-output macro, see here.

Binary Search in Scheme

../img/hex-twist-variation-2.jpg

Intro

Just for fun, I’ve begun translating some of the algorithms from Mastering Algorithms with Perl into Scheme. My hope is that I’ll get two things out of this: a better knowledge of algorithms, and of Scheme hacking.

Binary search is one of the first algorithms listed in the book; it’s tricky to write a correct binary search, but I had the Perl code to work from. Let’s see how I did.

What’s binary search?

Binary search is a method for finding a specific item in a sorted list. Here’s how it works:

  1. Take a guess that the item you want is in the middle of the current search “window” (when you start, the search window is the entire list).
  2. If the item is where you guessed it would be, return the index (the location of your guess).
  3. If your guess is “less than” the item you want (based on a comparison function you choose), recur, this time raising the “bottom” of the search window to the midway point.
  4. If your guess is “greater than” the item you want (based on your comparison function), recur, this time lowering the “top” of the search window to the midway point.

In other words, you cut the size of the search window in half every time through the loop. This gives you a worst-case running time of about (/ (log n) (log 2)) steps. This means you can find an item in a sorted list of 20,000,000,000 (twenty billion) items in about 34 steps.

Reading lines from a file

Before I could start writing a binary search, I needed a sorted list of items. I decided to work with a sorted list of words from /usr/share/dict/words, so I wrote a couple of little procedures to make a list of words from a subset of that file. (I didn’t want to read the entire large file into a list in memory.)

Note: Both format and the Lisp-inspired #!optional keyword are available in MIT Scheme; they made writing the re-matches? procedure more convenient.

  • re-matches? checks if a regular expression matches a string (in this case, a line from a file).
  • make-list-of-words-matching is used to loop over the lines of the words file and return a list of lines matching the provided regular expression.

Now I have the tools I need to make my word list.

(load-option 'format)

(define (re-matches? re line #!optional display-matches)
  ;; Regex String . Boolean -> Boolean
  "Attempt to match RE against LINE. Print the match if DISPLAY-MATCHES is set."
  (let ((match (re-string-match re line)))
    (if match
        (if (not (default-object? display-matches))
            (begin (format #t "|~A|~%" (re-match-extract line match 0))
                   #t)
            #t)
        #f)))

(define (make-list-of-words-matching re file)
  ;; Regex String -> List
  "Given a regular expression RE, loop over FILE, gathering matches."
  (call-with-input-file file
    (lambda (port)
      (let loop ((source (read-line port)) (sink '()))
        (if (eof-object? source)
            sink
            (loop (read-line port) (if (re-matches? re source)
                             (cons source sink)
                             sink)))))))

Writing tests

Since I am not one of the 10% of programmers who can implement a correct binary search on paper, I started out by writing a test procedure. The test procedure grew over time as I found bugs and read an interesting discussion about the various edge cases a binary search procedure should handle. These include:

  • Empty list
  • List has one word
  • List has two word
  • Word is not there and “less than” anything in the list
  • Word is not there and “greater than” anything in the list
  • Word is first item
  • Word is last item
  • List is all one word
  • If multiple copies of word are in list, return the first word found (this could be implemented to return the first or last duplicated word)

Furthermore, I added a few “sanity checks” that check the return values against known outputs. Here are the relevant procedures:

  • assert= checks two numbers for equality and prints a result
  • assert-equal checks two Scheme objects against each other with equal? and prints a result
  • run-binary-search-tests reads in words from a file and runs all of our tests
(define (assert= expected got #!optional noise)
  ;; Int Int -> IO
  (if (= expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (assert-equal? expected got #!optional noise)
  ;; Thing Thing -> IO
  (if (equal? expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (run-binary-search-tests)
  ;; -> IO
  "Run our binary search tests using known words from the 'words' file.
This file should be in the current working directory."
  (with-working-directory-pathname (pwd)
    (lambda ()
      (if (file-exists? "words")
          (begin
            (format #t "file 'words' exists, making a list...~%")
            (let* ((unsorted (make-list-of-words-matching "acc" "words"))
                   (sorted (sort unsorted string<?)))
              (format #t "doing binary searches...~%")
              (assert-equal? #f (binary-search "test" '())) ; empty list
              (assert-equal? #f (binary-search "aardvark" sorted)) ; element absent and too small
              (assert-equal? #f (binary-search "zebra" sorted)) ; element absent and too large
              (assert= 0 (binary-search "accusive" '("accusive"))) ; list of length one
              (assert= 0 (binary-search "acca" sorted)) ; first element of list
              (assert= 1 (binary-search "aardvark" '("aardvark" "aardvark" "babylon"))) ; multiple copies of word in list
              (assert= 1 (binary-search "barbaric" '("accusive" "barbaric"))) ; list of length two
              (assert= 98 (binary-search "acclamator" sorted))
              (assert= 127 (binary-search "aardvark" (map (lambda (x) "aardvark") test-list))) ; list is all one value
              (assert= 143 (binary-search "accomplice" sorted))
              (assert= 254 (binary-search "accustomedly" sorted))
              (assert= 255 (binary-search "accustomedness" sorted)))))))) ; last element of list

The binary search procedure

Finally, here’s the binary search procedure; it uses a couple of helper procedures for clarity.

  • ->int is a helper procedure that does a quick and dirty integer conversion on its argument
  • split-difference takes a low and high number and returns the floor of the halfway point between the two
  • binary-search takes an optional debug-print argument that I used a lot while debugging. The format statements and the optional argument tests add a lot of bulk – now that the procedure is debugged, they can probably be removed. (Aside: I wonder how much “elegant” code started out like this and was revised after sufficient initial testing and debugging?)
(define (->int n)
  ;; Number -> Int
  "Given a number N, return its integer representation.
N can be an integer or flonum (yes, it's quick and dirty)."
  (flo:floor->exact (exact->inexact n)))

(define (split-difference low high)
  ;; Int Int -> Int
  "Given two numbers, return their rough average."
  (if (= (- high low) 1)
      1
      (->int (/ (- high low) 2))))

(define (binary-search word xs #!optional debug-print)
  ;; String List -> Int
  "Do binary search of list XS for WORD. Return the index found, or #f."
  (if (null? xs)
      #f
      (let loop ((low 0) (high (- (length xs) 1)))
        (let* ((try (+ low (split-difference low high)))
               (word-at-try (list-ref xs try)))
          (cond
           ((string=? word-at-try word) try)
           ((< (- high low) 1) #f)
           ((= (- high try) 1) 
            (if (string=? (list-ref xs low) word)
                low
                #f))
           ((string<? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string<? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop (+ 1 try) high)) ; raise the bottom of the window
                (loop (+ 1 try) high)))
           ((string>? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string>? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop low (+ 1 try))) ; lower the top of the window
                (loop low (+ 1 try))))
           (else #f))))))

Takeaways

This exercise has taught me a lot.

  1. Writing correct code is hard. (I’m confident that this code is not correct.) You need to figure out your invariants and edge cases first. I didn’t, and it made things a lot harder.
  2. It’s been said a million times, but tests are code. The tests required some debugging of their own.
  3. Once they worked, the tests were extremely helpful. Especially now that I’m at the point where (if this were “for real”) additional features would need to be added, the format calls removed, the procedure speeded up, and so on.

I hope this has been useful to some other aspiring Scheme wizards out there. Happy Hacking!

(Image courtesy Melisande under Creative Commons license.)

SICP Exercises 1.10 through 1.13

../img/sinayskaya-12.jpg

Intro

In this post I’ll share some of the fun I had doing exercises 1.10 through 1.13 from Structure and Interpretation of Computer Programs. Remember that I’m working from the first edition, since it’s what I have. Fortunately it’s not too different from the second edition so far; I’ve written the exercises out so you don’t need the book to follow along.

Exercise 1.10

Draw the tree illustrating the process generated by the count-change procedure of section 1.2.2 in making change for 11 cents. What are the orders of growth of the space and time used by this process as the amount to be changed increases?

Having already drawn the tree out by hand for an earlier post, I noticed a lot of inefficient procedure calls, and sidetracked a little to implement a version called cc-faster which massively reduced their number. For example, when counting change for 600 cents (using all 5 types of coins), the number of procedure calls was reduced from 29,806,269 to 233,609. In other words, it made approximately 128 times fewer recursive calls for the same input! Even so, cc-faster still generates a tree-recursive process and is destined to blow up. Depending on your application it might be good enough though.

Space

But I digress. On p. 11 of the text, the authors state that

In general, the time required by a tree-recursive process will be proportional to the number of nodes in the tree, while the space required will be proportional to the maximum depth of the tree.

Therefore I’m happy to guesstimate the order of growth of the space used by (cc 11 5) to be somewhere in the neighborhood of O(n), where n is the amount to be changed. (Looking at the earlier post mentioned above, it looks like cc-faster is also O(n), but in the number of different kinds of coins.)

Time

Speaking of guesstimation, I decided to instrument the cc procedure in the following hackish manner:

;;; `C-CC': A VERSION OF `CC' THAT COUNTS HOW OFTEN IT CALLS ITSELF
;; Yes, a gross hack. But written quickly! :-)

(define *call-count* 0)

(define-syntax incf!
  (syntax-rules ()
    ((incf! var)
     (begin (set! var (+ 1 var))
            var))))

(define (c-cc amount kinds-of-coins)
  (begin (incf! *call-count*)
         (cond 
          ;; ((= kinds-of-coins 1) 1) ; Uncomment this to greatly
          ;; lessen the # of procedure calls by using the `cc-faster'
          ;; algorithm.
          ((= amount 0) 1)
          ((or (< amount 0) (= kinds-of-coins 0)) 0)
          (else (+
                 (c-cc amount (- kinds-of-coins 1))
                 (c-cc (- amount (first-denomination kinds-of-coins))
                       kinds-of-coins))))))

(define (print-call-count-statistics n #!optional display-proc)
  (set! *call-count* 0)
  ; Use the special version of `cc' above
  (c-cc n 5)
  (display *call-count*)
  (newline)
  (set! *call-count* 0)
  (if (not (default-object? display-proc))
      (begin (display (apply proc (list *call-count*)))
             (newline))))

This allowed me to do the following measurements:

;; CALL COUNT STATISTICS FOR THE ORIGINAL `CC' ALGORITHM
; (for-each (lambda (n) (print-call-count-statistics n)) '(50 100 150
; 200 250 300 350 400 450 500 550 600))
; Note: the number in the right column comes from dividing the current
; value by the previous.
;    1571
;   15499 0.101361378153
;   71775 0.215938697318
;  229589 0.312623862642
;  587331 0.390902234004
; 1292591 0.454382708838
; 2552159 ...
; 4642025
; 7917379
;12822611
;19901311
;29806269 0.667688767085
;Unspecified return value

I plotted the results of my measurements and tried to do a little curve-fitting by hand. I initially settled on x^e * ϕ, but it didn’t grow steeply enough to match cc‘s growth as you can see in the plot below. (This guess seemed reasonable at first because I only went up to 450 with cc.)

Then I started dividing the number of nodes in the call tree for input n by the number of nodes generated by input n+1 (see the right column in the code block above). At first glance it looks like cc‘s order of growth is a function that will asymptotically approach 1.

../img/call-counts.jpg

So what’s my final answer? cc is exponential in a bad way. Don’t use it! :-)

Exercise 1.11

Design a procedure that evolves an iterative exponentiation process that uses successive squaring and works in logarithmic time, as does fast-exp.

Unlike Exercise 1.9, the authors walk you through how to do this exercise. Even with their patient description, I managed to implement a completely wrong solution. I finally figured out what I was doing wrong by going back and reading the text, and then modeling the procedure by hand on paper. It turns out I hadn’t understood their explanation after all! So I guess the late Prof. Dijkstra really was onto something.

While trying to debug it, I wrote fast-expt-tester, which showed me how hopelessly off the rails I was, but not necessarily how to fix it. Thank goodness for pen and paper, the ultimate debugging tools (at least for a student-level problem like this).

; Call it like so: (fast-expt-iter 7 7 1)
(define (fast-expt-iter b n a)
  (cond 
   ((= n 0) a)
   ((even? n)
    (fast-expt-iter (square b) (/ n 2) a))
   (else (fast-expt-iter b (- n 1) (* b a)))))

(define (fast-expt-tester rounds)
    (let loop ((n rounds))
      (if (= n 0)
          (display "all done")
          (let* ((z1 (random 20))
                 (z2 (random 20))
                 (correct (expt z1 z2))
                 (maybe-correct (fast-expt-iter z1 z2 1))
                 (message (string-append
                           "expt's answer\n"
                           (number->string correct)
                           "\nis not equal to fast-expt-iter's answer\n"
                           (number->string maybe-correct)
                           "\nfor inputs: "
                           (number->string z1)
                           " "
                           (number->string z2)
                           "\n")))
            (if (not (= correct maybe-correct))
                (begin (display message)
                       (newline)
                       (loop (- n 1))))))))

Exercise 1.12

… One can perform integer multiplication by means of repeated addition. … Now suppose we include, together with addition, the operations double, which doubles an integer, and halve, which divides an (even) integer by 2. Using these, design a multiplication procedure analogous to fast-exp that works in logarithmic time.

This was slightly simpler to understand than 1.11 (at least for me).

(define (halve n)
  (/ n 2))

(define (double n)
  (+ n n))

(define (rml/fast-* a b)
  (cond ((= b 0) 0)
        ((even? b)
         (rml/fast-* (double a) (halve b)))
        (else
         (+ a (rml/fast-* a (- b 1))))))

(* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/fast-* 123123123123 123123123123)
;Value: 15159303447561417273129

Exercise 1.13

Using the results of exercises 1.11 and 1.12, devise a procedure that generates an iterative process for multiplying two integers in terms of adding, doubling, and halving and works in logarithmic time.

I have to admit it, expressing iteration using recursive procedure definitions is great fun!

(define (rml/iter-* a b c)
  (cond ((= b 0) c)
        ((even? b)
         (rml/iter-* (double a) (halve b) c))
        (else
         (rml/iter-* a (- b 1) (+ a c)))))

(* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/fast-* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/iter-* 123123123123 123123123123 0)
;Value: 15159303447561417273129

(Origami image courtesy Melisande under Creative Commons license.)

Scheme Idiom: Loop over an open file input port

Dear Scheme wizards, I have a confession to make: I can never remember how to write loops in Scheme using the named-let convention. I’m working on a problem from the British Informatics Olympiad which you can read about here, with my own ugly imperative Ruby solution here. (My apologies to real Ruby programmers, of course.)

I’ll no doubt be apologizing again after I share my Scheme solution. Until then, here is some code to read in the contents from a file. This works in MIT Scheme but should be portable to any R5RS Schemes.

Note that read will blow up if the file contains certain characters, like #\#. MIT Scheme provides additional procedures like read-line to solve this problem.

(with-input-from-file "/home/rml/Desktop/current/INPUT.TXT"
  (lambda ()
    (let loop ((source (read)) (sink '()))
        (if (eof-object? source)
            (reverse sink)
            (loop (read) (cons source sink))))))
;Value 36: (5 7 3 8 8 1 0 2 7 4 4 4 5 2 6 5)

Just for Fun: Estimating pi with Scheme

../img/double-star-from-pentagon-backlit.jpg

A while back I shared some Perl code for calculating the circumference of a circle without knowing 𝛑. Just for fun, and due to my longtime infatuation with all things Schemish, I’ve written a little pi approximator in Scheme. It uses the idea that we can approximate a circle using smaller and smaller triangles stacked on top of each other. (See previously for a better explanation with a picture.)

And now, the code!

;;;; pi.scm -- Estimate the value of 𝛑 using smaller and smaller
;;;; triangles.

;;; Call it like so: (pi-estimate n), where n is the number of
;;; iterations you'd like to go through. It doesn't take many to get
;;; pretty accurate.

(define reference-pi 3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679)

(define (circumference radius)
  (* 3.14159 2 radius))

(define (square x)
  (* x x))

(define (hypotenuse a b)
  (sqrt (+ (expt a 2) (expt b 2))))

(define (pi-iter radius a b count maxcount)
  (let* ((hyp (hypotenuse a b))
         (newbase (- radius (sqrt (- (square radius)
                                     (square (/ hyp 2)))))))
    (if (= count maxcount)
        (/
         (* 2 hyp (expt 2 (+ 1 count)))
         (* 2 radius))
        (pi-iter radius newbase (/ hyp 2) (+ count 1) maxcount))))

(define (pi-estimate iterations)
  (pi-iter 128 128 128 0 iterations))

(Origami image courtesy Melisande under Creative Commons license.)