Binary Search in Scheme

../img/hex-twist-variation-2.jpg

Intro

Just for fun, I’ve begun translating some of the algorithms from Mastering Algorithms with Perl into Scheme. My hope is that I’ll get two things out of this: a better knowledge of algorithms, and of Scheme hacking.

Binary search is one of the first algorithms listed in the book; it’s tricky to write a correct binary search, but I had the Perl code to work from. Let’s see how I did.

What’s binary search?

Binary search is a method for finding a specific item in a sorted list. Here’s how it works:

  1. Take a guess that the item you want is in the middle of the current search “window” (when you start, the search window is the entire list).
  2. If the item is where you guessed it would be, return the index (the location of your guess).
  3. If your guess is “less than” the item you want (based on a comparison function you choose), recur, this time raising the “bottom” of the search window to the midway point.
  4. If your guess is “greater than” the item you want (based on your comparison function), recur, this time lowering the “top” of the search window to the midway point.

In other words, you cut the size of the search window in half every time through the loop. This gives you a worst-case running time of about (/ (log n) (log 2)) steps. This means you can find an item in a sorted list of 20,000,000,000 (twenty billion) items in about 34 steps.

Reading lines from a file

Before I could start writing a binary search, I needed a sorted list of items. I decided to work with a sorted list of words from /usr/share/dict/words, so I wrote a couple of little procedures to make a list of words from a subset of that file. (I didn’t want to read the entire large file into a list in memory.)

Note: Both format and the Lisp-inspired #!optional keyword are available in MIT Scheme; they made writing the re-matches? procedure more convenient.

  • re-matches? checks if a regular expression matches a string (in this case, a line from a file).
  • make-list-of-words-matching is used to loop over the lines of the words file and return a list of lines matching the provided regular expression.

Now I have the tools I need to make my word list.

(load-option 'format)

(define (re-matches? re line #!optional display-matches)
  ;; Regex String . Boolean -> Boolean
  "Attempt to match RE against LINE. Print the match if DISPLAY-MATCHES is set."
  (let ((match (re-string-match re line)))
    (if match
        (if (not (default-object? display-matches))
            (begin (format #t "|~A|~%" (re-match-extract line match 0))
                   #t)
            #t)
        #f)))

(define (make-list-of-words-matching re file)
  ;; Regex String -> List
  "Given a regular expression RE, loop over FILE, gathering matches."
  (call-with-input-file file
    (lambda (port)
      (let loop ((source (read-line port)) (sink '()))
        (if (eof-object? source)
            sink
            (loop (read-line port) (if (re-matches? re source)
                             (cons source sink)
                             sink)))))))

Writing tests

Since I am not one of the 10% of programmers who can implement a correct binary search on paper, I started out by writing a test procedure. The test procedure grew over time as I found bugs and read an interesting discussion about the various edge cases a binary search procedure should handle. These include:

  • Empty list
  • List has one word
  • List has two word
  • Word is not there and “less than” anything in the list
  • Word is not there and “greater than” anything in the list
  • Word is first item
  • Word is last item
  • List is all one word
  • If multiple copies of word are in list, return the first word found (this could be implemented to return the first or last duplicated word)

Furthermore, I added a few “sanity checks” that check the return values against known outputs. Here are the relevant procedures:

  • assert= checks two numbers for equality and prints a result
  • assert-equal checks two Scheme objects against each other with equal? and prints a result
  • run-binary-search-tests reads in words from a file and runs all of our tests
(define (assert= expected got #!optional noise)
  ;; Int Int -> IO
  (if (= expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (assert-equal? expected got #!optional noise)
  ;; Thing Thing -> IO
  (if (equal? expected got)
      (format #t "~A is ~A\t...ok~%" expected got)
      (format #t "~A is not ~A\t...FAIL~%" expected got)))

(define (run-binary-search-tests)
  ;; -> IO
  "Run our binary search tests using known words from the 'words' file.
This file should be in the current working directory."
  (with-working-directory-pathname (pwd)
    (lambda ()
      (if (file-exists? "words")
          (begin
            (format #t "file 'words' exists, making a list...~%")
            (let* ((unsorted (make-list-of-words-matching "acc" "words"))
                   (sorted (sort unsorted string<?)))
              (format #t "doing binary searches...~%")
              (assert-equal? #f (binary-search "test" '())) ; empty list
              (assert-equal? #f (binary-search "aardvark" sorted)) ; element absent and too small
              (assert-equal? #f (binary-search "zebra" sorted)) ; element absent and too large
              (assert= 0 (binary-search "accusive" '("accusive"))) ; list of length one
              (assert= 0 (binary-search "acca" sorted)) ; first element of list
              (assert= 1 (binary-search "aardvark" '("aardvark" "aardvark" "babylon"))) ; multiple copies of word in list
              (assert= 1 (binary-search "barbaric" '("accusive" "barbaric"))) ; list of length two
              (assert= 98 (binary-search "acclamator" sorted))
              (assert= 127 (binary-search "aardvark" (map (lambda (x) "aardvark") test-list))) ; list is all one value
              (assert= 143 (binary-search "accomplice" sorted))
              (assert= 254 (binary-search "accustomedly" sorted))
              (assert= 255 (binary-search "accustomedness" sorted)))))))) ; last element of list

The binary search procedure

Finally, here’s the binary search procedure; it uses a couple of helper procedures for clarity.

  • ->int is a helper procedure that does a quick and dirty integer conversion on its argument
  • split-difference takes a low and high number and returns the floor of the halfway point between the two
  • binary-search takes an optional debug-print argument that I used a lot while debugging. The format statements and the optional argument tests add a lot of bulk – now that the procedure is debugged, they can probably be removed. (Aside: I wonder how much “elegant” code started out like this and was revised after sufficient initial testing and debugging?)
(define (->int n)
  ;; Number -> Int
  "Given a number N, return its integer representation.
N can be an integer or flonum (yes, it's quick and dirty)."
  (flo:floor->exact (exact->inexact n)))

(define (split-difference low high)
  ;; Int Int -> Int
  "Given two numbers, return their rough average."
  (if (= (- high low) 1)
      1
      (->int (/ (- high low) 2))))

(define (binary-search word xs #!optional debug-print)
  ;; String List -> Int
  "Do binary search of list XS for WORD. Return the index found, or #f."
  (if (null? xs)
      #f
      (let loop ((low 0) (high (- (length xs) 1)))
        (let* ((try (+ low (split-difference low high)))
               (word-at-try (list-ref xs try)))
          (cond
           ((string=? word-at-try word) try)
           ((< (- high low) 1) #f)
           ((= (- high try) 1) 
            (if (string=? (list-ref xs low) word)
                low
                #f))
           ((string<? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string<? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop (+ 1 try) high)) ; raise the bottom of the window
                (loop (+ 1 try) high)))
           ((string>? word-at-try word)
            (if (not (default-object? debug-print))
                (begin (format #f "(string>? ~A ~A) -> #t~%try: ~A high: ~A low: ~A ~2%"
                               word-at-try word try high low)
                       (loop low (+ 1 try))) ; lower the top of the window
                (loop low (+ 1 try))))
           (else #f))))))

Takeaways

This exercise has taught me a lot.

  1. Writing correct code is hard. (I’m confident that this code is not correct.) You need to figure out your invariants and edge cases first. I didn’t, and it made things a lot harder.
  2. It’s been said a million times, but tests are code. The tests required some debugging of their own.
  3. Once they worked, the tests were extremely helpful. Especially now that I’m at the point where (if this were “for real”) additional features would need to be added, the format calls removed, the procedure speeded up, and so on.

I hope this has been useful to some other aspiring Scheme wizards out there. Happy Hacking!

(Image courtesy Melisande under Creative Commons license.)

SICP Exercises 1.10 through 1.13

../img/sinayskaya-12.jpg

Intro

In this post I’ll share some of the fun I had doing exercises 1.10 through 1.13 from Structure and Interpretation of Computer Programs. Remember that I’m working from the first edition, since it’s what I have. Fortunately it’s not too different from the second edition so far; I’ve written the exercises out so you don’t need the book to follow along.

Exercise 1.10

Draw the tree illustrating the process generated by the count-change procedure of section 1.2.2 in making change for 11 cents. What are the orders of growth of the space and time used by this process as the amount to be changed increases?

Having already drawn the tree out by hand for an earlier post, I noticed a lot of inefficient procedure calls, and sidetracked a little to implement a version called cc-faster which massively reduced their number. For example, when counting change for 600 cents (using all 5 types of coins), the number of procedure calls was reduced from 29,806,269 to 233,609. In other words, it made approximately 128 times fewer recursive calls for the same input! Even so, cc-faster still generates a tree-recursive process and is destined to blow up. Depending on your application it might be good enough though.

Space

But I digress. On p. 11 of the text, the authors state that

In general, the time required by a tree-recursive process will be proportional to the number of nodes in the tree, while the space required will be proportional to the maximum depth of the tree.

Therefore I’m happy to guesstimate the order of growth of the space used by (cc 11 5) to be somewhere in the neighborhood of O(n), where n is the amount to be changed. (Looking at the earlier post mentioned above, it looks like cc-faster is also O(n), but in the number of different kinds of coins.)

Time

Speaking of guesstimation, I decided to instrument the cc procedure in the following hackish manner:

;;; `C-CC': A VERSION OF `CC' THAT COUNTS HOW OFTEN IT CALLS ITSELF
;; Yes, a gross hack. But written quickly! :-) 

(define *call-count* 0)

(define-syntax incf!
  (syntax-rules ()
    ((incf! var)
     (begin (set! var (+ 1 var))
            var))))

(define (c-cc amount kinds-of-coins)
  (begin (incf! *call-count*)
         (cond 
          ;; ((= kinds-of-coins 1) 1) ; Uncomment this to greatly
          ;; lessen the # of procedure calls by using the `cc-faster'
          ;; algorithm.
          ((= amount 0) 1)
          ((or (< amount 0) (= kinds-of-coins 0)) 0)
          (else (+
                 (c-cc amount (- kinds-of-coins 1))
                 (c-cc (- amount (first-denomination kinds-of-coins))
                       kinds-of-coins))))))

(define (print-call-count-statistics n #!optional display-proc)
  (set! *call-count* 0)
  ; Use the special version of `cc' above
  (c-cc n 5)
  (display *call-count*)
  (newline)
  (set! *call-count* 0)
  (if (not (default-object? display-proc))
      (begin (display (apply proc (list *call-count*)))
             (newline))))

This allowed me to do the following measurements:

;; CALL COUNT STATISTICS FOR THE ORIGINAL `CC' ALGORITHM
; (for-each (lambda (n) (print-call-count-statistics n)) '(50 100 150
; 200 250 300 350 400 450 500 550 600))
; Note: the number in the right column comes from dividing the current
; value by the previous.
;    1571
;   15499 0.101361378153
;   71775 0.215938697318
;  229589 0.312623862642
;  587331 0.390902234004
; 1292591 0.454382708838
; 2552159 ...
; 4642025
; 7917379
;12822611
;19901311
;29806269 0.667688767085
;Unspecified return value

I plotted the results of my measurements and tried to do a little curve-fitting by hand. I initially settled on x^e * ϕ, but it didn’t grow steeply enough to match cc‘s growth as you can see in the plot below. (This guess seemed reasonable at first because I only went up to 450 with cc.)

Then I started dividing the number of nodes in the call tree for input n by the number of nodes generated by input n+1 (see the right column in the code block above). At first glance it looks like cc‘s order of growth is a function that will asymptotically approach 1.

../img/call-counts.jpg

So what’s my final answer? cc is exponential in a bad way. Don’t use it! :-)

Exercise 1.11

Design a procedure that evolves an iterative exponentiation process that uses successive squaring and works in logarithmic time, as does fast-exp.

Unlike Exercise 1.9, the authors walk you through how to do this exercise. Even with their patient description, I managed to implement a completely wrong solution. I finally figured out what I was doing wrong by going back and reading the text, and then modeling the procedure by hand on paper. It turns out I hadn’t understood their explanation after all! So I guess the late Prof. Dijkstra really was onto something.

While trying to debug it, I wrote fast-expt-tester, which showed me how hopelessly off the rails I was, but not necessarily how to fix it. Thank goodness for pen and paper, the ultimate debugging tools (at least for a student-level problem like this).

; Call it like so: (fast-expt-iter 7 7 1)
(define (fast-expt-iter b n a)
  (cond 
   ((= n 0) a)
   ((even? n)
    (fast-expt-iter (square b) (/ n 2) a))
   (else (fast-expt-iter b (- n 1) (* b a)))))

(define (fast-expt-tester rounds)
    (let loop ((n rounds))
      (if (= n 0)
          (display "all done")
          (let* ((z1 (random 20))
                 (z2 (random 20))
                 (correct (expt z1 z2))
                 (maybe-correct (fast-expt-iter z1 z2 1))
                 (message (string-append
                           "expt's answer\n"
                           (number->string correct)
                           "\nis not equal to fast-expt-iter's answer\n"
                           (number->string maybe-correct)
                           "\nfor inputs: "
                           (number->string z1)
                           " "
                           (number->string z2)
                           "\n")))
            (if (not (= correct maybe-correct))
                (begin (display message)
                       (newline)
                       (loop (- n 1))))))))

Exercise 1.12

… One can perform integer multiplication by means of repeated addition. … Now suppose we include, together with addition, the operations double, which doubles an integer, and halve, which divides an (even) integer by 2. Using these, design a multiplication procedure analogous to fast-exp that works in logarithmic time.

This was slightly simpler to understand than 1.11 (at least for me).

(define (halve n)
  (/ n 2))

(define (double n)
  (+ n n))

(define (rml/fast-* a b)
  (cond ((= b 0) 0)
        ((even? b)
         (rml/fast-* (double a) (halve b)))
        (else
         (+ a (rml/fast-* a (- b 1))))))

(* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/fast-* 123123123123 123123123123)
;Value: 15159303447561417273129

Exercise 1.13

Using the results of exercises 1.11 and 1.12, devise a procedure that generates an iterative process for multiplying two integers in terms of adding, doubling, and halving and works in logarithmic time.

I have to admit it, expressing iteration using recursive procedure definitions is great fun!

(define (rml/iter-* a b c)
  (cond ((= b 0) c)
        ((even? b)
         (rml/iter-* (double a) (halve b) c))
        (else
         (rml/iter-* a (- b 1) (+ a c)))))

(* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/fast-* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/iter-* 123123123123 123123123123 0)
;Value: 15159303447561417273129

(Origami image courtesy Melisande under Creative Commons license.)

Scheme Idiom: Loop over an open file input port

Dear Scheme wizards, I have a confession to make: I can never remember how to write loops in Scheme using the named-let convention. I’m working on a problem from the British Informatics Olympiad which you can read about here, with my own ugly imperative Ruby solution here. (My apologies to real Ruby programmers, of course.)

I’ll no doubt be apologizing again after I share my Scheme solution. Until then, here is some code to read in the contents from a file. This works in MIT Scheme but should be portable to any R5RS Schemes.

Note that read will blow up if the file contains certain characters, like #\#. MIT Scheme provides additional procedures like read-line to solve this problem.

(with-input-from-file "/home/rml/Desktop/current/INPUT.TXT"
  (lambda ()
    (let loop ((source (read)) (sink '()))
        (if (eof-object? source)
            (reverse sink)
            (loop (read) (cons source sink))))))
;Value 36: (5 7 3 8 8 1 0 2 7 4 4 4 5 2 6 5)

Just for Fun: Estimating pi with Scheme

../img/double-star-from-pentagon-backlit.jpg

A while back I shared some Perl code for calculating the circumference of a circle without knowing 𝛑. Just for fun, and due to my longtime infatuation with all things Schemish, I’ve written a little pi approximator in Scheme. It uses the idea that we can approximate a circle using smaller and smaller triangles stacked on top of each other. (See previously for a better explanation with a picture.)

And now, the code!

;;;; pi.scm -- Estimate the value of 𝛑 using smaller and smaller
;;;; triangles.

;;; Call it like so: (pi-estimate n), where n is the number of
;;; iterations you'd like to go through. It doesn't take many to get
;;; pretty accurate.

(define reference-pi 3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679)

(define (circumference radius)
  (* 3.14159 2 radius))

(define (square x)
  (* x x))

(define (hypotenuse a b)
  (sqrt (+ (expt a 2) (expt b 2))))

(define (pi-iter radius a b count maxcount)
  (let* ((hyp (hypotenuse a b))
         (newbase (- radius (sqrt (- (square radius)
                                     (square (/ hyp 2)))))))
    (if (= count maxcount)
        (/
         (* 2 hyp (expt 2 (+ 1 count)))
         (* 2 radius))
        (pi-iter radius newbase (/ hyp 2) (+ count 1) maxcount))))

(define (pi-estimate iterations)
  (pi-iter 128 128 128 0 iterations))

(Origami image courtesy Melisande under Creative Commons license.)

Recommendations for Technical Writers

This document describes my recommendations for technical writers working in the software industry (at least the web-focused corner of it). Be forewarned: it’s opinionated. I don’t even live up to it, so you can think of it as an extended note to self.

Learn the UNIX command line

The entire modern internet computing infrastructure is built on open source, UNIX-like operating systems (mostly Linux). If you’re not familiar with them, you will have a more difficult time learning about how large-scale software systems running on networked computers work, otherwise known as: virtually all software systems of importance 1.

To get there, you’ll need to know how to interact with the command line interface (also known as a “terminal”). This is a very plain-looking text-only interface where you will type commands formatted in a strict language understood by the computer. You will learn (if you haven’t already) that the computer always does exactly what you’ve asked it to, but not necessarily what you wanted it to do.

There are a number of “shells” you can use on a UNIX-like OS. I recommend Bash because it’s ubiquitous.

Recommended Reading:

Learn REST API principles

Here’s a light-speed introduction to REST APIs. A large software system has lots of “objects”, which are generally just pieces of stored data about something. For example, where I work, there are objects that represent an advertising campaign or a user in the system.

It doesn’t really matter what data the objects represent, the point of an API is that it creates a single point of contact for external entities (usually programs) that want to interact with the data in the system. It ensures that only certain operations can be performed, and only by authorized parties.

Here’s where the REST part comes in: for any object represented in the system, you can (in theory) perform 4 basic operations on it:

  • Create: Make a new object of that type
  • Read: Look at what information the object contains
  • Update: Change the information the object contains
  • Delete: Delete the object from the system altogether

Unfortunately, I know about RESTful APIs only from practice, not study. There’s a copy of O’Reilly’s RESTful Web Services at work that I’ve been meaning to read to fill in the gaps. See? This really is a note to self.

Recommended Reading:

Learn to use a programmer’s text editor

You’re a professional who gets paid to work with text all day. Programmers are also professionals who get paid to work with text all day. You should be following their example by using a powerful plain text editor; it will allow you to automate away some of the tedious parts of editing.

Remember, word processors are written by programmers for others to use. Text editors are written by programmers for themselves to use. It’s not even a contest.

I recommend vim or Emacs. Pick one, it doesn’t matter.

Recommended Reading:

Learn a programming language with strong UNIX integration

As noted above, you are going to be documenting software running on networked computers doing … networky stuff. For example, you might be making API calls to a “cloud” of some kind 2. A lot of it will be tedious and potentially time-consuming, which means that you should automate it using a programming language. Once you are pretty familiar with a range of terminal commands, it won’t be too difficult to pick up a language with strong UNIX integration. Tastes vary; I am most familiar with Perl. I do not recommend wasting much time writing shell scripts, as there are too many weird edge cases for beginners to trip over.

Recommended Reading:

  • Learning Perl by Randal L. Schwartz, brian d foy, and Tom Phoenix
  • Modern Perl by chromatic

Footnotes:

1 Where “importance” is defined as: someone will pay you good money to document it. Personally, I’d love to get a paying job rewriting half of the Emacs manual. But I digress.

2 A Freudian slip by the tech industry perhaps? After all, “the cloud” brings to mind an ephemeral, floating entity that is mostly vaporous, and exists at the whim of forces you don’t really understand, and certainly can’t control. In addition, it’s highly likely to be blown away at any time.

An Iterative Change-Counting Procedure

… the process is really iterative: Its state is captured completely
by its three state variables, and an interpreter need keep track of
only three variables in order to execute the program.

– SICP (1st ed.), p.33

After much head-scratching and experimentation, I have implemented the following iterative change-counting procedure, which I call cc-linear (See also previously, previously):

(define (cc-linear amt kinds)
  (define (cc-linear-aux amount kinds-of-coins count stack)
    (cond ((or (= amount 0)
               (= kinds-of-coins 1))
           (if (null? stack)
               (+ count 1)
               (let* ((new-amt-pair (car stack))
                      (new-stack (cdr stack)))
                 (cc-linear-aux (car new-amt-pair) (cdr new-amt-pair) (+ 1 count) new-stack))))
          ((or (< amount 0)
               (= kinds-of-coins 0))
           (if (null? stack)
               count
               (let ((new-amt-pair (car stack))
                     (new-stack (cdr stack)))
                 (cc-linear-aux (car new-amt-pair) (cdr new-amt-pair) count new-stack))))
          (else (cc-linear-aux amount
                             (- kinds-of-coins 1)
                             count
                             (cons (cons (- amount (first-denomination kinds-of-coins))
                                         kinds-of-coins)
                                   stack)))))
  (cc-linear-aux amt kinds 0 '()))

Note that the definition of the first-denomination procedure is given in the text, as well as being downloadable from the SICP website.

This procedure returns the correct answer. There are a few issues with it, however:

  • It uses cons, let*, car, cdr, and null?, none of which are available to the student at this point in the text.
  • It simulates a stack by passing the list of delayed computations as an argument to subsequent invocations of itself. This means that it doesn’t generate a “linear iterative” process, since the amount of space used is not constant.

As for the use of car, cdr, and friends, I’m willing to let it pass since I still learned a lot by forcing myself to think in terms of iterating via recursive procedure calls.

As far as the shape of the process being generated is concerned, it can be described as “resembling linear recursion”. In the course of implementing this solution I wrote out the following simulation by hand of changing $0.12 using only two kinds of coins, a nickel and a penny.

Pre-visualizing the process to be generated by `cc-linear’

(12 2) 0 ()
( 2 2) 0 ((12 1))
(-3 2) 0 ((2 1) (12 1))
( 2 1) 0 ((12 1))
( 1 1) 0 ((2 0) (12 1))
( 0 1) 0 ((2 0) (12 1))
( 2 0) 1 ((12 1))
(12 1) 1 ()
(11 1) 1 ((12 0))
(10 1) 1 ((11 0) (12 0))
( 9 1) 1 ((10 0) (11 0) (12 0))
( 8 1) 1 (( 9 0) (10 0) (11 0) (12 0))
( 7 1) 1 (( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 6 1) 1 (( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 5 1) 1 (( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 4 1) 1 (( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 3 1) 1 (( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 2 1) 1 (( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 1 1) 1 (( 2 0) ( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 0 1) 1 (( 1 0) ( 2 0) ( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 1 0) 2 (( 2 0) ( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))

Et cetera…

As you can see, the stack continues to grow, and passing that huge association list in as a procedure argument is not without cost, as the following performance analysis shows (for the definitions of with-timing-output and cc-faster, see here.):

Performance analysis of `cc’ vs. `cc-faster’ vs. `cc-linear’ – `cc-linear’ spends a fair amount of time in garbage collection, presumably due to all of the list manipulation.

(with-timing-output (cc 292 5))
Run time: 3.42
GC time: .03
Actual time: 3.478
;Value: 8526

(with-timing-output (cc-faster 292 5))
Run time: .05
GC time: 0.
Actual time: .047
;Value: 8526

(with-timing-output (cc-linear 292 5))
Run time: .11
GC time: .04
Actual time: .15
;Value: 8526

Attempting to Count Change, Iteratively

I have discovered what I believe is a nice optimization for the tree-recursive change-counting procedure, cc, mentioned in a previous post. (In the first edition, it’s Exercise 1.9.)

Unfortunately I’m still struggling with the real problem: how to implement this problem to evolve an iterative process, as requested by the book. My attempts to recast the computation using only variables passed as procedure arguments are failing. These attempts always need more “registers” than I have available, and I start thinking in terms of some kind of “storage” where I can stash the intermediate problems that I can’t work on yet.

Of course, these “registers” are arguments to the function, and this “storage” is also known as the stack. And growing the stack is what we’re trying to avoid.

Scannning ahead in the text, Exercise 1.10 asks the reader to

Draw the tree illustrating the process generated by the (cc) procedure … making change for 11 cents.

Here’s what appeared in my notebook – Note that the procedure calls surrounded by boxes are the ones that return a value of 1. In this case, the return value of (cc 11 3) is 4, shown by the 4 boxed procedure calls:

../img/cc-larger-tree-scaled.jpg

While drawing out this tree, I noticed that once the procedure tries to make change for an amount using only pennies, it makes (* 2 amount) unnecessary procedure calls. It’s easy to see that they just bang down the list from amount to 0 and finally return 1. My idea was to reformulate the procedure so that we perform this check first, like so:

(define (cc-faster amount kinds-of-coins)
  (cond ((= kinds-of-coins 1) 1) ; Are we making change with pennies? If so, there's only one way to do it!
        ((= amount 0) 1)
        ((or (< amount 0)
             (= kinds-of-coins 0)) 
         0)
        (else (+
               (cc-faster amount (- kinds-of-coins 1))
               (cc-faster (- amount (first-denomination kinds-of-coins)) 
                   kinds-of-coins)))))

Calling this procedure results in the following tree of procedure calls:

../img/cc-smaller-tree-scaled.jpg

Note that instead of 55 procedure calls, we now make only 15! And the discrepancy only grows wider as we count larger amounts of change, prompting me to make the following measurements using MIT Scheme – the with-timing-output macro is described at the bottom:

(with-timing-output (cc 700 5))
; Run time:     75.81
; GC time:      .34
; Actual time:  76.641
;Value: 207530

(with-timing-output (cc-faster 700 5))
; Run time:     .56
; GC time:      .02
; Actual time:  .608
;Value: 207530

This reduces running time from over a minute to less than a second, which is kind of exciting! It’s still not an iterative process, though.

Just for completeness, here’s how long the same procedure takes using m-cc, the memoizing version of cc described previously. Obviously looking stuff up in memory is faster than computing it over and over again:

(with-timing-output (m-cc 700 5))
; Run time:     0.
; GC time:      0.
; Actual time:  0.
;Value: 207530

And here’s the definition of the with-timing-output macro. It’s just sugar on top of the with-timings measurement procedure built into MIT Scheme:

(define-syntax with-timing-output
  (syntax-rules ()
    ((with-timing-output body)
     (with-timings 
         (lambda () body) 
       (lambda (run-time gc-time real-time)
         (display "Run time:\t")
         (write (internal-time/ticks->seconds run-time))
         (newline)
         (display "GC time:\t")
         (write (internal-time/ticks->seconds gc-time)) 
         (newline)
         (display "Actual time:\t")
         (write (internal-time/ticks->seconds real-time))
         (newline))))))

Next step: ITERATION!