Category Archives: SICP

The Debugger is a Notebook

penrose-tiling-based-modular-origami.jpg

My style of programming has changed since the ODB. I now write insanely fast, making numerous mistakes. This gives me something to search for with the ODB. It’s fun.

– Bil Lewis, creator of the Omniscient Debugger

I. Programmers and Poets

In this short essay I will explore some similarities between the way (some) programmers work when doing exploratory programming that can be fruitfully compared to the way poets write poems. I will also sprinkle in some attempts to make the case that a good debugger is core to the experience of composing a new program that you don’t understand very well yet, and compare that with the experience of writing a poem. I know a lot about the former because I am not a very good programmer, so many programs that explore computer science concepts are “exploratory” for me; I know a lot about the latter because I am a reasonably good poet who has written many poems (which is to say, that I have edited many poems, which is really more important).

This work is largely inspired by:

  • The experience of working an programs for SICP exercises and getting popped into the MIT Scheme debugger a lot! 1
  • Using the Scheme 48/scsh inspector a bunch while working on geiser-scsh
  • Writing a lot of poems

II. Generating {Program,Poem} Text

Computer program texts are compressed descriptions of computational processes designed to be experienced by computers and humans. Similarly, poems are compressed descriptions of cognitive and emotional processes designed to be experienced by humans.

Both artifacts strive to encapsulate something that was understood by the author(s) at one point in time and convey that understanding to a reader at another point in time (human or machine). In poetry world, there are a number of different ways to work. There are ostensibly some writers who think really hard for a moment and write a fully-formed sentence. Then they think for a few moments more and write down another fully-formed sentence. And so on.

In reality, there are very few poets who work this way. Most people work using an approximation of what Sussman beautifully describes as “problem-solving by debugging almost-right plans”. 2 This is actually how human beings create new things! As my professor told our writing workshop, “I can’t teach you how to write. I can only teach you how to edit your own work”. Few people write well, and fewer edit well. But in the end, writing and editing are actually the same thing. When you first edit a poem, you may correct minor errors in the text. The more important work is “running the program” of the poem in your head, reading it over and over, reciting it aloud, testing whether it achieves the aesthetic purpose you have set for it. You will add a pronoun in one place, and replace an adjective in another. You might remove the last line, or add another stanza entirely. Do this for long enough, and you may find the poem has changed substantially over the course of having been “debugged”. It may also achieve a goal that you didn’t know existed when you began writing it. I suspect there is something very similar at work when people are doing exploratory programming sessions.

III. Debugger as Crutch/Enabler

Debuggers are seen by some as a crutch. I agree that debuggers are a crutch. There’s a reason crutches were invented. Because without them, you would have to crawl along, dragging your broken leg behind you in the dirt. And we all have a “broken leg” of sorts when we’re working on a problem we don’t understand very well.

I’d like to propose a better metaphor for debuggers. The debugger is a notebook where you can sketch out different versions of your program. You may correct minor errors in a variable declaration, or change a parameter to a procedure. You might redefine an existing procedure as a higher-order procedure that replaces two or three more verbose ones. And so on. All inside the debugger!

A sufficiently powerful debugger will give you the freedom to sketch out an idea quickly, watch it break, and play around in the environment where the breakage occurred, reading variable bindings, entering new definitions, etc.

I think of this style of programming as “sketching in your notebook” because you don’t write poems by staring at a blank sheet of paper for two minutes and then carefully writing a fully-formed sentence. You have an initial idea or impulse, and then you express it as fast as you can! You write down whatever parts of it you can manage to catch hold of, since your brain is working and associating much faster than your pen can really capture. What you end up with is a pile of things, some of which you will discard, some of which are worth keeping just as they are, and some of which are worth keeping but non-optimal and will need to be rewritten. If you actually have an idea worth expressing, you are in much greater danger of not capturing something than you are of making a mess. You will always start by making a mess and then cleaning it up 3.

I submit that a sufficently powerful, expressive debugging environment is as necessary to the programmer as a pocket notebook to the poet.

Interesting Reads

These essays explore writing, debugging, and thinking in more depth:

(Image courtesy fdecomite under Creative Commons License.)

Footnotes:

1

For more information about how cool the MIT Scheme debugger is, see Joe Marshall’s informative blog post.

2

This technique is mentioned on his CSAIL page here. For more information, see the link to his paper elsewhere on this page.

3

You may enjoy an interesting essay with this title: Make a Mess, Clean it Up!

SICP Exercises 1.5 through 1.7

../img/origami-pockets.jpg

In this installment, we’ll look at exercises 1.5 through 1.7. Note that these exercises are taken from the first edition.

Exercise 1.5

The GOOD-ENOUGH? test used in computing square roots will not be very effective for finding the square roots of very small numbers. Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inadequate for very large numbers. Explain these statements, with tests showing how the test fails for small and large numbers. An alternative strategy for implementing GOOD-ENOUGH? is to watch how GUESS changes from one iteration to the next and to stop when the change is a very small fraction of the guess. Design a square-root procedure that uses this kind of test. Does this work better for small and large numbers?

First, here are the original definitions of GOOD-ENOUGH? and friends from the text:

(define (square x)
  (* x x))

(define (sqrt-iter guess x)
  (if (good-enough? guess x)
      guess
      (sqrt-iter (improve guess x)
                 x)))

(define (improve guess x)
  (average guess (/ x guess)))

(define (average x y)
  (/ (+ x y) 2))

(define (good-enough? guess x)
  (< (abs (- (square guess) x)) 0.001))

(define (new-sqrt x)
  (sqrt-iter 1.0 x))

Let’s test the operation of NEW-SQRT. In the results below, NEW-SQRT is off by a factor of ten when taking the root of .00001 – a small number.

(sqrt .00001)
0.00316228 ;Three one thousandths.
(new-sqrt .00001)
0.0313565 ;Three one hundredths. Wat?

Why are we so far off? Let’s trace GOOD-ENOUGH?, since it’s used to decide when our result is, er, good enough.

> ,trace good-enough?
> (new-sqrt .00001)
[Enter (good-enough? 1. 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.500005 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.250012 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.125026 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.0625531 1.e-05)
 Leave good-enough? #f]
[Enter (good-enough? 0.0313565 1.e-05)
 Leave good-enough? #t]
0.0313565

In the last call to GOOD-ENOUGH?, we’re still off by a factor of ten! This is because inside GOOD-ENOUGH?, we’re checking our results against an absolute value (also known as a “magic number”), which is dumb. As suggested by the text, let’s implement a better GOOD-ENOUGH? that watches how GUESS changes from one iteration to the next and returns true when the value of GUESS stops changing by much. This means we’ll have to update SQRT-ITER, too.

(define (good-enough? last-guess this-guess x)
  (if (< (abs (- this-guess last-guess)) .0001) ;Watch how GUESS changes.
      #t
      #f))

(define (sqrt-iter last-guess this-guess x)
  (if (good-enough? last-guess this-guess x)
      this-guess
      (sqrt-iter this-guess (improve this-guess x) x)))

(define (new-sqrt x)
  (sqrt-iter 0 1.0 x))

Now we’ll test our new GOOD-ENOUGH?:

> (sqrt .0001)
0.01
> (new-sqrt .0001)
0.01
> (sqrt .1)
0.316228
> (new-sqrt .1)
0.316228
> (sqrt .0000001)
0.000316228
> (new-sqrt .0000001)
0.000319804 ;Pretty darn close!

As we can see from the tests, this version of GOOD-ENOUGH? starts to lose accuracy when we work with millionths. I’m happy with that accuracy for this application. :-)

Exercise 1.6

Newton’s method for cube roots is based on the fact that if y is an approximation for the cube root of x, then a better approximation is given by the value

(x/y**2 + 2y) / 3

Use this formula to implement a cube-root procedure analogous to the square root procedure.

After doing the last problem, this is pretty much a “color by numbers” implementation:

(define (cube-iter last-guess this-guess x)
  (if (good-enough? last-guess this-guess x)
      this-guess
      (cube-iter this-guess (cube-improve this-guess x) x)))

(define (cube-improve guess x)
  (/ (+ (/ x (square guess)) (* 2 guess)) 3))

(define (cube x)
  (cube-iter 0 1.0 x))

Testing it is also straightforward:

  • Take the cube root Y of a number, X
  • Use EXPT to cube Y
  • See if the number you get back is X

The tests look good. Note that GOOD-ENOUGH? is still being traced (just for fun).

> (cube 16)
[Enter (good-enough? 0 1. 16)
 Leave good-enough? #f]
[Enter (good-enough? 1. 6. 16)
 Leave good-enough? #f]
[Enter (good-enough? 6. 4.14815 16)
 Leave good-enough? #f]
[Enter (good-enough? 4.14815 3.07538 16)
 Leave good-enough? #f]
[Enter (good-enough? 3.07538 2.61415 16)
 Leave good-enough? #f]
[Enter (good-enough? 2.61415 2.5232 16)
 Leave good-enough? #f]
[Enter (good-enough? 2.5232 2.51985 16)
 Leave good-enough? #f]
[Enter (good-enough? 2.51985 2.51984 16)
 Leave good-enough? #t]
2.51984
> ##
2.51984
> (expt ## 3)
16.
> (expt 16 3)
4096
> (cube ##)
[Enter (good-enough? 0 1. 4096)
 Leave good-enough? #f]
[Enter (good-enough? 1. 1366. 4096)
 Leave good-enough? #f]
[Enter (good-enough? 1366. 910.667 4096)
 Leave good-enough? #f]
[Enter (good-enough? 910.667 607.113 4096)
 Leave good-enough? #f]
[Enter (good-enough? 607.113 404.746 4096)
 Leave good-enough? #f]
[Enter (good-enough? 404.746 269.839 4096)
 Leave good-enough? #f]
[Enter (good-enough? 269.839 179.911 4096)
 Leave good-enough? #f]
[Enter (good-enough? 179.911 119.983 4096)
 Leave good-enough? #f]
[Enter (good-enough? 119.983 80.0836 4096)
 Leave good-enough? #f]
[Enter (good-enough? 80.0836 53.6019 4096)
 Leave good-enough? #f]
[Enter (good-enough? 53.6019 36.2098 4096)
 Leave good-enough? #f]
[Enter (good-enough? 36.2098 25.1812 4096)
 Leave good-enough? #f]
[Enter (good-enough? 25.1812 18.9407 4096)
 Leave good-enough? #f]
[Enter (good-enough? 18.9407 16.4329 4096)
 Leave good-enough? #f]
[Enter (good-enough? 16.4329 16.0113 4096)
 Leave good-enough? #f]
[Enter (good-enough? 16.0113 16. 4096)
 Leave good-enough? #f]
[Enter (good-enough? 16. 16. 4096)
 Leave good-enough? #t]
16.

Exercise 1.7

Each of the following two procedures defines a method for adding two positive integers in terms of the more primitive operators 1+, which increments its argument by 1, and -1+, which decrements its argument by 1.

(define (incr n) (+ n 1))               ;Modern 1+
(define (decr n) (- n 1))               ;Modern -1+

;; The FOO and BAR designators are just used to differentiate these
;; procedures from our built-in addition procedure (and from each
;; other).
(define (foo+ a b)
  (if (= a 0)
      b
      (incr (foo+ (decr a) b))))

(define (bar+ a b)
  (if (= a 0)
      b
      (bar+ (decr a)
            (incr b))))

Using the substitution model, illustrate the process generated by each procedure in evaluating (+ 4 5). Are these processes iterative or recursive?

Let’s start by tracing FOO+. This wasn’t strictly necessary, but I like tracing things.

> (foo+ 3 4)
[Enter (foo+ 3 4)
[Enter (foo+ 2 4)
[Enter (foo+ 1 4)
[Enter (foo+ 0 4)
 Leave foo+ 4]
 Leave foo+ 5]
 Leave foo+ 6]
 Leave foo+ 7]
7

As you can see, we keep deferring operations until we reach our base case. This is shown in the way we enter the recursive FOO+ procedure call four times, and each time we decrement the variable A, but B remains the same. Then, when we hit our base case (when A is equal to 0), we make the addition calls we’ve been storing up all this time. You can tell this is happening because every time we `Leave’ the call to FOO+, we’re still adding to the value.

Here’s another, possibly better way to visualize it:

(foo+ 3 4)
(incr (foo+ (decr 3) 4))
(incr (incr (foo+ (decr 2) 4)))
(incr (incr (incr (foo+ (decr 1) 4))))
(incr (incr (incr 4)))

Therefore this procedure uses O(n) time in B and O(n) space in A (assuming it’s called as (FOO+ A B).)

Now let’s look at BAR+. Here’s the text of the procedure again:

(define (bar+ a b)
  (if (= a 0)
      b
      (bar+ (decr a)
            (incr b))))

BAR+ looks like it will generate an iterative process. Its self-call appears to be in tail position. Let’s see how it looks in the REPL:

> (bar+ 3 4)
[Enter (bar+ 3 4)
[Enter (bar+ 2 5)
[Enter (bar+ 1 6)
[Enter (bar+ 0 7)
 Leave bar+ 7]
 Leave bar+ 7]
 Leave bar+ 7]
 Leave bar+ 7]
7

In this case, we can be pretty sure we’re generating an iterative process (with no deferred computations being built up) because when we `Leave’ the call to BAR+ we’re not doing anything additional to the result, it’s just being passed back up to the top-level unchanged.

Here’s a trace written out by hand:

(BAR+ 4 3)
(BAR+ 3 4)
(BAR+ 2 5)
(BAR+ 1 6)
(BAR+ 0 7)

As you can see, we don’t defer operations that take up additional space. Therefore we’re O(1) in space, and O(n) in time for this implementation.

(Image courtesy SallySetsForth under Creative Commons license.)

SICP Exercises 1.1 through 1.4

../img/necker-cube.png

Some time ago, I got a copy of the first edition of SICP on eBay. It’s a fascinating book, for many reasons. I’ve been slowly working through it in fits and starts, but I’ve decided that this year I’m going to really focus on finishing this book and doing all of the exercises.

Therefore, for the sake of completeness, I’d like to backtrack a bit and record my work on SICP exercises 1.1 through 1.4. My goal is to be able to do every single exercise from SICP (1st ed.) and explain it here. (You can see all of the SICP-related posts here.)

For each of the exercises below, I’ll first provide the problem text from the book, and then my answer (or attempt at an answer).

Exercise 1.1

Below is a sequence of expressions. What is the result printed by
the interpreter in response to each expression? Assume that the
sequence is to be evaluated in the order in which it is presented.

Here are the expressions, followed by my answers.

Expression Result
10 10
(+ 5 3 4) 12
(- 9 1) 8
(/ 6 2) 3
(+ (* 2 4) (- 4 6)) 6
(define a 3) UNDEFINED
(define b (+ a 1)) 4
(+ a b (* a b)) 19
(= a b) #f
(if (and (> b a) (< b (* a b))) b a) 4
(cond ((= a 4) 6) ((= b 4) (+ 6 7 a)) (else 25)) 16

Exercise 1.2

Define a procedure that takes three numbers as arguments and
returns the sum of the squares of the two larger numbers.

Here is the procedure definition, along with a couple of helpers.

(define (square x)
  (* x x))

(define (sos x y)
  (+ (square x) (square y)))

(define (sos-largest-two a b c)
  (let* ((sorted (sort (list a b c) >))
         (large (first sorted))
         (medium (second sorted)))
    (sos medium large)))

Exercise 1.3

Ben Bitdiddle has invented a test to determine whether the
interpreter he is faced with is using applicative-order evaluation
or normal-order evaluation. He defines the following two
procedures:

(define (p) (p))

(define (test x y)
  (if (= x 0)
      0
      y))

Then he evaluates the expression

(test 0 (p))   

What behavior will Ben observe with an interpreter that uses
applicative-order evaluation? What behavior will he observe with an
interpreter that uses normal-order evaluation? Explain your answer.
(Assume that the evaluation rule for the special form IF is the
same whether the interpreter is using normal or applicative order:
The predicate expression is evaluated first, and the result
determines whether to evaluate the consequent or the alternative
expression.)

An interpreter that uses applicative-order evaluation will enter an infinite recursion.

An interpreter that uses normal-order evaluation will return 0.

The reason for this is that an applicative-order interpreter follows the usual “evaluate arguments, then apply” recipe. I believe this is also known as “call-by-value”.

The normal-order interpreter never gets around to evaluating the second argument to test because it doesn’t need to (and of course, once inside test, the if form behaves as in the problem description).

Exercise 1.4

Alyssa P. Hacker doesn’t see why if needs to be provided as a
special form. “Why can’t I just define it as an ordinary
procedure in terms of cond?” she asks. Alyssa’s friend Eva Lu
Ator claims this can indeed be done, and she defines a new version
of if:

(define (new-if predicate then-clause else-clause)
  (cond (predicate then-clause)
        (else else-clause)))

Eva demonstrates the program for Alyssa:

==> (new-if (= 2 3) 0 5)
5

==> (new-if (= 1 1) 0 5)
0

Delighted, Alyssa uses new-if to rewrite the square-root program:

(define (sqrt-iter guess x)
  (new-if (good-enough? guess x)
          guess
          (sqrt-iter (improve guess x) x)))

What happens when Alyssa attempts to use this to compute square
roots? Explain.

Running this procedure exceeded MIT Scheme’s maximum recursion depth. I decided to trace the procedure to better see what was happening (note that I renamed the procedures to alyssa-sqrt-iter and alyssa/sqrt to differentiate them from the original versions).

(trace alyssa-sqrt-iter)
;Unspecified return value

(alyssa/sqrt 100)
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 1.
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 50.5
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 26.24009900990099
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 15.025530119986813
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 10.840434673026925
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 10.032578510960604
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 10.000052895642693
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 10.000000000139897
;;           100]
;; [Entering #[compound-procedure 12 alyssa-sqrt-iter]
;;     Args: 10.
;;           100]
;;  ... et cetera ...

;Quit!

The last line, where the arguments to alyssa/sqrt-iter are 10 and 100, repeats hundreds of times. Even though the call to good-enough? (shown again below) returns #t, new-if still ends up evaluating the `else-clause’ and causing the linear recursion instead of returning the current value of guess (as it should).

(new-if (good-enough? guess x)
        guess
        (sqrt-iter (improve guess x) x))

This problem illustrates the reason we need if to be a special form: we need it to exhibit “short-circuiting” behavior, i.e., we need it to avoid evaluating all of its arguments.

SICP Exercises 1.10 through 1.13

../img/sinayskaya-12.jpg

Intro

In this post I’ll share some of the fun I had doing exercises 1.10 through 1.13 from Structure and Interpretation of Computer Programs. Remember that I’m working from the first edition, since it’s what I have. Fortunately it’s not too different from the second edition so far; I’ve written the exercises out so you don’t need the book to follow along.

Exercise 1.10

Draw the tree illustrating the process generated by the count-change procedure of section 1.2.2 in making change for 11 cents. What are the orders of growth of the space and time used by this process as the amount to be changed increases?

Having already drawn the tree out by hand for an earlier post, I noticed a lot of inefficient procedure calls, and sidetracked a little to implement a version called cc-faster which massively reduced their number. For example, when counting change for 600 cents (using all 5 types of coins), the number of procedure calls was reduced from 29,806,269 to 233,609. In other words, it made approximately 128 times fewer recursive calls for the same input! Even so, cc-faster still generates a tree-recursive process and is destined to blow up. Depending on your application it might be good enough though.

Space

But I digress. On p. 11 of the text, the authors state that

In general, the time required by a tree-recursive process will be proportional to the number of nodes in the tree, while the space required will be proportional to the maximum depth of the tree.

Therefore I’m happy to guesstimate the order of growth of the space used by (cc 11 5) to be somewhere in the neighborhood of O(n), where n is the amount to be changed. (Looking at the earlier post mentioned above, it looks like cc-faster is also O(n), but in the number of different kinds of coins.)

Time

Speaking of guesstimation, I decided to instrument the cc procedure in the following hackish manner:

;;; `C-CC': A VERSION OF `CC' THAT COUNTS HOW OFTEN IT CALLS ITSELF
;; Yes, a gross hack. But written quickly! :-)

(define *call-count* 0)

(define-syntax incf!
  (syntax-rules ()
    ((incf! var)
     (begin (set! var (+ 1 var))
            var))))

(define (c-cc amount kinds-of-coins)
  (begin (incf! *call-count*)
         (cond 
          ;; ((= kinds-of-coins 1) 1) ; Uncomment this to greatly
          ;; lessen the # of procedure calls by using the `cc-faster'
          ;; algorithm.
          ((= amount 0) 1)
          ((or (< amount 0) (= kinds-of-coins 0)) 0)
          (else (+
                 (c-cc amount (- kinds-of-coins 1))
                 (c-cc (- amount (first-denomination kinds-of-coins))
                       kinds-of-coins))))))

(define (print-call-count-statistics n #!optional display-proc)
  (set! *call-count* 0)
  ; Use the special version of `cc' above
  (c-cc n 5)
  (display *call-count*)
  (newline)
  (set! *call-count* 0)
  (if (not (default-object? display-proc))
      (begin (display (apply proc (list *call-count*)))
             (newline))))

This allowed me to do the following measurements:

;; CALL COUNT STATISTICS FOR THE ORIGINAL `CC' ALGORITHM
; (for-each (lambda (n) (print-call-count-statistics n)) '(50 100 150
; 200 250 300 350 400 450 500 550 600))
; Note: the number in the right column comes from dividing the current
; value by the previous.
;    1571
;   15499 0.101361378153
;   71775 0.215938697318
;  229589 0.312623862642
;  587331 0.390902234004
; 1292591 0.454382708838
; 2552159 ...
; 4642025
; 7917379
;12822611
;19901311
;29806269 0.667688767085
;Unspecified return value

I plotted the results of my measurements and tried to do a little curve-fitting by hand. I initially settled on x^e * ϕ, but it didn’t grow steeply enough to match cc‘s growth as you can see in the plot below. (This guess seemed reasonable at first because I only went up to 450 with cc.)

Then I started dividing the number of nodes in the call tree for input n by the number of nodes generated by input n+1 (see the right column in the code block above). At first glance it looks like cc‘s order of growth is a function that will asymptotically approach 1.

../img/call-counts.jpg

So what’s my final answer? cc is exponential in a bad way. Don’t use it! :-)

Exercise 1.11

Design a procedure that evolves an iterative exponentiation process that uses successive squaring and works in logarithmic time, as does fast-exp.

Unlike Exercise 1.9, the authors walk you through how to do this exercise. Even with their patient description, I managed to implement a completely wrong solution. I finally figured out what I was doing wrong by going back and reading the text, and then modeling the procedure by hand on paper. It turns out I hadn’t understood their explanation after all! So I guess the late Prof. Dijkstra really was onto something.

While trying to debug it, I wrote fast-expt-tester, which showed me how hopelessly off the rails I was, but not necessarily how to fix it. Thank goodness for pen and paper, the ultimate debugging tools (at least for a student-level problem like this).

; Call it like so: (fast-expt-iter 7 7 1)
(define (fast-expt-iter b n a)
  (cond 
   ((= n 0) a)
   ((even? n)
    (fast-expt-iter (square b) (/ n 2) a))
   (else (fast-expt-iter b (- n 1) (* b a)))))

(define (fast-expt-tester rounds)
    (let loop ((n rounds))
      (if (= n 0)
          (display "all done")
          (let* ((z1 (random 20))
                 (z2 (random 20))
                 (correct (expt z1 z2))
                 (maybe-correct (fast-expt-iter z1 z2 1))
                 (message (string-append
                           "expt's answer\n"
                           (number->string correct)
                           "\nis not equal to fast-expt-iter's answer\n"
                           (number->string maybe-correct)
                           "\nfor inputs: "
                           (number->string z1)
                           " "
                           (number->string z2)
                           "\n")))
            (if (not (= correct maybe-correct))
                (begin (display message)
                       (newline)
                       (loop (- n 1))))))))

Exercise 1.12

… One can perform integer multiplication by means of repeated addition. … Now suppose we include, together with addition, the operations double, which doubles an integer, and halve, which divides an (even) integer by 2. Using these, design a multiplication procedure analogous to fast-exp that works in logarithmic time.

This was slightly simpler to understand than 1.11 (at least for me).

(define (halve n)
  (/ n 2))

(define (double n)
  (+ n n))

(define (rml/fast-* a b)
  (cond ((= b 0) 0)
        ((even? b)
         (rml/fast-* (double a) (halve b)))
        (else
         (+ a (rml/fast-* a (- b 1))))))

(* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/fast-* 123123123123 123123123123)
;Value: 15159303447561417273129

Exercise 1.13

Using the results of exercises 1.11 and 1.12, devise a procedure that generates an iterative process for multiplying two integers in terms of adding, doubling, and halving and works in logarithmic time.

I have to admit it, expressing iteration using recursive procedure definitions is great fun!

(define (rml/iter-* a b c)
  (cond ((= b 0) c)
        ((even? b)
         (rml/iter-* (double a) (halve b) c))
        (else
         (rml/iter-* a (- b 1) (+ a c)))))

(* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/fast-* 123123123123 123123123123)
;Value: 15159303447561417273129

(rml/iter-* 123123123123 123123123123 0)
;Value: 15159303447561417273129

(Origami image courtesy Melisande under Creative Commons license.)

An Iterative Change-Counting Procedure


… the process is really iterative: Its state is captured completely
by its three state variables, and an interpreter need keep track of
only three variables in order to execute the program.

– SICP (1st ed.), p.33

After much head-scratching and experimentation, I have implemented the following iterative change-counting procedure, which I call cc-linear (See also previously, previously):

(define (cc-linear amt kinds)
  (define (cc-linear-aux amount kinds-of-coins count stack)
    (cond ((or (= amount 0)
               (= kinds-of-coins 1))
           (if (null? stack)
               (+ count 1)
               (let* ((new-amt-pair (car stack))
                      (new-stack (cdr stack)))
                 (cc-linear-aux (car new-amt-pair) (cdr new-amt-pair) (+ 1 count) new-stack))))
          ((or (< amount 0)
               (= kinds-of-coins 0))
           (if (null? stack)
               count
               (let ((new-amt-pair (car stack))
                     (new-stack (cdr stack)))
                 (cc-linear-aux (car new-amt-pair) (cdr new-amt-pair) count new-stack))))
          (else (cc-linear-aux amount
                             (- kinds-of-coins 1)
                             count
                             (cons (cons (- amount (first-denomination kinds-of-coins))
                                         kinds-of-coins)
                                   stack)))))
  (cc-linear-aux amt kinds 0 '()))

Note that the definition of the first-denomination procedure is given in the text, as well as being downloadable from the SICP website.

This procedure returns the correct answer. There are a few issues with it, however:

  • It uses cons, let*, car, cdr, and null?, none of which are available to the student at this point in the text.
  • It simulates a stack by passing the list of delayed computations as an argument to subsequent invocations of itself. This means that it doesn’t generate a “linear iterative” process, since the amount of space used is not constant.

As for the use of car, cdr, and friends, I’m willing to let it pass since I still learned a lot by forcing myself to think in terms of iterating via recursive procedure calls.

As far as the shape of the process being generated is concerned, it can be described as “resembling linear recursion”. In the course of implementing this solution I wrote out the following simulation by hand of changing $0.12 using only two kinds of coins, a nickel and a penny.

Pre-visualizing the process to be generated by `cc-linear’

(12 2) 0 ()
( 2 2) 0 ((12 1))
(-3 2) 0 ((2 1) (12 1))
( 2 1) 0 ((12 1))
( 1 1) 0 ((2 0) (12 1))
( 0 1) 0 ((2 0) (12 1))
( 2 0) 1 ((12 1))
(12 1) 1 ()
(11 1) 1 ((12 0))
(10 1) 1 ((11 0) (12 0))
( 9 1) 1 ((10 0) (11 0) (12 0))
( 8 1) 1 (( 9 0) (10 0) (11 0) (12 0))
( 7 1) 1 (( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 6 1) 1 (( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 5 1) 1 (( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 4 1) 1 (( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 3 1) 1 (( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 2 1) 1 (( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 1 1) 1 (( 2 0) ( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 0 1) 1 (( 1 0) ( 2 0) ( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))
( 1 0) 2 (( 2 0) ( 3 0) ( 4 0) ( 5 0) ( 6 0) ( 7 0) ( 8 0) ( 9 0) (10 0) (11 0) (12 0))

Et cetera…

As you can see, the stack continues to grow, and passing that huge association list in as a procedure argument is not without cost, as the following performance analysis shows (for the definitions of with-timing-output and cc-faster, see here.):

Performance analysis of `cc’ vs. `cc-faster’ vs. `cc-linear’ – `cc-linear’ spends a fair amount of time in garbage collection, presumably due to all of the list manipulation.

(with-timing-output (cc 292 5))
Run time: 3.42
GC time: .03
Actual time: 3.478
;Value: 8526

(with-timing-output (cc-faster 292 5))
Run time: .05
GC time: 0.
Actual time: .047
;Value: 8526

(with-timing-output (cc-linear 292 5))
Run time: .11
GC time: .04
Actual time: .15
;Value: 8526

Attempting to Count Change, Iteratively

I have discovered what I believe is a nice optimization for the tree-recursive change-counting procedure, cc, mentioned in a previous post. (In the first edition, it’s Exercise 1.9.)

Unfortunately I’m still struggling with the real problem: how to implement this problem to evolve an iterative process, as requested by the book. My attempts to recast the computation using only variables passed as procedure arguments are failing. These attempts always need more “registers” than I have available, and I start thinking in terms of some kind of “storage” where I can stash the intermediate problems that I can’t work on yet.

Of course, these “registers” are arguments to the function, and this “storage” is also known as the stack. And growing the stack is what we’re trying to avoid.

Scannning ahead in the text, Exercise 1.10 asks the reader to

Draw the tree illustrating the process generated by the (cc) procedure … making change for 11 cents.

Here’s what appeared in my notebook – Note that the procedure calls surrounded by boxes are the ones that return a value of 1. In this case, the return value of (cc 11 3) is 4, shown by the 4 boxed procedure calls:

../img/cc-larger-tree-scaled.jpg

While drawing out this tree, I noticed that once the procedure tries to make change for an amount using only pennies, it makes (* 2 amount) unnecessary procedure calls. It’s easy to see that they just bang down the list from amount to 0 and finally return 1. My idea was to reformulate the procedure so that we perform this check first, like so:

(define (cc-faster amount kinds-of-coins)
  (cond ((= kinds-of-coins 1) 1) ; Are we making change with pennies? If so, there's only one way to do it!
        ((= amount 0) 1)
        ((or (< amount 0)
             (= kinds-of-coins 0)) 
         0)
        (else (+
               (cc-faster amount (- kinds-of-coins 1))
               (cc-faster (- amount (first-denomination kinds-of-coins)) 
                   kinds-of-coins)))))

Calling this procedure results in the following tree of procedure calls:

../img/cc-smaller-tree-scaled.jpg

Note that instead of 55 procedure calls, we now make only 15! And the discrepancy only grows wider as we count larger amounts of change, prompting me to make the following measurements using MIT Scheme – the with-timing-output macro is described at the bottom:

(with-timing-output (cc 700 5))
; Run time:     75.81
; GC time:      .34
; Actual time:  76.641
;Value: 207530

(with-timing-output (cc-faster 700 5))
; Run time:     .56
; GC time:      .02
; Actual time:  .608
;Value: 207530

This reduces running time from over a minute to less than a second, which is kind of exciting! It’s still not an iterative process, though.

Just for completeness, here’s how long the same procedure takes using m-cc, the memoizing version of cc described previously. Obviously looking stuff up in memory is faster than computing it over and over again:

(with-timing-output (m-cc 700 5))
; Run time:     0.
; GC time:      0.
; Actual time:  0.
;Value: 207530

And here’s the definition of the with-timing-output macro. It’s just sugar on top of the with-timings measurement procedure built into MIT Scheme:

(define-syntax with-timing-output
  (syntax-rules ()
    ((with-timing-output body)
     (with-timings 
         (lambda () body) 
       (lambda (run-time gc-time real-time)
         (display "Run time:\t")
         (write (internal-time/ticks->seconds run-time))
         (newline)
         (display "GC time:\t")
         (write (internal-time/ticks->seconds gc-time)) 
         (newline)
         (display "Actual time:\t")
         (write (internal-time/ticks->seconds real-time))
         (newline))))))

Next step: ITERATION!

Trading Space for Time

Working through SICP for the first time. Found a used first edition (hardcover, no less) on eBay. Some misguided person had marked each of the covers up with a giant X in black marker.

The first edition differs from the second in a few ways. For example, at the end of the tree-recursive change-counting problem on pages 39-40 of the second edition, the authors say “it is not obvious how to design a better algorithm for computing the result, and we leave this problem as a challenge”. They also note that memoization can be used to reduce the number of redundant computations that occur.

In other words, we can trade space for time.

In the first edition, Exercise 1.9 asks you to design a procedure that evolves an iterative process for solving the change-counting problem. I’m working on a solution to that problem; in the meantime, I implemented a procedure that I call memoizing-apply which does the following:

  1. See if an answer has already been computed and stored in a table *mr* (short for `memoized-results’).
  2. If the answer hasn’t already been computed, compute it. Then stick the answer in the table *mr*.

This process is known as memoization or tabulation, and *mr* serves as a lookup table.

We define a table *mr*, and a procedure memoized? that checks if the procedure has already been called with the current arguments. This assumes that the procedure always returns the same answer when you give it the same arguments.

(define *mr* '())

(define (memoized? args)
  (let ((result (assoc args *mr*)))
    (if result
        (cdr result)
        #f)))

Memoizing-apply does what we described above: it checks if the answer has already been computed (is already in the lookup table), and if so it just returns the answer. If not, it computes the answer and sticks it in the table.

(define (memoizing-apply f xs)
  (or (memoized? xs)
      (let ((result (apply f xs)))
      (begin (push! (cons xs result) *mr*)
             result))))

Note that push! is not a standard Scheme procedure; here’s the definition:

(define-syntax push!
  (syntax-rules ()
    ((push! item seq)
     (set! seq (cons item seq)))))

Let’s try it out. (Note that you can get the definitions for the procedures cc and first-denomination from a freely available copy of the book available at the SICP website.)

Here’s my modified version of cc, the change-counting procedure. I call it m-cc, and it uses the memoizing-apply procedure above to avoid doing redundant work:

(define (m-cc amount kinds-of-coins)
  (cond ((= amount 0) 1)
        ((or (< amount 0) (= kinds-of-coins 0)) 0)
        (else (+
               (memoizing-apply m-cc (list amount (- kinds-of-coins 1)))
               (memoizing-apply m-cc (list (- amount (first-denomination kinds-of-coins))
                               kinds-of-coins))))))

The above code is different from the original in that it uses memoizing-apply to make the recursive call to itself. In my testing this is much faster than the equivalent cc version.

The memoization technique works very nicely, but it’s only appropriate in situations where memory is freely available. On most computers, most of the time, it’s probably the best way to go, since the tree-recursive implementation is readable and easy to understand. The exercise that asks you to solve the problem using an iterative process will likely be more complicated, but I’m not sure yet.