Make your own iterators in Scheme using closures

idle-machines

Introduction

I recently bought a copy of Higher Order Perl (also known as HOP) by Mark Dominus. In chapter 4, he introduces iterators. At a high level, an iterator is a black box with an input funnel, an output funnel, and a big red “NEXT” button. You put something into the funnel at time T; then, at some later time T+n, you press the “NEXT” button and something useful comes out.

Iterators are useful in cases when you want to access a potentially huge list of things one at a time, but you don’t have access to a computer with infinite memory.

In this post we’ll look at a moderately useful iterator example; a directory walker that recurses down a directory tree on your file system. The implementation is given in Scheme (specifically, the scsh dialect); the code and the descriptions thereof are heavily inspired by Dominus’ excellent book. However, this post contains the work of an amateur. Any errors or omissions in the following code and explanations are mine.

Iterators are made with closures

In order to build a useful iterator, we need to have a basic understanding of closures. A closure is just a function packaged up with an environment. The environment allows us to store up state between function calls. As noted above, it’s easiest just to think of it as a little black box that we can pass around in our program.

In Scheme and Common Lisp, you build functions with LAMBDA. Here’s a very simple closure example in Scheme, translated from the first example given in HOP:

(define (upto m n)
  (set! m (- m 1)) ;needed since we don't have post-increment ($i++)
  (lambda ()
    (if (< m n)
        (begin (set! m (+ m 1)) m)
        #f)))

This is a function that returns a function (a closure). To use the closure returned by UPTO, assign it to a variable and call it whenever you want the next value. When the iterator is exhausted, it will return #f.

> (define 1to10 (upto 1 10))
> (1to10)
1
> (1to10)
2
...
> (1to10)
9
> (1to10)
10
> (1to10)
#f

A trivial file tree iterator

A more interesting and practical use of this technique is to walk a directory tree on your computer. In the Scheme procedure below, we take a list of directories and then build and return a closure that, when called, will walk the directory tree breadth-first, printing out the names of files as we go.

(define (dir-walk queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files)))
                          file))
                       ((file-regular? file) file)
                       (else #f)))))))

The important part to notice is the inner LAMBDA. This is where we create a closure by packaging up a procedure with an environment. The closure’s environment remembers the contents of the variable QUEUE in memory, so we can ask for the elements of QUEUE at our leisure.

Here it is running on my machine:

> (define dir-iter1 (dir-walk '("/Users/rloveland/Desktop/current/")))
> (dir-iter1)
"/Users/rloveland/Desktop/current/XProlog.jar"
> (dir-iter1)
"/Users/rloveland/Desktop/current/all-jira-issues.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/automate-campaign-setup.txt"
> (dir-iter1)
"/Users/rloveland/Desktop/current/buflocal.scm"
> (dir-iter1)
"/Users/rloveland/Desktop/current/bundle.js"

Just for fun, here’s a version of the same function in Common Lisp. It’s essentially the same as the scsh version, save that it uses the nice CL pathname machinery instead of appending raw strings, and also makes use of the CL-FAD convenience library.

(defun dir-walk (queue)
  (let ((file nil))
    (lambda ()
      (if (not (null queue))
          (progn
            (setf file (car queue))
            (setf queue (cdr queue))
            (cond ((cl-fad:directory-pathname-p file)
                   (let ((new-files (cl-fad:list-directory file)))
                     (setf queue (append queue (mapcar #'(lambda (filename)
                                                           (merge-pathnames filename file))
                                                       new-files)))
                     file))
                  ((cl-fad:file-exists-p file) file)
                  (t nil)))))))

Usage is also essentially the same:

CL-USER> (defvar *tree-walker* (dir-walk '(#P"/Users/rloveland/Desktop/current/")))
*TREE-WALKER*
CL-USER> (type-of *tree-walker*)
FUNCTION
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Desktop/current/"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.debuggerDefaults"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/.DS_Store"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hi"
CL-USER> (funcall *tree-walker*)
#P"/Users/rloveland/Dropbox/current/A.hs"

A slightly less trivial file tree iterator

Listing out files and directories is nice, but not that useful. We’d like a way to see only those files and directories that have some interesting property.

This can be accomplished by passing in another argument to our DIR-WALK function: this argument will be yet another function that will test the current file to see whether it’s interesting to us. It’s pretty easy to change DIR-WALK to accept a function argument, INTERESTING?. This arbitrary function is used to check the file to see if we care about it.

This time around, when we build our internal queue, we use a call to FILTER to make sure that only interesting files get added.

(define (dir-walk* interesting? queue)
  (let ((file #f))
    (lambda ()
      (if (not (null? queue))
          (begin (set! file (car queue))
                 (set! queue (cdr queue))
                 (cond
                  ((file-directory? file)
                        (let ((new-files (directory-files file)))
                          (set! queue (append queue (filter interesting? (map (lambda (filename)
                                                           (string-append file "/" filename))
                                                         new-files))))
                          (if (interesting? file) file)))
                  ((interesting? file) file)
                  (else #f)))
          #f))))

And here it is in use; in this example, we pass an INTERESTING? function that asks for only files that are not marked as executable:

> (define dir-iter2 (dir-walk* (lambda (f) (file-not-executable? f)) '("/home/rml/Desktop/current")))
> (dir-iter2)
> (dir-iter2)
"/home/rml/Desktop/current/A.hi"
> (dir-iter2)
"/home/rml/Desktop/current/A.hs"
> (dir-iter2)
"/home/rml/Desktop/current/A.o"
> (dir-iter2)
"/home/rml/Desktop/current/ABBREV.xlsx"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.class"
> (dir-iter2)
"/home/rml/Desktop/current/AddInts.java"

Conclusion

There are still bugs and unexpected behavior in DIR-WALK*. For example, on the first call to the resulting iterator there is no result due to the one-armed IF. There are also strange things that happen if we want to filter out directories from our results, since we mix together the FILE-DIRECTORY? check and the INTERESTING? check inside the iterator. However, despite these small nits, it can still do useful work on my machine 1, and it’s a good enough example of using closures to build iterators.

Here’s hoping you’ll enjoy playing with closures and iterators in your own code!

(Image courtesy zeitfanger.at under a Creative Commons License.)

Footnotes:

1 In the grandest internet tradition, “It works on my machine” ™.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s