Debugging in Lisp
|Column Tag:||AI Applications
Debugging in Lisp
By Andy Cohen, AI Contributing Editor
[Since the last discussion on Macscheme, Semantic Microsystems has updated their product. They are currently shipping version 1.1 which includes access via Macscheme primitives to the Quickdraw routines in ROM. Version 1.1 is also compatible with The Mac+ and HFS. Registered owners of earlier versions can update by sending $20 and their original MacScheme disk to Sematic Microsystems. Additionally, Semantic Microsystems announced the first part of a new development system called Macscheme+Toolsmith. This product includes Pascal-like access to the Toolbox routines in ROM from Macscheme. It also allows the programmer to design an apparently free standing application without the MacScheme environment. A minimized run time system will be required, however, the application and the run time system may be sold without royalties to Sematic Microsystems. Macscheme+ToolSmith will be sold at $250, while Macscheme is still $125 (Which makes us wonder, why $20 for the update?!). We've been pleased with the MacScheme product and the developers' responses to help requests. This month we feature a simple description on using the debugger in MacScheme. - Andy Cohen, A.I. Contributing Editor]
Debugging in Lisp
Semantic Microsystems, Inc.
Most Lisp systems include good debugging tools, in part because Lisp has traditionally been an interactive, interpreted language. I am going to explain the sorts of debugging tools that are typically provided by Lisp systems, how to take advantage of them and, in particular, how to use the debugging aids provided by the MacScheme Lisp system.
What kinds of debugging help can a system give you? For starters, it can tell you that an error exists. It can also help you find the error that it has told you about. Of course, it can't detect all errors. For example, only you can tell if your program is doing the right thing or generating the output you desire. However, if you know that a program produces the wrong output, or at least questionable output, a debugging tool can help you figure out why. Once you have located the source of the error, Lisp debugging tools can help you fix your buggy program and test it conveniently.
There are several standard debugging tools. Compile-time and run-time error checks discover errors. Inspectors let you examine the scene of the crime. If you suspect that there might be something wrong with a particular procedure, then tracers, steppers, and the ability to insert breakpoints in a program can help you see what is happening as the program is executed. Lastly, editors and incremental compilers can make it easy to modify small parts of your program and move on to the next bug. (One of the reasons Lisp interpreters are popular is that they give you incremental compilation for free.)
MacScheme has all of these tools except a stepper. I'll be using MacScheme in the examples that follow.
Error checks can be performed at compile-time or at run-time. A compiler checks for illegal syntax and some kinds of domain errors. (Compile-time checks for domain errors are usually called type-checks.) In MacScheme, when a compile-time error is detected, it is reported and the system performs a reset. For example, if is a reserved word, so it is not legal syntax to assign to it:
>>> (set! if 3)
ERROR: Keywords of special forms may
not be used as variables
(set! if 3)
If a procedure is known to the compiler, then the compiler can make sure it is given the right number of arguments. This is an example of a compile-time domain check.
>>> (cons 3 4 5)
ERROR: Wrong number of arguments to
(cons 3 4 5)
MacScheme, like most Lisps, performs very few compile-time domain checks. This is because Lisp associates types with run-time objects instead of with variables, so compile-time checking is not possible in general. This puts Lisp at a disadvantage compared to Pascal because run-time domain checks are performed only on the pieces of code that are actually executed. If you have a program that contains several branches, you can not be sure that it is free of domain errors until you test each of the branches. In Pascal, on the other hand, many domain errors are caught at compile-time, placing less of a burden on testing. Lisp must make up for this disadvantage by having excellent run-time domain checking.
Interpretive Lisp systems excel at run-time error checks. An interpreter will detect undefined variables and domain errors such as passing the wrong number of arguments to a procedure, using a non-procedural object as a procedure, attempting to add two strings, etc. Most systems interrupt execution, print a helpful error message on the screen, and place you inside an inspector so you can examine the context of the error. This is a far cry from a core dump or a bomb window.
The following example is from page 72 of The Little Lisper. vec+ is a function which takes two vectors and returns the vector that is their sum. (The Little Lisper represents vectors as lists instead of using Scheme's built-in vector data type.)
>>> (define vec+
(lambda (vec1 vec2)
((null? vec1) ())
(+ (car vec1) (car vec2))
(vec+ (cdr vec1)
>>> (vec+ '(3 4 5) '(1 1 1))
(4 5 6)
>>> (vec+ '(1 2 3 4) '(0 2))
ERROR: Bad argument to cdr
Entering debugger. Enter ? for help.
Oops! Let's track down the cause of this error message. We first ask for the name of the procedure in which the error occured.
debug:> i ; "i" means "identify procedure"
So the error occurred in our vec+ routine. Let's look at the code for vec+.
debug:> c ; "c" means "show code"
(lambda (vec1 vec2)
(cond ((null? vec1) ())
(cons (+ (car vec1) (car vec2))
(vec+ (cdr vec1)
Since the error message told us that cdr was passed a bad argument of (), we should think about the calls to cdr in vec+. (In Scheme, cdr is defined only on non-null lists. When applied to a non-null list, it returns a list consisting of everything in its argument except the first element.) Let's check out the values of vec1 and vec2 since these are the two possible arguments to cdr that could be causing the error.
debug:> a ; "a" means "all variables"
vec1 = (3 4)
vec2 = ()
So vec2 is the problem. Looking more carefully at the code, we can see that we're testing to see if vec1 is null but not vec2. We need to add a test for vec2.
Inside the MacScheme debugger you can find out the name of the procedure in which the error occurred, the arguments that were passed to that procedure, the variables accessible at that point and their values, and the chain of procedures awaiting results. You can move back to the context of any procedure awaiting a result and look at the variables that are accessible to that procedure. You can evaluate arbitrary expressions in the environment of any of the procedures along the chain. You can modify arguments passed to those procedures, and then resume computation. You can modify the values of variables and procedure definitions within the environments of any of the procedures along the chain and resume computation. In addition, you can exit the MacScheme debugger, perform some other computation, and then decide that you want to do additional inspecting of the last bug you encountered. You can reenter the debugger and be back in the state of the most recent error.
Breakpoints, Tracers, and Steppers
Let's shift gears now, and consider the following situation. Suppose your program is producing incorrect output. The traditional approach is to insert print statements in it. The problem with inserting print statements is that it is difficult to know what information to print, and if you print too much the information is unwieldy. It is better to insert breakpoints into your program. In MacScheme, when (break) is evaluated, execution is interrupted and you are placed in the MacScheme debugger. You can now use the full power of the inspector to examine the context of the breakpoint. Instead of having to decide at compile-time what features will be of interest (as you had to do when you debugged by inserting print statements), you get to decide interactively what to look at. For breakpoints to be really useful, your inspector must let you resume the computation. Then you can move on to the next breakpoint that you have set if you decide that things look just fine in the context of the first breakpoint.
It's also useful to be able to induce a break manually to snoop around in computations that seem to be taking a long time or acting weirdly. MacScheme lets you do so by selecting Break from the menu. This interrupts a computation and places you in the MacScheme debugger. When you have finished looking around, you can type an "r" to resume the computation.
If you suspect that a particular routine embedded in your program is producing bad outputs or receiving bad inputs, a convenient way to test your hypothesis is to trace the routine. Whenever a traced routine is called, the MacScheme tracer prints its name and the arguments it was passed. When the routine returns, its name and the value it returns are printed.
>>> (define (fact n)
(if (zero? n)
(* n (fact (- n 1)))))
>>> (fact 3)
>>> (trace fact)
>>> (fact 3)
Computing (#<PROCEDURE fact> 3)
Computing (#<PROCEDURE fact> 2)
Computing (#<PROCEDURE fact> 1)
Computing (#<PROCEDURE fact> 0)
(#<PROCEDURE fact> 0) --> 0
(#<PROCEDURE fact> 1) --> 0
(#<PROCEDURE fact> 2) --> 0
(#<PROCEDURE fact> 3) --> 0
Here we see that the recursive calls are passing the correct arguments, but the wrong results are returned.
Once you have determined the problem to be a routine that is receiving good inputs and producing bad outputs, and you have little idea why, you might want to step through the procedure to see what is going wrong. A stepper allows you to view the evaluation of each subexpression of a routine as it is happening. Some steppers let you use the full power of the inspector at each step along the way. You can view a stepper as the ability to insert breakpoints automatically between every expression of your program. A stepper is the one tool I've mentioned that MacScheme doesn't have. To explain why MacScheme does not have a stepper, we must look at the interaction between compiling and debugging.
Interpreters and Compilers
Interpreters are best for development and debugging because they work with the source code, which programmers understand. Compilers turn nice, readable source code into the gobbledygook of machine language--but that machine language sure runs fast. Most Lisp compilers also give up some run-time checking in order to get more speed. One of the main advantages of special-purpose Lisp machines over comparably priced conventional computers is that Lisp machines have special hardware to perform run-time checks very quickly.
MacScheme, like Smalltalk-80, uses a compromise. It compiles to byte code--the machine language for a hypothetical Lisp machine--and then interprets the byte code. This approach gives most of the speed of compiled code together with most of the nice debugging associated with interpreted code. The byte code is also more compact than either native code or source code if the source code is discarded after compilation, but MacScheme normally keeps the source code around to aid in debugging.
Though MacScheme keeps the source code for each user-defined procedure, it does not try to remember the correspondence between individual byte code instructions and source code subexpressions. There just isn't enough memory on a Macintosh to maintain the large tables that would be necessary, which is why MacScheme doesn't have a stepper. Smalltalk-80, on the other hand, does something very clever. Whenever you run its stepper, Smalltalk constructs those tables incrementally, by de-compiling the byte code if necessary.
Texas Instruments' PC Scheme for the IBM PC and TI Professional also uses byte code. Compared to MacScheme, PC Scheme emphasizes speed at some cost to debugging. For example, PC Scheme normally does not retain source code. Because the PC is slower than the Mac and can address less memory, this is a reasonable engineering compromise.
No set of debugging tools can make debugging easy, but they certainly can make it easier, faster, and more fun. We've looked at the help provided by compile-time and run-time error checking, inspectors, breakpoints, tracers and steppers. These tools were first developed for Lisp systems in the 1960's, yet are still hard to find in other languages. Lisp's supportive programming environment is one of the main reasons why so many Lisp programmers are fanatical about their favorite language.
Daniel Friedman and Matthias Felleisen. The Little Lisper. Second Edition. Chicago: Science Research Associates, Inc. (ISBN 0-574-21955-2) 1986.
Adele Goldberg. Smalltalk-80: The Interactive Programming Environment. Menlo-Park: Addison-Wesley Publishing Co. 1984.
How to Turn Off the Debugger
If the automatic placement into the debugger is confusing or it is just a pain to you, try the following procedure sent to us by the folks at Semantic Microsystems:
;;; (debugger #!false) turns off the debugger.
;;; (debugger #!true) turns on the debugger.
;;; (debugger) returns #!true if the debugger is on,
;;; otherwise it returns #!false
(let ((state #!true)
(fake-debug (lambda args (reset))))
(lambda flags ; flags is a list of arguments
(if (null? flags)
state ; no arguments
(set! debug (if (car flags) real-debug fake-debug))
(set! state (car flags))
A couple of typos from the last column on MacScheme were identified by the author. The first was in one of the code samples using sort. The following is the correct code:
>>> (sort '("Even" "mathematicians" "are" "accustomed" "to" "treating"
"functions" "as" "underprivileged" "objects")
(lambda (x y)
(or (<? (string-length x) (string-length y))
(string<? x y))))
The second error was in the code defining the procedure make-counter. The code should have included the argument n as follows:
(set! n (+ n 1))
There was also an error in the third reference. The reference was supposed to read Joseph Stoy's Denotational Semantics of Programming Languages. Our thanks to Will Clinger of the Tektronix Computer Research Laboratory (and also of Semantic Microsystems).