TweetFollow Us on Twitter

Routines as Values

Volume Number: 13 (1997)
Issue Number: 10
Column Tag: Programming Techniques

Routines As Values

by Lawrence D'Oliveiro

The High-Level Programmer's Guide to Generating Code at Run-Time

Introduction

This article is about passing around routines (procedures and functions) as values. That is, you take the address of a routine, pass it to another routine as an argument, maybe even store it in a data structure for later use. Ultimately the whole point with a routine is that you make a call to it; but many interesting things can happen on the way there.

There are many useful programming techniques to do with routines as values. In my view, these techniques deserve to be more widely known than they are, because they can contribute so much to the principles of reusable code and information hiding that good programmers value. This article will give you an introduction to some of these techniques.

To make sense of this article, it helps to have some understanding of 68K assembly language and how it interfaces to the Mac OS. The definitive reference to 68K machine language is the 68000 Family Programmer's Reference, available from Motorola's Web site as an Acrobat PDF file at http://www.mot.com/hpesd/aesop/68k/68kprm.pdf. Links to related documents can be found at http://www.mot.com/SPS/HPESD/docs/pubs/index.html. For a description of 68K calling conventions on the Mac OS, see the Trap Manager chapter of Inside Macintosh: Operating System Utilities, available on-line at ftp://ftpdev.info.apple.com//Developer_Services/Technical_Documentation/Inside_Macintosh/Operating_System_Utilities/. [Another useful and brief article introducing the basics of 68K ASM is "The Secrets of the Machine" by Malcolm Teas in MacTech 9.5. It is available from www.mactech.com/Articles/Vol.09/09.05/Read-Assembly/index.html -- ed].

Language Notes

Most of the code examples in this article are written in Modula-2. Specifically, they were written for the (now-defunct) Metrowerks Modula-2 compiler that runs under Apple's MPW (Macintosh Programmer's Workshop). I use this combination because it's been the best language and development environment for the kinds of programs I write. I recently got hold of the p1 Modula-2 compiler from Germany, but have yet to start using that in earnest.

At the end of the article, I will discuss the applicability of these techniques to other languages: everybody's favorite, C, and everybody else's favorite, Pascal.

In the meantime, I'll explain a few things for those who might be unfamiliar with Modula-2, but are familiar with Pascal. And if you're not familiar with Pascal, then you're not a real programmer...

Modula-2 has a lot in common with Pascal, and most of the following code should be comprehensible to Pascal programmers. It has the same signed integer types as Pascal (INTEGER for 16-bit integers and LONGINT for 32-bit ones), but it also has unsigned integer types: CARDINAL for 16-bit ones and LONGCARD for 32-bit ones. Pointer types are defined using the syntax "POINTER TO t" instead of Pascal's "^t". And there is a general untyped pointer type, ADDRESS, which is compatible with any pointer type. The Pascal compilers on the Mac don't seem to have any equivalent to this; it is like the "void *" type in C. The built-in ADR function returns the address of something; it is equivalent to the "@" operator in Pascal or "&" operator in C. Like the Pascal version (but unlike the C one), it always returns an untyped pointer value, that is, its return type is ADDRESS.

If you want to do unsafe type conversions, there are two ways. One is the traditional record-variant trick from Pascal (or unions in C). The other is the use of the CAST function. You will see both in use later. Unlike Pascal or C, you can't just use a type name as a function to do a cast; this makes unsafe conversions a little easier to find in the code!

The big difference from Pascal, for the purposes of this article, is that Modula-2 allows you to declare routine types. That is, you can write pieces of code like this:

  TYPE
    MyProcType =
      PROCEDURE
       (
        INTEGER
       );

  VAR
    ProcVar : MyProcType;
    K : INTEGER;

  PROCEDURE MyProc
   (
    Arg : INTEGER
   );

  BEGIN
    ...
  END MyProc;

BEGIN (* main program *)
  MyProc(K); (* invokes MyProc *)
  ProcVar := MyProc;
  ProcVar(K); (* also invokes MyProc *)
  ...

In Pascal, you can take the address of the routine:

  Var
    ProcVar : Ptr;

  Procedure MyProc
   (
    Arg : Integer
   );

 Begin
  ...
 End {MyProc} ;

Begin
{ main program } 
  MyProc(K);
  ProcVar := @MyProc;
  ...

But there is no way to define a type like MyProcType in the Modula-2 version, and the language provides no standard mechanism for calling the routine once you get its address in ProcVar. Of course, you can get around this, in a type-unsafe way, with appropriate use of inline machine code.

C programmers will already be familiar with all this, since C has similar sorts of facilities to Modula-2 in this regard, albeit with slightly less elegant syntax:

  typedef void (*MyProcType)(short);

  MyProcType ProcVar;
  short K;

void MyProc
 (
  short Arg
 )
 {
  ...
 } /*MyProc*/

void main
 (
  void
 )
 {
  MyProc(K); /* invokes MyProc */
  ProcVar = &MyProc;
  (*ProcVar)(K); /* also invokes MyProc */
  ...

I did a quick search for Modula-2-related information on the Web, and was amazed at how much I found. Here are a few URLs to get you started: http://www.infm.ulst.ac.uk/~george/mp/pract/modula.html, http://www.ee.newcastle.edu.au/users/staff/peter/m2/Modula2.html, http://www.ise.fhg.de/personal_home_pages/hubertus/texte/tabcont.html.

Anyway, enough with the preliminaries, on with the show...

Action Routines

There is frequently a need to traverse a complex data structure, looking for elements that match a particular criterion, and doing some processing on them. For example, consider the System 7 process list: you might want to scan it to see if a particular program is running. Or you might want to total up information about all processes, or display them in a list for the user's information. In all these applications, there is a certain recurring pattern to the code:

  • Declare a variable named, say, CurrentProcess, of type ProcessSerialNumber. Initialize it to kNoProcess.
  • Call GetNextProcess, passing it CurrentProcess. If you get an error, exit the loop.
  • Perform whatever processing is appropriate on the process with the serial number in CurrentProcess. Then repeat from the previous step.

The code in the first two steps would be the same in every case. Ideally, you should be able to write it just once, and reuse this code everywhere you need to. How can you do this?

The answer is to write a generic process-traversal routine, which calls an action routine that you specify for each process. In Modula-2, you could define the type of the action routine as follows:

TYPE
  ProcessActions =
    PROCEDURE
     (
      VAR (*ThisProcess :*) ProcessSerialNumber,
      VAR (*ProcessInfo :*) ProcessInfoRec,
      VAR (*KeepGoing :*) BOOLEAN,
      (*ActionArg :*) ADDRESS
     );

The first two arguments pass all the information about the process that the Process Manager provides. The KeepGoing argument may be set to false by the action routine to stop the traversal. This is useful if, for example, you're searching for a particular process, and you don't need to search further once you've found it. The last argument allows the passing of additional data whose meaning is up to the caller; we'll see one interesting use for this later.

And here's the generic process-traversal routine:

PROCEDURE TraverseProcesses
 (
  Action : ProcessActions;
  ActionArg : ADDRESS
 );

  VAR
    ThisProcess : ProcessSerialNumber;
    ProcessInfo : ProcessInfoRec;
    ProcessName : Str255;
    AppSpec : FSSpec;
    KeepGoing : BOOLEAN;
    Err : OSErr;

BEGIN
  ThisProcess.highLongOfPSN := 0;
  ThisProcess.lowLongOfPSN := kNoProcess;
  ProcessInfo.processInfoLength := SIZE(ProcessInfoRec);
  ProcessInfo.processName := ADR(ProcessName);
  ProcessInfo.processAppSpec := ADR(AppSpec);
  KeepGoing := TRUE; (* initially *)
  LOOP
    Err := GetNextProcess(ThisProcess);
    IF Err <> noErr THEN
      EXIT
    END (*IF*);
    Err := GetProcessInformation(ThisProcess, ProcessInfo);
    IF Err = noErr THEN
      Action
       (
        ThisProcess,
        ProcessInfo,
        KeepGoing,
        ActionArg
       );
      IF NOT KeepGoing THEN
        EXIT
      END (*IF*)
    END (*IF*)
  END (*LOOP*)
END TraverseProcesses;

A sorting example

Another use for action routines is to isolate assumptions about data structures. Consider a common operation: sorting. You've probably got a collection of favorite sorting routines that you like to use. What sorts of assumptions do they make about the data being sorted? Does it have to be in an array? Can it be in a disk file? Do the sort keys have to reside together with the rest of the data? Why make any assumptions at all?

Many sorting routines can function by making only two kinds of accesses to the data: they need to be able to compare the keys of two records, and they need to be able to swap the positions of two records to get them into the right order. Let's define interfaces for action routines to perform these operations:

TYPE
  ComesBeforeProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  SwapProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     );

The one assumption I will make is that records are identifiable by index numbers starting from 1. A ComesBeforeProc returns true if the record with index First is supposed to come before the one with index Second (that is, they're already in the right order); it returns false otherwise. A SwapProc is called to swap two out-of-order elements: after it executes, the record which had index First should now have index Second, and vice versa.

As with the process-traversal example, each action routine takes an additional argument which may be used to pass additional caller-specified information.

Here's a sorting routine that uses the above action routines to implement a simple Shellsort:

PROCEDURE Sort
 (
  NrElements : LONGCARD;
  ComesBefore : ComesBeforeProc;
  ComesBeforeArg : ADDRESS;
  Swap : SwapProc;
  SwapArg : ADDRESS
 );

  VAR
    i, j, k, l, m : LONGCARD;

BEGIN
  i := 1;
  WHILE i < NrElements DO
    i := i + i
  END (*WHILE*);
  m := (i - 1) DIV 2;
  IF (m = 0) AND (NrElements > 1) THEN
   (* do at least one pass when sorting 2 elements *)
    m := 1
  END (*IF*);
  WHILE m > 0 DO
    k := NrElements - m;
    FOR j := 1 TO k DO
      i := j;
      LOOP
        l := i + m;
        IF ComesBefore(i, l, ComesBeforeArg)
        THEN
          EXIT
        END (*IF*);
        Swap(i, l, SwapArg);
        IF i <= m THEN
          EXIT
        END (*IF*);
        i := i - m
      END (*LOOP*)
    END (*FOR*);
    m := m DIV 2
  END (*WHILE*)
END Sort;

Of course, you could replace the body of this routine with just about any sorting algorithm you like, and keep the same interface. Here's a very simple, almost trivial example use of this routine to sort an array of integers. First we declare a data array, and comparison and swap routines for the sort:

TYPE
  ElementType = INTEGER;
VAR
  MyData : POINTER TO
    POINTER TO ARRAY [1 .. 10000] OF ElementType;
  MyNrElements : CARDINAL;

PROCEDURE MyComesBefore
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 ) : BOOLEAN;

BEGIN
  RETURN
      MyData^^[First]
    <=
      MyData^^[Second]
END MyComesBefore;

PROCEDURE MySwap
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 );

  VAR
    Temp : ElementType;

BEGIN
  Temp := MyData^^[First];
  MyData^^[First] := MyData^^[Second];
  MyData^^[Second] := Temp
END MySwap;

And here's the call that does the actual sort:

Sort
 (
  (*NrElements :=*) MyNrElements,
  (*ComesBefore :=*) MyComesBefore,
  (*ComesBeforeArg :=*) NIL,
  (*Swap :=*) MySwap,
  (*SwapArg :=*) NIL
 )

This example makes no use of the extra arguments that can be passed to the comparison and swap routines. Later we will see an interesting use for these extra arguments.

Closures

In languages like POP-11 and LISP, which are primarily used for Artificial Intelligence work, you can have a lot of fun with functions. You can even create new functions at run-time!

But why should AI researchers have all the fun? Can't you create functions at run-time in a common language like Pascal, Modula-2 or C? Certainly you can!

The bad news is: the techniques for doing this rely on particular features of the compiler and machine. It takes some work to get them right the first time. And certain compiler optimizations can signficant problems.

The good news is: once you get it right, it's a lot easier to do again. And these become very useful techniques, that you find yourself using over and over!

Technically, a Curried function (it's named after a person, not a dish) is created when you take a function with many arguments, and "bind" some of them to particular value; the result is a function of fewer arguments. A closure, on the other hand, is created when you bind a function to its parent environment, to create a new function which automatically sets up the correct environment for the original function when it's called.

To illustrate closures and environments, consider a common problem on the Mac: accessing A5 globals from an interrupt routine.

In most Mac languages, when you declare a bunch of global variables, the compiler allocates them in a block of memory accessed relative to register A5. A5 is also used to access the application's segment table, as well as QuickDraw's global storage. Thus, A5 is a very important register, and your application code is liable to behave very strangely if it's called with an incorrect value in A5, and it tries to access any global data.

Instead of putting special code to deal with this into all of your interrupt routines, the principle of reusable code dictates that you should write this code just once, and use it over and over. That is, you should write a general library routine which, given an interrupt routine, creates a "wrapper" routine for it that passes the correct A5 value. Here's an example of what I mean:

TYPE
  BoundInterruptProc = ARRAY [1 .. 9] OF CARDINAL;

A BoundInterruptProc is an array which will hold the the code for the wrapper routine. The wrapper code itself is filled in by BindInterruptProc, which is defined as follows:

PROCEDURE CurrentA5() : LONGCARD;
(* returns the current value of the global base register.*)

  CODE
    02E8DH;          (* move.l a5, (sp) *)

PROCEDURE BindInterruptProc
 (
  TheProc : ADDRESS;
  VAR Result : BoundInterruptProc
 );
(* constructs the wrapper around the specified procedure.*)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F0DH;          (* move.l a5, -(sp) *)
                              (* save previous A5 *)
  Result[2] := 02A7CH;          (* move.l #n, a5 *)
  TwoWords.LongValue := CurrentA5(); 
                              (* n for previous instr *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 04EB9H;          (* jsr p *)
  TwoWords.LongValue := TheProc;  (* p for previous instr *)
  Result[6] := TwoWords.HighWord;
  Result[7] := TwoWords.LowWord;
  Result[8] := 02A5FH;          (* move.l (sp)+, a5 *)
                              (* restore previous A5 *)
  Result[9] := 04E75H;          (* rts *)
  FlushCaches
END BindInterruptProc;

So instead of passing the address of your actual interrupt routine to the system, you pass the address of the appropriately-filled-in BoundInterruptProc structure, for example:

VAR
  Params : HParamBlockRec;
  BoundCompletion : BoundInterruptProc;
...

BindInterruptProc(ADR(MyCompletion), BoundCompletion);
Params.ioCompletion := ADR(BoundCompletion);
...
Err := PBReadAsync(ADR(Params));

and in your interrupt routine you have full access to all your A5 globals!

Note that FlushCaches call at the end of the definition of BindInterruptProc. This is necessary because of a piece of brain damage in Motorola's 68030 and 68040 processors. The problem is that, on these processors, memory references are divided into two kinds, data and code, and there is a separate CPU cache for each, with no attempt to maintain consistency between them. Thus, if a given memory location ends up in both caches (as could happen when you generate code at run-time, as BindInterruptProc does), it is possible for it to be modified from the data cache, without the instruction cache picking this up! Then, when you later try to execute the instruction at that memory location, you get the wrong value, and crash...

Here's how FlushCaches is defined. This definition is safe to use on all members of the 68000 CPU family, with system version 6.0 and later. It even works on Power Macs, where it flushes the cache of compiled code on models with the DR ("dynamic recompilation") emulator, and does nothing on older Power Macs.

PROCEDURE HWPrivPresent() : BOOLEAN;
(* has this system got the HWPriv trap. *)

  CONST
    UnimplementedTrap = 009FH;
    HWPrivTrap = 0A198H;

BEGIN
  RETURN
      GetOSTrapAddress(HWPrivTrap)
    <>
      GetOSTrapAddress(UnimplementedTrap)
END HWPrivPresent;

PROCEDURE FlushInstructionCache;
(* actually flushes both instruction and data caches. *)

  CODE
    07001H, (* moveq.l #1, d0 *)
    0A198H; (* _HWPriv *)

PROCEDURE FlushCaches;
 (* flushes instruction and data caches, if present. *)

BEGIN
  IF HWPrivPresent() THEN
    FlushInstructionCache
  END (*IF*)
END FlushCaches;

Binding Actual Arguments

Often, in a routine of yours that is called by the system, you don't need access to all your A5 globals, but you do need access to a particular block of data. Depending on the situation, the system may provide all kinds of different ways of passing the address of this block as an additional argument to your routine -- or it might not provide any way at all.

For example, the action routine you pass to the Control Manager's TrackControl routine gets the control handle as an argument, so you could use the refCon field in the control record to pass information to it. That is, unless the part of the control you're tracking is an indicator (such as a scroll bar thumb), in which case you don't get any arguments at all. Of course, if your code is part of the currently-running application, you could probably get away with using A5 globals in this case.

There are other, equally complicated situations. Time Manager tasks didn't use to get any arguments, until System 6.0.3, when they started getting the address of the TMInfo record in register A1. Memory Manager GrowZoneProcs don't get any arguments, and you can't guarantee that some part of the system (or some weird INIT) won't try to allocate some memory in your heap while the A5 value is set to something nonsensical.

Now, what do you think: would you prefer to resort to a range of different fiddly techniques to deal with each of these situations, or would you rather stick with one general, uniform technique that can handle them all?

I know which I prefer. I would Curry my routine by binding it to the address of my data, using the following Bind routine:

TYPE
 (* structure for holding Curried routine *)
  BoundProc = ARRAY [1 .. 8] OF CARDINAL;

PROCEDURE Bind
 (
  TheProc : ADDRESS;
  ToArg : ADDRESS;
  VAR Result : BoundProc
 );
 (*    creates a new procedure which calls TheProc
      with ToArg as an extra argument. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F17H;          (* move.l (sp), -(sp) *) )
  Result[2] := 02F7CH;          (* move.l #m, n(sp) *) 
  TwoWords.LongValue := ToArg;    (* m for above move *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 4;                (* n for above move *)
  Result[6] := 04EF9H;          (* jmp n *)
  TwoWords.LongValue := TheProc;  (* n for jmp *)
  Result[7] := TwoWords.HighWord;
  Result[8] := TwoWords.LowWord;
  FlushCaches
END Bind;

Note that you can use Bind on any procedure or function of just about any number of arguments, and (if a function) returning a result of any type. The only restrictions are that it must follow Pascal calling conventions (as Metrowerks Modula-2 does), the argument you are binding must be the last one in the argument list, and it must be 32 bits in size (such as an address).

Converting Calling Conventions

Another important use for run-time closures is where the system invokes some code, that you supply, in a nonstandard way. One case that has bitten many Macintosh programmers over the years is the infamous TextEdit clikLoop callback.

If you specify a clikLoop routine in your TextEdit record, TEClick will call it repeatedly as long as the mouse remains down. One use for this callback is to implement autoscrolling: TextEdit's default clikLoop action, like the rest of TextEdit, doesn't know anything about scrollbars, so you have to provide your own code to keep these in sync.

The clikLoop routine is invoked in a slightly odd way: it must preserve register D2 (normally on the Mac, any routine you call is allowed to trash A0, A1, D0, D1 and D2), and it must return a boolean result in the bottom byte of D0. If this value is ever false, then TEClick returns immediately, as though the mouse has been released.

The trap lies in an inocuous-seeming library routine, that Apple provides, called TESetClickLoop. It looks very tempting to use, as it allows you to pass a clikLoop callback that conforms to straight Pascal calling conventions. Of course, it uses its own glue code internally to call your routine and convert appropriately between calling conventions.

The problem lies in how this glue code finds your routine: the address of your routine is stored in a global variable! What this means is, if you have two separate TextEdit records in existence at once, and you try to specify a separate clikLoop routine for each one, then the last one you specify gets called for both, since there is only room to remember a single one!

The correct solution is to create the glue code on the fly. Here's a routine that does this:

TYPE
  ClikLoopProc =
   (* has one more arg than TextEdit.ClikLoopProc *)
    PROCEDURE
     (
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  BoundClikLoopProc = ARRAY [1 .. 12] OF CARDINAL;

PROCEDURE BindClikLoopProc
 (
  TheProc : ClikLoopProc;
  ToArg : ADDRESS;
  VAR Result : BoundClikLoopProc
 );
 (* binds TheProc to the specified argument to create
  a closure that can be invoked by TextEdit as a
  ClikLoop procedure. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F02H;            (* move.l d2, -(sp)*)
                                (* save vulnerable register *)
  Result[2] := 04227H;            (* clr.b -(sp) *)
  Result[3] := 02F3CH;            (* move.l #n, -(sp) *)
                                (* insert additional arg *)
  TwoWords.LongValue := ToArg;
  Result[4] := TwoWords.HighWord  ;  (* n for above *)
  Result[5] := TwoWords.LowWord;    (* ditto *)
  Result[6] := 04EB9H;            (* jsr l *)
  TwoWords.LongValue := CAST(LONGCARD, TheProc);
  Result[7] := TwoWords.HighWord;    (* high word of l *)
  Result[8] := TwoWords.LowWord;    (* low word of l *)
  Result[9] := 07000H;            (* moveq.l #0, d0 *)
  Result[10] := 0101FH;            (* move.b (sp)+, d0 *)
  Result[11] := 0241FH;            (* move.l (sp)+, d2 *)
                                (* restore saved register *)
  Result[12] := 4E75H;            (* rts *)
  FlushCaches
END BindClikLoopProc;

Note how, while I'm at it, I cleverly change the type of the ClikLoopProc that you must pass, to include an extra argument. This allows you to pass extra information to your ClikLoopProc when it is called, for whatever purpose you want.

Up-Level References

In Pascal and Modula-2, you can declare one routine local to another. In this case, the environment of the inner routine includes everything declared locally to the outer routine. In particular, it is possible for the inner routine to access variables declared local to the outer routine.

How is this done? I know of two different techniques that compilers may use. One involves the use of a global display array of pointers to stack frames, indexed by the source nesting level of the routine. The other way is to pass a static link when calling a nested routine -- this is just a pointer to the enclosing routine's stack frame.

On the Mac, one of these techniques is the right way, and the other is the wrong way. The display array is the wrong way, because it requires global data that isn't always available. The static link method is the right way. Thankfully, the Mac compilers I'm aware of all do it the right way.

To illustrate how a static link works, have a look at the following piece of source:

PROCEDURE Outer;

  VAR
    OuterVar : CARDINAL;

  PROCEDURE Inner
   (
    NewValue : CARDINAL
   );

  BEGIN
    OuterVar := NewValue
  END Inner;

BEGIN (*Outer*)
  OuterVar := 1;
  Inner(2)
END Outer;

Here's the code that the Metrowerks compiler generates for for procedure Outer:

LINK    A6,#$FFFE
MOVE.W   #$0001,-$0002(A6) ; OuterVar := 1
MOVE.W   #$0002,-(A7) ; arg for call to Inner
PEA    (A6)
JSR    Inner
UNLK    A6
RTS    

Note that the variable OuterVar is at offset -2 within Outer's stack frame. Also note that when it calls Inner, it doesn't just pass the cardinal value 2 for the argument NewValue; it pushes an additional longword, which is the value of its A6 (stack frame) register. To see the use of this, here's the code for Inner:

LINK    A6,#$0000
MOVEA.L  $0008(A6),A4 ; Outer's A6 to access OuterVar
MOVE.W   $000C(A6),-$0002(A4) ; OuterVar := NewValue
UNLK    A6
MOVEA.L  (A7)+,A0
ADDQ.L   #$6,A7
JMP    (A0)

Notice how Inner gets to OuterVar by using the stack frame pointer that was passed to it by Outer. If Outer was itself declared local to another routine, it is similarly possible for Inner to access the local variables in that stack frame by following the frame pointer value passed to Outer by its parent.

To see the relevance of this, here's an example use of the TraverseProcesses routine I gave up above. It's a routine called FindProcessBySignature, that scans the process list until it finds a process with a given application signature. It returns true and the process serial number if it finds a match, false otherwise.

PROCEDURE FindProcessBySignature
 (
  Signature : OSType;
  VAR MatchingProcess : ProcessSerialNumber
 ) : BOOLEAN;

  VAR
    Found : BOOLEAN;

  PROCEDURE MatchSignature
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

    VAR
      Err : OSErr;

  BEGIN
    IF ProcessInfo.processSignature = Signature THEN
      MatchingProcess := ThisProcess;
      Found := TRUE;
      KeepGoing := FALSE
    END (*IF*)
  END MatchSignature;

BEGIN (*FindProcessBySignature*)
  Found := FALSE; (* initial assumption *)
  TraverseProcesses
   (
    (*Action :=*) CAST(ProcessActions,
    ADR(MatchSignature)),
    (*ActionArg :=*) CurrentA6()
   );
  RETURN
    Found
END FindProcessBySignature;

First of all, compare the declaration of the inner MatchSignature routine with the original definition of the ProcessActions type. Something is missing: What has happened to the ActionArg?

The answer, of course, is that this is now an implicit additional argument, being the outer routine's stack frame pointer! When FindProcessBySignature passes the address of MatchSignature to TraverseProcesses as the action routine, it tells the latter to call it with that additional argument, which is the result of calling the function CurrentA6. This is defined as follows:

PROCEDURE CurrentA6() : LONGCARD;

  CODE
    02E8EH; (* move.l a6, (sp) *)

It returns the value that is currently in the A6,which is the stack frame register -- exactly what the inner routine needs when it's called.

The main ugly thing about this technique is that Modula-2 doesn't allow you to pass a nested routine as a routine value; this is only allowed for globally-declared routines (as in the sorting example earlier). Luckily, Metrowerks allows you to use the ADR (address-of) function on a nested routine, and you can do a CAST to force the result of this to the appropriate type. (Unfortunately, the p1 compiler doesn't seem to support this feature.)

Why not just declare MatchSignature globally outside FindProcessBySignature, and avoid this trickiness with stack frame pointers? The reason is because, instead of passing the stack frame pointer as the additional argument, you'll now have to pass pointers to the local variables Found and MatchingProcess. Or if you try using globals instead, your code is no longer reentrant! This may not seem a big deal at first sight, but it's very important in library code, which will be written once and reused in a wide variety of situations you never envisaged. Alternatively, you could combine both the local variables into a single structure, and pass the address of that. That will work, but you lose some of the simplicity of the nesting version.

Besides, the MatchSignature routine doesn't need to be visible globally. One of the safety principles you learn when you're writing large programs is: if something doesn't need to be visible globally, don't make it so. In a large piece of software, the complexity comes not from the number of parts, but from the number of potential interactions between them. To minimize unwanted interactions, things should only be accessible in places where they need to be. This is called information hiding.

There is one subtle pitfall with static links you need to watch out for. Consider the following code fragment:

PROCEDURE Parent;

  VAR
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) CurrentA6() (***WRONG***)
     )
  END Brother;

BEGIN (*Parent*)
  ...
  Brother
  ...
END Parent;

What is wrong with this? The problem is, that the use of CurrentA6() in Brother means that it is passing Sister a pointer to its own stack frame, rather than the one belonging to Parent! This means that, where Sister expects to access variables local to Parent, it accesses ones local to Brother instead. This is obviously wrong, since the language rules don't allow variables local to Brother to be directly visible to Sister.

The following is one way to modify this to work correctly:

PROCEDURE Parent;

  VAR
    ParentA6 : LONGCARD;
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) ParentA6
     )
  END Brother;

BEGIN (*Parent*)
  ParentA6 := CurrentA6();
  ...
  Brother
  ...
END Parent;

Here the CurrentA6() call has been moved to Parent, and its value saved for use by Brother when calling Sister.

The rule is simple: when calling a nested routine, you must always pass the stack frame pointer for the routine textually immediately enclosing the one you want to call. If you stick to this rule, all the more complicated cases -- references to local variables up more than one level, recursion, inner routines calling outer ones, and so on -- are guaranteed to work correctly!

Application To Other Languages

For some odd reason, most Mac programmers don't use Modula-2. Most use C, and most of the rest use Pascal. And you're wondering: how applicable are these techniques to those languages?

You can use them to some extent. Modula-2 is modeled fairly closely on Pascal (they were both designed by the same person, after all), and I'm pretty sure all the Mac Pascals use A6 as a stack frame pointer in the same way as Modula-2. There are two main drawbacks to Pascal:

  • There are no routine types. Once you take the address of a procedure or function, you're completely on your own.
  • I understand THINK Pascal doesn't even allow you to take the address of a routine that's local to another routine. That means you can't use any of the A6 stuff.

Note that it is possible in Pascal to pass a routine as an argument to another, in a type-safe fashion. This feature is powerful enough to deal with all the examples using static links I have given above. However, it doesn't allow you to save routine addresses in data structures, which can come in useful in some cases I haven't covered in this article.

The major drawback with C is that it doesn't allow you to declare one routine inside another. This means you have to go through the extra complexity I mentioned above in passing additional arguments to action routines. At least C has proper procedure and function types.

Conclusion

Assembly-language hackers are occasionally fond of a programming technique known as self-modifying code, where an instruction sequence pokes some bits in an instruction elsewhere in the routine before executing it.

I want to make a clear distinction between self-modifying code, and the techniques for generating code at run-time that I have been describing above. Self-modifying code was a technique concocted to squeeze a few more instruction cycles out of time-critical code. It has been quite rightly criticized as leading to programs that are hard to understand, hard to reuse and hard to maintain.

The techniques I have been describing, on the other hand, are more structured than this. The code which is being generated at run-time follows a simple, easy-to-understand form, and its use is really quite straightforward. My aim has not been to optimize time-critical code, but precisely to create code which is easy to understand, easy to reuse and easy to maintain!


Lawrence D'Oliveiro has been doing both high-level and low-level programming on a variety of different platforms over the last eighteen years. When he's not playing with his techno-toys, or worrying about which cool technology Apple is going to abandon next, he likes cycling and walking around the nearby Waikato River.

 
AAPL
$500.72
Apple Inc.
+2.04
MSFT
$34.67
Microsoft Corpora
+0.18
GOOG
$894.72
Google Inc.
+12.71

MacTech Search:
Community Search:

Software Updates via MacUpdate

Apple HP Printer Drivers 2.16.1 - For OS...
Apple HP Printer Drivers includes the latest HP printing and scanning software for Mac OS X 10.6, 10.7 and 10.8. For information about supported printer models, see this page.Version 2.16.1: This... Read more
Yep 3.5.1 - Organize and manage all your...
Yep is a document organization and management tool. Like iTunes for music or iPhoto for photos, Yep lets you search and view your documents in a comfortable interface, while offering the ability to... Read more
Apple Canon Laser Printer Drivers 2.11 -...
Apple Canon Laser Printer Drivers is the latest Canon Laser printing and scanning software for Mac OS X 10.6, 10.7 and 10.8. For information about supported printer models, see this page.Version 2.11... Read more
Apple Java for Mac OS X 10.6 Update 17 -...
Apple Java for Mac OS X 10.6 delivers improved security, reliability, and compatibility by updating Java SE 6.Version Update 17: Java for Mac OS X 10.6 Update 17 delivers improved security,... Read more
Arq 3.3 - Online backup (requires Amazon...
Arq is online backup for the Mac using Amazon S3 and Amazon Glacier. It backs-up and faithfully restores all the special metadata of Mac files that other products don't, including resource forks,... Read more
Apple Java 2013-005 - For OS X 10.7 and...
Apple Java for OS X 2013-005 delivers improved security, reliability, and compatibility by updating Java SE 6 to 1.6.0_65. On systems that have not already installed Java for OS X 2012-006, this... Read more
DEVONthink Pro 2.7 - Knowledge base, inf...
Save 10% with our exclusive coupon code: MACUPDATE10 DEVONthink Pro is your essential assistant for today's world, where almost everything is digital. From shopping receipts to important research... Read more
VirtualBox 4.3.0 - x86 virtualization so...
VirtualBox is a family of powerful x86 virtualization products for enterprise as well as home use. Not only is VirtualBox an extremely feature rich, high performance product for enterprise customers... Read more
Merlin 2.9.2 - Project management softwa...
Merlin is the only native network-based collaborative Project Management solution for Mac OS X. This version offers many features propelling Merlin to the top of Mac OS X professional project... Read more
Eye Candy 7.1.0.1191 - 30 professional P...
Eye Candy renders realistic effects that are difficult or impossible to achieve in Photoshop alone, such as Fire, Chrome, and the new Lightning. Effects like Animal Fur, Smoke, and Reptile Skin are... Read more

The Blockheads Creator David Frampton Gi...
The Blockheads Creator David Frampton Gives a Postmortem on the Creation Process of the Game Posted by Andrew Stevens on October 16th, 2013 [ permalink ] Hey, a | Read more »
Sorcery! Enhances the Gameplay in Latest...
Sorcery! | Read more »
PROVERBidioms Paints English Sayings in...
PROVERBidioms Paints English Sayings in a Picture for Users to Find Posted by Andrew Stevens on October 16th, 2013 [ permalink ] | Read more »
OmniFocus 2 for iPhone Review
OmniFocus 2 for iPhone Review By Carter Dotson on October 16th, 2013 Our Rating: :: OMNIPOTENTiPhone App - Designed for the iPhone, compatible with the iPad OmniFocus 2 for iPhone is a task management app for people who absolutely... | Read more »
Ingress – Google’s Augmented-Reality Gam...
Ingress – Google’s Augmented-Reality Game to Make its Way to iOS Next Year Posted by Andrew Stevens on October 16th, 2013 [ permalink ] | Read more »
CSR Classics is Full of Ridiculously Pre...
CSR Classics is Full of Ridiculously Pretty Classic Automobiles Posted by Rob Rich on October 16th, 2013 [ permalink ] | Read more »
Costume Quest Review
Costume Quest Review By Blake Grundman on October 16th, 2013 Our Rating: :: SLIGHTLY SOURUniversal App - Designed for iPhone and iPad This bite sized snack lacks the staying power to appeal beyond the haunting season.   | Read more »
Artomaton – The AI Painter is an Artific...
Artomaton – The AI Painter is an Artificial Artistic Intelligence That Paints From Photos You’ve Taken Posted by Andrew Stevens on October 16th, 2013 [ | Read more »
Hills of Glory 3D Review
Hills of Glory 3D Review By Carter Dotson on October 16th, 2013 Our Rating: :: BREACHED DEFENSEUniversal App - Designed for iPhone and iPad Hills of Glory 3D is the most aggravating kind of game: one with good ideas but sloppy... | Read more »
FitStar: Tony Gonzalez Adds New 7 Minute...
FitStar: Tony Gonzalez Adds New 7 Minute Workout Program for Those Who Are in a Hurry Posted by Andrew Stevens on October 16th, 2013 [ permalink ] | Read more »

Price Scanner via MacPrices.net

Updated MacBook Price Trackers
We’ve updated our MacBook Price Trackers with the latest information on prices, bundles, and availability on MacBook Airs, MacBook Pros, and the MacBook Pros with Retina Displays from Apple’s... Read more
13-inch Retina MacBook Pros on sale for up to...
B&H Photo has the 13″ 2.5GHz Retina MacBook Pro on sale for $1399 including free shipping. Their price is $100 off MSRP. They have the 13″ 2.6GHz Retina MacBook Pro on sale for $1580 which is $... Read more
AppleCare Protection Plans on sale for up to...
B&H Photo has 3-Year AppleCare Warranties on sale for up to $105 off MSRP including free shipping plus NY sales tax only: - Mac Laptops 15″ and Above: $244 $105 off MSRP - Mac Laptops 13″ and... Read more
Apple’s 64-bit A7 Processor: One Step Closer...
PC Pro’s Darien Graham-Smith reported that Canonical founder and Ubuntu Linux creator Mark Shuttleworth believes Apple intends to follow Ubuntu’s lead and merge its desktop and mobile operating... Read more
MacBook Pro First, Followed By iPad At The En...
French site Info MacG’s Florian Innocente says he has received availability dates and order of arrival for the next MacBook Pro and the iPad from the same contact who had warned hom of the arrival of... Read more
Chart: iPad Value Decline From NextWorth
With every announcement of a new Apple device, serial upgraders begin selling off their previous models – driving down the resale value. So, with the Oct. 22 Apple announcement date approaching,... Read more
SOASTA Survey: What App Do You Check First in...
SOASTA Inc., the leader in cloud and mobile testing announced the results of its recent survey showing which mobile apps are popular with smartphone owners in major American markets. SOASTA’s survey... Read more
Apple, Samsung Reportedly Both Developing 12-...
Digitimes’ Aaron Lee and Joseph Tsai report that Apple and Samsung Electronics are said to both be planning to release 12-inch tablets, and that Apple is currently cooperating with Quanta Computer on... Read more
Apple’s 2011 MacBook Pro Lineup Suffering Fro...
Appleinsider’s Shane Cole says that owners of early-2011 15-inch and 17-inch MacBook Pros are reporting issues with those models’ discrete AMD graphics processors, which in some cases results in the... Read more
Global Notebook Shipments To Grow Less Than 3...
Digitimes Research’s Joanne Chien reports that Taiwan’s notebook shipments grew only 2.5% sequentially, and dropped 8.6% year-over-year in the third quarter despite the fact that notebook ODMs have... Read more

Jobs Board

Senior Mac / *Apple* Systems Engineer - 318...
318 Inc, a top provider of Apple solutions is seeking a new Senior Apple Systems Engineer to be based out of our Santa Monica, California location. We are a Read more
*Apple* Retail - Manager - Apple Inc. (Unite...
Job Summary Keeping an Apple Store thriving requires a diverse set of leadership skills, and as a Manager, you’re a master of them all. In the store’s fast-paced, Read more
*Apple* Solutions Consultant - Apple (United...
**Job Summary** Apple Solutions Consultant (ASC) - Retail Representatives Apple Solutions Consultants are trained by Apple on selling Apple -branded products Read more
Associate *Apple* Solutions Consultant - Ap...
**Job Summary** The Associate ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The Associate ASC's role is to Read more
*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.