TweetFollow Us on Twitter

Routines as Values

Volume Number: 13 (1997)
Issue Number: 10
Column Tag: Programming Techniques

Routines As Values

by Lawrence D'Oliveiro

The High-Level Programmer's Guide to Generating Code at Run-Time

Introduction

This article is about passing around routines (procedures and functions) as values. That is, you take the address of a routine, pass it to another routine as an argument, maybe even store it in a data structure for later use. Ultimately the whole point with a routine is that you make a call to it; but many interesting things can happen on the way there.

There are many useful programming techniques to do with routines as values. In my view, these techniques deserve to be more widely known than they are, because they can contribute so much to the principles of reusable code and information hiding that good programmers value. This article will give you an introduction to some of these techniques.

To make sense of this article, it helps to have some understanding of 68K assembly language and how it interfaces to the Mac OS. The definitive reference to 68K machine language is the 68000 Family Programmer's Reference, available from Motorola's Web site as an Acrobat PDF file at http://www.mot.com/hpesd/aesop/68k/68kprm.pdf. Links to related documents can be found at http://www.mot.com/SPS/HPESD/docs/pubs/index.html. For a description of 68K calling conventions on the Mac OS, see the Trap Manager chapter of Inside Macintosh: Operating System Utilities, available on-line at ftp://ftpdev.info.apple.com//Developer_Services/Technical_Documentation/Inside_Macintosh/Operating_System_Utilities/. [Another useful and brief article introducing the basics of 68K ASM is "The Secrets of the Machine" by Malcolm Teas in MacTech 9.5. It is available from www.mactech.com/Articles/Vol.09/09.05/Read-Assembly/index.html -- ed].

Language Notes

Most of the code examples in this article are written in Modula-2. Specifically, they were written for the (now-defunct) Metrowerks Modula-2 compiler that runs under Apple's MPW (Macintosh Programmer's Workshop). I use this combination because it's been the best language and development environment for the kinds of programs I write. I recently got hold of the p1 Modula-2 compiler from Germany, but have yet to start using that in earnest.

At the end of the article, I will discuss the applicability of these techniques to other languages: everybody's favorite, C, and everybody else's favorite, Pascal.

In the meantime, I'll explain a few things for those who might be unfamiliar with Modula-2, but are familiar with Pascal. And if you're not familiar with Pascal, then you're not a real programmer...

Modula-2 has a lot in common with Pascal, and most of the following code should be comprehensible to Pascal programmers. It has the same signed integer types as Pascal (INTEGER for 16-bit integers and LONGINT for 32-bit ones), but it also has unsigned integer types: CARDINAL for 16-bit ones and LONGCARD for 32-bit ones. Pointer types are defined using the syntax "POINTER TO t" instead of Pascal's "^t". And there is a general untyped pointer type, ADDRESS, which is compatible with any pointer type. The Pascal compilers on the Mac don't seem to have any equivalent to this; it is like the "void *" type in C. The built-in ADR function returns the address of something; it is equivalent to the "@" operator in Pascal or "&" operator in C. Like the Pascal version (but unlike the C one), it always returns an untyped pointer value, that is, its return type is ADDRESS.

If you want to do unsafe type conversions, there are two ways. One is the traditional record-variant trick from Pascal (or unions in C). The other is the use of the CAST function. You will see both in use later. Unlike Pascal or C, you can't just use a type name as a function to do a cast; this makes unsafe conversions a little easier to find in the code!

The big difference from Pascal, for the purposes of this article, is that Modula-2 allows you to declare routine types. That is, you can write pieces of code like this:

  TYPE
    MyProcType =
      PROCEDURE
       (
        INTEGER
       );

  VAR
    ProcVar : MyProcType;
    K : INTEGER;

  PROCEDURE MyProc
   (
    Arg : INTEGER
   );

  BEGIN
    ...
  END MyProc;

BEGIN (* main program *)
  MyProc(K); (* invokes MyProc *)
  ProcVar := MyProc;
  ProcVar(K); (* also invokes MyProc *)
  ...

In Pascal, you can take the address of the routine:

  Var
    ProcVar : Ptr;

  Procedure MyProc
   (
    Arg : Integer
   );

 Begin
  ...
 End {MyProc} ;

Begin
{ main program } 
  MyProc(K);
  ProcVar := @MyProc;
  ...

But there is no way to define a type like MyProcType in the Modula-2 version, and the language provides no standard mechanism for calling the routine once you get its address in ProcVar. Of course, you can get around this, in a type-unsafe way, with appropriate use of inline machine code.

C programmers will already be familiar with all this, since C has similar sorts of facilities to Modula-2 in this regard, albeit with slightly less elegant syntax:

  typedef void (*MyProcType)(short);

  MyProcType ProcVar;
  short K;

void MyProc
 (
  short Arg
 )
 {
  ...
 } /*MyProc*/

void main
 (
  void
 )
 {
  MyProc(K); /* invokes MyProc */
  ProcVar = &MyProc;
  (*ProcVar)(K); /* also invokes MyProc */
  ...

I did a quick search for Modula-2-related information on the Web, and was amazed at how much I found. Here are a few URLs to get you started: http://www.infm.ulst.ac.uk/~george/mp/pract/modula.html, http://www.ee.newcastle.edu.au/users/staff/peter/m2/Modula2.html, http://www.ise.fhg.de/personal_home_pages/hubertus/texte/tabcont.html.

Anyway, enough with the preliminaries, on with the show...

Action Routines

There is frequently a need to traverse a complex data structure, looking for elements that match a particular criterion, and doing some processing on them. For example, consider the System 7 process list: you might want to scan it to see if a particular program is running. Or you might want to total up information about all processes, or display them in a list for the user's information. In all these applications, there is a certain recurring pattern to the code:

  • Declare a variable named, say, CurrentProcess, of type ProcessSerialNumber. Initialize it to kNoProcess.
  • Call GetNextProcess, passing it CurrentProcess. If you get an error, exit the loop.
  • Perform whatever processing is appropriate on the process with the serial number in CurrentProcess. Then repeat from the previous step.

The code in the first two steps would be the same in every case. Ideally, you should be able to write it just once, and reuse this code everywhere you need to. How can you do this?

The answer is to write a generic process-traversal routine, which calls an action routine that you specify for each process. In Modula-2, you could define the type of the action routine as follows:

TYPE
  ProcessActions =
    PROCEDURE
     (
      VAR (*ThisProcess :*) ProcessSerialNumber,
      VAR (*ProcessInfo :*) ProcessInfoRec,
      VAR (*KeepGoing :*) BOOLEAN,
      (*ActionArg :*) ADDRESS
     );

The first two arguments pass all the information about the process that the Process Manager provides. The KeepGoing argument may be set to false by the action routine to stop the traversal. This is useful if, for example, you're searching for a particular process, and you don't need to search further once you've found it. The last argument allows the passing of additional data whose meaning is up to the caller; we'll see one interesting use for this later.

And here's the generic process-traversal routine:

PROCEDURE TraverseProcesses
 (
  Action : ProcessActions;
  ActionArg : ADDRESS
 );

  VAR
    ThisProcess : ProcessSerialNumber;
    ProcessInfo : ProcessInfoRec;
    ProcessName : Str255;
    AppSpec : FSSpec;
    KeepGoing : BOOLEAN;
    Err : OSErr;

BEGIN
  ThisProcess.highLongOfPSN := 0;
  ThisProcess.lowLongOfPSN := kNoProcess;
  ProcessInfo.processInfoLength := SIZE(ProcessInfoRec);
  ProcessInfo.processName := ADR(ProcessName);
  ProcessInfo.processAppSpec := ADR(AppSpec);
  KeepGoing := TRUE; (* initially *)
  LOOP
    Err := GetNextProcess(ThisProcess);
    IF Err <> noErr THEN
      EXIT
    END (*IF*);
    Err := GetProcessInformation(ThisProcess, ProcessInfo);
    IF Err = noErr THEN
      Action
       (
        ThisProcess,
        ProcessInfo,
        KeepGoing,
        ActionArg
       );
      IF NOT KeepGoing THEN
        EXIT
      END (*IF*)
    END (*IF*)
  END (*LOOP*)
END TraverseProcesses;

A sorting example

Another use for action routines is to isolate assumptions about data structures. Consider a common operation: sorting. You've probably got a collection of favorite sorting routines that you like to use. What sorts of assumptions do they make about the data being sorted? Does it have to be in an array? Can it be in a disk file? Do the sort keys have to reside together with the rest of the data? Why make any assumptions at all?

Many sorting routines can function by making only two kinds of accesses to the data: they need to be able to compare the keys of two records, and they need to be able to swap the positions of two records to get them into the right order. Let's define interfaces for action routines to perform these operations:

TYPE
  ComesBeforeProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  SwapProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     );

The one assumption I will make is that records are identifiable by index numbers starting from 1. A ComesBeforeProc returns true if the record with index First is supposed to come before the one with index Second (that is, they're already in the right order); it returns false otherwise. A SwapProc is called to swap two out-of-order elements: after it executes, the record which had index First should now have index Second, and vice versa.

As with the process-traversal example, each action routine takes an additional argument which may be used to pass additional caller-specified information.

Here's a sorting routine that uses the above action routines to implement a simple Shellsort:

PROCEDURE Sort
 (
  NrElements : LONGCARD;
  ComesBefore : ComesBeforeProc;
  ComesBeforeArg : ADDRESS;
  Swap : SwapProc;
  SwapArg : ADDRESS
 );

  VAR
    i, j, k, l, m : LONGCARD;

BEGIN
  i := 1;
  WHILE i < NrElements DO
    i := i + i
  END (*WHILE*);
  m := (i - 1) DIV 2;
  IF (m = 0) AND (NrElements > 1) THEN
   (* do at least one pass when sorting 2 elements *)
    m := 1
  END (*IF*);
  WHILE m > 0 DO
    k := NrElements - m;
    FOR j := 1 TO k DO
      i := j;
      LOOP
        l := i + m;
        IF ComesBefore(i, l, ComesBeforeArg)
        THEN
          EXIT
        END (*IF*);
        Swap(i, l, SwapArg);
        IF i <= m THEN
          EXIT
        END (*IF*);
        i := i - m
      END (*LOOP*)
    END (*FOR*);
    m := m DIV 2
  END (*WHILE*)
END Sort;

Of course, you could replace the body of this routine with just about any sorting algorithm you like, and keep the same interface. Here's a very simple, almost trivial example use of this routine to sort an array of integers. First we declare a data array, and comparison and swap routines for the sort:

TYPE
  ElementType = INTEGER;
VAR
  MyData : POINTER TO
    POINTER TO ARRAY [1 .. 10000] OF ElementType;
  MyNrElements : CARDINAL;

PROCEDURE MyComesBefore
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 ) : BOOLEAN;

BEGIN
  RETURN
      MyData^^[First]
    <=
      MyData^^[Second]
END MyComesBefore;

PROCEDURE MySwap
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 );

  VAR
    Temp : ElementType;

BEGIN
  Temp := MyData^^[First];
  MyData^^[First] := MyData^^[Second];
  MyData^^[Second] := Temp
END MySwap;

And here's the call that does the actual sort:

Sort
 (
  (*NrElements :=*) MyNrElements,
  (*ComesBefore :=*) MyComesBefore,
  (*ComesBeforeArg :=*) NIL,
  (*Swap :=*) MySwap,
  (*SwapArg :=*) NIL
 )

This example makes no use of the extra arguments that can be passed to the comparison and swap routines. Later we will see an interesting use for these extra arguments.

Closures

In languages like POP-11 and LISP, which are primarily used for Artificial Intelligence work, you can have a lot of fun with functions. You can even create new functions at run-time!

But why should AI researchers have all the fun? Can't you create functions at run-time in a common language like Pascal, Modula-2 or C? Certainly you can!

The bad news is: the techniques for doing this rely on particular features of the compiler and machine. It takes some work to get them right the first time. And certain compiler optimizations can signficant problems.

The good news is: once you get it right, it's a lot easier to do again. And these become very useful techniques, that you find yourself using over and over!

Technically, a Curried function (it's named after a person, not a dish) is created when you take a function with many arguments, and "bind" some of them to particular value; the result is a function of fewer arguments. A closure, on the other hand, is created when you bind a function to its parent environment, to create a new function which automatically sets up the correct environment for the original function when it's called.

To illustrate closures and environments, consider a common problem on the Mac: accessing A5 globals from an interrupt routine.

In most Mac languages, when you declare a bunch of global variables, the compiler allocates them in a block of memory accessed relative to register A5. A5 is also used to access the application's segment table, as well as QuickDraw's global storage. Thus, A5 is a very important register, and your application code is liable to behave very strangely if it's called with an incorrect value in A5, and it tries to access any global data.

Instead of putting special code to deal with this into all of your interrupt routines, the principle of reusable code dictates that you should write this code just once, and use it over and over. That is, you should write a general library routine which, given an interrupt routine, creates a "wrapper" routine for it that passes the correct A5 value. Here's an example of what I mean:

TYPE
  BoundInterruptProc = ARRAY [1 .. 9] OF CARDINAL;

A BoundInterruptProc is an array which will hold the the code for the wrapper routine. The wrapper code itself is filled in by BindInterruptProc, which is defined as follows:

PROCEDURE CurrentA5() : LONGCARD;
(* returns the current value of the global base register.*)

  CODE
    02E8DH;          (* move.l a5, (sp) *)

PROCEDURE BindInterruptProc
 (
  TheProc : ADDRESS;
  VAR Result : BoundInterruptProc
 );
(* constructs the wrapper around the specified procedure.*)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F0DH;          (* move.l a5, -(sp) *)
                              (* save previous A5 *)
  Result[2] := 02A7CH;          (* move.l #n, a5 *)
  TwoWords.LongValue := CurrentA5(); 
                              (* n for previous instr *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 04EB9H;          (* jsr p *)
  TwoWords.LongValue := TheProc;  (* p for previous instr *)
  Result[6] := TwoWords.HighWord;
  Result[7] := TwoWords.LowWord;
  Result[8] := 02A5FH;          (* move.l (sp)+, a5 *)
                              (* restore previous A5 *)
  Result[9] := 04E75H;          (* rts *)
  FlushCaches
END BindInterruptProc;

So instead of passing the address of your actual interrupt routine to the system, you pass the address of the appropriately-filled-in BoundInterruptProc structure, for example:

VAR
  Params : HParamBlockRec;
  BoundCompletion : BoundInterruptProc;
...

BindInterruptProc(ADR(MyCompletion), BoundCompletion);
Params.ioCompletion := ADR(BoundCompletion);
...
Err := PBReadAsync(ADR(Params));

and in your interrupt routine you have full access to all your A5 globals!

Note that FlushCaches call at the end of the definition of BindInterruptProc. This is necessary because of a piece of brain damage in Motorola's 68030 and 68040 processors. The problem is that, on these processors, memory references are divided into two kinds, data and code, and there is a separate CPU cache for each, with no attempt to maintain consistency between them. Thus, if a given memory location ends up in both caches (as could happen when you generate code at run-time, as BindInterruptProc does), it is possible for it to be modified from the data cache, without the instruction cache picking this up! Then, when you later try to execute the instruction at that memory location, you get the wrong value, and crash...

Here's how FlushCaches is defined. This definition is safe to use on all members of the 68000 CPU family, with system version 6.0 and later. It even works on Power Macs, where it flushes the cache of compiled code on models with the DR ("dynamic recompilation") emulator, and does nothing on older Power Macs.

PROCEDURE HWPrivPresent() : BOOLEAN;
(* has this system got the HWPriv trap. *)

  CONST
    UnimplementedTrap = 009FH;
    HWPrivTrap = 0A198H;

BEGIN
  RETURN
      GetOSTrapAddress(HWPrivTrap)
    <>
      GetOSTrapAddress(UnimplementedTrap)
END HWPrivPresent;

PROCEDURE FlushInstructionCache;
(* actually flushes both instruction and data caches. *)

  CODE
    07001H, (* moveq.l #1, d0 *)
    0A198H; (* _HWPriv *)

PROCEDURE FlushCaches;
 (* flushes instruction and data caches, if present. *)

BEGIN
  IF HWPrivPresent() THEN
    FlushInstructionCache
  END (*IF*)
END FlushCaches;

Binding Actual Arguments

Often, in a routine of yours that is called by the system, you don't need access to all your A5 globals, but you do need access to a particular block of data. Depending on the situation, the system may provide all kinds of different ways of passing the address of this block as an additional argument to your routine -- or it might not provide any way at all.

For example, the action routine you pass to the Control Manager's TrackControl routine gets the control handle as an argument, so you could use the refCon field in the control record to pass information to it. That is, unless the part of the control you're tracking is an indicator (such as a scroll bar thumb), in which case you don't get any arguments at all. Of course, if your code is part of the currently-running application, you could probably get away with using A5 globals in this case.

There are other, equally complicated situations. Time Manager tasks didn't use to get any arguments, until System 6.0.3, when they started getting the address of the TMInfo record in register A1. Memory Manager GrowZoneProcs don't get any arguments, and you can't guarantee that some part of the system (or some weird INIT) won't try to allocate some memory in your heap while the A5 value is set to something nonsensical.

Now, what do you think: would you prefer to resort to a range of different fiddly techniques to deal with each of these situations, or would you rather stick with one general, uniform technique that can handle them all?

I know which I prefer. I would Curry my routine by binding it to the address of my data, using the following Bind routine:

TYPE
 (* structure for holding Curried routine *)
  BoundProc = ARRAY [1 .. 8] OF CARDINAL;

PROCEDURE Bind
 (
  TheProc : ADDRESS;
  ToArg : ADDRESS;
  VAR Result : BoundProc
 );
 (*    creates a new procedure which calls TheProc
      with ToArg as an extra argument. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F17H;          (* move.l (sp), -(sp) *) )
  Result[2] := 02F7CH;          (* move.l #m, n(sp) *) 
  TwoWords.LongValue := ToArg;    (* m for above move *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 4;                (* n for above move *)
  Result[6] := 04EF9H;          (* jmp n *)
  TwoWords.LongValue := TheProc;  (* n for jmp *)
  Result[7] := TwoWords.HighWord;
  Result[8] := TwoWords.LowWord;
  FlushCaches
END Bind;

Note that you can use Bind on any procedure or function of just about any number of arguments, and (if a function) returning a result of any type. The only restrictions are that it must follow Pascal calling conventions (as Metrowerks Modula-2 does), the argument you are binding must be the last one in the argument list, and it must be 32 bits in size (such as an address).

Converting Calling Conventions

Another important use for run-time closures is where the system invokes some code, that you supply, in a nonstandard way. One case that has bitten many Macintosh programmers over the years is the infamous TextEdit clikLoop callback.

If you specify a clikLoop routine in your TextEdit record, TEClick will call it repeatedly as long as the mouse remains down. One use for this callback is to implement autoscrolling: TextEdit's default clikLoop action, like the rest of TextEdit, doesn't know anything about scrollbars, so you have to provide your own code to keep these in sync.

The clikLoop routine is invoked in a slightly odd way: it must preserve register D2 (normally on the Mac, any routine you call is allowed to trash A0, A1, D0, D1 and D2), and it must return a boolean result in the bottom byte of D0. If this value is ever false, then TEClick returns immediately, as though the mouse has been released.

The trap lies in an inocuous-seeming library routine, that Apple provides, called TESetClickLoop. It looks very tempting to use, as it allows you to pass a clikLoop callback that conforms to straight Pascal calling conventions. Of course, it uses its own glue code internally to call your routine and convert appropriately between calling conventions.

The problem lies in how this glue code finds your routine: the address of your routine is stored in a global variable! What this means is, if you have two separate TextEdit records in existence at once, and you try to specify a separate clikLoop routine for each one, then the last one you specify gets called for both, since there is only room to remember a single one!

The correct solution is to create the glue code on the fly. Here's a routine that does this:

TYPE
  ClikLoopProc =
   (* has one more arg than TextEdit.ClikLoopProc *)
    PROCEDURE
     (
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  BoundClikLoopProc = ARRAY [1 .. 12] OF CARDINAL;

PROCEDURE BindClikLoopProc
 (
  TheProc : ClikLoopProc;
  ToArg : ADDRESS;
  VAR Result : BoundClikLoopProc
 );
 (* binds TheProc to the specified argument to create
  a closure that can be invoked by TextEdit as a
  ClikLoop procedure. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F02H;            (* move.l d2, -(sp)*)
                                (* save vulnerable register *)
  Result[2] := 04227H;            (* clr.b -(sp) *)
  Result[3] := 02F3CH;            (* move.l #n, -(sp) *)
                                (* insert additional arg *)
  TwoWords.LongValue := ToArg;
  Result[4] := TwoWords.HighWord  ;  (* n for above *)
  Result[5] := TwoWords.LowWord;    (* ditto *)
  Result[6] := 04EB9H;            (* jsr l *)
  TwoWords.LongValue := CAST(LONGCARD, TheProc);
  Result[7] := TwoWords.HighWord;    (* high word of l *)
  Result[8] := TwoWords.LowWord;    (* low word of l *)
  Result[9] := 07000H;            (* moveq.l #0, d0 *)
  Result[10] := 0101FH;            (* move.b (sp)+, d0 *)
  Result[11] := 0241FH;            (* move.l (sp)+, d2 *)
                                (* restore saved register *)
  Result[12] := 4E75H;            (* rts *)
  FlushCaches
END BindClikLoopProc;

Note how, while I'm at it, I cleverly change the type of the ClikLoopProc that you must pass, to include an extra argument. This allows you to pass extra information to your ClikLoopProc when it is called, for whatever purpose you want.

Up-Level References

In Pascal and Modula-2, you can declare one routine local to another. In this case, the environment of the inner routine includes everything declared locally to the outer routine. In particular, it is possible for the inner routine to access variables declared local to the outer routine.

How is this done? I know of two different techniques that compilers may use. One involves the use of a global display array of pointers to stack frames, indexed by the source nesting level of the routine. The other way is to pass a static link when calling a nested routine -- this is just a pointer to the enclosing routine's stack frame.

On the Mac, one of these techniques is the right way, and the other is the wrong way. The display array is the wrong way, because it requires global data that isn't always available. The static link method is the right way. Thankfully, the Mac compilers I'm aware of all do it the right way.

To illustrate how a static link works, have a look at the following piece of source:

PROCEDURE Outer;

  VAR
    OuterVar : CARDINAL;

  PROCEDURE Inner
   (
    NewValue : CARDINAL
   );

  BEGIN
    OuterVar := NewValue
  END Inner;

BEGIN (*Outer*)
  OuterVar := 1;
  Inner(2)
END Outer;

Here's the code that the Metrowerks compiler generates for for procedure Outer:

LINK    A6,#$FFFE
MOVE.W   #$0001,-$0002(A6) ; OuterVar := 1
MOVE.W   #$0002,-(A7) ; arg for call to Inner
PEA    (A6)
JSR    Inner
UNLK    A6
RTS    

Note that the variable OuterVar is at offset -2 within Outer's stack frame. Also note that when it calls Inner, it doesn't just pass the cardinal value 2 for the argument NewValue; it pushes an additional longword, which is the value of its A6 (stack frame) register. To see the use of this, here's the code for Inner:

LINK    A6,#$0000
MOVEA.L  $0008(A6),A4 ; Outer's A6 to access OuterVar
MOVE.W   $000C(A6),-$0002(A4) ; OuterVar := NewValue
UNLK    A6
MOVEA.L  (A7)+,A0
ADDQ.L   #$6,A7
JMP    (A0)

Notice how Inner gets to OuterVar by using the stack frame pointer that was passed to it by Outer. If Outer was itself declared local to another routine, it is similarly possible for Inner to access the local variables in that stack frame by following the frame pointer value passed to Outer by its parent.

To see the relevance of this, here's an example use of the TraverseProcesses routine I gave up above. It's a routine called FindProcessBySignature, that scans the process list until it finds a process with a given application signature. It returns true and the process serial number if it finds a match, false otherwise.

PROCEDURE FindProcessBySignature
 (
  Signature : OSType;
  VAR MatchingProcess : ProcessSerialNumber
 ) : BOOLEAN;

  VAR
    Found : BOOLEAN;

  PROCEDURE MatchSignature
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

    VAR
      Err : OSErr;

  BEGIN
    IF ProcessInfo.processSignature = Signature THEN
      MatchingProcess := ThisProcess;
      Found := TRUE;
      KeepGoing := FALSE
    END (*IF*)
  END MatchSignature;

BEGIN (*FindProcessBySignature*)
  Found := FALSE; (* initial assumption *)
  TraverseProcesses
   (
    (*Action :=*) CAST(ProcessActions,
    ADR(MatchSignature)),
    (*ActionArg :=*) CurrentA6()
   );
  RETURN
    Found
END FindProcessBySignature;

First of all, compare the declaration of the inner MatchSignature routine with the original definition of the ProcessActions type. Something is missing: What has happened to the ActionArg?

The answer, of course, is that this is now an implicit additional argument, being the outer routine's stack frame pointer! When FindProcessBySignature passes the address of MatchSignature to TraverseProcesses as the action routine, it tells the latter to call it with that additional argument, which is the result of calling the function CurrentA6. This is defined as follows:

PROCEDURE CurrentA6() : LONGCARD;

  CODE
    02E8EH; (* move.l a6, (sp) *)

It returns the value that is currently in the A6,which is the stack frame register -- exactly what the inner routine needs when it's called.

The main ugly thing about this technique is that Modula-2 doesn't allow you to pass a nested routine as a routine value; this is only allowed for globally-declared routines (as in the sorting example earlier). Luckily, Metrowerks allows you to use the ADR (address-of) function on a nested routine, and you can do a CAST to force the result of this to the appropriate type. (Unfortunately, the p1 compiler doesn't seem to support this feature.)

Why not just declare MatchSignature globally outside FindProcessBySignature, and avoid this trickiness with stack frame pointers? The reason is because, instead of passing the stack frame pointer as the additional argument, you'll now have to pass pointers to the local variables Found and MatchingProcess. Or if you try using globals instead, your code is no longer reentrant! This may not seem a big deal at first sight, but it's very important in library code, which will be written once and reused in a wide variety of situations you never envisaged. Alternatively, you could combine both the local variables into a single structure, and pass the address of that. That will work, but you lose some of the simplicity of the nesting version.

Besides, the MatchSignature routine doesn't need to be visible globally. One of the safety principles you learn when you're writing large programs is: if something doesn't need to be visible globally, don't make it so. In a large piece of software, the complexity comes not from the number of parts, but from the number of potential interactions between them. To minimize unwanted interactions, things should only be accessible in places where they need to be. This is called information hiding.

There is one subtle pitfall with static links you need to watch out for. Consider the following code fragment:

PROCEDURE Parent;

  VAR
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) CurrentA6() (***WRONG***)
     )
  END Brother;

BEGIN (*Parent*)
  ...
  Brother
  ...
END Parent;

What is wrong with this? The problem is, that the use of CurrentA6() in Brother means that it is passing Sister a pointer to its own stack frame, rather than the one belonging to Parent! This means that, where Sister expects to access variables local to Parent, it accesses ones local to Brother instead. This is obviously wrong, since the language rules don't allow variables local to Brother to be directly visible to Sister.

The following is one way to modify this to work correctly:

PROCEDURE Parent;

  VAR
    ParentA6 : LONGCARD;
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) ParentA6
     )
  END Brother;

BEGIN (*Parent*)
  ParentA6 := CurrentA6();
  ...
  Brother
  ...
END Parent;

Here the CurrentA6() call has been moved to Parent, and its value saved for use by Brother when calling Sister.

The rule is simple: when calling a nested routine, you must always pass the stack frame pointer for the routine textually immediately enclosing the one you want to call. If you stick to this rule, all the more complicated cases -- references to local variables up more than one level, recursion, inner routines calling outer ones, and so on -- are guaranteed to work correctly!

Application To Other Languages

For some odd reason, most Mac programmers don't use Modula-2. Most use C, and most of the rest use Pascal. And you're wondering: how applicable are these techniques to those languages?

You can use them to some extent. Modula-2 is modeled fairly closely on Pascal (they were both designed by the same person, after all), and I'm pretty sure all the Mac Pascals use A6 as a stack frame pointer in the same way as Modula-2. There are two main drawbacks to Pascal:

  • There are no routine types. Once you take the address of a procedure or function, you're completely on your own.
  • I understand THINK Pascal doesn't even allow you to take the address of a routine that's local to another routine. That means you can't use any of the A6 stuff.

Note that it is possible in Pascal to pass a routine as an argument to another, in a type-safe fashion. This feature is powerful enough to deal with all the examples using static links I have given above. However, it doesn't allow you to save routine addresses in data structures, which can come in useful in some cases I haven't covered in this article.

The major drawback with C is that it doesn't allow you to declare one routine inside another. This means you have to go through the extra complexity I mentioned above in passing additional arguments to action routines. At least C has proper procedure and function types.

Conclusion

Assembly-language hackers are occasionally fond of a programming technique known as self-modifying code, where an instruction sequence pokes some bits in an instruction elsewhere in the routine before executing it.

I want to make a clear distinction between self-modifying code, and the techniques for generating code at run-time that I have been describing above. Self-modifying code was a technique concocted to squeeze a few more instruction cycles out of time-critical code. It has been quite rightly criticized as leading to programs that are hard to understand, hard to reuse and hard to maintain.

The techniques I have been describing, on the other hand, are more structured than this. The code which is being generated at run-time follows a simple, easy-to-understand form, and its use is really quite straightforward. My aim has not been to optimize time-critical code, but precisely to create code which is easy to understand, easy to reuse and easy to maintain!


Lawrence D'Oliveiro has been doing both high-level and low-level programming on a variety of different platforms over the last eighteen years. When he's not playing with his techno-toys, or worrying about which cool technology Apple is going to abandon next, he likes cycling and walking around the nearby Waikato River.

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

MarsEdit 3.6.8 - Quick and convenient bl...
MarsEdit is a blog editor for OS X that makes editing your blog like writing email, with spell-checking, drafts, multiple windows, and even AppleScript support. It works with with most blog services... Read more
BBEdit 11.0.3 - Powerful text and HTML e...
BBEdit is the leading professional HTML and text editor for the Mac. Specifically crafted in response to the needs of Web authors and software developers, this award-winning product provides a... Read more
Microsoft Office Preview 15.8 - Popular...
Welcome to the new and modern Microsoft Office for Mac. You will receive regular updates automatically until the official release in the second half of 2015. With the redesigned Ribbon and your... Read more
Yosemite Cache Cleaner 9.0.5 - Clear cac...
Yosemite Cache Cleaner is an award-winning general purpose tool for OS X. YCC makes system maintenance simple with an easy point-and-click interface to many OS X functions. Novice and expert users... Read more
ExpanDrive 4.3.2 - Access cloud storage...
ExpanDrive builds cloud storage in every application, acts just like a USB drive plugged into your Mac. With ExpanDrive, you can securely access any remote file server directly from the Finder or... Read more
RapidWeaver 6.0.8 - Create template-base...
RapidWeaver is a next-generation Web design application to help you easily create professional-looking Web sites in minutes. No knowledge of complex code is required, RapidWeaver will take care of... Read more
Artlantis Studio 5.1.2.7 - 3D rendering...
Artlantis Studio is a unique and ideal tool for performing very high resolution rendering easily and in real time. The new FastRadiosity engine now lets you compute images in radiosity-even in... Read more
MacUpdate Desktop 6.0.5 - Search and ins...
MacUpdate Desktop 6 brings seamless 1-click installs and version updates to your Mac. With a free MacUpdate account and MacUpdate Desktop 6, Mac users can now install almost any Mac app on macupdate.... Read more
BitTorrent Sync 2.0.82 - Sync files secu...
BitTorrent Sync allows you to sync unlimited files between your own devices, or share a folder with friends and family to automatically sync anything. File transfers are encrypted. Your information... Read more
Google Drive 1.20 - File backup and shar...
Google Drive is a place where you can create, share, collaborate, and keep all of your stuff. Whether you're working with a friend on a joint research project, planning a wedding with your fiancé, or... Read more

Bored? MyLeisure FreeTime Maximizer Will...
Bored? MyLeisure FreeTime Maximizer Will Take Care of That! Posted by Jessica Fisher on March 5th, 2015 [ permalink ] iPhone App - Designed for the iPhone, compatible with the iPad | Read more »
New Publisher Allstar Games Heads West w...
Allstar Games has announced its first mobile title designed for western audiences, Allstar Heroes. The game will be a massive online battle arena (MOBA) that offers dozens of heroes for you to collect and pit against your opponents. As each hero has... | Read more »
RAD Boarding Review
RAD Boarding Review By Jennifer Allen on March 5th, 2015 Our Rating: :: NEARLY RADUniversal App - Designed for iPhone and iPad RAD Boarding isn’t quite one of the greats, but it has potential.   | Read more »
Presenting the International Mobile Gami...
11th Annual International Mobile Gaming Awards ceremony, hosted by actress Allison Haislip, gathered mobile game developers and publishers from around the world. They chose 13 winners out of the 93 nominations. British studio USTWO won the the Grand... | Read more »
AG Drive Review
AG Drive Review By Tre Lawrence on March 5th, 2015 Our Rating: :: FUTURISTIC STREET RACING.Universal App - Designed for iPhone and iPad Futuristic racing… interstellar style.   | Read more »
GDC 2015 – Nightmare Guardians is an Int...
GDC 2015 – Nightmare Guardians is an Interesting Hybrid of MOBA and Lane Defense Posted by Rob Rich on March 5th, 2015 [ permalink ] I have to say that lane defense (i.e. | Read more »
Overkill 3 Review
Overkill 3 Review By Tre Lawrence on March 5th, 2015 Our Rating: :: WHO'S NEXT?Universal App - Designed for iPhone and iPad Cover system gameplay in the third-person.   Developer: Craneballs Price: Free Version Reviewed: 1.1.6... | Read more »
Warner Bros. Interactive Entertainment A...
Warner Bros. has some exciting games coming down the pipe! | Read more »
GDC 2015 – Star Trek Timelines will Prob...
GDC 2015 – Star Trek Timelines will Probably Make Your Inner Trekkie Squeal With Glee Posted by Rob Rich on March 4th, 2015 [ permalink ] Any popular fictional universe has its fair share of fan fiction – where belo | Read more »
Protect Yourself from an Onslaught of Ca...
Surprise Attack Games has announced a Cat-astrophic new physics puzzler called Fort Meow! In the game, a young girl named Nia finds her grandfather’s journal which triggers an all mighty feline attack! Why do the cats want the journal? Who knows,... | Read more »

Price Scanner via MacPrices.net

Apple restocks refurbished 15-inch Retina Mac...
The Apple Store has restocked Apple Certified Refurbished 2014 15″ Retina MacBook Pros, available for up to $400 off the cost of new models. An Apple one-year warranty is included with each model,... Read more
Roundup of MacBook Air sale prices, models up...
B&H Photo has MacBook Airs on sale for up to $100 off MSRP. Shipping is free, and B&H charges NY sales tax only: - 11″ 128GB MacBook Air: $799 100 off MSRP - 11″ 256GB MacBook Air: $999 $100... Read more
New Firstrade Mobile App Enables On-The-Go Tr...
Firstrade Securities Inc. has announced its new mobile app, which gives investors immediate access to the company’s trading platform on all mobile devices. The app was developed in-house and was... Read more
Sonnet Introduces USB 3.0 + eSATA Thunderbolt...
Sonnet has announced the launch of its new USB 3.0 + eSATA Thunderbolt Adapter for easy connectivity to USB 3.0 devices and eSATA storage, and USB 3.0 + Gigabit Ethernet Thunderbolt Adapter for easy... Read more
Apple restocks refurbished 27-inch 5K iMacs f...
The Apple Store has restocked Apple Certified Refurbished 27″ 3.5GHz 5K iMacs for $2119 including free shipping. Their price is $380 off the cost of new models, and it’s the lowest price available... Read more
Free Clean Reader Mobile App Hides Swear Word...
The new Clean Reader app, now available in the Apple App Store and Google Play, delivers the opportunity of reading any book without being exposed to profanity. By selecting how clean they want their... Read more
Kinsa Launches “Groups” App to Monitor Illnes...
Kinsa, makers of the first FDA approved app-enabled smartphone thermometer thst won the 2013 Cleveland Clinic Medical Innovation Grand Prize and recently appeared in Apple’s “Parenthood” TV... Read more
iPad: A More Positive Outlook – The ‘Book Mys...
It’s good to hear someone saying positive things about the iPad. I’ve been trying to bend my mind around how Apple’s tablet could have gone from zero to bestselling personal computing device on the... Read more
Mac Pros on sale for up to $279 off MSRP
Amazon has Mac Pros in stock and on sale for up to $279 off MSRP. Shipping is free: - 4-Core Mac Pro: $2725.87, $273 off MSRP (9%) - 6-Core Mac Pro: $3719.99, $279 off MSRP (7%) Read more
Sale! 13-inch Retina MacBook Pros for up to $...
B&H Photo has 13″ Retina MacBook Pros on sale for up to $205 off MSRP. Shipping is free, and B&H charges NY sales tax only: - 13″ 2.6GHz/128GB Retina MacBook Pro: $1219.99 save $80 - 13″ 2.... Read more

Jobs Board

*Apple* Retail - Multiple Positions (US) - A...
Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, you're also the Read more
*Apple* Solutions Consultant - Retail Sales...
**Job Summary** As an Apple Solutions Consultant (ASC) you are the link between our customers and our products. Your role is to drive the Apple business in a retail Read more
Position Opening at *Apple* - Apple (United...
…Summary** As a Specialist, you help create the energy and excitement around Apple products, providing the right solutions and getting products into customers' hands. You Read more
Position Opening at *Apple* - Apple (United...
**Job Summary** The Apple Store is a retail environment like no other - uniquely focused on delivering amazing customer experiences. As an Expert, you introduce people Read more
*Apple* Solutions Consultant - Retail Sales...
**Job Summary** As an Apple Solutions Consultant (ASC) you are the link between our customers and our products. Your role is to drive the Apple business in a retail Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.