TweetFollow Us on Twitter

Routines as Values

Volume Number: 13 (1997)
Issue Number: 10
Column Tag: Programming Techniques

Routines As Values

by Lawrence D'Oliveiro

The High-Level Programmer's Guide to Generating Code at Run-Time

Introduction

This article is about passing around routines (procedures and functions) as values. That is, you take the address of a routine, pass it to another routine as an argument, maybe even store it in a data structure for later use. Ultimately the whole point with a routine is that you make a call to it; but many interesting things can happen on the way there.

There are many useful programming techniques to do with routines as values. In my view, these techniques deserve to be more widely known than they are, because they can contribute so much to the principles of reusable code and information hiding that good programmers value. This article will give you an introduction to some of these techniques.

To make sense of this article, it helps to have some understanding of 68K assembly language and how it interfaces to the Mac OS. The definitive reference to 68K machine language is the 68000 Family Programmer's Reference, available from Motorola's Web site as an Acrobat PDF file at http://www.mot.com/hpesd/aesop/68k/68kprm.pdf. Links to related documents can be found at http://www.mot.com/SPS/HPESD/docs/pubs/index.html. For a description of 68K calling conventions on the Mac OS, see the Trap Manager chapter of Inside Macintosh: Operating System Utilities, available on-line at ftp://ftpdev.info.apple.com//Developer_Services/Technical_Documentation/Inside_Macintosh/Operating_System_Utilities/. [Another useful and brief article introducing the basics of 68K ASM is "The Secrets of the Machine" by Malcolm Teas in MacTech 9.5. It is available from www.mactech.com/Articles/Vol.09/09.05/Read-Assembly/index.html -- ed].

Language Notes

Most of the code examples in this article are written in Modula-2. Specifically, they were written for the (now-defunct) Metrowerks Modula-2 compiler that runs under Apple's MPW (Macintosh Programmer's Workshop). I use this combination because it's been the best language and development environment for the kinds of programs I write. I recently got hold of the p1 Modula-2 compiler from Germany, but have yet to start using that in earnest.

At the end of the article, I will discuss the applicability of these techniques to other languages: everybody's favorite, C, and everybody else's favorite, Pascal.

In the meantime, I'll explain a few things for those who might be unfamiliar with Modula-2, but are familiar with Pascal. And if you're not familiar with Pascal, then you're not a real programmer...

Modula-2 has a lot in common with Pascal, and most of the following code should be comprehensible to Pascal programmers. It has the same signed integer types as Pascal (INTEGER for 16-bit integers and LONGINT for 32-bit ones), but it also has unsigned integer types: CARDINAL for 16-bit ones and LONGCARD for 32-bit ones. Pointer types are defined using the syntax "POINTER TO t" instead of Pascal's "^t". And there is a general untyped pointer type, ADDRESS, which is compatible with any pointer type. The Pascal compilers on the Mac don't seem to have any equivalent to this; it is like the "void *" type in C. The built-in ADR function returns the address of something; it is equivalent to the "@" operator in Pascal or "&" operator in C. Like the Pascal version (but unlike the C one), it always returns an untyped pointer value, that is, its return type is ADDRESS.

If you want to do unsafe type conversions, there are two ways. One is the traditional record-variant trick from Pascal (or unions in C). The other is the use of the CAST function. You will see both in use later. Unlike Pascal or C, you can't just use a type name as a function to do a cast; this makes unsafe conversions a little easier to find in the code!

The big difference from Pascal, for the purposes of this article, is that Modula-2 allows you to declare routine types. That is, you can write pieces of code like this:

  TYPE
    MyProcType =
      PROCEDURE
       (
        INTEGER
       );

  VAR
    ProcVar : MyProcType;
    K : INTEGER;

  PROCEDURE MyProc
   (
    Arg : INTEGER
   );

  BEGIN
    ...
  END MyProc;

BEGIN (* main program *)
  MyProc(K); (* invokes MyProc *)
  ProcVar := MyProc;
  ProcVar(K); (* also invokes MyProc *)
  ...

In Pascal, you can take the address of the routine:

  Var
    ProcVar : Ptr;

  Procedure MyProc
   (
    Arg : Integer
   );

 Begin
  ...
 End {MyProc} ;

Begin
{ main program } 
  MyProc(K);
  ProcVar := @MyProc;
  ...

But there is no way to define a type like MyProcType in the Modula-2 version, and the language provides no standard mechanism for calling the routine once you get its address in ProcVar. Of course, you can get around this, in a type-unsafe way, with appropriate use of inline machine code.

C programmers will already be familiar with all this, since C has similar sorts of facilities to Modula-2 in this regard, albeit with slightly less elegant syntax:

  typedef void (*MyProcType)(short);

  MyProcType ProcVar;
  short K;

void MyProc
 (
  short Arg
 )
 {
  ...
 } /*MyProc*/

void main
 (
  void
 )
 {
  MyProc(K); /* invokes MyProc */
  ProcVar = &MyProc;
  (*ProcVar)(K); /* also invokes MyProc */
  ...

I did a quick search for Modula-2-related information on the Web, and was amazed at how much I found. Here are a few URLs to get you started: http://www.infm.ulst.ac.uk/~george/mp/pract/modula.html, http://www.ee.newcastle.edu.au/users/staff/peter/m2/Modula2.html, http://www.ise.fhg.de/personal_home_pages/hubertus/texte/tabcont.html.

Anyway, enough with the preliminaries, on with the show...

Action Routines

There is frequently a need to traverse a complex data structure, looking for elements that match a particular criterion, and doing some processing on them. For example, consider the System 7 process list: you might want to scan it to see if a particular program is running. Or you might want to total up information about all processes, or display them in a list for the user's information. In all these applications, there is a certain recurring pattern to the code:

  • Declare a variable named, say, CurrentProcess, of type ProcessSerialNumber. Initialize it to kNoProcess.
  • Call GetNextProcess, passing it CurrentProcess. If you get an error, exit the loop.
  • Perform whatever processing is appropriate on the process with the serial number in CurrentProcess. Then repeat from the previous step.

The code in the first two steps would be the same in every case. Ideally, you should be able to write it just once, and reuse this code everywhere you need to. How can you do this?

The answer is to write a generic process-traversal routine, which calls an action routine that you specify for each process. In Modula-2, you could define the type of the action routine as follows:

TYPE
  ProcessActions =
    PROCEDURE
     (
      VAR (*ThisProcess :*) ProcessSerialNumber,
      VAR (*ProcessInfo :*) ProcessInfoRec,
      VAR (*KeepGoing :*) BOOLEAN,
      (*ActionArg :*) ADDRESS
     );

The first two arguments pass all the information about the process that the Process Manager provides. The KeepGoing argument may be set to false by the action routine to stop the traversal. This is useful if, for example, you're searching for a particular process, and you don't need to search further once you've found it. The last argument allows the passing of additional data whose meaning is up to the caller; we'll see one interesting use for this later.

And here's the generic process-traversal routine:

PROCEDURE TraverseProcesses
 (
  Action : ProcessActions;
  ActionArg : ADDRESS
 );

  VAR
    ThisProcess : ProcessSerialNumber;
    ProcessInfo : ProcessInfoRec;
    ProcessName : Str255;
    AppSpec : FSSpec;
    KeepGoing : BOOLEAN;
    Err : OSErr;

BEGIN
  ThisProcess.highLongOfPSN := 0;
  ThisProcess.lowLongOfPSN := kNoProcess;
  ProcessInfo.processInfoLength := SIZE(ProcessInfoRec);
  ProcessInfo.processName := ADR(ProcessName);
  ProcessInfo.processAppSpec := ADR(AppSpec);
  KeepGoing := TRUE; (* initially *)
  LOOP
    Err := GetNextProcess(ThisProcess);
    IF Err <> noErr THEN
      EXIT
    END (*IF*);
    Err := GetProcessInformation(ThisProcess, ProcessInfo);
    IF Err = noErr THEN
      Action
       (
        ThisProcess,
        ProcessInfo,
        KeepGoing,
        ActionArg
       );
      IF NOT KeepGoing THEN
        EXIT
      END (*IF*)
    END (*IF*)
  END (*LOOP*)
END TraverseProcesses;

A sorting example

Another use for action routines is to isolate assumptions about data structures. Consider a common operation: sorting. You've probably got a collection of favorite sorting routines that you like to use. What sorts of assumptions do they make about the data being sorted? Does it have to be in an array? Can it be in a disk file? Do the sort keys have to reside together with the rest of the data? Why make any assumptions at all?

Many sorting routines can function by making only two kinds of accesses to the data: they need to be able to compare the keys of two records, and they need to be able to swap the positions of two records to get them into the right order. Let's define interfaces for action routines to perform these operations:

TYPE
  ComesBeforeProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  SwapProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     );

The one assumption I will make is that records are identifiable by index numbers starting from 1. A ComesBeforeProc returns true if the record with index First is supposed to come before the one with index Second (that is, they're already in the right order); it returns false otherwise. A SwapProc is called to swap two out-of-order elements: after it executes, the record which had index First should now have index Second, and vice versa.

As with the process-traversal example, each action routine takes an additional argument which may be used to pass additional caller-specified information.

Here's a sorting routine that uses the above action routines to implement a simple Shellsort:

PROCEDURE Sort
 (
  NrElements : LONGCARD;
  ComesBefore : ComesBeforeProc;
  ComesBeforeArg : ADDRESS;
  Swap : SwapProc;
  SwapArg : ADDRESS
 );

  VAR
    i, j, k, l, m : LONGCARD;

BEGIN
  i := 1;
  WHILE i < NrElements DO
    i := i + i
  END (*WHILE*);
  m := (i - 1) DIV 2;
  IF (m = 0) AND (NrElements > 1) THEN
   (* do at least one pass when sorting 2 elements *)
    m := 1
  END (*IF*);
  WHILE m > 0 DO
    k := NrElements - m;
    FOR j := 1 TO k DO
      i := j;
      LOOP
        l := i + m;
        IF ComesBefore(i, l, ComesBeforeArg)
        THEN
          EXIT
        END (*IF*);
        Swap(i, l, SwapArg);
        IF i <= m THEN
          EXIT
        END (*IF*);
        i := i - m
      END (*LOOP*)
    END (*FOR*);
    m := m DIV 2
  END (*WHILE*)
END Sort;

Of course, you could replace the body of this routine with just about any sorting algorithm you like, and keep the same interface. Here's a very simple, almost trivial example use of this routine to sort an array of integers. First we declare a data array, and comparison and swap routines for the sort:

TYPE
  ElementType = INTEGER;
VAR
  MyData : POINTER TO
    POINTER TO ARRAY [1 .. 10000] OF ElementType;
  MyNrElements : CARDINAL;

PROCEDURE MyComesBefore
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 ) : BOOLEAN;

BEGIN
  RETURN
      MyData^^[First]
    <=
      MyData^^[Second]
END MyComesBefore;

PROCEDURE MySwap
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 );

  VAR
    Temp : ElementType;

BEGIN
  Temp := MyData^^[First];
  MyData^^[First] := MyData^^[Second];
  MyData^^[Second] := Temp
END MySwap;

And here's the call that does the actual sort:

Sort
 (
  (*NrElements :=*) MyNrElements,
  (*ComesBefore :=*) MyComesBefore,
  (*ComesBeforeArg :=*) NIL,
  (*Swap :=*) MySwap,
  (*SwapArg :=*) NIL
 )

This example makes no use of the extra arguments that can be passed to the comparison and swap routines. Later we will see an interesting use for these extra arguments.

Closures

In languages like POP-11 and LISP, which are primarily used for Artificial Intelligence work, you can have a lot of fun with functions. You can even create new functions at run-time!

But why should AI researchers have all the fun? Can't you create functions at run-time in a common language like Pascal, Modula-2 or C? Certainly you can!

The bad news is: the techniques for doing this rely on particular features of the compiler and machine. It takes some work to get them right the first time. And certain compiler optimizations can signficant problems.

The good news is: once you get it right, it's a lot easier to do again. And these become very useful techniques, that you find yourself using over and over!

Technically, a Curried function (it's named after a person, not a dish) is created when you take a function with many arguments, and "bind" some of them to particular value; the result is a function of fewer arguments. A closure, on the other hand, is created when you bind a function to its parent environment, to create a new function which automatically sets up the correct environment for the original function when it's called.

To illustrate closures and environments, consider a common problem on the Mac: accessing A5 globals from an interrupt routine.

In most Mac languages, when you declare a bunch of global variables, the compiler allocates them in a block of memory accessed relative to register A5. A5 is also used to access the application's segment table, as well as QuickDraw's global storage. Thus, A5 is a very important register, and your application code is liable to behave very strangely if it's called with an incorrect value in A5, and it tries to access any global data.

Instead of putting special code to deal with this into all of your interrupt routines, the principle of reusable code dictates that you should write this code just once, and use it over and over. That is, you should write a general library routine which, given an interrupt routine, creates a "wrapper" routine for it that passes the correct A5 value. Here's an example of what I mean:

TYPE
  BoundInterruptProc = ARRAY [1 .. 9] OF CARDINAL;

A BoundInterruptProc is an array which will hold the the code for the wrapper routine. The wrapper code itself is filled in by BindInterruptProc, which is defined as follows:

PROCEDURE CurrentA5() : LONGCARD;
(* returns the current value of the global base register.*)

  CODE
    02E8DH;          (* move.l a5, (sp) *)

PROCEDURE BindInterruptProc
 (
  TheProc : ADDRESS;
  VAR Result : BoundInterruptProc
 );
(* constructs the wrapper around the specified procedure.*)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F0DH;          (* move.l a5, -(sp) *)
                              (* save previous A5 *)
  Result[2] := 02A7CH;          (* move.l #n, a5 *)
  TwoWords.LongValue := CurrentA5(); 
                              (* n for previous instr *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 04EB9H;          (* jsr p *)
  TwoWords.LongValue := TheProc;  (* p for previous instr *)
  Result[6] := TwoWords.HighWord;
  Result[7] := TwoWords.LowWord;
  Result[8] := 02A5FH;          (* move.l (sp)+, a5 *)
                              (* restore previous A5 *)
  Result[9] := 04E75H;          (* rts *)
  FlushCaches
END BindInterruptProc;

So instead of passing the address of your actual interrupt routine to the system, you pass the address of the appropriately-filled-in BoundInterruptProc structure, for example:

VAR
  Params : HParamBlockRec;
  BoundCompletion : BoundInterruptProc;
...

BindInterruptProc(ADR(MyCompletion), BoundCompletion);
Params.ioCompletion := ADR(BoundCompletion);
...
Err := PBReadAsync(ADR(Params));

and in your interrupt routine you have full access to all your A5 globals!

Note that FlushCaches call at the end of the definition of BindInterruptProc. This is necessary because of a piece of brain damage in Motorola's 68030 and 68040 processors. The problem is that, on these processors, memory references are divided into two kinds, data and code, and there is a separate CPU cache for each, with no attempt to maintain consistency between them. Thus, if a given memory location ends up in both caches (as could happen when you generate code at run-time, as BindInterruptProc does), it is possible for it to be modified from the data cache, without the instruction cache picking this up! Then, when you later try to execute the instruction at that memory location, you get the wrong value, and crash...

Here's how FlushCaches is defined. This definition is safe to use on all members of the 68000 CPU family, with system version 6.0 and later. It even works on Power Macs, where it flushes the cache of compiled code on models with the DR ("dynamic recompilation") emulator, and does nothing on older Power Macs.

PROCEDURE HWPrivPresent() : BOOLEAN;
(* has this system got the HWPriv trap. *)

  CONST
    UnimplementedTrap = 009FH;
    HWPrivTrap = 0A198H;

BEGIN
  RETURN
      GetOSTrapAddress(HWPrivTrap)
    <>
      GetOSTrapAddress(UnimplementedTrap)
END HWPrivPresent;

PROCEDURE FlushInstructionCache;
(* actually flushes both instruction and data caches. *)

  CODE
    07001H, (* moveq.l #1, d0 *)
    0A198H; (* _HWPriv *)

PROCEDURE FlushCaches;
 (* flushes instruction and data caches, if present. *)

BEGIN
  IF HWPrivPresent() THEN
    FlushInstructionCache
  END (*IF*)
END FlushCaches;

Binding Actual Arguments

Often, in a routine of yours that is called by the system, you don't need access to all your A5 globals, but you do need access to a particular block of data. Depending on the situation, the system may provide all kinds of different ways of passing the address of this block as an additional argument to your routine -- or it might not provide any way at all.

For example, the action routine you pass to the Control Manager's TrackControl routine gets the control handle as an argument, so you could use the refCon field in the control record to pass information to it. That is, unless the part of the control you're tracking is an indicator (such as a scroll bar thumb), in which case you don't get any arguments at all. Of course, if your code is part of the currently-running application, you could probably get away with using A5 globals in this case.

There are other, equally complicated situations. Time Manager tasks didn't use to get any arguments, until System 6.0.3, when they started getting the address of the TMInfo record in register A1. Memory Manager GrowZoneProcs don't get any arguments, and you can't guarantee that some part of the system (or some weird INIT) won't try to allocate some memory in your heap while the A5 value is set to something nonsensical.

Now, what do you think: would you prefer to resort to a range of different fiddly techniques to deal with each of these situations, or would you rather stick with one general, uniform technique that can handle them all?

I know which I prefer. I would Curry my routine by binding it to the address of my data, using the following Bind routine:

TYPE
 (* structure for holding Curried routine *)
  BoundProc = ARRAY [1 .. 8] OF CARDINAL;

PROCEDURE Bind
 (
  TheProc : ADDRESS;
  ToArg : ADDRESS;
  VAR Result : BoundProc
 );
 (*    creates a new procedure which calls TheProc
      with ToArg as an extra argument. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F17H;          (* move.l (sp), -(sp) *) )
  Result[2] := 02F7CH;          (* move.l #m, n(sp) *) 
  TwoWords.LongValue := ToArg;    (* m for above move *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 4;                (* n for above move *)
  Result[6] := 04EF9H;          (* jmp n *)
  TwoWords.LongValue := TheProc;  (* n for jmp *)
  Result[7] := TwoWords.HighWord;
  Result[8] := TwoWords.LowWord;
  FlushCaches
END Bind;

Note that you can use Bind on any procedure or function of just about any number of arguments, and (if a function) returning a result of any type. The only restrictions are that it must follow Pascal calling conventions (as Metrowerks Modula-2 does), the argument you are binding must be the last one in the argument list, and it must be 32 bits in size (such as an address).

Converting Calling Conventions

Another important use for run-time closures is where the system invokes some code, that you supply, in a nonstandard way. One case that has bitten many Macintosh programmers over the years is the infamous TextEdit clikLoop callback.

If you specify a clikLoop routine in your TextEdit record, TEClick will call it repeatedly as long as the mouse remains down. One use for this callback is to implement autoscrolling: TextEdit's default clikLoop action, like the rest of TextEdit, doesn't know anything about scrollbars, so you have to provide your own code to keep these in sync.

The clikLoop routine is invoked in a slightly odd way: it must preserve register D2 (normally on the Mac, any routine you call is allowed to trash A0, A1, D0, D1 and D2), and it must return a boolean result in the bottom byte of D0. If this value is ever false, then TEClick returns immediately, as though the mouse has been released.

The trap lies in an inocuous-seeming library routine, that Apple provides, called TESetClickLoop. It looks very tempting to use, as it allows you to pass a clikLoop callback that conforms to straight Pascal calling conventions. Of course, it uses its own glue code internally to call your routine and convert appropriately between calling conventions.

The problem lies in how this glue code finds your routine: the address of your routine is stored in a global variable! What this means is, if you have two separate TextEdit records in existence at once, and you try to specify a separate clikLoop routine for each one, then the last one you specify gets called for both, since there is only room to remember a single one!

The correct solution is to create the glue code on the fly. Here's a routine that does this:

TYPE
  ClikLoopProc =
   (* has one more arg than TextEdit.ClikLoopProc *)
    PROCEDURE
     (
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  BoundClikLoopProc = ARRAY [1 .. 12] OF CARDINAL;

PROCEDURE BindClikLoopProc
 (
  TheProc : ClikLoopProc;
  ToArg : ADDRESS;
  VAR Result : BoundClikLoopProc
 );
 (* binds TheProc to the specified argument to create
  a closure that can be invoked by TextEdit as a
  ClikLoop procedure. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F02H;            (* move.l d2, -(sp)*)
                                (* save vulnerable register *)
  Result[2] := 04227H;            (* clr.b -(sp) *)
  Result[3] := 02F3CH;            (* move.l #n, -(sp) *)
                                (* insert additional arg *)
  TwoWords.LongValue := ToArg;
  Result[4] := TwoWords.HighWord  ;  (* n for above *)
  Result[5] := TwoWords.LowWord;    (* ditto *)
  Result[6] := 04EB9H;            (* jsr l *)
  TwoWords.LongValue := CAST(LONGCARD, TheProc);
  Result[7] := TwoWords.HighWord;    (* high word of l *)
  Result[8] := TwoWords.LowWord;    (* low word of l *)
  Result[9] := 07000H;            (* moveq.l #0, d0 *)
  Result[10] := 0101FH;            (* move.b (sp)+, d0 *)
  Result[11] := 0241FH;            (* move.l (sp)+, d2 *)
                                (* restore saved register *)
  Result[12] := 4E75H;            (* rts *)
  FlushCaches
END BindClikLoopProc;

Note how, while I'm at it, I cleverly change the type of the ClikLoopProc that you must pass, to include an extra argument. This allows you to pass extra information to your ClikLoopProc when it is called, for whatever purpose you want.

Up-Level References

In Pascal and Modula-2, you can declare one routine local to another. In this case, the environment of the inner routine includes everything declared locally to the outer routine. In particular, it is possible for the inner routine to access variables declared local to the outer routine.

How is this done? I know of two different techniques that compilers may use. One involves the use of a global display array of pointers to stack frames, indexed by the source nesting level of the routine. The other way is to pass a static link when calling a nested routine -- this is just a pointer to the enclosing routine's stack frame.

On the Mac, one of these techniques is the right way, and the other is the wrong way. The display array is the wrong way, because it requires global data that isn't always available. The static link method is the right way. Thankfully, the Mac compilers I'm aware of all do it the right way.

To illustrate how a static link works, have a look at the following piece of source:

PROCEDURE Outer;

  VAR
    OuterVar : CARDINAL;

  PROCEDURE Inner
   (
    NewValue : CARDINAL
   );

  BEGIN
    OuterVar := NewValue
  END Inner;

BEGIN (*Outer*)
  OuterVar := 1;
  Inner(2)
END Outer;

Here's the code that the Metrowerks compiler generates for for procedure Outer:

LINK    A6,#$FFFE
MOVE.W   #$0001,-$0002(A6) ; OuterVar := 1
MOVE.W   #$0002,-(A7) ; arg for call to Inner
PEA    (A6)
JSR    Inner
UNLK    A6
RTS    

Note that the variable OuterVar is at offset -2 within Outer's stack frame. Also note that when it calls Inner, it doesn't just pass the cardinal value 2 for the argument NewValue; it pushes an additional longword, which is the value of its A6 (stack frame) register. To see the use of this, here's the code for Inner:

LINK    A6,#$0000
MOVEA.L  $0008(A6),A4 ; Outer's A6 to access OuterVar
MOVE.W   $000C(A6),-$0002(A4) ; OuterVar := NewValue
UNLK    A6
MOVEA.L  (A7)+,A0
ADDQ.L   #$6,A7
JMP    (A0)

Notice how Inner gets to OuterVar by using the stack frame pointer that was passed to it by Outer. If Outer was itself declared local to another routine, it is similarly possible for Inner to access the local variables in that stack frame by following the frame pointer value passed to Outer by its parent.

To see the relevance of this, here's an example use of the TraverseProcesses routine I gave up above. It's a routine called FindProcessBySignature, that scans the process list until it finds a process with a given application signature. It returns true and the process serial number if it finds a match, false otherwise.

PROCEDURE FindProcessBySignature
 (
  Signature : OSType;
  VAR MatchingProcess : ProcessSerialNumber
 ) : BOOLEAN;

  VAR
    Found : BOOLEAN;

  PROCEDURE MatchSignature
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

    VAR
      Err : OSErr;

  BEGIN
    IF ProcessInfo.processSignature = Signature THEN
      MatchingProcess := ThisProcess;
      Found := TRUE;
      KeepGoing := FALSE
    END (*IF*)
  END MatchSignature;

BEGIN (*FindProcessBySignature*)
  Found := FALSE; (* initial assumption *)
  TraverseProcesses
   (
    (*Action :=*) CAST(ProcessActions,
    ADR(MatchSignature)),
    (*ActionArg :=*) CurrentA6()
   );
  RETURN
    Found
END FindProcessBySignature;

First of all, compare the declaration of the inner MatchSignature routine with the original definition of the ProcessActions type. Something is missing: What has happened to the ActionArg?

The answer, of course, is that this is now an implicit additional argument, being the outer routine's stack frame pointer! When FindProcessBySignature passes the address of MatchSignature to TraverseProcesses as the action routine, it tells the latter to call it with that additional argument, which is the result of calling the function CurrentA6. This is defined as follows:

PROCEDURE CurrentA6() : LONGCARD;

  CODE
    02E8EH; (* move.l a6, (sp) *)

It returns the value that is currently in the A6,which is the stack frame register -- exactly what the inner routine needs when it's called.

The main ugly thing about this technique is that Modula-2 doesn't allow you to pass a nested routine as a routine value; this is only allowed for globally-declared routines (as in the sorting example earlier). Luckily, Metrowerks allows you to use the ADR (address-of) function on a nested routine, and you can do a CAST to force the result of this to the appropriate type. (Unfortunately, the p1 compiler doesn't seem to support this feature.)

Why not just declare MatchSignature globally outside FindProcessBySignature, and avoid this trickiness with stack frame pointers? The reason is because, instead of passing the stack frame pointer as the additional argument, you'll now have to pass pointers to the local variables Found and MatchingProcess. Or if you try using globals instead, your code is no longer reentrant! This may not seem a big deal at first sight, but it's very important in library code, which will be written once and reused in a wide variety of situations you never envisaged. Alternatively, you could combine both the local variables into a single structure, and pass the address of that. That will work, but you lose some of the simplicity of the nesting version.

Besides, the MatchSignature routine doesn't need to be visible globally. One of the safety principles you learn when you're writing large programs is: if something doesn't need to be visible globally, don't make it so. In a large piece of software, the complexity comes not from the number of parts, but from the number of potential interactions between them. To minimize unwanted interactions, things should only be accessible in places where they need to be. This is called information hiding.

There is one subtle pitfall with static links you need to watch out for. Consider the following code fragment:

PROCEDURE Parent;

  VAR
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) CurrentA6() (***WRONG***)
     )
  END Brother;

BEGIN (*Parent*)
  ...
  Brother
  ...
END Parent;

What is wrong with this? The problem is, that the use of CurrentA6() in Brother means that it is passing Sister a pointer to its own stack frame, rather than the one belonging to Parent! This means that, where Sister expects to access variables local to Parent, it accesses ones local to Brother instead. This is obviously wrong, since the language rules don't allow variables local to Brother to be directly visible to Sister.

The following is one way to modify this to work correctly:

PROCEDURE Parent;

  VAR
    ParentA6 : LONGCARD;
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) ParentA6
     )
  END Brother;

BEGIN (*Parent*)
  ParentA6 := CurrentA6();
  ...
  Brother
  ...
END Parent;

Here the CurrentA6() call has been moved to Parent, and its value saved for use by Brother when calling Sister.

The rule is simple: when calling a nested routine, you must always pass the stack frame pointer for the routine textually immediately enclosing the one you want to call. If you stick to this rule, all the more complicated cases -- references to local variables up more than one level, recursion, inner routines calling outer ones, and so on -- are guaranteed to work correctly!

Application To Other Languages

For some odd reason, most Mac programmers don't use Modula-2. Most use C, and most of the rest use Pascal. And you're wondering: how applicable are these techniques to those languages?

You can use them to some extent. Modula-2 is modeled fairly closely on Pascal (they were both designed by the same person, after all), and I'm pretty sure all the Mac Pascals use A6 as a stack frame pointer in the same way as Modula-2. There are two main drawbacks to Pascal:

  • There are no routine types. Once you take the address of a procedure or function, you're completely on your own.
  • I understand THINK Pascal doesn't even allow you to take the address of a routine that's local to another routine. That means you can't use any of the A6 stuff.

Note that it is possible in Pascal to pass a routine as an argument to another, in a type-safe fashion. This feature is powerful enough to deal with all the examples using static links I have given above. However, it doesn't allow you to save routine addresses in data structures, which can come in useful in some cases I haven't covered in this article.

The major drawback with C is that it doesn't allow you to declare one routine inside another. This means you have to go through the extra complexity I mentioned above in passing additional arguments to action routines. At least C has proper procedure and function types.

Conclusion

Assembly-language hackers are occasionally fond of a programming technique known as self-modifying code, where an instruction sequence pokes some bits in an instruction elsewhere in the routine before executing it.

I want to make a clear distinction between self-modifying code, and the techniques for generating code at run-time that I have been describing above. Self-modifying code was a technique concocted to squeeze a few more instruction cycles out of time-critical code. It has been quite rightly criticized as leading to programs that are hard to understand, hard to reuse and hard to maintain.

The techniques I have been describing, on the other hand, are more structured than this. The code which is being generated at run-time follows a simple, easy-to-understand form, and its use is really quite straightforward. My aim has not been to optimize time-critical code, but precisely to create code which is easy to understand, easy to reuse and easy to maintain!


Lawrence D'Oliveiro has been doing both high-level and low-level programming on a variety of different platforms over the last eighteen years. When he's not playing with his techno-toys, or worrying about which cool technology Apple is going to abandon next, he likes cycling and walking around the nearby Waikato River.

 
AAPL
$567.77
Apple Inc.
+43.02
MSFT
$39.86
Microsoft Corpora
+0.17
GOOG
$525.16
Google Inc.
-1.78

MacTech Search:
Community Search:

Software Updates via MacUpdate

Ember 1.5.1 - Versatile digital scrapboo...
Ember (formerly LittleSnapper) is your digital scrapbook of things that inspire you: websites, photos, apps or other things. Just drag in images that you want to keep, organize them into relevant... Read more
Cyberduck 4.4.4 - FTP and SFTP browser....
Cyberduck is a robust FTP/FTP-TLS/SFTP browser for the Mac whose lack of visual clutter and cleverly intuitive features make it easy to use. Support for external editors and system technologies such... Read more
TechTool Pro 7.0.3 - Hard drive and syst...
TechTool Pro is now 7, and this is the most advanced version of the acclaimed Macintosh troubleshooting utility created in its 20-year history. Micromat has redeveloped TechTool Pro 7 to be fully 64... Read more
MacFamilyTree 7.1.6 - Create and explore...
MacFamilyTree gives genealogy a facelift: it's modern, interactive, incredibly fast, and easy to use. We're convinced that generations of chroniclers would have loved to trade in their genealogy... Read more
EtreCheck 1.9.9 - For troubleshooting yo...
EtreCheck is a simple little app to display the important details of your system configuration and allow you to copy that information to the Clipboard. It is meant to be used with Apple Support... Read more
TeamViewer 9.0.28116 - Establish remote...
TeamViewer gives you remote control of any computer or Mac over the Internet within seconds, or can be used for online meetings. Find out why more than 200 million users trust TeamViewer! Free for... Read more
DiskAid 6.6.3 - Use your iOS device as a...
DiskAid is the ultimate transfer tool for accessing the iPod, iPhone or iPad directly from the desktop. Access data such as: music, video, photos, contacts, notes, call history, text messages (SMS),... Read more
Viber 4.1.0 - Send messages and make cal...
Viber lets you send free messages and make free calls to other Viber users, on any device and network, in any country! Viber syncs your contacts, messages and call history with your mobile device,... Read more
Apple iOS 7.1.1 - The latest version of...
The latest version of iOS can be downloaded through iTunes. Apple iOS 7 brings an all-new design and all-new features. Simplicity Simplicity is often equated with minimalism. Yet true simplicity is... Read more
1Password 4.3 - Powerful password manage...
1Password is a password manager that uniquely brings you both security and convenience. It is the only program that provides anti-phishing protection and goes beyond password management by adding Web... Read more

Latest Forum Discussions

See All

Pen & Ink Review
Pen & Ink Review By Jennifer Allen on April 24th, 2014 Our Rating: :: CONVENIENT ARTISTRYiPad Only App - Designed for the iPad Pen & Ink is an ideal way to sketch down some visual ideas or simply spark your imagination.   | Read more »
Sonic & All-Stars Racing Transformed...
Sonic & All-Stars Racing Transformed Now Free, to Add New Iconic Characters and Elements Posted by Tre Lawrence on April 24th, 2014 [ | Read more »
Soccer Rally 2 Review
Soccer Rally 2 Review By Carter Dotson on April 24th, 2014 Our Rating: :: GOALKEEPINGUniversal App - Designed for iPhone and iPad Soccer Rally 2 is the most serious vehicular soccer game ever created.   | Read more »
Galaxy Conquerors Review
Galaxy Conquerors Review By Jennifer Allen on April 24th, 2014 Our Rating: :: RETRO SHOOTINGUniversal App - Designed for iPhone and iPad Old school shooting is fun but inaccurate in Galaxy Conquerors.   | Read more »
Yomi Review
Yomi Review By Rob Thomas on April 24th, 2014 Our Rating: :: C-C-C-COMBO BREAKERiPad Only App - Designed for the iPad Round One – Fight! No quarters required for this iOS adaptation of a tabletop adaptation of the arcade fighting... | Read more »
Injustice: Gods Among Us Updated with Ne...
Injustice: Gods Among Us Updated with New Characters, Leaderboards, Gear, and Online Multiplayer Posted by Rob Rich on April 24th, 2014 [ | Read more »
Spin It Review
Spin It Review By Jordan Minor on April 24th, 2014 Our Rating: :: SPIN ME RIGHT ROUNDUniversal App - Designed for iPhone and iPad Spin It has a fine puzzle game model, but its execution lacks energy.   | Read more »
Productivity App NoteSuite is Having its...
Productivity App NoteSuite is Having its Biggest Sale Ever, Just for One Week Posted by Rob Rich on April 24th, 2014 [ permalink ] | Read more »
Wayward Souls Review
Wayward Souls Review By Carter Dotson on April 24th, 2014 Our Rating: :: CARRY ON, WAYWARD SONUniversal App - Designed for iPhone and iPad Wayward Souls is a roguelike-inspired action-RPG that sets a high bar for other games to... | Read more »
The Sandbox Gets Update, Receives New Ca...
The Sandbox Gets Update, Receives New Campaign and New Elements Posted by Tre Lawrence on April 24th, 2014 [ permalink ] Universal App - Designed for iPhone and iPad | Read more »

Price Scanner via MacPrices.net

Strong iPhone Sales Drive Apple Record March...
Apple on Wednesday announced financial results for its fiscal 2014 second quarter ended March 29, 2014. The Company posted quarterly revenue of $45.6 billion and quarterly net profit of $10.2 billion... Read more
Award-Winning NoteSuite Productivity App is $...
Minneapolis based Theory.io has announced an 80-Percent Markdown NoteSuite for iPad. NoteSuite helps users stay organized by capturing their notes, to-dos and documents in one organized place.... Read more
16GB 1st generation iPad mini available for $...
Radio Shack has a select number of refurbished 1st generation 16GB WiFi iPad minis available for $199.99 on their online store. Choose free shipping or free ship-to-store. We expect these to sell out... Read more
13-inch 2.5GHz MacBook Pro on sale for $100 o...
B&H Photo has the 13″ 2.5GHz MacBook Pro on sale for $1099 including free shipping plus NY sales tax only. Their price is $100 off MSRP. Read more
iPad Sales “Lull” A Reality Correction Of Unm...
I have lots of time for Jean-Louis Gassée, the former Apple Computer executive (1981 to 1990) who succeeded Steve Jobs as head of Macintosh development when the latter was dismissed in 1985. Mr.... Read more
Apple Makes OS X Betas Available To All – Wit...
Apple’s OS X Beta Seed Program, which lets you install the latest pre-release builds, try it out, and submit your feedback, is now open to anyone who wants to sign on rather than to developers and... Read more
Apple Releases iOS 7.1.1 Update
The latest iOS 7.1.1 update contains improvements, bug fixes and security updates, including: • Further improvements to Touch ID fingerprint recognition • Fixes a bug that could impact keyboard... Read more
Logitech Announces Thinner, Lighter, More Fle...
Logitech has announced an update to its Ultrathin for iPad Air, iPad mini and iPad mini with Retina display, improving the flexibility and design of its award-winning predecessor with an even thinner... Read more
Logitech Introduces Hinge, Big Bang and Turna...
Logitech has announced expansion of its tablet product line with three new cases – the Logitech Hinge, the Logitech Big Bang and the Logitech Turnaround – each for the iPad Air, iPad mini and iPad... Read more
WaterField’s Rough Rider Leather Messenger Ba...
WaterField Designs have announced the new 15-inch size of their popular Rough Rider leather messenger bag, a vintage-looking bag that combines Old West charm and ruggedness with distinctly modern... Read more

Jobs Board

*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
Position Opening at *Apple* - Apple (United...
…customers purchase our products, you're the one who helps them get more out of their new Apple technology. Your day in the Apple Store is filled with a range of Read more
*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
*Apple* Inc. Research Data Specialist - Appl...
…of Worldwide Market Research & Intelligence. The team is responsible for conducting Apple branded consumer market research. It is also responsible for analyzing data Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.