TweetFollow Us on Twitter

Routines as Values

Volume Number: 13 (1997)
Issue Number: 10
Column Tag: Programming Techniques

Routines As Values

by Lawrence D'Oliveiro

The High-Level Programmer's Guide to Generating Code at Run-Time

Introduction

This article is about passing around routines (procedures and functions) as values. That is, you take the address of a routine, pass it to another routine as an argument, maybe even store it in a data structure for later use. Ultimately the whole point with a routine is that you make a call to it; but many interesting things can happen on the way there.

There are many useful programming techniques to do with routines as values. In my view, these techniques deserve to be more widely known than they are, because they can contribute so much to the principles of reusable code and information hiding that good programmers value. This article will give you an introduction to some of these techniques.

To make sense of this article, it helps to have some understanding of 68K assembly language and how it interfaces to the Mac OS. The definitive reference to 68K machine language is the 68000 Family Programmer's Reference, available from Motorola's Web site as an Acrobat PDF file at http://www.mot.com/hpesd/aesop/68k/68kprm.pdf. Links to related documents can be found at http://www.mot.com/SPS/HPESD/docs/pubs/index.html. For a description of 68K calling conventions on the Mac OS, see the Trap Manager chapter of Inside Macintosh: Operating System Utilities, available on-line at ftp://ftpdev.info.apple.com//Developer_Services/Technical_Documentation/Inside_Macintosh/Operating_System_Utilities/. [Another useful and brief article introducing the basics of 68K ASM is "The Secrets of the Machine" by Malcolm Teas in MacTech 9.5. It is available from www.mactech.com/Articles/Vol.09/09.05/Read-Assembly/index.html -- ed].

Language Notes

Most of the code examples in this article are written in Modula-2. Specifically, they were written for the (now-defunct) Metrowerks Modula-2 compiler that runs under Apple's MPW (Macintosh Programmer's Workshop). I use this combination because it's been the best language and development environment for the kinds of programs I write. I recently got hold of the p1 Modula-2 compiler from Germany, but have yet to start using that in earnest.

At the end of the article, I will discuss the applicability of these techniques to other languages: everybody's favorite, C, and everybody else's favorite, Pascal.

In the meantime, I'll explain a few things for those who might be unfamiliar with Modula-2, but are familiar with Pascal. And if you're not familiar with Pascal, then you're not a real programmer...

Modula-2 has a lot in common with Pascal, and most of the following code should be comprehensible to Pascal programmers. It has the same signed integer types as Pascal (INTEGER for 16-bit integers and LONGINT for 32-bit ones), but it also has unsigned integer types: CARDINAL for 16-bit ones and LONGCARD for 32-bit ones. Pointer types are defined using the syntax "POINTER TO t" instead of Pascal's "^t". And there is a general untyped pointer type, ADDRESS, which is compatible with any pointer type. The Pascal compilers on the Mac don't seem to have any equivalent to this; it is like the "void *" type in C. The built-in ADR function returns the address of something; it is equivalent to the "@" operator in Pascal or "&" operator in C. Like the Pascal version (but unlike the C one), it always returns an untyped pointer value, that is, its return type is ADDRESS.

If you want to do unsafe type conversions, there are two ways. One is the traditional record-variant trick from Pascal (or unions in C). The other is the use of the CAST function. You will see both in use later. Unlike Pascal or C, you can't just use a type name as a function to do a cast; this makes unsafe conversions a little easier to find in the code!

The big difference from Pascal, for the purposes of this article, is that Modula-2 allows you to declare routine types. That is, you can write pieces of code like this:

  TYPE
    MyProcType =
      PROCEDURE
       (
        INTEGER
       );

  VAR
    ProcVar : MyProcType;
    K : INTEGER;

  PROCEDURE MyProc
   (
    Arg : INTEGER
   );

  BEGIN
    ...
  END MyProc;

BEGIN (* main program *)
  MyProc(K); (* invokes MyProc *)
  ProcVar := MyProc;
  ProcVar(K); (* also invokes MyProc *)
  ...

In Pascal, you can take the address of the routine:

  Var
    ProcVar : Ptr;

  Procedure MyProc
   (
    Arg : Integer
   );

 Begin
  ...
 End {MyProc} ;

Begin
{ main program } 
  MyProc(K);
  ProcVar := @MyProc;
  ...

But there is no way to define a type like MyProcType in the Modula-2 version, and the language provides no standard mechanism for calling the routine once you get its address in ProcVar. Of course, you can get around this, in a type-unsafe way, with appropriate use of inline machine code.

C programmers will already be familiar with all this, since C has similar sorts of facilities to Modula-2 in this regard, albeit with slightly less elegant syntax:

  typedef void (*MyProcType)(short);

  MyProcType ProcVar;
  short K;

void MyProc
 (
  short Arg
 )
 {
  ...
 } /*MyProc*/

void main
 (
  void
 )
 {
  MyProc(K); /* invokes MyProc */
  ProcVar = &MyProc;
  (*ProcVar)(K); /* also invokes MyProc */
  ...

I did a quick search for Modula-2-related information on the Web, and was amazed at how much I found. Here are a few URLs to get you started: http://www.infm.ulst.ac.uk/~george/mp/pract/modula.html, http://www.ee.newcastle.edu.au/users/staff/peter/m2/Modula2.html, http://www.ise.fhg.de/personal_home_pages/hubertus/texte/tabcont.html.

Anyway, enough with the preliminaries, on with the show...

Action Routines

There is frequently a need to traverse a complex data structure, looking for elements that match a particular criterion, and doing some processing on them. For example, consider the System 7 process list: you might want to scan it to see if a particular program is running. Or you might want to total up information about all processes, or display them in a list for the user's information. In all these applications, there is a certain recurring pattern to the code:

  • Declare a variable named, say, CurrentProcess, of type ProcessSerialNumber. Initialize it to kNoProcess.
  • Call GetNextProcess, passing it CurrentProcess. If you get an error, exit the loop.
  • Perform whatever processing is appropriate on the process with the serial number in CurrentProcess. Then repeat from the previous step.

The code in the first two steps would be the same in every case. Ideally, you should be able to write it just once, and reuse this code everywhere you need to. How can you do this?

The answer is to write a generic process-traversal routine, which calls an action routine that you specify for each process. In Modula-2, you could define the type of the action routine as follows:

TYPE
  ProcessActions =
    PROCEDURE
     (
      VAR (*ThisProcess :*) ProcessSerialNumber,
      VAR (*ProcessInfo :*) ProcessInfoRec,
      VAR (*KeepGoing :*) BOOLEAN,
      (*ActionArg :*) ADDRESS
     );

The first two arguments pass all the information about the process that the Process Manager provides. The KeepGoing argument may be set to false by the action routine to stop the traversal. This is useful if, for example, you're searching for a particular process, and you don't need to search further once you've found it. The last argument allows the passing of additional data whose meaning is up to the caller; we'll see one interesting use for this later.

And here's the generic process-traversal routine:

PROCEDURE TraverseProcesses
 (
  Action : ProcessActions;
  ActionArg : ADDRESS
 );

  VAR
    ThisProcess : ProcessSerialNumber;
    ProcessInfo : ProcessInfoRec;
    ProcessName : Str255;
    AppSpec : FSSpec;
    KeepGoing : BOOLEAN;
    Err : OSErr;

BEGIN
  ThisProcess.highLongOfPSN := 0;
  ThisProcess.lowLongOfPSN := kNoProcess;
  ProcessInfo.processInfoLength := SIZE(ProcessInfoRec);
  ProcessInfo.processName := ADR(ProcessName);
  ProcessInfo.processAppSpec := ADR(AppSpec);
  KeepGoing := TRUE; (* initially *)
  LOOP
    Err := GetNextProcess(ThisProcess);
    IF Err <> noErr THEN
      EXIT
    END (*IF*);
    Err := GetProcessInformation(ThisProcess, ProcessInfo);
    IF Err = noErr THEN
      Action
       (
        ThisProcess,
        ProcessInfo,
        KeepGoing,
        ActionArg
       );
      IF NOT KeepGoing THEN
        EXIT
      END (*IF*)
    END (*IF*)
  END (*LOOP*)
END TraverseProcesses;

A sorting example

Another use for action routines is to isolate assumptions about data structures. Consider a common operation: sorting. You've probably got a collection of favorite sorting routines that you like to use. What sorts of assumptions do they make about the data being sorted? Does it have to be in an array? Can it be in a disk file? Do the sort keys have to reside together with the rest of the data? Why make any assumptions at all?

Many sorting routines can function by making only two kinds of accesses to the data: they need to be able to compare the keys of two records, and they need to be able to swap the positions of two records to get them into the right order. Let's define interfaces for action routines to perform these operations:

TYPE
  ComesBeforeProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  SwapProc =
    PROCEDURE
     (
      (*First :*) LONGCARD,
      (*Second :*) LONGCARD,
      (*Arg :*) ADDRESS
     );

The one assumption I will make is that records are identifiable by index numbers starting from 1. A ComesBeforeProc returns true if the record with index First is supposed to come before the one with index Second (that is, they're already in the right order); it returns false otherwise. A SwapProc is called to swap two out-of-order elements: after it executes, the record which had index First should now have index Second, and vice versa.

As with the process-traversal example, each action routine takes an additional argument which may be used to pass additional caller-specified information.

Here's a sorting routine that uses the above action routines to implement a simple Shellsort:

PROCEDURE Sort
 (
  NrElements : LONGCARD;
  ComesBefore : ComesBeforeProc;
  ComesBeforeArg : ADDRESS;
  Swap : SwapProc;
  SwapArg : ADDRESS
 );

  VAR
    i, j, k, l, m : LONGCARD;

BEGIN
  i := 1;
  WHILE i < NrElements DO
    i := i + i
  END (*WHILE*);
  m := (i - 1) DIV 2;
  IF (m = 0) AND (NrElements > 1) THEN
   (* do at least one pass when sorting 2 elements *)
    m := 1
  END (*IF*);
  WHILE m > 0 DO
    k := NrElements - m;
    FOR j := 1 TO k DO
      i := j;
      LOOP
        l := i + m;
        IF ComesBefore(i, l, ComesBeforeArg)
        THEN
          EXIT
        END (*IF*);
        Swap(i, l, SwapArg);
        IF i <= m THEN
          EXIT
        END (*IF*);
        i := i - m
      END (*LOOP*)
    END (*FOR*);
    m := m DIV 2
  END (*WHILE*)
END Sort;

Of course, you could replace the body of this routine with just about any sorting algorithm you like, and keep the same interface. Here's a very simple, almost trivial example use of this routine to sort an array of integers. First we declare a data array, and comparison and swap routines for the sort:

TYPE
  ElementType = INTEGER;
VAR
  MyData : POINTER TO
    POINTER TO ARRAY [1 .. 10000] OF ElementType;
  MyNrElements : CARDINAL;

PROCEDURE MyComesBefore
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 ) : BOOLEAN;

BEGIN
  RETURN
      MyData^^[First]
    <=
      MyData^^[Second]
END MyComesBefore;

PROCEDURE MySwap
 (
  First, Second : LONGCARD;
  Unused : ADDRESS
 );

  VAR
    Temp : ElementType;

BEGIN
  Temp := MyData^^[First];
  MyData^^[First] := MyData^^[Second];
  MyData^^[Second] := Temp
END MySwap;

And here's the call that does the actual sort:

Sort
 (
  (*NrElements :=*) MyNrElements,
  (*ComesBefore :=*) MyComesBefore,
  (*ComesBeforeArg :=*) NIL,
  (*Swap :=*) MySwap,
  (*SwapArg :=*) NIL
 )

This example makes no use of the extra arguments that can be passed to the comparison and swap routines. Later we will see an interesting use for these extra arguments.

Closures

In languages like POP-11 and LISP, which are primarily used for Artificial Intelligence work, you can have a lot of fun with functions. You can even create new functions at run-time!

But why should AI researchers have all the fun? Can't you create functions at run-time in a common language like Pascal, Modula-2 or C? Certainly you can!

The bad news is: the techniques for doing this rely on particular features of the compiler and machine. It takes some work to get them right the first time. And certain compiler optimizations can signficant problems.

The good news is: once you get it right, it's a lot easier to do again. And these become very useful techniques, that you find yourself using over and over!

Technically, a Curried function (it's named after a person, not a dish) is created when you take a function with many arguments, and "bind" some of them to particular value; the result is a function of fewer arguments. A closure, on the other hand, is created when you bind a function to its parent environment, to create a new function which automatically sets up the correct environment for the original function when it's called.

To illustrate closures and environments, consider a common problem on the Mac: accessing A5 globals from an interrupt routine.

In most Mac languages, when you declare a bunch of global variables, the compiler allocates them in a block of memory accessed relative to register A5. A5 is also used to access the application's segment table, as well as QuickDraw's global storage. Thus, A5 is a very important register, and your application code is liable to behave very strangely if it's called with an incorrect value in A5, and it tries to access any global data.

Instead of putting special code to deal with this into all of your interrupt routines, the principle of reusable code dictates that you should write this code just once, and use it over and over. That is, you should write a general library routine which, given an interrupt routine, creates a "wrapper" routine for it that passes the correct A5 value. Here's an example of what I mean:

TYPE
  BoundInterruptProc = ARRAY [1 .. 9] OF CARDINAL;

A BoundInterruptProc is an array which will hold the the code for the wrapper routine. The wrapper code itself is filled in by BindInterruptProc, which is defined as follows:

PROCEDURE CurrentA5() : LONGCARD;
(* returns the current value of the global base register.*)

  CODE
    02E8DH;          (* move.l a5, (sp) *)

PROCEDURE BindInterruptProc
 (
  TheProc : ADDRESS;
  VAR Result : BoundInterruptProc
 );
(* constructs the wrapper around the specified procedure.*)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F0DH;          (* move.l a5, -(sp) *)
                              (* save previous A5 *)
  Result[2] := 02A7CH;          (* move.l #n, a5 *)
  TwoWords.LongValue := CurrentA5(); 
                              (* n for previous instr *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 04EB9H;          (* jsr p *)
  TwoWords.LongValue := TheProc;  (* p for previous instr *)
  Result[6] := TwoWords.HighWord;
  Result[7] := TwoWords.LowWord;
  Result[8] := 02A5FH;          (* move.l (sp)+, a5 *)
                              (* restore previous A5 *)
  Result[9] := 04E75H;          (* rts *)
  FlushCaches
END BindInterruptProc;

So instead of passing the address of your actual interrupt routine to the system, you pass the address of the appropriately-filled-in BoundInterruptProc structure, for example:

VAR
  Params : HParamBlockRec;
  BoundCompletion : BoundInterruptProc;
...

BindInterruptProc(ADR(MyCompletion), BoundCompletion);
Params.ioCompletion := ADR(BoundCompletion);
...
Err := PBReadAsync(ADR(Params));

and in your interrupt routine you have full access to all your A5 globals!

Note that FlushCaches call at the end of the definition of BindInterruptProc. This is necessary because of a piece of brain damage in Motorola's 68030 and 68040 processors. The problem is that, on these processors, memory references are divided into two kinds, data and code, and there is a separate CPU cache for each, with no attempt to maintain consistency between them. Thus, if a given memory location ends up in both caches (as could happen when you generate code at run-time, as BindInterruptProc does), it is possible for it to be modified from the data cache, without the instruction cache picking this up! Then, when you later try to execute the instruction at that memory location, you get the wrong value, and crash...

Here's how FlushCaches is defined. This definition is safe to use on all members of the 68000 CPU family, with system version 6.0 and later. It even works on Power Macs, where it flushes the cache of compiled code on models with the DR ("dynamic recompilation") emulator, and does nothing on older Power Macs.

PROCEDURE HWPrivPresent() : BOOLEAN;
(* has this system got the HWPriv trap. *)

  CONST
    UnimplementedTrap = 009FH;
    HWPrivTrap = 0A198H;

BEGIN
  RETURN
      GetOSTrapAddress(HWPrivTrap)
    <>
      GetOSTrapAddress(UnimplementedTrap)
END HWPrivPresent;

PROCEDURE FlushInstructionCache;
(* actually flushes both instruction and data caches. *)

  CODE
    07001H, (* moveq.l #1, d0 *)
    0A198H; (* _HWPriv *)

PROCEDURE FlushCaches;
 (* flushes instruction and data caches, if present. *)

BEGIN
  IF HWPrivPresent() THEN
    FlushInstructionCache
  END (*IF*)
END FlushCaches;

Binding Actual Arguments

Often, in a routine of yours that is called by the system, you don't need access to all your A5 globals, but you do need access to a particular block of data. Depending on the situation, the system may provide all kinds of different ways of passing the address of this block as an additional argument to your routine -- or it might not provide any way at all.

For example, the action routine you pass to the Control Manager's TrackControl routine gets the control handle as an argument, so you could use the refCon field in the control record to pass information to it. That is, unless the part of the control you're tracking is an indicator (such as a scroll bar thumb), in which case you don't get any arguments at all. Of course, if your code is part of the currently-running application, you could probably get away with using A5 globals in this case.

There are other, equally complicated situations. Time Manager tasks didn't use to get any arguments, until System 6.0.3, when they started getting the address of the TMInfo record in register A1. Memory Manager GrowZoneProcs don't get any arguments, and you can't guarantee that some part of the system (or some weird INIT) won't try to allocate some memory in your heap while the A5 value is set to something nonsensical.

Now, what do you think: would you prefer to resort to a range of different fiddly techniques to deal with each of these situations, or would you rather stick with one general, uniform technique that can handle them all?

I know which I prefer. I would Curry my routine by binding it to the address of my data, using the following Bind routine:

TYPE
 (* structure for holding Curried routine *)
  BoundProc = ARRAY [1 .. 8] OF CARDINAL;

PROCEDURE Bind
 (
  TheProc : ADDRESS;
  ToArg : ADDRESS;
  VAR Result : BoundProc
 );
 (*    creates a new procedure which calls TheProc
      with ToArg as an extra argument. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F17H;          (* move.l (sp), -(sp) *) )
  Result[2] := 02F7CH;          (* move.l #m, n(sp) *) 
  TwoWords.LongValue := ToArg;    (* m for above move *)
  Result[3] := TwoWords.HighWord;
  Result[4] := TwoWords.LowWord;
  Result[5] := 4;                (* n for above move *)
  Result[6] := 04EF9H;          (* jmp n *)
  TwoWords.LongValue := TheProc;  (* n for jmp *)
  Result[7] := TwoWords.HighWord;
  Result[8] := TwoWords.LowWord;
  FlushCaches
END Bind;

Note that you can use Bind on any procedure or function of just about any number of arguments, and (if a function) returning a result of any type. The only restrictions are that it must follow Pascal calling conventions (as Metrowerks Modula-2 does), the argument you are binding must be the last one in the argument list, and it must be 32 bits in size (such as an address).

Converting Calling Conventions

Another important use for run-time closures is where the system invokes some code, that you supply, in a nonstandard way. One case that has bitten many Macintosh programmers over the years is the infamous TextEdit clikLoop callback.

If you specify a clikLoop routine in your TextEdit record, TEClick will call it repeatedly as long as the mouse remains down. One use for this callback is to implement autoscrolling: TextEdit's default clikLoop action, like the rest of TextEdit, doesn't know anything about scrollbars, so you have to provide your own code to keep these in sync.

The clikLoop routine is invoked in a slightly odd way: it must preserve register D2 (normally on the Mac, any routine you call is allowed to trash A0, A1, D0, D1 and D2), and it must return a boolean result in the bottom byte of D0. If this value is ever false, then TEClick returns immediately, as though the mouse has been released.

The trap lies in an inocuous-seeming library routine, that Apple provides, called TESetClickLoop. It looks very tempting to use, as it allows you to pass a clikLoop callback that conforms to straight Pascal calling conventions. Of course, it uses its own glue code internally to call your routine and convert appropriately between calling conventions.

The problem lies in how this glue code finds your routine: the address of your routine is stored in a global variable! What this means is, if you have two separate TextEdit records in existence at once, and you try to specify a separate clikLoop routine for each one, then the last one you specify gets called for both, since there is only room to remember a single one!

The correct solution is to create the glue code on the fly. Here's a routine that does this:

TYPE
  ClikLoopProc =
   (* has one more arg than TextEdit.ClikLoopProc *)
    PROCEDURE
     (
      (*Arg :*) ADDRESS
     ) : BOOLEAN;
  BoundClikLoopProc = ARRAY [1 .. 12] OF CARDINAL;

PROCEDURE BindClikLoopProc
 (
  TheProc : ClikLoopProc;
  ToArg : ADDRESS;
  VAR Result : BoundClikLoopProc
 );
 (* binds TheProc to the specified argument to create
  a closure that can be invoked by TextEdit as a
  ClikLoop procedure. *)

  VAR
    TwoWords :
    (* used to access a longword as two words *)
      RECORD
      CASE : BOOLEAN OF
      | FALSE:
        LongValue : LONGCARD
      | TRUE:
        HighWord, LowWord : CARDINAL
      END (*CASE*)
      END (*RECORD*);

BEGIN
  Result[1] := 02F02H;            (* move.l d2, -(sp)*)
                                (* save vulnerable register *)
  Result[2] := 04227H;            (* clr.b -(sp) *)
  Result[3] := 02F3CH;            (* move.l #n, -(sp) *)
                                (* insert additional arg *)
  TwoWords.LongValue := ToArg;
  Result[4] := TwoWords.HighWord  ;  (* n for above *)
  Result[5] := TwoWords.LowWord;    (* ditto *)
  Result[6] := 04EB9H;            (* jsr l *)
  TwoWords.LongValue := CAST(LONGCARD, TheProc);
  Result[7] := TwoWords.HighWord;    (* high word of l *)
  Result[8] := TwoWords.LowWord;    (* low word of l *)
  Result[9] := 07000H;            (* moveq.l #0, d0 *)
  Result[10] := 0101FH;            (* move.b (sp)+, d0 *)
  Result[11] := 0241FH;            (* move.l (sp)+, d2 *)
                                (* restore saved register *)
  Result[12] := 4E75H;            (* rts *)
  FlushCaches
END BindClikLoopProc;

Note how, while I'm at it, I cleverly change the type of the ClikLoopProc that you must pass, to include an extra argument. This allows you to pass extra information to your ClikLoopProc when it is called, for whatever purpose you want.

Up-Level References

In Pascal and Modula-2, you can declare one routine local to another. In this case, the environment of the inner routine includes everything declared locally to the outer routine. In particular, it is possible for the inner routine to access variables declared local to the outer routine.

How is this done? I know of two different techniques that compilers may use. One involves the use of a global display array of pointers to stack frames, indexed by the source nesting level of the routine. The other way is to pass a static link when calling a nested routine -- this is just a pointer to the enclosing routine's stack frame.

On the Mac, one of these techniques is the right way, and the other is the wrong way. The display array is the wrong way, because it requires global data that isn't always available. The static link method is the right way. Thankfully, the Mac compilers I'm aware of all do it the right way.

To illustrate how a static link works, have a look at the following piece of source:

PROCEDURE Outer;

  VAR
    OuterVar : CARDINAL;

  PROCEDURE Inner
   (
    NewValue : CARDINAL
   );

  BEGIN
    OuterVar := NewValue
  END Inner;

BEGIN (*Outer*)
  OuterVar := 1;
  Inner(2)
END Outer;

Here's the code that the Metrowerks compiler generates for for procedure Outer:

LINK    A6,#$FFFE
MOVE.W   #$0001,-$0002(A6) ; OuterVar := 1
MOVE.W   #$0002,-(A7) ; arg for call to Inner
PEA    (A6)
JSR    Inner
UNLK    A6
RTS    

Note that the variable OuterVar is at offset -2 within Outer's stack frame. Also note that when it calls Inner, it doesn't just pass the cardinal value 2 for the argument NewValue; it pushes an additional longword, which is the value of its A6 (stack frame) register. To see the use of this, here's the code for Inner:

LINK    A6,#$0000
MOVEA.L  $0008(A6),A4 ; Outer's A6 to access OuterVar
MOVE.W   $000C(A6),-$0002(A4) ; OuterVar := NewValue
UNLK    A6
MOVEA.L  (A7)+,A0
ADDQ.L   #$6,A7
JMP    (A0)

Notice how Inner gets to OuterVar by using the stack frame pointer that was passed to it by Outer. If Outer was itself declared local to another routine, it is similarly possible for Inner to access the local variables in that stack frame by following the frame pointer value passed to Outer by its parent.

To see the relevance of this, here's an example use of the TraverseProcesses routine I gave up above. It's a routine called FindProcessBySignature, that scans the process list until it finds a process with a given application signature. It returns true and the process serial number if it finds a match, false otherwise.

PROCEDURE FindProcessBySignature
 (
  Signature : OSType;
  VAR MatchingProcess : ProcessSerialNumber
 ) : BOOLEAN;

  VAR
    Found : BOOLEAN;

  PROCEDURE MatchSignature
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

    VAR
      Err : OSErr;

  BEGIN
    IF ProcessInfo.processSignature = Signature THEN
      MatchingProcess := ThisProcess;
      Found := TRUE;
      KeepGoing := FALSE
    END (*IF*)
  END MatchSignature;

BEGIN (*FindProcessBySignature*)
  Found := FALSE; (* initial assumption *)
  TraverseProcesses
   (
    (*Action :=*) CAST(ProcessActions,
    ADR(MatchSignature)),
    (*ActionArg :=*) CurrentA6()
   );
  RETURN
    Found
END FindProcessBySignature;

First of all, compare the declaration of the inner MatchSignature routine with the original definition of the ProcessActions type. Something is missing: What has happened to the ActionArg?

The answer, of course, is that this is now an implicit additional argument, being the outer routine's stack frame pointer! When FindProcessBySignature passes the address of MatchSignature to TraverseProcesses as the action routine, it tells the latter to call it with that additional argument, which is the result of calling the function CurrentA6. This is defined as follows:

PROCEDURE CurrentA6() : LONGCARD;

  CODE
    02E8EH; (* move.l a6, (sp) *)

It returns the value that is currently in the A6,which is the stack frame register -- exactly what the inner routine needs when it's called.

The main ugly thing about this technique is that Modula-2 doesn't allow you to pass a nested routine as a routine value; this is only allowed for globally-declared routines (as in the sorting example earlier). Luckily, Metrowerks allows you to use the ADR (address-of) function on a nested routine, and you can do a CAST to force the result of this to the appropriate type. (Unfortunately, the p1 compiler doesn't seem to support this feature.)

Why not just declare MatchSignature globally outside FindProcessBySignature, and avoid this trickiness with stack frame pointers? The reason is because, instead of passing the stack frame pointer as the additional argument, you'll now have to pass pointers to the local variables Found and MatchingProcess. Or if you try using globals instead, your code is no longer reentrant! This may not seem a big deal at first sight, but it's very important in library code, which will be written once and reused in a wide variety of situations you never envisaged. Alternatively, you could combine both the local variables into a single structure, and pass the address of that. That will work, but you lose some of the simplicity of the nesting version.

Besides, the MatchSignature routine doesn't need to be visible globally. One of the safety principles you learn when you're writing large programs is: if something doesn't need to be visible globally, don't make it so. In a large piece of software, the complexity comes not from the number of parts, but from the number of potential interactions between them. To minimize unwanted interactions, things should only be accessible in places where they need to be. This is called information hiding.

There is one subtle pitfall with static links you need to watch out for. Consider the following code fragment:

PROCEDURE Parent;

  VAR
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) CurrentA6() (***WRONG***)
     )
  END Brother;

BEGIN (*Parent*)
  ...
  Brother
  ...
END Parent;

What is wrong with this? The problem is, that the use of CurrentA6() in Brother means that it is passing Sister a pointer to its own stack frame, rather than the one belonging to Parent! This means that, where Sister expects to access variables local to Parent, it accesses ones local to Brother instead. This is obviously wrong, since the language rules don't allow variables local to Brother to be directly visible to Sister.

The following is one way to modify this to work correctly:

PROCEDURE Parent;

  VAR
    ParentA6 : LONGCARD;
    TargetProcess : ProcessSerialNumber;

  PROCEDURE Sister
   (
    VAR ThisProcess : ProcessSerialNumber;
    VAR ProcessInfo : ProcessInfoRec;
    VAR KeepGoing : BOOLEAN
   );

  BEGIN
    ...
    IF ThisIsTheOneIWant THEN
      TargetProcess := ThisProcess;
      KeepGoing := FALSE
    END (*IF*)
  END Sister;

  PROCEDURE Brother;

  BEGIN
    TraverseProcesses
     (
      (*Action :=*) CAST(ProcessActions, ADR(Sister)),
      (*ActionArg :=*) ParentA6
     )
  END Brother;

BEGIN (*Parent*)
  ParentA6 := CurrentA6();
  ...
  Brother
  ...
END Parent;

Here the CurrentA6() call has been moved to Parent, and its value saved for use by Brother when calling Sister.

The rule is simple: when calling a nested routine, you must always pass the stack frame pointer for the routine textually immediately enclosing the one you want to call. If you stick to this rule, all the more complicated cases -- references to local variables up more than one level, recursion, inner routines calling outer ones, and so on -- are guaranteed to work correctly!

Application To Other Languages

For some odd reason, most Mac programmers don't use Modula-2. Most use C, and most of the rest use Pascal. And you're wondering: how applicable are these techniques to those languages?

You can use them to some extent. Modula-2 is modeled fairly closely on Pascal (they were both designed by the same person, after all), and I'm pretty sure all the Mac Pascals use A6 as a stack frame pointer in the same way as Modula-2. There are two main drawbacks to Pascal:

  • There are no routine types. Once you take the address of a procedure or function, you're completely on your own.
  • I understand THINK Pascal doesn't even allow you to take the address of a routine that's local to another routine. That means you can't use any of the A6 stuff.

Note that it is possible in Pascal to pass a routine as an argument to another, in a type-safe fashion. This feature is powerful enough to deal with all the examples using static links I have given above. However, it doesn't allow you to save routine addresses in data structures, which can come in useful in some cases I haven't covered in this article.

The major drawback with C is that it doesn't allow you to declare one routine inside another. This means you have to go through the extra complexity I mentioned above in passing additional arguments to action routines. At least C has proper procedure and function types.

Conclusion

Assembly-language hackers are occasionally fond of a programming technique known as self-modifying code, where an instruction sequence pokes some bits in an instruction elsewhere in the routine before executing it.

I want to make a clear distinction between self-modifying code, and the techniques for generating code at run-time that I have been describing above. Self-modifying code was a technique concocted to squeeze a few more instruction cycles out of time-critical code. It has been quite rightly criticized as leading to programs that are hard to understand, hard to reuse and hard to maintain.

The techniques I have been describing, on the other hand, are more structured than this. The code which is being generated at run-time follows a simple, easy-to-understand form, and its use is really quite straightforward. My aim has not been to optimize time-critical code, but precisely to create code which is easy to understand, easy to reuse and easy to maintain!


Lawrence D'Oliveiro has been doing both high-level and low-level programming on a variety of different platforms over the last eighteen years. When he's not playing with his techno-toys, or worrying about which cool technology Apple is going to abandon next, he likes cycling and walking around the nearby Waikato River.

 
AAPL
$118.62
Apple Inc.
-0.38
MSFT
$47.94
Microsoft Corpora
+0.19
GOOG
$538.77
Google Inc.
-1.60

MacTech Search:
Community Search:

Software Updates via MacUpdate

Skype 7.2.0.412 - Voice-over-internet ph...
Skype allows you to talk to friends, family and co-workers across the Internet without the inconvenience of long distance telephone charges. Using peer-to-peer data transmission technology, Skype... Read more
HoudahSpot 3.9.6 - Advanced file search...
HoudahSpot is a powerful file search tool built upon MacOS X Spotlight. Spotlight unleashed Create detailed queries to locate the exact file you need Narrow down searches. Zero in on files Save... Read more
RapidWeaver 6.0.3 - Create template-base...
RapidWeaver is a next-generation Web design application to help you easily create professional-looking Web sites in minutes. No knowledge of complex code is required, RapidWeaver will take care of... Read more
iPhoto Library Manager 4.1.10 - Manage m...
iPhoto Library Manager lets you organize your photos into multiple iPhoto libraries. Separate your high school and college photos from your latest summer vacation pictures. Or keep some photo... Read more
iExplorer 3.5.1.9 - View and transfer al...
iExplorer is an iPhone browser for Mac lets you view the files on your iOS device. By using a drag and drop interface, you can quickly copy files and folders between your Mac and your iPhone or... Read more
MacUpdate Desktop 6.0.3 - Discover and i...
MacUpdate Desktop 6 brings seamless 1-click installs and version updates to your Mac. With a free MacUpdate account and MacUpdate Desktop 6, Mac users can now install almost any Mac app on macupdate.... Read more
SteerMouse 4.2.2 - Powerful third-party...
SteerMouse is an advanced driver for USB and Bluetooth mice. It also supports Apple Mighty Mouse very well. SteerMouse can assign various functions to buttons that Apple's software does not allow,... Read more
iMazing 1.1 - Complete iOS device manage...
iMazing (was DiskAid) is the ultimate iOS device manager with capabilities far beyond what iTunes offers. With iMazing and your iOS device (iPhone, iPad, or iPod), you can: Copy music to and from... Read more
PopChar X 7.0 - Floating window shows av...
PopChar X helps you get the most out of your font collection. With its crystal-clear interface, PopChar X provides a frustration-free way to access any font's special characters. Expanded... Read more
OneNote 15.4 - Free digital notebook fro...
OneNote is your very own digital notebook. With OneNote, you can capture that flash of genius, that moment of inspiration, or that list of errands that's too important to forget. Whether you're at... Read more

Latest Forum Discussions

See All

Proun+ Review
Proun+ Review By Jennifer Allen on November 28th, 2014 Our Rating: :: TWITCHY RACINGUniversal App - Designed for iPhone and iPad Twitchy racing aplenty in Proun+, an enjoyably tricky title.   | Read more »
Lucha Amigos (Games)
Lucha Amigos 1.0 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0 (iTunes) Description: Forget Ninja Turtles, and meet Wrestlers Turtles! Crazier, Spicier and…Bouncier! Sling carapaces of 7 Luchadores to knock all... | Read more »
Record of Agarest War Zero (Games)
Record of Agarest War Zero 1.0 Device: iOS Universal Category: Games Price: $7.99, Version: 1.0 (iTunes) Description: HyperDevbox Holiday Turkey Black Friday Special Pricing! To celebrate the opening of the holiday season HyperDevbox... | Read more »
Raby (Games)
Raby 1.0.3 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0.3 (iTunes) Description: ***WARNING - Raby runs on: iPhone 5, iPhone 5C, iPhone 5S, iPhone 6, iPhone 6 Plus, iPad Mini Retina, iPad Mini 3, iPad 4, iPad Air,... | Read more »
Oddworld: Stranger's Wrath (Games)
Oddworld: Stranger's Wrath 1.0 Device: iOS Universal Category: Games Price: $5.99, Version: 1.0 (iTunes) Description: ** PLEASE NOTE: Oddworld Stranger's Wrath requires at least an iPhone 4S, iPad 2, iPad Mini or iPod Touch 5th gen... | Read more »
Bounce On Back (Games)
Bounce On Back 1.0 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0 (iTunes) Description: | Read more »
Dwelp (Games)
Dwelp 1.0 Device: iOS Universal Category: Games Price: $.99, Version: 1.0 (iTunes) Description: === 50% off for a limited time, to celebrate release === Dwelp is an elegant little puzzler with a brand new game mechanic. To complete a... | Read more »
Make Way for Fat Chicken, from the Maker...
Make Way for Fat Chicken, from the Makers of Scrap Squad Posted by Jessica Fisher on November 26th, 2014 [ permalink ] Relevant Games has announced they will be releasing their reverse tower defense game, | Read more »
Tripnary Review
Tripnary Review By Jennifer Allen on November 26th, 2014 Our Rating: :: TRAVEL BUCKET LISTiPhone App - Designed for the iPhone, compatible with the iPad Want to create a travel bucket list? Tripnary is a fun way to do exactly that... | Read more »
Ossian Studios’ RPG, The Shadow Sun, is...
Ossian Studios’ RPG, The Shadow Sun, is Now Available for $4.99 Posted by Jessica Fisher on November 26th, 2014 [ permalink ] Universal App - Designed for iPhone and iPad | Read more »

Price Scanner via MacPrices.net

Karalux Announces 24K Gold-Plated iPhone 6
Karalux, a Vietnam-based jewellery firm, has launched a unique 24 karat gold-plated iPhone 6 version with gold-cast monolithic dragon on its back panel. The real 24 karat gold plated enclosure doesn’... Read more
Black Friday: 13-inch 2.6GHz Retina MacBook P...
 B&H Photo has lowered their price for the 13″ 2.6GHz/128GB Retina MacBook Pro to $1159 for Black Friday. That’s $140 off MSRP, and it’s the lowest price for this model (except for Apple’s $1099... Read more
View all the Black Friday sales on our Mac Pr...
We’ve updated our Mac Price Trackers with the latest information on prices, bundles, and availability on systems from Apple’s authorized internet/catalog resellers. View Black Friday sale prices at a... Read more
Black Friday: 11-inch MacBook Air for $779, s...
 Best Buy has lowered their price for the 2014 11″ 1.4GHz/128GB MacBook Air to $779.99 for Black Friday. That’s $120 off MSRP. Choose free shipping or free local store pickup (if available). Sale... Read more
Apple Store Black Friday sale for 2014: $100...
BLACK FRIDAY The Apple Store has posted their Black Friday deals for 2014. Receive a $100 PRODUCT(RED) branded iTunes gift card with the purchase of select Macs, $50 with iPads, and $25 with iPods,... Read more
Black Friday: 15% off iTunes Gift Cards
Staples is offering 15% off $50 and $100 iTunes Gift Cards on their online store as part of their Black Friday sale. Click here for more information. Shipping is free. Best Buy is offering $100... Read more
BEVL Releases Dock Tailored for iPhone 6 and...
Seattle based BEVL has released their first product: an iPhone dock that is divergent in build quality, rock-solid function and visual simplicity to complement the iPhone. BEVL is now accepting... Read more
Black Friday: $150 off 13-inch Retina MacBook...
 Best Buy has 13-inch 2.6GHz Retina MacBook Pros on sale for $150 off MSRP on their online store as part of their Black Friday sale. Choose free shipping or free local store pickup (if available).... Read more
Black Friday: $300 off 15-inch Retina MacBook...
 B&H Photo has the new 2014 15″ Retina MacBook Pros on sale for $300 off MSRP as part of their Black Friday sale. Shipping is free, and B&H charges NY sales tax only: - 15″ 2.2GHz Retina... Read more
Black Friday: Up to $140 off MacBook Airs, fr...
 B&H Photo has 2014 MacBook Airs on sale for up to $140 off MSRP as part of their Black Friday sale. Shipping is free, and B&H charges NY sales tax only: - 11″ 128GB MacBook Air: $799 $100... Read more

Jobs Board

*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
Senior Event Manager, *Apple* Retail Market...
…This senior level position is responsible for leading and imagining the Apple Retail Team's global event strategy. Delivering an overarching brand story; in-store, Read more
*Apple* Retail - Multiple Positions (US) - A...
Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, you're also the Read more
*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.