TweetFollow Us on Twitter

Volume Number:2
Issue Number:4
Column Tag:Threaded Code: Mach 1

A Multi-tasking Forth System

By Jörg Langowsk, EMBL, C/O I.L.L., Grenoble, Cedex, France, MacTutor Editorial Board

"Subroutine threaded Forth - the Mach 1 system"

This time I am going to introduce to you a new Forth-83 system that I recently got to review. This system is implemented in a somewhat different way compared with MacForth, NEON or other Forths, so let me digress a little and make some comments about Forth implementations on 68000 based systems.

As most of this column's readers know, 'threaded code' means program code that is organized in a very hierarchical manner; a 'main program' or colon definition consists of a list of words that describe other lower-level definitions that in turn are lists of even lower level definitions and so on ... until one arrives at the definitions of the very lowest level that are directly executable machine code (See Figure 1 below).

How do Forth systems implement this structure? The definition of every Forth word, colon definition, variable, etc. contains one short piece of executable machine code at the very beginning. This is called the code field and determines in which way the information following it is handled. For example, in a colon definition in MacForth, the code field contains a TRAP #F (V2.0) or TRAP #E (V2.4) instruction that jumps to code that in turn:

-pushes the instruction pointer (A3) on the return stack,

-sets the instruction pointer to the address of the first word of the new definition to be executed,

-jumps to the NEXT routine which executes the tokens that follow the code field.

Mac Forth is an example of 'token threaded Forth' which means that the numbers following the code field are not valid addresses of other Forth words, but 'tokens' that are to be converted into addresses by some algorithm. In Mac Forth, positive token values are interpreted as offsets from a base pointer (A4), while negative token values are negative offsets from the top of application space (A5) into a token table, where the 32-bit address of the Forth word is found. The token table is used to handle programs that exceed the 32 K address space that can be handled through the first method of addressing.

The advantage of MacForth's and similar structures is that we need only 16 bits for each word compiled into a definition. The disadvantage is the limited address space (32K) for the simple A4-indexed addressing mode and the need for a token table to convert tokens into addresses that lie outside the 32K addressing region. Time is lost on each NEXT loop by having to test whether the token is negative or positive, then eventually by accessing the token table ( See Figure 2 ).

The Forth implementation that was used to create NEON does not need a token table since each word uses 32 bits, but more space is used in the compiled definitions that way.

Fig. 3 gives a summary of some possible addressing modes for Forth interpreters with their execution times (in system clocks, from the MC68000 manual). Some of these examples have been taken from a Dr. Dobb's Journal article by Joe Barnhart, 'Forth and the Motorola 68000', DDJ 83 (1983) 18-26.

MacForth does not conform to any of the 'pure' definitions here, but (cf. Fig. 2) is a mixture of a base-offset direct word and base-offset indirect long threaded code (with an additional branch required).

You see that in all the examples given the inner interpreter requires considerable overhead to 'set up' the forth words for execution.

Additional overhead is created by the execution of the small piece of machine code starting at the CFA. In MacForth this is a TRAP instruction (38 cycles) plus three lines of assembly code that the trap vector points to (32 cycles). Counting everything up, execution of one colon-defined Forth word requires 100 or 120 cycles under MacForth, depending whether or not the token table has to be accessed. In a system like NEON, the CFA does not contain a trap, but a JSR to a routine that sets up the inner interpreter, with a similar amount of time involved.

One solution around this problem is to code time-critical things in assembly language, so that the whole definition is made up of executable 68000 code, starting with the code field. I have given an example for this in MT V1#9 by defining macro words that compile inline code into a definition. The speedup was of the order of 50% for MacForth. Of course, if you write inline macros, you are not really creating threaded code, but linear code, and the size of the definition tends to get rather large.

Still, there is another way to create threaded, i.e. hierarchical, code that does away with the needs for an 'inner interpreter' loop completely, since the threaded code will be directly executable 68000 machine language. This sounds strange, but is actually very simple to achieve. The keyword is 'subroutine threading'. As you saw from the inline code example, we can do without a 'code field' by just starting to execute our definition at the very beginnning, provided it is all bona fide 68000 code. Now, all we have to do is to persuade the Forth system not to compile a list of addresses into a colon definition, but a list of JSR (addr). This way, we don't need the inner interpreter at all, Forth instructions are 68000 instructions; the subroutine return stack, in this case, is the normal 68000 stack, pointed to by A7 (Mach 1 uses still a different stack for loop indices, more about that later).

The speed increase over the previous examples is considerable, with only 18 cycles of overhead for the JSR instruction if the jump is taken within the segment, or 30 cycles for one more .JMP if a jump table has to be used to branch to a different segment. This is one-fifth to one-third of the overhead of any of the other implementations given on the next page in Fig. 3.

Actually, subroutine threading gives us a Forth compiler that creates native 68000 code, which won't even need a runtime Forth nucleus, but will nicely execute by itself.

Drawbacks of this strategy: A7 cannot be the Forth data stack pointer any longer, some other register has to be used. This does not really create a problem as long as one stays within Forth; the 68000 couldn't care less which stack pointer you use for passing arguments between routines. Toolbox routines, however, expect their arguments on the A7 stack. Their calling sequence, therefore, becomes a little bit more complicated - and more time consuming - in a subroutine-threaded Forth implementation, since all the arguments have to be transferred from the data stack to the A7 stack first, then any result has to be swapped back from the A7 stack to the data stack. Also, since toolbox calling should be transparent to the user, the Forth compiler should know about the parameters that all the toolbox traps expect and create the 'glue' code automatically.

Subroutine-threading, considering all the pros and cons, is certainly one of the more elegant ways to implement Forth. Given a good compiler, the user will see no difference to a 'classical' Forth, except that the resulting code runs faster by a factor of (approximately) two. The 68000, with its ability to use any address register as a stack pointer, is especially suited for this kind of approach.

Such were my thoughts at the end of last year, and I had started doing some experiments implementing my own Forth on the Mac, which was going to be subroutine threaded. That was when I received a letter which described a system that did all the things that I just mentioned, and some more. With the effect that I realised that perhaps in a year or two my own efforts would get me to where others had gone already. Too bad.

The letter came from a company called Palo Alto Shipping, and they had developed a system called Mach1, a subroutine threaded Forth-83, multi-tasking above all. After a long introduction, let us jump right in the middle of things with a description what Mach1 is and what it can do.

The Mach1 system

With my evaluation package I got one disk and a deceptively small handbook. Opening the latter revealed a phrase on page IV that I wish to see on more software products: All rights reserved. Nothing more, no legalese for which you need to take courses to understand, just a simple copyright notice as you would find it in any serious textbook. Being a strong adherent of Jerry Pournelle's policy concerning Silly Licensing Agreements, I found this part of the manual extremely sympathetic. Later, it is explicitly said that you may make as many backup copies as you like and use them - one at a time only - on any machine you like. Good.

The handbook gives a short introduction to Forth and refers beginners to Starting Forth for a good tutorial. The system is Forth-83, with some additions. Of course, these 'additions' are where the fun comes in, so here are the main features that make Mach1 comfortable:

-program loading from normal text files, no blocks.

-local variables in a local stack frame created by the LINK/UNLK instruction.

-a floating point package for access to the SANE library. Unfortunately, no fast 32-bit floating point (only Fortran has it so far).

-words that give easy access to the MacinTalk speech synthesizer, which comes with the system.

-vectored input/output (e.g., if you really want, you could use the system from a serial terminal).

-an assembler that recognizes standard Motorola syntax 68000 code,

-a debugger/disassembler/decompiler that can be used like Macsbug and recognizes almost the same commands, but at the same time decompiles Forth words.

-the possibility to save the system with all definitions that you added so that you can quit and resume work without having to reload everything. This is called 'snapshot' in MacForth and 'workspace' in Mach1.

-a very easy way to create stand-alone 'turnkey' applications which contain a stripped-down version of the Forth runtime system and are not too large. These applications may be sold without any additional licensing fees.

-true multi-tasking with an arbitrary number of tasks.

The toolbox is accessed through the word CALL which (on compile) looks up the necessary parameter setup for the trap, swaps them from the data stack (A6) to the subroutine stack (A7) and calls the trap. Three traps cannot be called, UnlodeSeg and LodeSeg - this is understandable because of the heavy use of the segment loader by Mach itself - and TECalText, which I wasn't able to figure out why.

For the creation and management of windows, controls, and menus, sufficient support is given through high-level words in the Mach1 system. There is a 'templates' menu with which you can define a window's characteristics in a dialog box, and the Forth commands to create that window will be automatically copied to the clipboard; same for menus and controls. Like in NEON, you may load from the clipboard.

Mach1 is packaged with the switcher and MDS Edit. You're supposed to set up the system in a way that you can switch between Edit and Mach1 for development. I found myself pretty soon trashing Edit and using the Editor desk accessory for writing programs, which overall is faster, though Edit does a better job of formatting.

Multi-tasking under Mach1

Before we get to the program example, let's take a quick look at the multi-tasking features of Mach1. You will soon see that multi-tasking is so essential in this system that the style of programming in Mach1 will be quite different from other Forths.

Under Mach1, you may define tasks. Each task has some private stack and user variable space assigned. Tasks are arranged in an endless chain, with each task containing a pointer to the next task in the chain at the base of its user variable area. Just below this pointer is a word that contains the status of the task, SLEEP or WAKE. This word actually is executable 68000 code; if the task is aSLEEP, the code looks like

4EF9 XXXX XXXX   JMP next task,

if it is aWAKE, it is

 4E40  ----  ----TRAP #0 .

So tasks that are dormant will just be skipped by a jump to the base address of the next task; active ones execute the trap instruction, which falls into a routine that restores the task's register file and continues where it quit the last time. Any task will execute until the word PAUSE is encountered, at which point the local registers will be saved and execution transferred to the next task.

PAUSE can be compiled into a task's code at strategic points, for example in a loop that is part of a lengthy calculation. Also some built-in words (mainly I/O, like EMIT) contain a PAUSE. The requirement for the task to call the scheduler, instead of the scheduler switching tasks automatically, is the only thing that distinguishes Mach1 from a full implementation of multi-tasking. So multi-user would be not practical, because if one user entered into a pauseless loop, the others would be cut off; but for single-user multi-tasking this system works very well.

I have written a background task that acts as a printer spooler for this month's example. One of the problems with Mach1 is multiple file handling; each task can only handle one open file at a time (from how I understood the manual). Therefore, multi-tasking comes in automatically if one wants to work with several open files (I've heard that this is done sometimes...).

Another drawback is that the file handling routines cannot accept a file name string as an external parameter; it is compiled into the definition that calls the file routine. Therefore, when a new spool file is opened, my program resorts to a kludge, storing a unique 4 digit decimal number into the appropriate position in the word's code (which I was able to find with the debugger).

The example below creates a task, spool_task, that executes spool in the background. This is an infinite loop that looks for the numbers of spool files contained in a list; if the list is non-empty (list.busy), it will open the spool file with that number and print it until it encounters a zero byte (end-of-file for a text file). The spool file is not deleted, so your disk might fill up rather quickly; reason is that there is no predefined word for deletion of files in Mach1, and I was too lazy to write one.

If you have Mach1, load this program, create a second active window, like the "Skylight" window in the example on page vi of the manual. Then say new_spool in the main window, which will create a new spool file, open it and redirect the output to both the file and the window. Click the second window and do the same there. Now you can create output in one of the windows, e.g. by ?FREE or WORDS, let the thing go and do some output in the other window. When you close_spool in any window, the background task will automatically start printing the spool file while you can continue doing (almost) whatever you like in one of the windows.

Sieve benchmark

The famous Erastothenes Sieve (see back issues of this magazine, e.g. V1#9) executes in 12.8 sec (in my hands). This is almost twice as fast as MacForth or NEON (both slightly above 20 sec) and only twice as slow as compiled C of average quality.


The Assembler contained in Mach1 accepts standard Motorola syntax. Words defined under Mach1 may be used in the operand field of assembly statements, e.g. in a JSR. One important feature is the automatic creation of inline code; when you say MACH after defining a word, the compiler will insert the code comprising the word into other definitions, instead of compiling a JSR to the code's address. Simple words like DUP and SWAP are defined this way; when you disassemble a definition containing these words, you will find no JSRs, but code that does DUP and SWAP directly.


This review, by no means complete, should have given you a flavor of Mach1 and what can be done with it. I am impressed of the speed of the code created; an easy-to-use assembler and debugger and the MACH inline code feature help speeding up the system even more. For speed considerations, a fast 32-bit floating point package would be desirable.

The multi-tasking feature is so easy to use that one often splits up programs into tasks when one wouldn't have thought about doing so in other languages. Due to the intrinsically high speed (subroutine threading) the scheduler also doesn't create too much overhead, if one uses PAUSE with consideration.

The vocabularies that come with the system are rather small compared with other systems; nevertheless, full Forth-83 has been implemented and all the necessary things are there. With the MACH macro feature, many of the 'combined' words such as IC! are not needed anymore. The existing set of words is explained very thoroughly in the manual, going into depths of the assembly coding where necessary. Dictionary structures, memory maps, task structures all are explained in detail.

Access to the Macintosh toolbox routines is convenient, and the template facility to set up standard windows and controls comes in very handy. There is even a special kind of scroll bar that will automatically resize with the window.

Stand-alone application may be created very easily, spanning several code segments if necessary. The applications need no run time kernel or library to run, which is nice (in fact, I consider it almost essential). The right to sell Mach1-created applications is contained in the purchasing price of $49.95.

Still, there are some bugs in the system. The debugger will work only once with the debugging switch, using the switch a second time crashed the system (in my hands). Also, some instructions are not disassembled correctly. Those bugs (and probably others that I haven't found) are nothing more and nothing less than one would expect for a freshly released system and I expect them to be fixed soon (upgrade charge $5).

The file handling system is not as convenient as it could be. Multi-tasking makes up for many of the inconveniencies, but not for all of them. A reasonable set of open/close/position/delete routines that accept stack input should be included. Also, the TECaltext trap should be accessible through the normal CALL mechanism. Finally, a fast floating point package would make a real nice addition.

All in all, I can highly recommend Mach1 for both Forth beginners and developers. What intrigues me most is that with a rather small vocabulary one has a system at hand that makes near optimal use of the Macintosh's special features and is fast, too. One last recommendation to Palo Alto Shipping: 20 instead of 2 pages of index would improve the handbook dramatically.

( Mach 1 background printer spooler )
( © J. Langowski / MacTutor Feb. 1986 )

only mac also i/o also forth

: wkg_spool working" spool.####" ;
: cre_spool  create" spool.####" ;
: use_spool   using" spool.####" ;

  variable filecount
  variable printlist 400 vallot
  variable print.ptr
  190 user active.spool 

: file#  filecount  @ ;
: print# print.ptr  @ 4 - @ ;
: incr.file  file#  1+ filecount  ! ;
: incr.print print.ptr @ 4 + print.ptr ! ;
: decr.print print.ptr @ 4 - print.ptr ! ;
: reset.file  1 filecount  ! ;
: reset.print printlist print.ptr ! ;
: add.spool ( n - ) 
    printlist print.ptr @  
 do i 4 - @ i ! -4 +loop  incr.print 
    printlist ! ;
: list.busy print.ptr @ printlist <> ; 

11 constant ##offset 
( offset from beginning of file routine to file name )

: wkg_spool# ( n - ) <# # # # # #>  ( four digits )
  ['] wkg_spool ##offset + swap  cmove 
 ( change file name, append number )
    wkg_spool ;
: cre_spool# ( n - ) <# # # # # #> 
  ['] cre_spool ##offset + swap  cmove  cre_spool ;
: use_spool# ( n - ) <# # # # # #> 
  ['] use_spool ##offset + swap  cmove  use_spool ;

: new_spool 
 file# cre_spool# file# wkg_spool#  
 file# active.spool !
 ." Your spool file is spool." file# . cr
 file output 12 emit ( form feed )
 ." SPOOL FILE # " file# . cr
 console file + output 
: close_spool console output close-file 
 active.spool @ add.spool ;

: print_file 0 
 begin dup virtual c@ dup while emit 1+ repeat 
 2drop ;
: print_spool 
    list.busy if
        print# decr.print use_spool#  disk 4 + w@
        not if print_file close-file  
 else 10 call sysbeep then 
    then ;
400 1000 background spool_task
spool_task build
: spool activate 
    1cc0a mode2  comm2 output
    begin print_spool pause again
reset.file reset.print
spool_task spool

Community Search:
MacTech Search:

Software Updates via MacUpdate

Apple Security Update 2015-005 - For OS...
Apple Security Update 2015-005 is recommended for all users and improves the security of OS X. For detailed information about the security content of this update, please visit: Read more
Apple HP Printer Drivers 3.1 - For OS X...
Apple HP Printer Drivers includes the latest HP printing and scanning software for OS X Lion or later. For information about supported printer models, see this page. Version 3.1: The latest printing... Read more
Epson Printer Drivers 3.1 - For OS X 10....
Epson Printer Drivers installs the latest software for your EPSON printer or scanner for OS X Yosemite, OS X Mavericks, OS X Mountain Lion, and OS X Lion. For more information about printing and... Read more
Xcode 6.4 - Integrated development envir...
Xcode provides everything developers need to create great applications for Mac, iPhone, and iPad. Xcode brings user interface design, coding, testing, and debugging into a united workflow. The Xcode... Read more
OS X Yosemite 10.10.4 - Apple's lat...
OS X Yosemite is Apple's newest operating system for Mac. An elegant design that feels entirely fresh, yet inherently familiar. The apps you use every day, enhanced with new features. And a... Read more
Dash 3.0.2 - Instant search and offline...
Dash is an API Documentation Browser and Code Snippet Manager. Dash helps you store snippets of code, as well as instantly search and browse documentation for almost any API you might use (for a full... Read more
FontExplorer X Pro 5.0 - Font management...
FontExplorer X Pro is optimized for professional use; it's the solution that gives you the power you need to manage all your fonts. Now you can more easily manage, activate and organize your... Read more
Typinator 6.6 - Speedy and reliable text...
Typinator turbo-charges your typing productivity. Type a little. Typinator does the rest. We've all faced projects that require repetitive typing tasks. With Typinator, you can store commonly used... Read more
Arq 4.12.1 - Online backup to Google Dri...
Arq is super-easy online backup for the Mac. Back up to your own Google Drive storage (15GB free storage), your own Amazon Glacier ($.01/GB per month storage) or S3, or any SFTP server. Arq backs up... Read more
Gutenprint 5.2.11-pre1 - Quality drivers...
Gutenprint is a suite of printer drivers that may be used with most common UNIX print spooling systems, including CUPS, lpr, LPRng, or others. Gutenprint currently supports over 2000 printer models.... Read more

Vector 2 is Officially a Thing and it...
Vector is a pretty cool parkour-driven runner that's gotten a pretty decent following since it first came out - although personally I think more people could stand to show it some love. Anyway, Nekki has announced that a sequel isofficially on its... | Read more »
This Week at 148Apps:June 22-26, 2015
June's Summer Journey Continues With 148Apps How do you know what apps are worth your time and money? Just look to the review team at 148Apps. We sort through the chaos and find the apps you're looking for. The ones we love become Editor’s Choice,... | Read more »
LEGO® Minifigures Online (Games)
LEGO® Minifigures Online 1.0.1 Device: iOS iPhone Category: Games Price: $4.99, Version: 1.0.1 (iTunes) Description: | Read more »
World of Tanks Blitz celebrates its firs...
Today marks the first anniversary of the launch of World of Tanks Blitz, the mobile version of the PC tank battler, World of Tanks. World of Tanks Blitz launched on iOS and Android on June 26th last year and to celebrate, Wargaming is giving all... | Read more »
Heroes and Castles 2 Has its Own Standal...
Heroes and Castles 2 is a third-person castle defense game from the same team behind Block Fortress and Bug Heroes. It's cool, it's fun, and now it has its very own free version. [Read more] | Read more »
Formula Cartoon All-Stars Lets You Race...
Ever want to pit your favorite characters from shows like Steven Universe, Adventure Time, and Regular Show against each other in a not quite death race? Well once upon a time you could, but Formula All Stars Touch N' Go doesn't exist anymore. Hope... | Read more »
Retype - Typography Photo Editor (Photo...
Retype - Typography Photo Editor 1.0 Device: iOS Universal Category: Photography Price: $2.99, Version: 1.0 (iTunes) Description: Retype is built out of passion for great typography and it's all about adding text to photo with style... | Read more »
Hungry Shark Evolution Celebrates Shark...
Shark Week is almost here, as is Independence Day, so naturally Hungry Shark Evolution is going to get in on the action. Yes, even the fireworks. [Read more] | Read more »
The New Trivia Crack Will Feature a Musi...
It's official: iHeartMedia (you may know them from iHeartRadio) will be in charge of providing music-related questions for Trivia Crack's upcoming sequel. Also Trivia Crack is getting a sequel. [Read more] | Read more »
Toca Life: City (Education)
Toca Life: City 1.0 Device: iOS Universal Category: Education Price: $2.99, Version: 1.0 (iTunes) Description: Welcome to Toca Life: City, a metropolis filled with everyday fun! Customize characters, explore exciting locations and... | Read more »

Price Scanner via

OtterBox Releases New Symmetry Series Metalli...
Otterbox’s new Symmetry Series of smartphone cases flaunts the best of both both street style and street smarts with their new metallic finishes and trusted OtterBox protection for iPhone 6 and... Read more
MacBook Airs on sale for up to $75 off MSRP
Save up to $75 on the purchase of a new 2015 13″ or 11″ 1.6GHz MacBook Air at the following resellers. Shipping is free with each model: 11" 128GB MSRP $899 11" 256GB... Read more
Apple’s Education discount saves up to $300 o...
Purchase a new Mac or iPad at The Apple Store for Education and take up to $300 off MSRP. All teachers, students, and staff of any educational institution qualify for the discount. Shipping is free,... Read more
Save up to $600 with Apple refurbished Mac Pr...
The Apple Store has Apple Certified Refurbished Mac Pros available for up to $600 off the cost of new models. An Apple one-year warranty is included with each Mac Pro, and shipping is free. The... Read more
Mac Pros on sale for up to $260 off MSRP
B&H Photo has Mac Pros on sale for up to $260 off MSRP. Shipping is free, and B&H charges sales tax in NY only: - 3.7GHz 4-core Mac Pro: $2799, $200 off MSRP - 3.5GHz 6-core Mac Pro: $3719.99... Read more
Save up to $400 on 2014 15-inch Retina MacBoo...
B&H Photo has previous-generation 2014 15″ Retina MacBook Pros on sale for up to $400 off original MSRP. Shipping is free, and B&H charges NY sales tax only: - 15″ 2.2GHz Retina MacBook Pro... Read more
15-inch Retina MacBook Pros on sale for up to...
B&H Photo has new 2015 15″ Retina MacBook Pros on sale for up to $125 off MSRP including free shipping plus NY sales tax only: - 15″ 2.2GHz Retina MacBook Pro: $1899.99 $100 off - 15″ 2.5GHz... Read more
College Student Deals: Additional $100 off Ma...
Take an additional $100 off all MacBooks and iMacs at Best Buy Online with their College Students Deals Savings, valid through July 11, 2015. Anyone with a valid .EDU email address can take advantage... Read more
Apple refurbished Time Capsules available for...
The Apple Store has certified refurbished Time Capsules available for $100 off MSRP. Apple’s one-year warranty is included with each Time Capsule, and shipping is free: - 2TB Time Capsule: $199, $100... Read more
Newsweek Launches iPhone App
The venerable weekly news magazine Newsweek, owned by IBT Media, has announced the launch of its first iPhone app. The new app is available through Apple’s App Store and will allow consumers to read... Read more

Jobs Board

*Apple* TV Live Streaming Frameworks Test En...
**Job Summary** Work and contribute towards the engineering of Apple 's state-of-the-art products involving video, audio, and graphics in Interactive Media Group (IMG) at Read more
Project Manager, WW *Apple* Fulfillment Ope...
…a senior project manager / business analyst to work within our Worldwide Apple Fulfillment Operations and the Business Process Re-engineering team. This role will work Read more
Senior Data Scientist, *Apple* Retail - Onl...
**Job Summary** Apple Retail - Online sells Apple products to customers around the world. In addition to selling Apple products with unique services such as iPad Read more
*Apple* Music Producer - Apple (United State...
**Job Summary** Apple Music seeks a Producer to help shepherd some of the most important content and editorial initiatives within the music app, with a particular focus Read more
Sr. Technical Services Consultant, *Apple*...
**Job Summary** Apple Professional Services (APS) has an opening for a senior technical position that contributes to Apple 's efforts for strategic and transactional Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.