TweetFollow Us on Twitter

Multi-tasking
Volume Number:2
Issue Number:4
Column Tag:Threaded Code: Mach 1

A Multi-tasking Forth System

By Jörg Langowsk, EMBL, C/O I.L.L., Grenoble, Cedex, France, MacTutor Editorial Board

"Subroutine threaded Forth - the Mach 1 system"

This time I am going to introduce to you a new Forth-83 system that I recently got to review. This system is implemented in a somewhat different way compared with MacForth, NEON or other Forths, so let me digress a little and make some comments about Forth implementations on 68000 based systems.

As most of this column's readers know, 'threaded code' means program code that is organized in a very hierarchical manner; a 'main program' or colon definition consists of a list of words that describe other lower-level definitions that in turn are lists of even lower level definitions and so on ... until one arrives at the definitions of the very lowest level that are directly executable machine code (See Figure 1 below).

How do Forth systems implement this structure? The definition of every Forth word, colon definition, variable, etc. contains one short piece of executable machine code at the very beginning. This is called the code field and determines in which way the information following it is handled. For example, in a colon definition in MacForth, the code field contains a TRAP #F (V2.0) or TRAP #E (V2.4) instruction that jumps to code that in turn:

-pushes the instruction pointer (A3) on the return stack,

-sets the instruction pointer to the address of the first word of the new definition to be executed,

-jumps to the NEXT routine which executes the tokens that follow the code field.

Mac Forth is an example of 'token threaded Forth' which means that the numbers following the code field are not valid addresses of other Forth words, but 'tokens' that are to be converted into addresses by some algorithm. In Mac Forth, positive token values are interpreted as offsets from a base pointer (A4), while negative token values are negative offsets from the top of application space (A5) into a token table, where the 32-bit address of the Forth word is found. The token table is used to handle programs that exceed the 32 K address space that can be handled through the first method of addressing.

The advantage of MacForth's and similar structures is that we need only 16 bits for each word compiled into a definition. The disadvantage is the limited address space (32K) for the simple A4-indexed addressing mode and the need for a token table to convert tokens into addresses that lie outside the 32K addressing region. Time is lost on each NEXT loop by having to test whether the token is negative or positive, then eventually by accessing the token table ( See Figure 2 ).

The Forth implementation that was used to create NEON does not need a token table since each word uses 32 bits, but more space is used in the compiled definitions that way.

Fig. 3 gives a summary of some possible addressing modes for Forth interpreters with their execution times (in system clocks, from the MC68000 manual). Some of these examples have been taken from a Dr. Dobb's Journal article by Joe Barnhart, 'Forth and the Motorola 68000', DDJ 83 (1983) 18-26.

MacForth does not conform to any of the 'pure' definitions here, but (cf. Fig. 2) is a mixture of a base-offset direct word and base-offset indirect long threaded code (with an additional branch required).

You see that in all the examples given the inner interpreter requires considerable overhead to 'set up' the forth words for execution.

Additional overhead is created by the execution of the small piece of machine code starting at the CFA. In MacForth this is a TRAP instruction (38 cycles) plus three lines of assembly code that the trap vector points to (32 cycles). Counting everything up, execution of one colon-defined Forth word requires 100 or 120 cycles under MacForth, depending whether or not the token table has to be accessed. In a system like NEON, the CFA does not contain a trap, but a JSR to a routine that sets up the inner interpreter, with a similar amount of time involved.

One solution around this problem is to code time-critical things in assembly language, so that the whole definition is made up of executable 68000 code, starting with the code field. I have given an example for this in MT V1#9 by defining macro words that compile inline code into a definition. The speedup was of the order of 50% for MacForth. Of course, if you write inline macros, you are not really creating threaded code, but linear code, and the size of the definition tends to get rather large.

Still, there is another way to create threaded, i.e. hierarchical, code that does away with the needs for an 'inner interpreter' loop completely, since the threaded code will be directly executable 68000 machine language. This sounds strange, but is actually very simple to achieve. The keyword is 'subroutine threading'. As you saw from the inline code example, we can do without a 'code field' by just starting to execute our definition at the very beginnning, provided it is all bona fide 68000 code. Now, all we have to do is to persuade the Forth system not to compile a list of addresses into a colon definition, but a list of JSR (addr). This way, we don't need the inner interpreter at all, Forth instructions are 68000 instructions; the subroutine return stack, in this case, is the normal 68000 stack, pointed to by A7 (Mach 1 uses still a different stack for loop indices, more about that later).

The speed increase over the previous examples is considerable, with only 18 cycles of overhead for the JSR instruction if the jump is taken within the segment, or 30 cycles for one more .JMP if a jump table has to be used to branch to a different segment. This is one-fifth to one-third of the overhead of any of the other implementations given on the next page in Fig. 3.

Actually, subroutine threading gives us a Forth compiler that creates native 68000 code, which won't even need a runtime Forth nucleus, but will nicely execute by itself.

Drawbacks of this strategy: A7 cannot be the Forth data stack pointer any longer, some other register has to be used. This does not really create a problem as long as one stays within Forth; the 68000 couldn't care less which stack pointer you use for passing arguments between routines. Toolbox routines, however, expect their arguments on the A7 stack. Their calling sequence, therefore, becomes a little bit more complicated - and more time consuming - in a subroutine-threaded Forth implementation, since all the arguments have to be transferred from the data stack to the A7 stack first, then any result has to be swapped back from the A7 stack to the data stack. Also, since toolbox calling should be transparent to the user, the Forth compiler should know about the parameters that all the toolbox traps expect and create the 'glue' code automatically.

Subroutine-threading, considering all the pros and cons, is certainly one of the more elegant ways to implement Forth. Given a good compiler, the user will see no difference to a 'classical' Forth, except that the resulting code runs faster by a factor of (approximately) two. The 68000, with its ability to use any address register as a stack pointer, is especially suited for this kind of approach.

Such were my thoughts at the end of last year, and I had started doing some experiments implementing my own Forth on the Mac, which was going to be subroutine threaded. That was when I received a letter which described a system that did all the things that I just mentioned, and some more. With the effect that I realised that perhaps in a year or two my own efforts would get me to where others had gone already. Too bad.

The letter came from a company called Palo Alto Shipping, and they had developed a system called Mach1, a subroutine threaded Forth-83, multi-tasking above all. After a long introduction, let us jump right in the middle of things with a description what Mach1 is and what it can do.

The Mach1 system

With my evaluation package I got one disk and a deceptively small handbook. Opening the latter revealed a phrase on page IV that I wish to see on more software products: All rights reserved. Nothing more, no legalese for which you need to take courses to understand, just a simple copyright notice as you would find it in any serious textbook. Being a strong adherent of Jerry Pournelle's policy concerning Silly Licensing Agreements, I found this part of the manual extremely sympathetic. Later, it is explicitly said that you may make as many backup copies as you like and use them - one at a time only - on any machine you like. Good.

The handbook gives a short introduction to Forth and refers beginners to Starting Forth for a good tutorial. The system is Forth-83, with some additions. Of course, these 'additions' are where the fun comes in, so here are the main features that make Mach1 comfortable:

-program loading from normal text files, no blocks.

-local variables in a local stack frame created by the LINK/UNLK instruction.

-a floating point package for access to the SANE library. Unfortunately, no fast 32-bit floating point (only Fortran has it so far).

-words that give easy access to the MacinTalk speech synthesizer, which comes with the system.

-vectored input/output (e.g., if you really want, you could use the system from a serial terminal).

-an assembler that recognizes standard Motorola syntax 68000 code,

-a debugger/disassembler/decompiler that can be used like Macsbug and recognizes almost the same commands, but at the same time decompiles Forth words.

-the possibility to save the system with all definitions that you added so that you can quit and resume work without having to reload everything. This is called 'snapshot' in MacForth and 'workspace' in Mach1.

-a very easy way to create stand-alone 'turnkey' applications which contain a stripped-down version of the Forth runtime system and are not too large. These applications may be sold without any additional licensing fees.

-true multi-tasking with an arbitrary number of tasks.

The toolbox is accessed through the word CALL which (on compile) looks up the necessary parameter setup for the trap, swaps them from the data stack (A6) to the subroutine stack (A7) and calls the trap. Three traps cannot be called, UnlodeSeg and LodeSeg - this is understandable because of the heavy use of the segment loader by Mach itself - and TECalText, which I wasn't able to figure out why.

For the creation and management of windows, controls, and menus, sufficient support is given through high-level words in the Mach1 system. There is a 'templates' menu with which you can define a window's characteristics in a dialog box, and the Forth commands to create that window will be automatically copied to the clipboard; same for menus and controls. Like in NEON, you may load from the clipboard.

Mach1 is packaged with the switcher and MDS Edit. You're supposed to set up the system in a way that you can switch between Edit and Mach1 for development. I found myself pretty soon trashing Edit and using the Editor desk accessory for writing programs, which overall is faster, though Edit does a better job of formatting.

Multi-tasking under Mach1

Before we get to the program example, let's take a quick look at the multi-tasking features of Mach1. You will soon see that multi-tasking is so essential in this system that the style of programming in Mach1 will be quite different from other Forths.

Under Mach1, you may define tasks. Each task has some private stack and user variable space assigned. Tasks are arranged in an endless chain, with each task containing a pointer to the next task in the chain at the base of its user variable area. Just below this pointer is a word that contains the status of the task, SLEEP or WAKE. This word actually is executable 68000 code; if the task is aSLEEP, the code looks like

4EF9 XXXX XXXX   JMP next task,

if it is aWAKE, it is

 4E40  ----  ----TRAP #0 .

So tasks that are dormant will just be skipped by a jump to the base address of the next task; active ones execute the trap instruction, which falls into a routine that restores the task's register file and continues where it quit the last time. Any task will execute until the word PAUSE is encountered, at which point the local registers will be saved and execution transferred to the next task.

PAUSE can be compiled into a task's code at strategic points, for example in a loop that is part of a lengthy calculation. Also some built-in words (mainly I/O, like EMIT) contain a PAUSE. The requirement for the task to call the scheduler, instead of the scheduler switching tasks automatically, is the only thing that distinguishes Mach1 from a full implementation of multi-tasking. So multi-user would be not practical, because if one user entered into a pauseless loop, the others would be cut off; but for single-user multi-tasking this system works very well.

I have written a background task that acts as a printer spooler for this month's example. One of the problems with Mach1 is multiple file handling; each task can only handle one open file at a time (from how I understood the manual). Therefore, multi-tasking comes in automatically if one wants to work with several open files (I've heard that this is done sometimes...).

Another drawback is that the file handling routines cannot accept a file name string as an external parameter; it is compiled into the definition that calls the file routine. Therefore, when a new spool file is opened, my program resorts to a kludge, storing a unique 4 digit decimal number into the appropriate position in the word's code (which I was able to find with the debugger).

The example below creates a task, spool_task, that executes spool in the background. This is an infinite loop that looks for the numbers of spool files contained in a list; if the list is non-empty (list.busy), it will open the spool file with that number and print it until it encounters a zero byte (end-of-file for a text file). The spool file is not deleted, so your disk might fill up rather quickly; reason is that there is no predefined word for deletion of files in Mach1, and I was too lazy to write one.

If you have Mach1, load this program, create a second active window, like the "Skylight" window in the example on page vi of the manual. Then say new_spool in the main window, which will create a new spool file, open it and redirect the output to both the file and the window. Click the second window and do the same there. Now you can create output in one of the windows, e.g. by ?FREE or WORDS, let the thing go and do some output in the other window. When you close_spool in any window, the background task will automatically start printing the spool file while you can continue doing (almost) whatever you like in one of the windows.

Sieve benchmark

The famous Erastothenes Sieve (see back issues of this magazine, e.g. V1#9) executes in 12.8 sec (in my hands). This is almost twice as fast as MacForth or NEON (both slightly above 20 sec) and only twice as slow as compiled C of average quality.

Assembler

The Assembler contained in Mach1 accepts standard Motorola syntax. Words defined under Mach1 may be used in the operand field of assembly statements, e.g. in a JSR. One important feature is the automatic creation of inline code; when you say MACH after defining a word, the compiler will insert the code comprising the word into other definitions, instead of compiling a JSR to the code's address. Simple words like DUP and SWAP are defined this way; when you disassemble a definition containing these words, you will find no JSRs, but code that does DUP and SWAP directly.

Summary

This review, by no means complete, should have given you a flavor of Mach1 and what can be done with it. I am impressed of the speed of the code created; an easy-to-use assembler and debugger and the MACH inline code feature help speeding up the system even more. For speed considerations, a fast 32-bit floating point package would be desirable.

The multi-tasking feature is so easy to use that one often splits up programs into tasks when one wouldn't have thought about doing so in other languages. Due to the intrinsically high speed (subroutine threading) the scheduler also doesn't create too much overhead, if one uses PAUSE with consideration.

The vocabularies that come with the system are rather small compared with other systems; nevertheless, full Forth-83 has been implemented and all the necessary things are there. With the MACH macro feature, many of the 'combined' words such as IC! are not needed anymore. The existing set of words is explained very thoroughly in the manual, going into depths of the assembly coding where necessary. Dictionary structures, memory maps, task structures all are explained in detail.

Access to the Macintosh toolbox routines is convenient, and the template facility to set up standard windows and controls comes in very handy. There is even a special kind of scroll bar that will automatically resize with the window.

Stand-alone application may be created very easily, spanning several code segments if necessary. The applications need no run time kernel or library to run, which is nice (in fact, I consider it almost essential). The right to sell Mach1-created applications is contained in the purchasing price of $49.95.

Still, there are some bugs in the system. The debugger will work only once with the debugging switch, using the switch a second time crashed the system (in my hands). Also, some instructions are not disassembled correctly. Those bugs (and probably others that I haven't found) are nothing more and nothing less than one would expect for a freshly released system and I expect them to be fixed soon (upgrade charge $5).

The file handling system is not as convenient as it could be. Multi-tasking makes up for many of the inconveniencies, but not for all of them. A reasonable set of open/close/position/delete routines that accept stack input should be included. Also, the TECaltext trap should be accessible through the normal CALL mechanism. Finally, a fast floating point package would make a real nice addition.

All in all, I can highly recommend Mach1 for both Forth beginners and developers. What intrigues me most is that with a rather small vocabulary one has a system at hand that makes near optimal use of the Macintosh's special features and is fast, too. One last recommendation to Palo Alto Shipping: 20 instead of 2 pages of index would improve the handbook dramatically.


( Mach 1 background printer spooler )
( © J. Langowski / MacTutor Feb. 1986 )

only mac also i/o also forth

: wkg_spool working" spool.####" ;
: cre_spool  create" spool.####" ;
: use_spool   using" spool.####" ;

  variable filecount
  variable printlist 400 vallot
  variable print.ptr
  190 user active.spool 

: file#  filecount  @ ;
: print# print.ptr  @ 4 - @ ;
: incr.file  file#  1+ filecount  ! ;
: incr.print print.ptr @ 4 + print.ptr ! ;
: decr.print print.ptr @ 4 - print.ptr ! ;
: reset.file  1 filecount  ! ;
: reset.print printlist print.ptr ! ;
: add.spool ( n - ) 
    printlist print.ptr @  
 do i 4 - @ i ! -4 +loop  incr.print 
    printlist ! ;
: list.busy print.ptr @ printlist <> ; 

11 constant ##offset 
( offset from beginning of file routine to file name )

: wkg_spool# ( n - ) <# # # # # #>  ( four digits )
  ['] wkg_spool ##offset + swap  cmove 
 ( change file name, append number )
    wkg_spool ;
: cre_spool# ( n - ) <# # # # # #> 
  ['] cre_spool ##offset + swap  cmove  cre_spool ;
: use_spool# ( n - ) <# # # # # #> 
  ['] use_spool ##offset + swap  cmove  use_spool ;

: new_spool 
 incr.file 
 file# cre_spool# file# wkg_spool#  
 file# active.spool !
 ." Your spool file is spool." file# . cr
 file output 12 emit ( form feed )
 ." SPOOL FILE # " file# . cr
 console file + output 
;
: close_spool console output close-file 
 active.spool @ add.spool ;

: print_file 0 
 begin dup virtual c@ dup while emit 1+ repeat 
 2drop ;
 
: print_spool 
    list.busy if
        print# decr.print use_spool#  disk 4 + w@
        not if print_file close-file  
 else 10 call sysbeep then 
    then ;
      
400 1000 background spool_task
spool_task build
hex
: spool activate 
    1cc0a mode2  comm2 output
    begin print_spool pause again
;   
decimal
reset.file reset.print
spool_task spool
 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Shadow Blade: Reload guide - How to hack...
Shadow Blade: Reload is the kind of action-platformer that would have happily sucked up hours of your time on a console a few years back.Now, you can take it with you wherever you go, and its mobile conversion is not too shabby at all. To help you... | Read more »
Tomb of the Mask guide - How to increase...
Tomb of the Mask is a great endless arcade game from Happymagenta in which quick reflexes and a persistent attitude can go a long way toward earning a top score. Check out these tips to see if you can give yourself an edge on the leaderboards. [... | Read more »
Smooth Operator! (Games)
Smooth Operator! 1.0 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0 (iTunes) Description: Smooth Operator is a weird, weird two-player kissing game. Squeeze in for 2 player fun on a single iPad, creating awkward... | Read more »
Sinless: Remastered (Games)
Sinless: Remastered 1.0 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0 (iTunes) Description: | Read more »
_PRISM Guide - How to solve those puzzle...
_PRISM is a rather delightful puzzle game that’s been tailor made for touch screens. While part of the fun is figuring things out as you go along, we thought we’d offer you a helping hand at getting in the right mindset. Don’t worry about messing... | Read more »
Fractal Space (Games)
Fractal Space 1.3.1 Device: iOS Universal Category: Games Price: $.99, Version: 1.3.1 (iTunes) Description: Live the memorable experience of Fractal Space, a unique first person adventure & puzzle game by Haze Games! Will you... | Read more »
Set off on an adventure through the Cand...
Like match three puzzlers? If so, Jelly Blast, the innovative iOS and Android game which launched last year, is worth a look. Jelly Blast sees you head off on an epic adventure through the Candy Kingdom with your friends Lily, Mr. Hare, and Mr.... | Read more »
Ellipsis - Touch. Explore. Survive. (...
Ellipsis - Touch. Explore. Survive. 1.0 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0 (iTunes) Description: | Read more »
Abzorb (Games)
Abzorb 1 Device: iOS Universal Category: Games Price: $2.99, Version: 1 (iTunes) Description: Abzorb is a tilt game about consuming orbs by getting as close as you can to them without touching. By tilting your mobile device, collect... | Read more »
A Short Tale (Games)
A Short Tale 1.0 Device: iOS Universal Category: Games Price: $3.99, Version: 1.0 (iTunes) Description: It’s been years since I lost Ben, so long yet it doesn’t seem long enough. I never thought I’d return here, to where everything... | Read more »

Price Scanner via MacPrices.net

cb Hardcase – Handmade and Premium Protective...
Baden-Baden, Germany based company cb innovations has introduced the new cb Hardcase for iPhone. Featuring fine Italian Premium leather that makes for a unique look and feel, the cb Hardcase... Read more
Sale! B&H Photo offers 12-inch Retina Mac...
B&H Photo has 12″ Retina MacBooks on sale for $300 off MSRP for a limited time. Shipping is free, and B&H charges NY tax only: - 12″ 1.1GHz Gray Retina MacBook: $999 $300 off MSRP - 12″ 1.... Read more
App Annie Reveals Future of the App Economy:...
App Annie, a San Francisco based mobile app data and insights platform, has launched its first comprehensive app economy forecast. This new offering will provide brands, agencies, investors and app... Read more
Apple restocks Certified Refurbished Mac mini...
Apple has restocked Certified Refurbished 2014 Mac minis, with models available starting at $419. Apple’s one-year warranty is included with each mini, and shipping is free: - 1.4GHz Mac mini: $419 $... Read more
What iPad Pro Still Needs To Make It Truly Pr...
I love my iPad Air 2. So much that I’m grudgingly willing to put up with its compromises and limitations as a production tool in order to take advantage of its virtues. However, since a computer for... Read more
21-inch 3.1GHz 4K on sale for $1399, $100 off...
B&H Photo has the 21″ 3.1GHz 4K iMac on sale $1399 for a limited time. Shipping is free, and B&H charges NY sales tax only. Their price is $100 off MSRP: - 21″ 3.1GHz 4K iMac (MK452LL/A): $... Read more
Apple price trackers, updated continuously
Scan our Apple Price Trackers for the latest information on sales, bundles, and availability on systems from Apple’s authorized internet/catalog resellers. We update the trackers continuously: - 15″... Read more
Save up to $240 with Apple Certified Refurbis...
Apple is now offering Certified Refurbished 12″ Retina MacBooks for up to $240 off the cost of new models. Apple will include a standard one-year warranty with each MacBook, and shipping is free. The... Read more
Apple refurbished 13-inch Retina MacBook Pros...
Apple has Certified Refurbished 13″ Retina MacBook Pros available for up to $270 off the cost of new models. An Apple one-year warranty is included with each model, and shipping is free: - 13″ 2.7GHz... Read more
Apple refurbished Time Capsules available for...
Apple has certified refurbished Time Capsules available for $120 off MSRP. Apple’s one-year warranty is included with each Time Capsule, and shipping is free: - 2TB Time Capsule: $179, $120 off - 3TB... Read more

Jobs Board

Lead Engineer *Apple* OSX & Hardware -...
Lead Engineer Apple OSX & Hardware **Job ID:** 3125919 **Full/Part\-Time:** Full\-time **Regular/Temporary:** Regular **Listed:** 2016\-02\-10 **Location:** Cary, Read more
*Apple* Mobile Master - Best Buy (United Sta...
Job Title Apple Mobile Master **Brand** Best Buy **Job Description** **What does a Best Buy Apple Mobile Master do?** At Best Buy, our mission is to leverage the Read more
Infrastructure Engineer - *Apple* /Mac - Rem...
…part of a team Requires proven problem solving skills Preferred Additional: Apple Certified System Administrator (ACSA) Apple Certified Technical Coordinator (ACTC) Read more
Lead Engineer - *Apple* OSX & Hardware...
Lead Engineer - Apple OSX & Hardware **Job ID:** 3125919 **Full/Part\-Time:** Full\-time **Regular/Temporary:** Regular **Listed:** 2016\-02\-10 **Location:** Cary, Read more
Simply Mac *Apple* Specialist- Service Repa...
Simply Mac is the largest premier retailer of Apple products in the nation. In order to support our growing customer base, we are currently looking for a driven Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.