TweetFollow Us on Twitter

Multi-tasking
Volume Number:2
Issue Number:4
Column Tag:Threaded Code: Mach 1

A Multi-tasking Forth System

By Jörg Langowsk, EMBL, C/O I.L.L., Grenoble, Cedex, France, MacTutor Editorial Board

"Subroutine threaded Forth - the Mach 1 system"

This time I am going to introduce to you a new Forth-83 system that I recently got to review. This system is implemented in a somewhat different way compared with MacForth, NEON or other Forths, so let me digress a little and make some comments about Forth implementations on 68000 based systems.

As most of this column's readers know, 'threaded code' means program code that is organized in a very hierarchical manner; a 'main program' or colon definition consists of a list of words that describe other lower-level definitions that in turn are lists of even lower level definitions and so on ... until one arrives at the definitions of the very lowest level that are directly executable machine code (See Figure 1 below).

How do Forth systems implement this structure? The definition of every Forth word, colon definition, variable, etc. contains one short piece of executable machine code at the very beginning. This is called the code field and determines in which way the information following it is handled. For example, in a colon definition in MacForth, the code field contains a TRAP #F (V2.0) or TRAP #E (V2.4) instruction that jumps to code that in turn:

-pushes the instruction pointer (A3) on the return stack,

-sets the instruction pointer to the address of the first word of the new definition to be executed,

-jumps to the NEXT routine which executes the tokens that follow the code field.

Mac Forth is an example of 'token threaded Forth' which means that the numbers following the code field are not valid addresses of other Forth words, but 'tokens' that are to be converted into addresses by some algorithm. In Mac Forth, positive token values are interpreted as offsets from a base pointer (A4), while negative token values are negative offsets from the top of application space (A5) into a token table, where the 32-bit address of the Forth word is found. The token table is used to handle programs that exceed the 32 K address space that can be handled through the first method of addressing.

The advantage of MacForth's and similar structures is that we need only 16 bits for each word compiled into a definition. The disadvantage is the limited address space (32K) for the simple A4-indexed addressing mode and the need for a token table to convert tokens into addresses that lie outside the 32K addressing region. Time is lost on each NEXT loop by having to test whether the token is negative or positive, then eventually by accessing the token table ( See Figure 2 ).

The Forth implementation that was used to create NEON does not need a token table since each word uses 32 bits, but more space is used in the compiled definitions that way.

Fig. 3 gives a summary of some possible addressing modes for Forth interpreters with their execution times (in system clocks, from the MC68000 manual). Some of these examples have been taken from a Dr. Dobb's Journal article by Joe Barnhart, 'Forth and the Motorola 68000', DDJ 83 (1983) 18-26.

MacForth does not conform to any of the 'pure' definitions here, but (cf. Fig. 2) is a mixture of a base-offset direct word and base-offset indirect long threaded code (with an additional branch required).

You see that in all the examples given the inner interpreter requires considerable overhead to 'set up' the forth words for execution.

Additional overhead is created by the execution of the small piece of machine code starting at the CFA. In MacForth this is a TRAP instruction (38 cycles) plus three lines of assembly code that the trap vector points to (32 cycles). Counting everything up, execution of one colon-defined Forth word requires 100 or 120 cycles under MacForth, depending whether or not the token table has to be accessed. In a system like NEON, the CFA does not contain a trap, but a JSR to a routine that sets up the inner interpreter, with a similar amount of time involved.

One solution around this problem is to code time-critical things in assembly language, so that the whole definition is made up of executable 68000 code, starting with the code field. I have given an example for this in MT V1#9 by defining macro words that compile inline code into a definition. The speedup was of the order of 50% for MacForth. Of course, if you write inline macros, you are not really creating threaded code, but linear code, and the size of the definition tends to get rather large.

Still, there is another way to create threaded, i.e. hierarchical, code that does away with the needs for an 'inner interpreter' loop completely, since the threaded code will be directly executable 68000 machine language. This sounds strange, but is actually very simple to achieve. The keyword is 'subroutine threading'. As you saw from the inline code example, we can do without a 'code field' by just starting to execute our definition at the very beginnning, provided it is all bona fide 68000 code. Now, all we have to do is to persuade the Forth system not to compile a list of addresses into a colon definition, but a list of JSR (addr). This way, we don't need the inner interpreter at all, Forth instructions are 68000 instructions; the subroutine return stack, in this case, is the normal 68000 stack, pointed to by A7 (Mach 1 uses still a different stack for loop indices, more about that later).

The speed increase over the previous examples is considerable, with only 18 cycles of overhead for the JSR instruction if the jump is taken within the segment, or 30 cycles for one more .JMP if a jump table has to be used to branch to a different segment. This is one-fifth to one-third of the overhead of any of the other implementations given on the next page in Fig. 3.

Actually, subroutine threading gives us a Forth compiler that creates native 68000 code, which won't even need a runtime Forth nucleus, but will nicely execute by itself.

Drawbacks of this strategy: A7 cannot be the Forth data stack pointer any longer, some other register has to be used. This does not really create a problem as long as one stays within Forth; the 68000 couldn't care less which stack pointer you use for passing arguments between routines. Toolbox routines, however, expect their arguments on the A7 stack. Their calling sequence, therefore, becomes a little bit more complicated - and more time consuming - in a subroutine-threaded Forth implementation, since all the arguments have to be transferred from the data stack to the A7 stack first, then any result has to be swapped back from the A7 stack to the data stack. Also, since toolbox calling should be transparent to the user, the Forth compiler should know about the parameters that all the toolbox traps expect and create the 'glue' code automatically.

Subroutine-threading, considering all the pros and cons, is certainly one of the more elegant ways to implement Forth. Given a good compiler, the user will see no difference to a 'classical' Forth, except that the resulting code runs faster by a factor of (approximately) two. The 68000, with its ability to use any address register as a stack pointer, is especially suited for this kind of approach.

Such were my thoughts at the end of last year, and I had started doing some experiments implementing my own Forth on the Mac, which was going to be subroutine threaded. That was when I received a letter which described a system that did all the things that I just mentioned, and some more. With the effect that I realised that perhaps in a year or two my own efforts would get me to where others had gone already. Too bad.

The letter came from a company called Palo Alto Shipping, and they had developed a system called Mach1, a subroutine threaded Forth-83, multi-tasking above all. After a long introduction, let us jump right in the middle of things with a description what Mach1 is and what it can do.

The Mach1 system

With my evaluation package I got one disk and a deceptively small handbook. Opening the latter revealed a phrase on page IV that I wish to see on more software products: All rights reserved. Nothing more, no legalese for which you need to take courses to understand, just a simple copyright notice as you would find it in any serious textbook. Being a strong adherent of Jerry Pournelle's policy concerning Silly Licensing Agreements, I found this part of the manual extremely sympathetic. Later, it is explicitly said that you may make as many backup copies as you like and use them - one at a time only - on any machine you like. Good.

The handbook gives a short introduction to Forth and refers beginners to Starting Forth for a good tutorial. The system is Forth-83, with some additions. Of course, these 'additions' are where the fun comes in, so here are the main features that make Mach1 comfortable:

-program loading from normal text files, no blocks.

-local variables in a local stack frame created by the LINK/UNLK instruction.

-a floating point package for access to the SANE library. Unfortunately, no fast 32-bit floating point (only Fortran has it so far).

-words that give easy access to the MacinTalk speech synthesizer, which comes with the system.

-vectored input/output (e.g., if you really want, you could use the system from a serial terminal).

-an assembler that recognizes standard Motorola syntax 68000 code,

-a debugger/disassembler/decompiler that can be used like Macsbug and recognizes almost the same commands, but at the same time decompiles Forth words.

-the possibility to save the system with all definitions that you added so that you can quit and resume work without having to reload everything. This is called 'snapshot' in MacForth and 'workspace' in Mach1.

-a very easy way to create stand-alone 'turnkey' applications which contain a stripped-down version of the Forth runtime system and are not too large. These applications may be sold without any additional licensing fees.

-true multi-tasking with an arbitrary number of tasks.

The toolbox is accessed through the word CALL which (on compile) looks up the necessary parameter setup for the trap, swaps them from the data stack (A6) to the subroutine stack (A7) and calls the trap. Three traps cannot be called, UnlodeSeg and LodeSeg - this is understandable because of the heavy use of the segment loader by Mach itself - and TECalText, which I wasn't able to figure out why.

For the creation and management of windows, controls, and menus, sufficient support is given through high-level words in the Mach1 system. There is a 'templates' menu with which you can define a window's characteristics in a dialog box, and the Forth commands to create that window will be automatically copied to the clipboard; same for menus and controls. Like in NEON, you may load from the clipboard.

Mach1 is packaged with the switcher and MDS Edit. You're supposed to set up the system in a way that you can switch between Edit and Mach1 for development. I found myself pretty soon trashing Edit and using the Editor desk accessory for writing programs, which overall is faster, though Edit does a better job of formatting.

Multi-tasking under Mach1

Before we get to the program example, let's take a quick look at the multi-tasking features of Mach1. You will soon see that multi-tasking is so essential in this system that the style of programming in Mach1 will be quite different from other Forths.

Under Mach1, you may define tasks. Each task has some private stack and user variable space assigned. Tasks are arranged in an endless chain, with each task containing a pointer to the next task in the chain at the base of its user variable area. Just below this pointer is a word that contains the status of the task, SLEEP or WAKE. This word actually is executable 68000 code; if the task is aSLEEP, the code looks like

4EF9 XXXX XXXX   JMP next task,

if it is aWAKE, it is

 4E40  ----  ----TRAP #0 .

So tasks that are dormant will just be skipped by a jump to the base address of the next task; active ones execute the trap instruction, which falls into a routine that restores the task's register file and continues where it quit the last time. Any task will execute until the word PAUSE is encountered, at which point the local registers will be saved and execution transferred to the next task.

PAUSE can be compiled into a task's code at strategic points, for example in a loop that is part of a lengthy calculation. Also some built-in words (mainly I/O, like EMIT) contain a PAUSE. The requirement for the task to call the scheduler, instead of the scheduler switching tasks automatically, is the only thing that distinguishes Mach1 from a full implementation of multi-tasking. So multi-user would be not practical, because if one user entered into a pauseless loop, the others would be cut off; but for single-user multi-tasking this system works very well.

I have written a background task that acts as a printer spooler for this month's example. One of the problems with Mach1 is multiple file handling; each task can only handle one open file at a time (from how I understood the manual). Therefore, multi-tasking comes in automatically if one wants to work with several open files (I've heard that this is done sometimes...).

Another drawback is that the file handling routines cannot accept a file name string as an external parameter; it is compiled into the definition that calls the file routine. Therefore, when a new spool file is opened, my program resorts to a kludge, storing a unique 4 digit decimal number into the appropriate position in the word's code (which I was able to find with the debugger).

The example below creates a task, spool_task, that executes spool in the background. This is an infinite loop that looks for the numbers of spool files contained in a list; if the list is non-empty (list.busy), it will open the spool file with that number and print it until it encounters a zero byte (end-of-file for a text file). The spool file is not deleted, so your disk might fill up rather quickly; reason is that there is no predefined word for deletion of files in Mach1, and I was too lazy to write one.

If you have Mach1, load this program, create a second active window, like the "Skylight" window in the example on page vi of the manual. Then say new_spool in the main window, which will create a new spool file, open it and redirect the output to both the file and the window. Click the second window and do the same there. Now you can create output in one of the windows, e.g. by ?FREE or WORDS, let the thing go and do some output in the other window. When you close_spool in any window, the background task will automatically start printing the spool file while you can continue doing (almost) whatever you like in one of the windows.

Sieve benchmark

The famous Erastothenes Sieve (see back issues of this magazine, e.g. V1#9) executes in 12.8 sec (in my hands). This is almost twice as fast as MacForth or NEON (both slightly above 20 sec) and only twice as slow as compiled C of average quality.

Assembler

The Assembler contained in Mach1 accepts standard Motorola syntax. Words defined under Mach1 may be used in the operand field of assembly statements, e.g. in a JSR. One important feature is the automatic creation of inline code; when you say MACH after defining a word, the compiler will insert the code comprising the word into other definitions, instead of compiling a JSR to the code's address. Simple words like DUP and SWAP are defined this way; when you disassemble a definition containing these words, you will find no JSRs, but code that does DUP and SWAP directly.

Summary

This review, by no means complete, should have given you a flavor of Mach1 and what can be done with it. I am impressed of the speed of the code created; an easy-to-use assembler and debugger and the MACH inline code feature help speeding up the system even more. For speed considerations, a fast 32-bit floating point package would be desirable.

The multi-tasking feature is so easy to use that one often splits up programs into tasks when one wouldn't have thought about doing so in other languages. Due to the intrinsically high speed (subroutine threading) the scheduler also doesn't create too much overhead, if one uses PAUSE with consideration.

The vocabularies that come with the system are rather small compared with other systems; nevertheless, full Forth-83 has been implemented and all the necessary things are there. With the MACH macro feature, many of the 'combined' words such as IC! are not needed anymore. The existing set of words is explained very thoroughly in the manual, going into depths of the assembly coding where necessary. Dictionary structures, memory maps, task structures all are explained in detail.

Access to the Macintosh toolbox routines is convenient, and the template facility to set up standard windows and controls comes in very handy. There is even a special kind of scroll bar that will automatically resize with the window.

Stand-alone application may be created very easily, spanning several code segments if necessary. The applications need no run time kernel or library to run, which is nice (in fact, I consider it almost essential). The right to sell Mach1-created applications is contained in the purchasing price of $49.95.

Still, there are some bugs in the system. The debugger will work only once with the debugging switch, using the switch a second time crashed the system (in my hands). Also, some instructions are not disassembled correctly. Those bugs (and probably others that I haven't found) are nothing more and nothing less than one would expect for a freshly released system and I expect them to be fixed soon (upgrade charge $5).

The file handling system is not as convenient as it could be. Multi-tasking makes up for many of the inconveniencies, but not for all of them. A reasonable set of open/close/position/delete routines that accept stack input should be included. Also, the TECaltext trap should be accessible through the normal CALL mechanism. Finally, a fast floating point package would make a real nice addition.

All in all, I can highly recommend Mach1 for both Forth beginners and developers. What intrigues me most is that with a rather small vocabulary one has a system at hand that makes near optimal use of the Macintosh's special features and is fast, too. One last recommendation to Palo Alto Shipping: 20 instead of 2 pages of index would improve the handbook dramatically.


( Mach 1 background printer spooler )
( © J. Langowski / MacTutor Feb. 1986 )

only mac also i/o also forth

: wkg_spool working" spool.####" ;
: cre_spool  create" spool.####" ;
: use_spool   using" spool.####" ;

  variable filecount
  variable printlist 400 vallot
  variable print.ptr
  190 user active.spool 

: file#  filecount  @ ;
: print# print.ptr  @ 4 - @ ;
: incr.file  file#  1+ filecount  ! ;
: incr.print print.ptr @ 4 + print.ptr ! ;
: decr.print print.ptr @ 4 - print.ptr ! ;
: reset.file  1 filecount  ! ;
: reset.print printlist print.ptr ! ;
: add.spool ( n - ) 
    printlist print.ptr @  
 do i 4 - @ i ! -4 +loop  incr.print 
    printlist ! ;
: list.busy print.ptr @ printlist <> ; 

11 constant ##offset 
( offset from beginning of file routine to file name )

: wkg_spool# ( n - ) <# # # # # #>  ( four digits )
  ['] wkg_spool ##offset + swap  cmove 
 ( change file name, append number )
    wkg_spool ;
: cre_spool# ( n - ) <# # # # # #> 
  ['] cre_spool ##offset + swap  cmove  cre_spool ;
: use_spool# ( n - ) <# # # # # #> 
  ['] use_spool ##offset + swap  cmove  use_spool ;

: new_spool 
 incr.file 
 file# cre_spool# file# wkg_spool#  
 file# active.spool !
 ." Your spool file is spool." file# . cr
 file output 12 emit ( form feed )
 ." SPOOL FILE # " file# . cr
 console file + output 
;
: close_spool console output close-file 
 active.spool @ add.spool ;

: print_file 0 
 begin dup virtual c@ dup while emit 1+ repeat 
 2drop ;
 
: print_spool 
    list.busy if
        print# decr.print use_spool#  disk 4 + w@
        not if print_file close-file  
 else 10 call sysbeep then 
    then ;
      
400 1000 background spool_task
spool_task build
hex
: spool activate 
    1cc0a mode2  comm2 output
    begin print_spool pause again
;   
decimal
reset.file reset.print
spool_task spool
 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Firefox 37.0 - Fast, safe Web browser. (...
Firefox offers a fast, safe Web browsing experience. Browse quickly, securely, and effortlessly. With its industry-leading features, Firefox is the choice of Web development professionals and casual... Read more
Arq 4.11 - Online backup to Google Drive...
Arq is super-easy online backup for the Mac. Back up to your own Google Drive storage (15GB free storage), your own Amazon Glacier ($.01/GB per month storage) or S3, or any SFTP server. Arq backs up... Read more
MacFamilyTree 7.3.4 - Create and explore...
MacFamilyTree gives genealogy a facelift: it's modern, interactive, incredibly fast, and easy to use. We're convinced that generations of chroniclers would have loved to trade in their genealogy... Read more
Yummy FTP 1.10.2 - FTP/SFTP/FTPS client...
Yummy FTP is an FTP + SFTP + FTPS file transfer client which focuses on speed, reliability and productivity. Whether you need to transfer a few files or a few thousand, schedule automatic backups, or... Read more
VueScan 9.5.08 - Scanner software with a...
VueScan is a scanning program that works with most high-quality flatbed and film scanners to produce scans that have excellent color fidelity and color balance. VueScan is easy to use, and has... Read more
Iridient Developer 3.0.1 - Powerful imag...
Iridient Developer (was RAW Developer) is a powerful image conversion application designed specifically for OS X. Iridient Developer gives advanced photographers total control over every aspect of... Read more
Air Video Server HD 2.1.0 - Stream video...
Air Video Server HD streams videos instantly from your computer on your iPhone, iPad, iPod touch or Apple TV. No need to worry about converting or transferring files. We took everything that was... Read more
Duplicate Annihilator 5.7.5 - Find and d...
Duplicate Annihilator takes on the time-consuming task of comparing the images in your iPhoto library using effective algorithms to make sure that no duplicate escapes. Duplicate Annihilator... Read more
BusyContacts 1.0.2 - Fast, efficient con...
BusyContacts is a contact manager for OS X that makes creating, finding, and managing contacts faster and more efficient. It brings to contact management the same power, flexibility, and sharing... Read more
Capture One Pro 8.2.0.82 - RAW workflow...
Capture One Pro 8 is a professional RAW converter offering you ultimate image quality with accurate colors and incredible detail from more than 300 high-end cameras -- straight out of the box. It... Read more

2K Announces WWE 2K, Mobile's First...
It seems like this month has been pretty big for wrestling. First Wrestlemania, then 2K has announces that they're releasing  WWE 2K for iOS. It's a simulation-based WWE game where you'll get to play with several WWE superstars such as John Cena, ... | Read more »
How the Apple Watch Could Change the Fac...
The Apple Watch is still a ways out, but my previous musings on the wearable’s various features got me thinking: what might it be like a year after launch? Two years? Five years? What if it becomes a symbiotic part of the iOS framework to the point... | Read more »
Pie In The Sky: A Pizza Odyssey (Games)
Pie In The Sky: A Pizza Odyssey 1.0 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0 (iTunes) Description: A game about delivering pizza. In space. | Read more »
Chosen Gives Hopeful Singers, Songwriter...
If YouTube videos and reality TV shows like The Voice have taught us one thing, it’s that there are a lot of people out there who are anxious to show the world their talents. And if they’ve taught us a second thing, it’s that there’s an almost... | Read more »
Android's Popular OfficeSuite Now A...
Once only available for Android devices, OfficeSuite has finally landed on the app store. The Mobile Systems app lets you view, edit, create, and share Word, Excel, and PowerPoint documents as well as convert them to/from PDFs. It's touted as being... | Read more »
Warhammer: Arcane Magic is Coming Soon,...
Turbo Tape Games has announced that they're joining forces with Games Workshop to bring the turn-based strategy board game, Warhammer: Arcane Magic, to life on the iOS. | Read more »
Fast & Furious: Legacy's Creati...
| Read more »
N-Fusion and 505's Ember is Totally...
| Read more »
These are All the Apple Watch Apps and G...
The Apple Watch is less than a month from hitting store shelves, and once you get your hands on it you're probably going to want some apps and games to install. Fear not! We've compiled a list of all the Apple Watch apps and games we've been able to... | Read more »
Appy to Have Known You - Lee Hamlet Look...
Being at 148Apps these past 2 years has been an awesome experience that has taught me a great deal, and working with such a great team has been a privilege. Thank you to Rob Rich, and to both Rob LeFebvre and Jeff Scott before him, for helping me... | Read more »

Price Scanner via MacPrices.net

Adobe Brings Powerful Layout-Design Capabilit...
Adobe today announced the availability of Adobe Comp CC, a free iPad app that enables rapid creation of layout concepts for mobile, Web and print projects. With Comp CC, designers can rough out and... Read more
Apple offering refurbished 27-inch 5K iMacs f...
The Apple Store is offering Apple Certified Refurbished 27″ 3.5GHz 5K iMacs for $2119 including free shipping. Their price is $380 off the price of new models, and it’s the lowest price available for... Read more
16GB iPad mini on sale for $199, save $50
Walmart has 16GB iPad minis (1st generation) available for $199.99 on their online store, including free shipping. Their price is $50 off MSRP. Online orders only. Read more
New 128GB MacBook Airs on sale for $50 off MS...
 B&H Photo has 128GB 11″ and 13″ 2015 MacBook Airs on sale today for $50 off MSRP including free shipping plus NY sales tax only: - 11″ 1.6GHz/128GB MacBook Air (Model #MJVM2LL/A): $849 $50 off... Read more
13-inch 2.6GHz Retina MacBook Pro (refurbishe...
The Apple Store has Apple Certified Refurbished 13″ 2.6GHz/128GB Retina MacBook Pros available for $979 including free shipping. Original MSRP for this model was $1299. Read more
Save up to $600 with Apple refurbished Mac Pr...
The Apple Store is offering Apple Certified Refurbished Mac Pros for up to $600 off the cost of new models. An Apple one-year warranty is included with each Mac Pro, and shipping is free. The... Read more
Samsung Galaxy S 6 and Galaxy S 6 edge U.S. P...
Samsung Electronics America, Inc. has announced the Galaxy S 6 and Galaxy S 6 edge will be available in the U.S. beginning April 10, with pre-orders being accepted now. “We have completely reimagined... Read more
13-inch 2.5GHz MacBook Pro (refurbished) avai...
The Apple Store has Apple Certified Refurbished 13″ 2.5GHz MacBook Pros available for $829, or $270 off the cost of new models. Apple’s one-year warranty is standard, and shipping is free: - 13″ 2.... Read more
Save up to $80 on iPad Air 2s, NY tax only, f...
 B&H Photo has iPad Air 2s on sale for $80 off MSRP including free shipping plus NY sales tax only: - 16GB iPad Air 2 WiFi: $469.99 $30 off - 64GB iPad Air 2 WiFi: $549.99 $50 off - 128GB iPad... Read more
iMacs on sale for up to $205 off MSRP
B&H Photo has 21″ and 27″ iMacs on sale for up to $205 off MSRP including free shipping plus NY sales tax only: - 21″ 1.4GHz iMac: $1019 $80 off - 21″ 2.7GHz iMac: $1189 $110 off - 21″ 2.9GHz... Read more

Jobs Board

DevOps Software Engineer - *Apple* Pay, iOS...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
*Apple* Retail - Multiple Positions (US) - A...
Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, you're also the Read more
Sr. Technical Services Consultant, *Apple*...
**Job Summary** Apple Professional Services (APS) has an opening for a senior technical position that contributes to Apple 's efforts for strategic and transactional Read more
Lead *Apple* Solutions Consultant - Retail...
**Job Summary** Job Summary The Lead ASC is an Apple employee who serves as the Apple business manager and influencer in a hyper-business critical Reseller's store Read more
*Apple* Pay - Site Reliability Engineer - Ap...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.