|Column Tag:||Threaded Code
Adding Record Structures to Forth
By Jörg Langowski, EMBL, c/o I.L.L., Grenoble, Cedex, France, MacTutor Editorial Board
Records with local field names
Data representation is a field that is neglected by many Forth dialects. Basic Forth-83 doesn't even provide for simple one and two dimensional matrices, neither are more complex types of data supported, such as Pascal records or C structs. These latter forms of data representation play a most important role in Toolbox programming, since very many traps expect pointers to records as parameters.
A letter received through BITNET from a reader who was wondering how to install a way to handle such data structures in Forth got me started on this month's column:
"I posted the following article to the USENET, but got little in the way of a response. Any help you can give will be much appreciated. By the way, I know that the rectangle definitions given below are inaccurate for the Mac, but I was trying to be machine independent in posting to the Forth language newsgroup.
From postnews Thu Jun 12 15:23:34 1986
Subject: Defining a structure in FORTH?
I am very much a novice FORTH programmer, and I don't even have a good textbook to go by. I recently purchased a FORTH for my Macintosh at home (MACH1, distributed by the Palo Alto Shipping Co.), and would like some advice. Professionally I do a lot of work with LISP, and I would like to implement something similar to a `DEFSTRUCT' package in FORTH. In other words, I'd like to be able to do something like:
RIGHT 2 ]ENDSTRUCT
Which would automatically define the following:
8 CONSTANT RECTANGLE-SIZE
: RECTANGLE-TOP@ ( a - n ) @ ;
: RECTANGLE-TOP! ( n a - ) ! ;
: RECTANGLE-LEFT@ ( a - n ) 2 + @ ;
: RECTANGLE-LEFT! ( n a - ) 2 + ! ;
: RECTANGLE-BOTTOM@ ( a - n ) 4 + @ ;
: RECTANGLE-BOTTOM! ( n a - ) 4 + ! ;
: RECTANGLE-RIGHT@ ( a - n ) 6 + @ ;
: RECTANGLE-RIGHT! ( n a - ) 6 + ! ;
: MAKE-RECTANGLE ( whatever code
necessary to allocate 8 bytes of variable storage and assign a dictionary
entry to the word which follows.This I guess would be implementation
specific. ) ;
While I'm sure that this could be done by defining 'DEFSTRUCT[' so that it constructs all of the necessary dictionary headers etc. at the bit and byte level, this would doubtless be complicated and not very portable. I wonder then, if there is a higher level method of defining such a beast? Any help (even "no that can't be done") would be appreciated."
--Bruce Florman florman@rand-unix.ARPA
Since I think the question put forward by Bruce Florman is of very general interest to Macintosh Forth programmers, I'll try to show a way how such data structures may be implemented in MacForth or Mach2.
Structures in MacForth (CSI method)
MacForth (Kernel 2.4) provides a simple and effective way to implement structure definitions. A structure definition is a way to assemble information about a data structure (the lengths of the various fields and the total length of the structure). Example:
20 string: ^description
defines the data structure testrec with four fields, date, time, flag, and description. testrec is not a defining word. When executed, it merely leaves on the stack the length of the structure that is going to be defined; this number can then be used to allot an appropriate number of bytes in the dictionary. So, creation of a testrec would be done like:
create myrec testrec allot
The words that are used to access the field, ^date, ^time, ^flag, and ^description, simply add an offset to the number on top of stack. If this number is the address of a valid structure, like myrec,
would indeed yield the address of the flag field in myrec. [Note that the circumflex in front of the field names is purely a MacForth convention, you could name the fields as you like].
This solution is beautifully simple and helps very much improving the readability of your program text if you are working with lots of structured data types. There is one drawback, however, that the field name definitions are global to the program and therefore violate the conventional definition of a Pascal record, in which field names are always local to the structure.
This means you have to exercise a lot of discipline when you work with structures defined in this way. On executing a field operator, it is not checked whether the address on top of stack is really the address of a structure, so bugs that leave unexpected values on the stack would be harder to detect. Furthermore, since all the field names are global, they may not occur in several different structure definitions in different contexts.
Therefore, I'd like to present an alternative to CSI's implementation of structures which uses local field names. This is slower during compilation, since every structure definition will have its own local dictionary that has to be searched, but in most cases has the same speed during execution. It offers the additional advantage that by a very simple modification, a rudimentary NEON-like class behavior may be built in.
Record definition with local field names
From now on, we'll call the type of data structure dealt with a record, to emphasize the similarity with Pascal records. A record definition will be a template from which an arbitrary number of instances of this record can be built (note that this already strongly resembles NEON's terminology). Each instance will consist of a reference to its template and the data fields as defined in the template (Fig. 1).
A record definition (Listing 1) then consists of:
- the word :record, which sets up a defining word for the instances and initializes the stack for the field name definitions following;
- field name definitions (>long, >word, etc.), which add names to the record template and store (after the name) the length of the data field and its position within the record;
- ;record, which closes the definition, stores a 16-bit zero and the total length of the record at the end of the template, and checks for completeness of the definition.
An example definition is given at the end of Listing 1.
Run-time behavior of records
The run-time behavior of a record template defined through :record is given by the word do.record. This word scans the list of field names in the record template and creates a new instance of the record with a pointer to the template in its first four bytes and space for the data fields following it.
The run-time behavior of the record instance is just to place its base address (the pointer to the template) on the stack. Access to the record fields is provided through ^field, which expects an address of a record instance and a string address on the stack. ^field will search the record template for the field name and leave the (absolute) field address on the stack or abort with an error message if the string does not match any field name in that particular record.
The operator ^ is provided for readability; executing
r1 ^ date
will give the same result as executing
r1 " date" ^field.
So far, we have only talked about execution time behavior of records. However, most of the times one would want to compile references to record fields into Forth definitions rather than execute them directly. For inclusion into Forth definitions, one way is to write
: test1 [ r1 ^ date ] literal ....... ;
which compiles the address of the date field of r1 into the definition as a literal. If a run-time reference to an arbitrary record is to be made, one can either write
: test2 ( record addr -- addr of date field )
" date" ^field ;
which also checks at runtime for the validity of the date reference (something like 'late binding'), or, for faster execution, one writes
: test3 (record addr -- addr of date field )
[ r1 dup ^ date - ] literal + ;
which assumes that the record address passed at run time refers to a record of the same type as r1. But in that case, CSI's structure definition is, of course, equivalent and easier to read.
From record to class definitions - using record fields as vectors
A simple, again very rudimentary, implementation of a NEON class like structure can be obtained using the record definition given here. If the data contained in a >long field (lets say with the field name print) is the cfa of a Forth word, writing
r1 ^ print @ execute (Mach2) or
r1 ^ print @ make.token execute
will execute the word that the print field of r1 points to. (In MacForth, one might also reserve a >word field and store a token there, then say r1 ^ print @ execute).
Vectors within records are very similar to methods associated with objects. Of course, method inheritance from superclasses has not been implemented here, so the resemblance to 'real' object oriented languages is not very strong.
Some extensions to Mach2 for MacForth compatibility
At the beginning of Listing 1 I have included some definitions for MacForth words that are not included in Mach2. Those are the words =cells, needed, and -string. The latter, a string comparison operator, has been implemented in two different ways; in both cases, the top two stack items are string addresses, and the flag returned is 0 if the strings are equal and 1 if not. =string uses the IUMagIDString routine from the international utilities package, which does a better job in comparing name strings that contain umlauts, diacritical marks etc., but is slower. -string uses the _Cmpstring trap, which is much faster and the recommended one to use for applications like this one.
MacForth Plus - no more Level 1,2,3
Readers of the CSI newsletter might have received an announcement of their latest update to MacForth by the time this is in print. Anyway, I'll tell you a few things about it that I was told in a letter at the time I wrote this (June).
MacForth Plus, to be released at the end of August, will supersede all previous versions and levels of MacForth. Its version of the kernel will "...execute considerably faster than K2.4 It will execute faster than Mach1."
Normal text file editing will be supported, as well as block file editing.
Multitasking, which was undocumented, but in principle possible with MacForth, will be fully supported in MacForth Plus.
The documentation will contain in one single manual the Level 1,2 and 3 informations, as well as the new features.
Stand-alone applications can be produced by a built-in turnkey mechanism.
The upgrade from Level 2 will be available for (scheduled) $49 upgrade fee, which includes the manual. Level 3 users will receive a free upgrade.
This sounds very interesting. I hope I'll soon have a test copy to write about.
Listing 1: record structures in Forth
( Structures, Mach-2 version
Adding a structure compiler to Forth. JL 26.6.86.
This file defines a Pascal-like 'record' structure;
a record is a template for instances of the structure.
myrec r1 \this creates an instance r1 of myrec whose fields
may be accessed through myrec ^ field1 etc..
for 'late binding' usage, the word ^field is provided)
only forth also assembler
( some MacForth definitions that Mach1 is missing )
: =cells dup 2 mod + ;
: needed depth 1- > abort" NEEDED- not enough stack items" ;
count rot count rot swap
count rot count swap
( do.record, creating one instance of a record ) ( 062686 jl )
: do.record ( addr of master | -- )
create dup ,
begin dup c@ dup while ( not zero, i.e. end)
1+ =cells 4 + + ( next field in template )
drop 2+ w@ ( length stored here ) allot
does> ( nothing special )
( :record ;record and friends) ( 062686 jl )
: :record create 13579 4 does> do.record ;
: ;record 2 needed
0 w, ( end of list) w, ( total length )
13579 = 0= abort" ;record without :record"
32 word here over c@ 1+ dup =cells allot cmove ;
( field defining words ) ( 062686 jl )
: >long ( addr | addr+4)
put.fieldname dup w, 4 w, 4 + ;
: >word ( addr | addr+2)
put.fieldname dup w, 2 w, 2+ ;
: >byte ( addr | addr+1)
put.fieldname dup w, 1 w, 1+ ;
: >bytes ( addr \ n | addr+n )
put.fieldname over w, dup w, + ;
( ^field, addressing a field within a record ) ( 062686 jl )
: ^field ( addr name | address of field )
over @ ( addr name master )
begin 2dup -string while ( no match )
dup c@ 6 + =cells +
dup c@ 0= ( end of list )
abort" RECORD- specified field does not exist"
( match found )
dup c@ 1+ =cells + w@ ( start within record )
swap drop + ( address of field )
( ^ ) ( 062686 jl )
: ^ 32 word ^field ;
( example of a record structure ) ( 062686 jl )
30 >bytes description