|Column Tag:||FILE SAVING, OPENING
I/O Completion Routines
By Steve Brecher
I/O Completion Routines
Application I/O requests are ultimately performed by low-level operating system trap routines: _Read, _Write, _Control, _Status, _Open, _Close. The program passes the address of an I/O parameter block to the trap routine. The parameter block specifies the device driver (e.g., Sony floppy driver, serial driver) and the particulars of the request such as (for _Read or _Write) the number of bytes and the address of the programs buffer.
The operating system passes the parameter block on to the device driver; or, if the driver is currently busy, the parameter block is put onto the drivers queue of pending work.
Mac I/O operations can be either synchronous or asynchronous. A synchronous operation will be completed before control is returned to the invoking program. An asynchronous operation will be initiated -- started upon by the driver, or put into its queue-- and then control will be returned to the program regardless of whether the operation is yet complete.
Asynchronous I/O is the heart of multi-tasking; indeed, the only rationale for multi-tasking is the overlapping of I/O and computation. Programs can imp- lement a primitive form of multi-tasking within themselves by using a multi-buffer I/O scheme. For instance, a program can request that a buffer full of data be written, and then -- before that write request is completed -- immediately start filling another buffer with data.
Completion routines can be a convenient mechanism for a program to manage multi-buffered I/O. The com- pletion routine is called asynchronously with respect to the rest of the program; it can mark the buffer associated with the just-completed I/O request as free (written) or ready for processing (read). If another buffer is ready to be read into or written from, the completion routine can initiate the next I/O request.
Completion routines can also be used to hide low-level I/O details from the mainline logic of a program. I attempted to use the Macs completion routine facility for that purpose in a program doing input from the modem port at high baud rates. The program started off by issuing an asynchronous read request for one byte, and specifying a completion routine. The completion routine looked like this:
-- Put the received byte into programs input buffer;
-- Issue a _Status call to find out whether the serial input driver has more data in its own buffer;
-- If the driver has more data available, issue a _Read to transfer it to the programs input buffer;
-- Re-issue the one-byte asynchronous _Read.
The idea was for the completion routine to handle the low-level details of managing input into the programs buffer. The mainline of the program would then be freed from concern with those details, and could just take data from the buffer as it became available.
But this scheme didnt work. Following is an explanation of why it didnt, which relates to a severe limitation in the Macs implementation of completion routines.
Asynchronous I/O is requested by setting bit 10 ($0400) in the trap value. There are two ways for a program to detect the completion of asynchronous I/O. When an I/O operation is initiated, the ioResult field of the parameter block is set to 1; when the operation completes, ioResult is set to 0 (if no error) or to a negative error code. So the program may test ioResult to determine whether the operation is complete. Or, the program may specify the address of an I/O completion routine in the ioCompletion field of the parameter block. If ioCompletion is not nil (zero), then when the operation is complete the OS will JSR to the address contained in ioCompletion.
If bit 10 of the trap word is clear -- a synchronous I/O -- the Device Manager will clear ioCompletion before initiating the I/O; and the OS will test ioResult in a tight loop, waiting for it to become zero or negative before returning from the trap. Hence, you dont need to clear ioCompletion prior to a synchronous I/O; but for asynchronous I/O you must either either clear ioCompletion or set it to the address of a completion routine.
When a completion routine is entered, register D0.W contains the ioResult value (and the condition codes reflect a TST of its value); A0 contains the parameter block address; 4(SP) -- just above the return address -- contains a pointer to the drivers Device Control Entry (DCE); and registers D0-D3/A0-A3 may be altered by the completion routine. The I/O queue element (a.k.a. the parameter block) will have been unlinked from the drivers queue. And -- heres the catch -- interrupts may or may not be disabled.
The OS initiates an I/O operation by calling the appropriate entry point of the device driver. Lets consider, for example, a read request to the serial input driver. The OS calls the serial input drivers Prime (read) routine. There are two possibilities: either the request will be completed immediately by the Prime routine, or it will be completed later as the result of an interrupt.
If the number of characters in the drivers buffer is equal to or greater than the number specified in the I/O request, the Prime routine will fulfill the request immediately by transferring characters to the callers ioBuffer, and then jump to the OS ioDone routine, which will call the completion routine. Interrupts are enabled at entry to the completion routine, and it can therefore do anything that any other part of an application can do. In this case, when the completion routine is entered the original _Read trap has not yet returned and no other application code has been executed. Therefore the effect is similar to that of the application executing the following:
If the completion routine initiates another I/O request to the same driver, specifying the same ioCompletion (i.e., itself), that request may also complete immediately in which case the completion routine will be re-entered.
If there are insufficient characters in the drivers buffer to satisfy a read request, the Prime routine will return to the OS, which will return from the trap. Later, the drivers received-character interrupt service will determine that the request is complete, and jump to the OS ioDone routine, which will call the completion routine. Now, however, the completion routine is entered with interrupts disabled; in effect the completion routine is part of the interrupt service.
The completion routine can do no significant processing, because that would hold off interrupts for too long a time -- for example, subsequent incoming serial data might be lost. This virtually excludes the possiblity of intitiating a new I/O operation from within the completion routine. Also, the completion routine (at interrupt level) cannot do anything that would alter the heap configuration -- the interrupt might have occurred while a handle was being dereferenced.
Since there is no point to doing asynchronous I/O unless it is assumed that the request will not complete immediately, the programmer must assume that a serial I/O completion routine will be executed at interrupt level. This makes the completion routine facility rather useless for serial I/O. The routine could set a flag indicating the completion -- but such a flag is already available in ioResult.
Wish list item... To make completion routines useful, the Mac OS would have to be enhanced to implement what the DEC PDP-11 operating systems refer to as fork processes. Fork processes are serialized and executed synchronously after all (possibly nested) interrupts have been dismissed, but before control is returned to the point at which the first interrupt occurred. This enables completion routines to be inter- ruptable, and gives them time to do useful work.
MS BASIC programmers: Stuff it!
The QuickDraw StuffHex routine is a fast way to convert lengthy hex machine language data to binary code in MS BASIC programs. The trick is to prefix each string of hex digits with a character whose ASCII value is equal to the number of hex digits: this character is thus a length byte, making the string a Pascal string which can be passed to StuffHex. Note that the address of the string data is not given by the address of the BASIC string variable. The string variable is actually a data structure containing information about the string, including its address. The address can change each time the string is assigned or read into. We get the address of the string data by PEEKing into the string variable. Example:
Machine language interface to StuffHex:
SH%(0)=&H245F : SH%(1)=&HA866
Array that gets the binary machine
Declare all scalars before getting array
CODEPTR!=0! : HEXLINE$= : STRINGPTR!=0!
Get addresses of arrays and the string
Read lines of hex data, convert to binary:
Get address of first byte of string data
(the length byte):
Convert the lines hex data to binary:
Adjust pointer into CODE! array for next
data (if any):
The ASCII value of the first character
of each string must be equal to the
number of characters in the rest of the
string. Thus the number of hex digits
in each string must be at least 32
(ASCII space) and no greater than
126 (ASCII ~) so that the length is a
displayable ASCII character. But the
length must be other than 34 -- 34 is
the ASCII code for quotation mark!
Note the following DATA line has 36 hex
digits, and that the first character of the
string is $, i.e., CHR$(36).
More lines of hex data would go here
Empty string marks end of data:
Reports from Miss Elaine E.
Due to a QuickDraw bug, DrawText in srcCopy mode will erase four or five character positions after the position of the last character in the string. Its OK to use srcXor if no character position will be overwritten; but if you need srcCopy, a workaround is to set the right side of the ports clipRegion to the right edge of the last character before calling DrawText. If the font is not monospace, this implies calling TextWidth first to find out the screen width of the string to be drawn.
The QuickDraw ScrollRect routine will be slowed by a factor or 3 or 4 if you include the borders of the GrafPort in the rect that is scrolled. Make sure the rect passed to ScollRect is inset at least a couple of pixels from the ports border(s).
Thanks to Steve Hanna for this tip on the alternate screen buffer... To have QuickDraw draw in the alternate screen buffer, change the BaseAddr field of the GrafPort to $72700. When you use QD calls with that GrafPort as the current port, the drawing will be done in the alternate screen buffer. To flip the display between the main and alternate screen buffers, complement bit 6 of VIA buffer A, which is mapped to address $EFFFFE. (Steve Hanna also notes that the low-order 3 bits of buffer A control sound volume [0..7]. Bit 7 is tied to the SCC -- see below). The following instruction will flip the display to the other screen buffer:
Thanks to Dennis Brothers for pointing out that the MacsBug HD (heap display) command is a way find your program in memory -- the first CODE resource in the heap is most likely the first (or only) segment of your program.
The reason the ROM serial driver doesnt support input flow control is that a D0 was coded where a D3 should have been; a two-bit error. And the reason mouse interrupts will be lost if you close the ROM serial driver (without immediately opening the RAM serial driver) is that the ROM serial driver neglects to finish its cleanup by setting the master interrupt enable bit in the SCC chip -- a one-word omission from a table. (Mouse movement signals come in through the DCD pins of the SCC and generate SCC external status interrupts.)
If you want to program the SCC yourself, ask your local Zilog office for a copy of the Z8030/Z8530 SCC Serial Communications Controller Technical Manual. The memory-mapped addresses of the the SCC are in the MDS equate file SysEquates.Txt. The SCC WAIT/REQUEST (asserted low) pin state is brought over to bit 7 of VIA (6522 chip) buffer A, which is mapped to memory address $EFFFFE. The serial drivers Open routine configures the SCC to assert WAIT/REQUEST when an input character is available in the SCC buffer. This enables the Sony floppy driver to feed incoming data to the serial drivers input buffer while the Sony driver has disabled interrupts. When the byte at $EFFFFE is positive (bit 7 clear), the Sony driver fetches the character waiting in the SCC and calls code in the serial driver interrupt service routine.
Inside Macintosh says of the serial status ctsHold flag (Serial Driver, p. 13), If output has been suspended because the hardware handshake has been negated, ctsHold will be nonzero. The conditional clause could be misleading: provided that the output driver has been opened, ctsHold always reflects the state of the CTS (asserted low) pin of the SCC (pin 7 of the DB-9 connector) regardless of the status (or existence) of any output request. If CTS is asserted, ctsHold is zero.
Need a quick (and dirty!) test of whether any OS events are pending? Address $014C contains the OS event queue header (a pointer); if its nil (zero), the queue is empty.
Think you have a good (homemade) backup of Macintosh Pascal? Make sure you can click the mouse 101 times in the source edit window. On the 101st click, Mac Pascal goes to the disk to check that the master is there; if it doesnt like what it finds, it abrubtly quits to the Finder (trashing any of your unsaved work). In my book, this qualifies as a worm and, since its not documented, is tasteless at best and unethical at worst. If a publisher wants to frustrate users who attempt to backup a product or use it on a hard disk without inserting the master, the program should either quit at the outset or put up a dialog box demanding the master. The more experienced the programmer, the more violent his aversion to being forced to use distribution media in production work.
Consulair Mac C users who want all string constants to be compiled in Pascal format (length-byte prefix) can include the following at near the top of the source file:
If theres no semantic difference between pre-incrementing and post-incrementing a variable in your Mac C program (i.e., you can choose either ++i or i++), use pre-increment -- it generates more efficient code. Same applies to decrements. If you use post- inc(dec)rement, the generated code will inc(dec)rement the variable in memory, then offset that operation by dec(inc)rementing it in a register in preparation for its previous value being used in an expression -- even if its not so used.
A Mac C update is expected to be available this spring with floating point, register variables, structure assignments, and slicker code generation. Look for the Greenhills C compiler under the Apple name on the Mac later this year -- its being ported from the Lisa along with the Workshop. I havent used it, but its reported to generate slick code.