May 93 - Porting to OODLs
Porting to OODLs
At some point a programmer interested in object-oriented dynamic languages will ask, "why not implement this application in an OODL?" In this article we consider some of the costs and benefits of OODL development and discuss how to decide whether to port your application. The primary reason for considering a port to an object-oriented dynamic language is that you expect development and maintenance to be easier. Most likely your customers, given two implementations of the same software with the same apparent characteristics, will not care what language was used to develop it.
In deciding whether to port to an OODL (or to use one to develop a new program) you must consider various costs and benefits. Briefly, you must assess whether the gains in development and maintainability appropriately balance any costs associated with the OODL you are considering. Different object-oriented dynamic environments impose varying costs and confer varying benefits.
There are certain benefits that OODLs' advocates most often stress when comparing them to more traditional languages. One such benefit is superior expressive power; practically speaking, superior expressive power means the ability to express more computation in less code. Another commonly cited benefit is automatic memory management: dynamically allocated memory is both allocated and freed automatically, without the programmer's intervention. Advanced development environments, which provide numerous services to improve programmer productivity, are also high on the list of OODL benefits. Finally, a favorite topic of OODL advocates, one whose benefits are perhaps difficult to adequately convey, is the runtime flexibility afforded by dynamic typing and introspective functions. Each of these benefits imposes certain costs, and so critics of OODLs sometimes argue that they are not necessarily beneficial. Let's consider each feature separately.
Expressive power is a measure of the power of a programming language to express computational work clearly, succinctly, and generally. A language with great expressive power is one in which it is easy to describe an algorithm abstractly and in terms that are easy to understand. Object-oriented dynamic languages are designed specifically with expressive power in mind. It is often easier in OODLs than in more conventional languages to express general algorithms, such as a function that sorts any type of collection using an ordering test passed as a parameter, or an intersection test that accepts any two geometric objects, or a general pattern-matcher. An argument against such expressive power is that the high degree of abstraction inherent in highly expressive languages can conceal the computational cost of an operation, and it is true that a programmer needs a good understanding the costs of a language's abstract operations in order to appropriately optimize a program. In high-level languages these costs are not always obvious.
Automatic memory management
OODLs usually incorporate automatic memory management schemes, often referred to as garbage-collection, or GC. Garbage-collection makes the programmer's life much easier: so much so that it's hard to appreciate unless you have worked with it. In a system with garbage-collection many common memory errors simply never occur; they are impossible. For example, a program with automatic memory management never encounters a dangling pointer and can never fail to free a block of memory that is no longer in use.
The most common argument against garbage-collection is that it makes the life of the programmer easier at the expense of the end user, imposing on the user the cost in CPU time of garbage-collection overhead. In fact, empirical evidence suggests otherwise; present-day garbage-collected systems often spend less time in memory-management operations than do programs whose dynamic memory management is done by hand. This unexpected result is largely because modern garbage-collectors maintain strict allocation disciplines that make allocation and deallocation extremely fast; the memory management system can rely on certain invariants that it knows about to make memory operations fast. By contrast, common library operations in conventional languages, such as C's malloc() and free(), are normally much slower because they cannot make the same assumptions about the layout of the heap that a garbage-collected system can. For example, a system with stop-and-copy garbage collection can allocate an object by simply incrementing an index into free space. Most malloc() implementations must conduct a search of free areas to find one into which the request block can fit.
There are other costs associated with automatic memory management. Perhaps the most notorious is the gc-pause, or gc-wait, so called because the garbage-collector must interrupt processing while it runs, to ensure that the heap is stable while it frees unused memory. With a well-designed garbage-collector such pauses should be very brief, and it is usually possible to schedule collections at times the user is unlikely to notice. In some systems you can use an incremental garbage collector that distributes collection operations over the continuous operation of the program, though such collectors tend to be less efficient overall than those that pause.
One issue that needs careful attention is the use of garbage-collected OODLs in real-time systems: you must be sure that you can guarantee response time, and that means knowing when collection will happen and how long it will take in the worst case. There are techniques for controlling garbage-collection in such environments, however. The easiest among them is that you ensure that no dynamic allocation takes place during a critical loop. You might, for example, preallocate data structures that get partially filled during time-critical operations, and execute garbage collection explicitly at safe times.
Advanced development environments
Object-oriented and dynamic languages, especially Smalltalk and Lisp, are widely known for the rich development environments associated with them. Of course, it is one thing to praise a development environment and quite another to praise a programming language, but it is no accident that some of the best work in development environments has been done in these languages. Smalltalk and Lisp share certain features that have facilitated the creation of their environments. Both languages are designed for incremental, interactive development. Both languages can treat source code as program data. Both languages can be extended by defining new facilities that become part of the development system's runtime environment. Both include introspective features with which programmers can interactively examine program data.
In short, both Smalltalk and Lisp provide good support for the development of programming tools. It is therefore no surprise that some of the earliest advances in such tools should have come from Smalltalk and Lisp programmers. We have inherited from those programmers inventions such as interactive tracing and stepping, inspectors for examining runtime data, cross-referencing of source changes and dependencies, graphic display of class hierarchies and call trees, debuggers that support inspecting and changing variables on the stack and which can restart a halted computation from a user-selected stack-frame, and so on. These facilities came to exist in large part because Smalltalk and Lisp programmers realized that they could easily implement them; they didn't have to wait for someone else to design a separate utility program that could read their source files. Instead, they could make small extensions to their development systems' runtimes, accumulating and improving those changes over time until they had developed facilities of great power.
All programmers have benefited from these inventions; they are gradually becoming standard parts of the development systems for conventional languages. OODL programmers are not standing still, however, and their languages still have the built-in support that encourages good tool development.
The cost of these many useful tools has historically been that they are so tightly bound to the language runtime that it is impossible to separate the tools from the application. As a result, programs developed in Smalltalk and Lisp have traditionally been very large because they included with them the entire development system. This close integration of the application with the development system is actually beneficial for some in-house developers because they can easily examine a running application to determine the cause of a program error. It is clearly inappropriate, however, for most commercial development. OODL designers have realized that they must support a delivery model that is more in line with the needs of commercial developers and are beginning to release OODL products that separate the development environment from the application. When you are considering whether to move to OODL development you need to find out the minimum size of an application developed with the systems you are evaluating.
Perhaps hardest to explain of the commonly described benefits of OODL development, the flexibility of an object-oriented dynamic language is nevertheless one of its most appealing features. OODL flexibility is made up of equal parts of the other listed benefits; it grows out of the synergy among expressive power, automatic memory management, and an advanced development environment. Because an OODL includes a library of utility classes, and because the interface to the library is defined using very powerful, general abstractions, and because the development environment is designed to support fast interactive development, you can quickly and easily get a data structure built and a piece of code running. Because you can call all of your routines interactively, using data structures that you can build interactively, you can quickly and easily test your designs. Because of the fast turnaround time of an interactive development environment and the power of abstract, polymorphic protocols, you can switch data representations quickly and easily. Because you can run your program, or just subsystems or even individual routines, interactively, without ever leaving the development system, you can find an error quickly and use the debuggers, steppers, inspectors, and other tools to identify the exact nature of the problem. Once you have found a problem the incremental compiler and interactive environment make it easy to correct the problem and test the change.
The hidden cost of this development flexibility is that it can play upon the programmer's love of new features and tempt us to do more than we should: to add more features because it's easy, to try to generalize algorithms beyond reasonable utility because we can, to add ever more elaborate programming utilities because the environment supports them, and so on. Less flexible environments impose a sort of discipline, if only a crude one. With a more flexible and powerful development system more of the discipline lies with the programmer.
Aside from the matter of whether the benefits are in themselves costly, there are other costs associated with OODLs. The most obvious is that a programmer who switches to an OODL must learn the language and its idioms. Regardless of the real benefits to be had from OODL development, you will need to consider the time it will take for programmers to become familiar with a new language and development environment before deciding that a change is appropriate. Most good programmers can adapt to the superficial differences of syntax and a new user interface in a week or two, but you should expect new OODL programmers to be adjusting to dynamic features and programming idioms for some months. It can be extremely helpful to have at least one contributor, respected by the programming team, with solid experience in dynamic language development.
Aside from the cost of changing environments, each OODL imposes some minimum RAM requirement because of the runtime system that is part of every application. That minimum size varies from one language implementation to another. Some products, such as MacScheme, Object Logo, and Prograph, generate applications as small as 100 to 500 Kbytes and require perhaps 200K to 1 Megabyte to run. Others, such as various Common Lisp and Smalltalk products demand a megabyte or more of disk space and anywhere up to four or five megabytes of RAM. In defense of OODLs with large minimum sizes, we can say that adding application features usually increases the application size only slowly because built-in library code provides so much functionality and is nearly always designed with general reusability in mind. Frequently the growth curve as features are added is significantly flatter in an OODL application than an equivalent application written in a more conventional language. For a given OODL there is usually a level of application complexity at which a C or Pascal implementation equals or exceeds the size of the same application written in the OODL.
Before deciding whether to port your application to an OODL be sure to weigh the gains to be had from the change of language against the costs associated with various candidates, the cost of converting, and the costs of development and delivery in the chosen language.
When and why to port
How will you decide whether to port development to an object-oriented dynamic language? Not every project would benefit from porting. Let's take a look at several considerations that might make OODL development an attractive option.
Complex data management problems
One of the best reasons to switch to an object-oriented dynamic language is that you need to support complicated, dynamically-managed data structures. OODLs share a high level of support for the procedural and data abstractions necessary to manage such structures. For example, Smalltalk, CLOS, Dylan, Self and Prograph all have good support for complex data abstractions and models of data that rely on abstract objects, not on memory addresses and explicit representations. Because most OODLs provide automatic memory management, they reduce memory management problems to those associated with choosing appropriate data structures and ensuring they are populated correctly; all issues of disposal are eliminated. The more significant your dynamic memory management problems (that is, the more complex your dynamic data structures), the more you are likely to gain from OODL development.
Robustness a high priority
OODLs can help improve an application's robustness in several ways. For example, certain classes of memory errors are impossible in a runtime environment with automatic memory management. Bus errors are almost unheard of in OODL development, except among programmers who use foreign function facilities to call code written in more traditional languages.
Many dynamic languages, such as Common Lisp, Dylan, and Smalltalk, provide library classes for handling exceptions, greatly simplifying the task of managing errors and other exceptional conditions. Formally defined exception-handlers encapsulate program control so that your application can invoke condition-handlers and non-local exits in a safe, structured way, and with much less work than building your own exception-handling systems from scratch.
Runtime type-checking makes it easier to catch certain classes of error during development. In statically-typed languages a program that compiles is presumed to be free of type errors, but, in fact, runtime type errors can still occur and can be disastrous. Dynamic languages can catch runtime type errors and, using their exception-handling features, signal such errors to the programmer. Using a built-in exception-handling mechanism you can implement handlers that prevent crashes even in the presence of serious program errors and protect your users from system crashes and data losses.
If runtime robustness, especially safety from hard crashes, is a high priority then it might be worthwhile to consider switching to OODL development. As an example of the trade-offs involved in choosing OODL development over traditional languages, one developer involved in application testing reported that an application implemented in C crashed unexpectedly under low memory conditions; the same application ported to a Lisp variant slowed to a crawl in the same low-memory conditions as more and more time was spent in garbage collection, but it never crashed or lost data.
Quick development a priority
Sometimes a product's development time or time-to-market outweighs all other considerations. When this is the case it makes sense to choose the system in which most progress can be made in the least time. It will not always be true that an object-oriented dynamic language provides a faster path to completion than other alternatives, but OODL features can often significantly shorten development time, especially when a large proportion of your application code is similar to the classes provided in an OODL library. Lisp-like OODLs such as CLOS and Dylan, for example, are especially well suited for developing compilers, interpreters, pattern-matchers, parsers, representations of human knowledge, and recursive data structures generally. Smalltalk class libraries usually provide excellent support for processes, collection and table management, and graphic presentation, and have proven excellent substrates simulations, event-handling, and networking applications. If you know of an OODL class library that closely matches a major subsystem of your application, then OODL development may be a very attractive option.
Flexibility of representation a priority
OODLs have in common a high level of support for abstract data types. In combination with the rich libraries of classes and utilities provided with most such languages this abstract type support makes it easy to try various representations for data and choose the best one for a particular application. In addition, using polymorphism the programmer can define protocols that are independent of the particular representations of a group of types, and so use different representations within the same application depending on the particular needs of the situation. If it is important that your programmers be able to try or use many different data representations then an OODL will be an attractive option.
An idiom developed in the Lisp community and often used in OODL-based applications, data-directed programming provides a good architectural model for extensible software systems. A data-directed system is one in which control is organized in a tabular fashion, with execution selected by the type of input data. Obviously, the polymorphic features of object-oriented languages help support data-directed programming by simplifying the task of selecting the code to execute. An application designed in data-directed style is much easier to extend than one in which control is hard-coded, whether you want to add features to a new version or provide the user the ability to add new features incrementally.
Performance requirements match characteristics of an OODL well
The conventional wisdom that languages like Smalltalk and Lisp are necessarily too big and too slow for commercial development is mistaken, but there are real costs associated with OODL environments, and you should be aware of them.
If you have real-time requirements or are concerned that your customers will be annoyed by pauses for garbage-collection then you should ensure that the system you choose has acceptable pause times and provides some degree of programmer control over garbage collection.
Programs written in OODLs need not be slow, but you will probably need at least one programmer who understands specifically how to make OODL code run fast. This need is somewhat more of an issue in OODL development than with conventional languages because the expressive power of an OODL often conceals just how much computational work is necessary to accomplish a given task. In addition, the uniformity of a good expressive abstraction can sometimes conceal the different algorithms or orders of complexity involved in specific operations. OODL programmers must understand the semantics of the language's operations well enough to choose and apply them judiciously.
The commercial development of OODL systems for personal computers is still at a fairly early stage, and some environments still have relatively expensive disk and RAM requirements. Be sure that you know the minimum and average requirements for both the development system itself and for typical applications. Minimum sizes for applications developed using object-oriented dynamic languages range from about 100K in the case of Lightship Software's MacScheme+Toolsmith to around 450K for Prograph from TGS Systems, to between 1 and 3 megabytes for applications developed with Apple Computer's Macintosh Common Lisp or SmalltalkAgents from Quasar Knowledge Systems (SmalltalkAgents can use a two-megabyte runtime kernel separate from the application itself; in this case the application can be as small as about 50K, with the kernel taking up another two megabytes or so. This strategy is similar to that used by applications that require Apple's QuickTime or QuickDraw GX extensions).
Choosing a porting strategy
Once you decide that you should move development of an application to an OODL platform you must choose a development strategy that is appropriate to your project. Not all projects are best implemented by redesigning them for a new language. There are several less drastic options open to you.
Dynamic libraries for static languages
You can incorporate a library that implements OODL features into your conventional development environment. If you have a sufficiently knowledgeable engineer in your group you can develop your own OODL library in C or Pascal, or you can get library code either from commercial vendors or from freeware archives. For example, Stepstone offers a version of its Objective-C preprocessor and libraries that work with MPW C; Objective-C is an object-oriented extension to the C language, but more importantly, it supports a dynamic runtime system (though without automatic memory management) and a library of reusable abstract classes. Chestnut Software offers a product called The Dynamic Programming Library for C++, which implements garbage-collection and other dynamic-language features as a set of C++ library classes. Similarly, Drasch Computer Software sells CLISP, a set of Lisp data structures and functions implemented as a linkable C library.
There are also several free dynamic language libraries, including DEC's Scheme-to-C, Oliver Laumann's Extension Language Kit, and, perhaps most interesting for Macintosh programmers, John Wainwright's Objects In C. Objects-In-C, or OIC, is available from the Berkeley Macintosh Users' Group, and is on their PD-ROM collection of freeware. It is a portable C library that implements garbage-collection and an object model very similar to that of the Common Lisp Object System. The Macintosh version of the library is delivered as a THINK C project.
Linking static-language code in dynamic applications
Often the most difficult issue in deciding whether to switch to OODL development is what to do with a substantial body of code developed in a more conventional language. Happily, many OODL environments provide the option of continuing to exploit your conventional libraries by linking them to your OODL code. Facilities that support the reuse of conventional library code are usually called foreign function interfaces or user external interfaces. Most major OODL systems support them, but if you have a body of C or Pascal code that you want to continue to use then you should find out the details about foreign function support before investing in a particular OODL system. You should also be aware that in most cases the OODL runtime system cannot manage data structures allocated by foreign libraries, and so you will have to allocate and free them the conventional way.
A direct port is a simple recoding of a program in a new language. For small- to moderate-sized projects a direct port is often the simplest way to move from conventional languages to an OODL. Often the new program will be more succinct and somewhat better designed through a combination of the abstraction features provided by the OODL and the revealing effect of simply reading through all the code.
There are two disadvantages that make a direct port less than ideal, particularly for very small or very large applications: first, recoding the same algorithms the same way in an object-oriented language is almost certainly not the best use of the features of the language; second, it may be faster and easier for a knowledgeable programmer to recode the application from scratch, using the previous implementation as a specification, and perhaps reusing some of the old code through a foreign function interface, than to try to duplicate the old code in the new language.
Nevertheless, there are advantages to direct ports. For one thing, a new OODL programmer need not understand the subtleties or style of the new language to make a direct port; it is enough to learn the syntax and the semantics of basic operations and control structures. It can be particularly appropriate to port a small program in preparation to porting a larger one because the small port can help you understand the facilities available in the new language.
Redesigning for dynamic languages
It is not always necessary or desirable to redesign an application for implementation in a new language, but fairly often the very considerations that led you to switch to OODL development may influence you to redesign part or all of your application as well. For example, the best choice of data representation may be significantly different in the presence of automatic memory management. Or, as another example, you may be able to take advantage of highly optimized library code by changing a representation to use a standard library class.
Aside from performance improvements, the OODL's support for procedural and data abstraction may provide an opportunity to improve the API of one or more subsystems, or to reorganize in terms of well-established conventions. Such reorganizations can be very helpful in improving a program's maintainability.
Finally, in the course of re-thinking the design of an application to best exploit the power of an OODL you will often discover generalizations that greatly increase the flexibility of your application, or that improve its maintainability. You may even notice features that are easy to add in the new language that you wouldn't have considered implementing in the old one.
Choosing a language
Between the decades of effort researchers have devoted to improving implementation techniques for object-oriented and dynamic languages and the renewed interest in OODLs over the past few years, the prospects for commercial OODL development are better than they have ever been. There are a variety of languages and environments available to the commercial developer. You should first consider your reasons for moving to OODL development and the constraints on the size and performance of your application, and then examine the available development tools with these considerations in mind.
Here we briefly describe just a few of the most prominent OODL environments that are available for the Macintosh. Each product listed below implements a programming language and development environment that supports interactive development, incremental compilation, and automatic memory management. Each one provides debugging and other development tools as part of the environment. Each language also supports object-oriented programming, though MacScheme does not provide classes or inheritance.
Static languages with dynamic libraries
- CLISP is a library of C functions that add Lisp features to any ANSI C application.
$349 from Drasch Computer Software, 187 Slade Road, Ashford, Conn. 06728, (203) 429-3817, fax (203) 429-3817
- The Dynamic Programming Library is a C++ library that implements garbage-collection, symbolic programming, and other Lisp-like features in C++.
Chestnut Software, Inc., 2 Park Plaza, Boston, MA 02116 (617) 542-9222, fax (617) 542-9220
- OIC is a portable C library that implements garbage-collection and a CLOS-like object model with classes, multiple-inheritance, and generic functions in ANSI C.
Free from the Berkeley Macintosh Users' Group (BMUG) (415) 849-9114
Object-oriented dynamic languages for the Macintosh
- Macintosh Common Lisp is Apple Computer's implementation of the Common Lisp language, and includes the Common Lisp Object System, an object-oriented dynamic language.
$495 from APDA, Apple Computer, Inc., 20252 Mariani Avenue MS 33G, Cupertino, CA 95014 (800) 282-2732 (USA) or (800) 637-0029 (Canada) (408) 562-3910
- Objectworks/Smalltalk is the ParcPlace implementation of the Smalltalk language and development environment, a descendant of the original Xerox Smalltalk 80. It supports platform-independent application development and delivery on a variety of UNIX platforms as well as Macintosh and Microsoft Windows 3.0. Objectworks applications use Macintosh-style windows, but the user-interface and graphics sacrifice some standard Macintosh features in order to provide platform-independent functionality.
$3500 from ParcPlace Systems, 999 E. Arques Avenue, Sunnyvale, CA 94086 (408) 481-9090, fax (408) 481-9095
- Smalltalk/V Mac is Digitalk's implementation of the Smalltalk language and development environment. The class library differs somewhat from the ParcPlace Smalltalk implementation, and the support for the Macintosh user interface is closer to Macintosh standards.
$199.95 from Digitalk, Inc., 9841 Airport Blvd., Los Angeles, CA 90045 (213) 645-1082 AppleLink: DIGITALK
- SmalltalkAgents is a new implementation of Smalltalk for the Macintosh. Its implementors have taken pains to provide strong support in the language for standard Macintosh data structures and functionality, supplying classes that support all the standard Macintosh resource and other data types. Quasar Knowledge Systems, the developer of SmalltalkAgents, provides a delivery option in which a small application (perhaps 50 to 100K) is supported by a large (about 2 megabytes) runtime kernel. This approach has the advantage that more than one SmalltalkAgents application can run without significantly increasing RAM use.
$495 (introductory price) from Quasar Knowledge Systems, Bethesda, MD 20814
- Object Logo is an object-oriented implementation of the Logo programming language. Although Logo is usually thought of as a language for teaching programming concepts to children, it is in fact a fully-functional dynamic programming language, capable of application development. Object Logo adds the concepts of classes and methods to the Logo language, and delivers an object-oriented dynamic language with incremental compilation, Macintosh toolbox support, and the ability to generate stand-alone applications.
$195 from Paradigm Software, Inc., P.O. Box 2995, Cambridge, Mass. 02238 (617) 576-7675, fax (617) 576-7670.
- Prograph is a new programming language based equally on object-oriented and dataflow concepts. A program in Prograph is a set of class and method definitions, in which methods are defined in terms of dataflow and transformations of data. In addition, Prograph uses a graphical, iconic syntax for representing programs. Prograph is an OODL with automatic memory management, dynamic typing, and a rich library of classes designed to support Macintosh development.
$495 from TGS Systems, 2745 Dutch Village Road, Suite 200, Halifax, Nova Scotia, Canada B3L 4G7 (902) 455-4446, AppleLink: TGS.SYSTEMS
- Quintus MacProlog++ is a Prolog compiler and development system that extends the Prolog language with support for class hierarchies and method definitions. MacProlog++ runs on top of Quintus MacProlog, which includes graphical browsers, inspectors, and debuggers and an incremental compiler.
$595 for MacProlog and $495 for MacProlog++ from Quintus Corp., 2100 Geng Road, Palo Alto, CA 94303, (415) 813-3800, fax (415) 494-7608.
- MacScheme+Toolsmith is an implementation of the Scheme programming language. Scheme is a dialect of Lisp developed at MIT in the 1970s. MacScheme includes an idiosyncratic object system with message-passing and polymorphism, but without classes or inheritance. The MacScheme+Toolsmith development system includes full support for the Macintosh Toolbox, an editor, compiler, and debugging tools that are somewhat more rudimentary than those of most other products listed here, but it is one of the most space-efficient OODL systems, able to generate applications of less than 100K in size. MacScheme provides no support for foreign functions, but does include support for procedures written in machine code.
$495 from Academic Computing Specialists, 2015 East 3300 South, Salt Lake City, Utah 84109 (801) 484-3923, fax (801) 441-5015.
OODL systems listed that support foreign functions
- Macintosh Common Lisp
- Smalltalk/V Mac
- uintus MacProlog++
Regardless of whether you decide to port a particular application, you may want to examine one or more OODL development systems. Object-oriented dynamic languages are appropriate now for certain classes of application, particularly those in which flexibility, maintainability, runtime robustness, or time-to-market are more important than code size. As machines continue to grow faster and more capacious, and as applications developed with conventional languages grow larger, the advantages of OODLs in fast development and easy redesign grow more compelling. With newer implementations reducing the RAM footprint and further improving execution speed, the time will probably come before too much longer when an object-oriented dynamic language is the right choice for developing your application.