September 93 - Reflection
Many Object-Oriented Dynamic Languages (OODLs) provide support for a facility called reflection. Reflection in programming is the process of using programming language operations to examine parts of the program and make decisions about what to do on the basis of the information gathered. Examples of reflection in programs include asking an object what its class is, asking a class what its superclasses are, and asking an object whether it responds to a particular message.
More traditional static languages such as Object Pascal and C++ provide no support for reflection. Such support would be inconsistent with the minimalist philosophy that the static languages have regarding the language runtime; reflection requires that support for certain language-level features be present in the program at runtime, and static languages avoid mechanisms that require such support because of the overhead associated with them.
Reflection, therefore, is naturally more commonly supported in programs written in dynamic languages, and particularly in OODLs. Reflective operations serve natural and useful purposes in object-oriented programs, and so OODLs such as Smalltalk, CLOS, and Dylan have always provided reflective operations as part of their standard semantics. More recently, reflection has become a subject of research as programming-language theorists try to understand how much and what kind of support for reflection is practical and useful.
A language that supports reflection provides operations on language elements that return information useful to a running program. The program can then use the returned information to control its execution. Information provided by reflective operations might include the data type of an object, the name, superclasses, and defined methods of a class, the methods defined as specializations of a generic function, and so on.
A common use of reflection is in providing programming tools. It should be obvious that a language in which a program can query runtime data structures about their structure makes it easier to write tools such as browsers, inspectors, and editors. As an example, it is quite a bit easier to query an object for its list of superclasses than to write a tool that searches and maintains a database of header definitions in order to present an object's superclass list to the programmer. Commonly used OODL development environments provide tools that make extensive use of reflective facilities.
We will consider several other examples of the use of reflection in the following sections.
Examples of Reflection
Reflection is a somewhat abstract idea; it may be hard to see what concrete benefits it offers a working programmer. Like many unfamiliar features of dynamic programming languages, reflection is probably best explained by presenting examples. We'll use the reflective facilities of the languages Smalltalk and Dylan to illustrate the uses of reflection.
Smalltalk programmers have several reflective facilities available to them. The message respondsTo: aSymbol asks an object if it has a method whose name is aSymbol. The message isMemberOf: aClass asks whether an object is an instance of the specified class. The message isKindOf: aClass asks the receiver, a class object, whether it is a subclass of the specified class.
One way to enhance the configurability of an application or system is to package various parts of its behavior in separate objects. A Smalltalk program can use Smalltalk's reflective facilities to support a delegation scheme for distributing func-tionality among objects. When an object receives a request for a service it does not implement, it can pass the request to another object. This re-routing of requests is called delegation. Reflective facilities can simplify the implementation of a delegation scheme by making the calling logic simple. Let's examine several alternatives that would enable us to provide a flexible software module, and see what advantages reflection can offer.
Suppose you want to provide a terminal emulation object that offers programmers the ability to incorporate terminal or message windows into their applications. You know the general requirements of terminal windows, and in fact they are fairly stereotypical, but you also know that the range of applications in which programmers might want to use them is large and so you expect that users of your terminal object will want to customize its behavior for their own uses.
One solution that is fairly widely used is to heavily parameterize the behavior of the object. The Macintosh Toolbox and X Windows are both examples of systems in which the behavior of every major component is parameterized by a large number of configurable options. Problems with this approach are that, first, it is difficult to foresee the needs of users well enough to provide adequate parameterization for all reasonable uses, and, second, that the large number of options tends to make such components confusing and difficult to use correctly. There are simply so many parameter values to specify in a well-parameterized example of this kind of component that it's hard to get all the values right for any particular use.
Another approach that solves many of the problems of the parameterized approach is to provide an abstract class that encapsulates the most general useful behavior and permits its users to customize its functionality by writing subclasses. This approach has great advantages, but it also has some problems. In particular, a user of such a class may end up overriding enough of the class definition such that he or she has to rewrite substantial parts of the class just to make minor additions to its behavior. In addition, this approach some-times leads to unfortunate proliferation of specialized subclasses, most of which make only minor additions or changes to their ancestors. Users of static object-oriented frameworks can appreciate how difficult it is to design sets of classes that interoperate adequately with user-defined subclasses.
In some cases problems like these can be addressed by designing classes to use delegates that implement additional behavior. In our terminal emulation example, we could design our object to check for the existence of a delegate and, if it finds it, to call methods that we provide as hooks for extensibility. For example, our terminal emulation object could call a post-initialization method to let the delegate know that it started up and was initialized, character-entered and other editing methods to give the delegate a chance to preprocess input and to respond to changes in the text buffer, line condition methods to permit the delegate to handle status changes in the connection to a communications or interprocess communications stream, and so on.
A scheme like this can make our terminal object very flexible at somewhat the same cost as the parameterization strategy. We can make it much easier to use, however, using reflective facilities. If we start up the terminal object without a delegate then it can omit all its messages to the delegate. Alternatively, we could define all the delegate methods on the object nil, making them no-ops. The object nil is the value of uninitialized instance variables, and so method calls with no delegate present would do nothing.
When a delegate object is present we can use respondsTo: to query it before sending each delegate message, ensuring that we only send those messages that are implemented by the delegate. With this approach we enable users of our terminal object to extend its behavior by implementing only those methods that define the needed behavior; they don't need to provide definitions for any other methods, because when respondsTo: returns false the terminal object skips the corresponding message.
In this case we have used reflection to extend the capability of a general purpose terminal object, while keeping its API as simple as possible. Notice that if we use the delegation strategy along with Smalltalk's reflective facilities we can ensure that the API is as simple as a terminal object with no extensibility in the case where we want to use it just as it is. In the cases where we want to extend the behavior of the object the API only becomes as complex as we need for the behavior we want to modify. We never need to see any part of the API except what we want to use, which is a considerably simpler approach than that of the parameterized object, and even simpler than the abstract superclass strategy, since we can use the terminal class directly in many cases rather than needing to define a concrete subclass. Note that the delegation approach also permits delegates to centralize related behaviors that are used by several objects in an application; many related objects that need to provide extensibility can share a single delegate object at runtime.
Dylan and CLOS
Dylan, a new OODL, uses an object model that is more like that of the Common Lisp Object System (CLOS) than Smalltalk. In the Dylan and CLOS object systems, classes may inherit from several ancestors and may select methods using several objects at once to discriminate among alternatives. Instead of message-passing, Dylan and CLOS rely on generic functions, whose behaviors are determined by the classes of their parameters. In these languages, then, we use reflective functions, rather than reflective messages, to gather information about the data and functions in a program.
Dylan includes reflective functions for examining objects, classes, and functions. You can determine the class of an object by calling object-class, and can test whether an object is an instance of a particular class using instance?. The function direct-superclasses returns a sequence of all the classes that the specified class inherits from directly, and all-superclasses returns a sequence of all classes from which it inherits.
Because Dylan uses generic functions instead of message-passing, polymorphic operations are selected by one or more parameters to a function, not by a message receiver. As a result, there is no precise equivalent in Dylan to the Smalltalk respondsTo: message. Instead, Dylan provides the function find-method. When you pass the name of a function and a list of classes to find-method, it returns the method associated with the function that corresponds to the given list of classes. For example, the following code returns the Dylan method that adds two instances of <small-integer>:
(find-method binary+ (list <small-integer> <small-integer>))
The Dylan function find-method can be used in the same way as the Smalltalk respondsTo: message; if there is no method that corresponds to a particular list of classes then it returns the Boolean false value, enabling the programmer to conditionalize code in the same way as with the Smalltalk mechanism. Dylan, however, treats functions and methods as first-class data values, and so can provide some additional reflective functions. The function generic-function-methods returns a sequence of all the methods defined on a generic function. method-specializers returns a sequence of the specializers for a particular method. A method's specializers are the objects used to select the method when its generic function is called; in most cases this sequence will consist of classes, but in Dylan it is possible to specialize a method on an individual instance, and so method-specializers may in some cases return objects that are not classes. function-arguments returns three values: the number of arguments required by the specified function, a Boolean that indicates whether the function can accept additional optional arguments, and a (possibly empty) sequence of keywords used as keyword parameters, or Boolean false if the function does not accept keyword parameters. applicable-method? returns true if there is a method defined on a function that corresponds to a specified set of sample arguments, and sorted-applicable-methods returns all methods that could apply to a set of sample arguments.
We can imagine an event-handling system similar to that of the Macintosh, but which provides support for extended sets of events using the reflective capabilities of Dylan. If the event system were implemented in an object-oriented fashion then we would expect the event queue to be filled with objects of various event classes. We handle events by removing them from the queue and passing them to a generic event handler which accomplishes the same purpose as the Macintosh event handler. We could define an additional queue of handlers that would store event-handling functions. When the generic handler gets an event it can extract the parameter information from the event and search the handler queue for a function that is applicable to the event parameters; it can determine whether a particular handler is applicable by using the applicable-method? or sorted-applicable-methods function.
The event-processing loop might look something like the following Dylan code (oversimplified, of course, for clarity's sake). First we define the function that finds an appropriate handler:
(define-method find-event-handler (type time
(any? (method (handler)
(applicable-method? (object-class type)
The function any? returns the first element of the sequence *event-handler-queue* for which applicable-method? returns a true value. Our function therefore searches the event-handler queue, returning the first handler that is appropriate to the event parameters that we pass to it.
Following is the actual event loop:
(define-method event-loop ()
(for ((continue #t (test-whether-to-continue)))
((not continue) 'done)
(bind ((current-event (pop-event-queue))
(handler (find-event-handler event-type
(apply handler current-event)
(apply *default-event-handler* current-event)))))
Our event loop uses the special iterating form for to establish an infinite loop that exits when something sets the variable continue to a false value. The bind form creates the local variable current-event to hold the first item in the event queue, and then binds local variables to the parameters returned by the event-params function. It then binds handler to the result of a call to our find-event-handler function. Finally, if find-event-handler found an appropriate handler then the if form applies that handler to the current event; otherwise it applies the default system-defined handler.
An event system like this one could be quite flexible and extensible; programmers would have a great deal of freedom to define new event types and handlers, and applications could enable users to add new event types to be handled by application-defined handlers or user scripts.
Dylan's object system was heavily influenced by the design of CLOS, and CLOS provides much the same reflection support as Dylan, although under different function names. In fact, CLOS goes further than Dylan in providing facilities for redefining class and function structure. For example, the CLOS generic function change-class enables the programmer to dynamically change the class of an instance, and the reinitialize-instance function re-executes the object's initialization code.
Browsers and Inspectors
Browsers and inspectors, programming tools that were invented for use with LISP and Smalltalk environments, have become familiar tools to programmers working in static languages such as C and Pascal. They are very helpful in managing complex programs and in identifying problems in working code.
Reflection provides a straightforward language-level facility for examining program structures in the runtime. Browsing and inspecting tools for C and Pascal usually use separately maintained metadata files such as symbol tables and interfaces, but because OODLs can return useful information from a query about object classes, inheritance relationships, defined methods, and so on, browsers and inspectors can simply query the runtime data structures in place. In addition, it's easy to make OODL inspectors and browsers interact directly with the data structures, providing live updating and editing in place (such facilities are also possible in static languages, although, if your programs run under a system that protects memory, such as UNIX, then you may have to link special libraries to your application in order to get support for editing in place).
In the previous sections we discussed in some detail some examples of how you might use reflection to advantage. In this section we will just briefly consider a few more examples of how reflection can be useful.
One practical use of reflection is in a programming technique called data-directed programming: data-directed programs structure the flow of control in terms of tables of alternative executions. The program selects a particular execution path based upon the data it receives as input. Programs that map input data to output data for example, can use data-directed techniques to organize execution in terms of the type of data object that appears on input. By using reflective facilities to identify the input objects and organizing execution as a table whose entries are keyed on the type of object a programmer can simplify the runtime logic of the program. In addition, data-directed programs are often easier to maintain and extend than programs that use more explicit control logic because the programmer can simply add or change table entries in order to modify the program or add new features.
Another practical use of reflection is in the implementation of error-handling code. OODLs often provide good support for error-handling in the form of signaling or condition systems. These facilities make it easy to properly handle the transfer of control in the case of an error or exception, but there is still the problem of what to do once control has been transferred out of the context in which the exception occurred. Reflective facilities can help by providing a way to examine the relevant data. Using reflective facilities an error-handler can determine the type of an object and other information about it. Depending on the specific nature of the reflective facilities, it may also be able to determine whether the function in which the exception occurred was the correct one for the data provided as input, or other such relevant information. Using reflection, an error-handling system can provide a programmer with substantial information about how an error occurred and in what state it left the runtime. The error-handling system can also have more flexibility in handling errors in a delivered product so that, even when errors occur, they are less likely to cause the program to fail. In this way reflective capabilities can help programmers to deliver more bulletproof applications.
System designers and application designers whose applications provide interoperation and extensibility features must concern themselves with the public points of entry offered to other programs and systems. These public points of entry may be regarded as protocols by analogy with network protocols or with protocols in object-oriented languages such as Smalltalk.
Reflective features in a programming language offer another level of protection for a running system in that it can use them to enforce its protocols. Rather than relying upon a client's adherence to the published protocols, a system with reflective capabilities can actually query its clients to test whether they actually obey the protocols. If a protocol consists of such and such functions operating on such and such types then the system can enforce its protocol by asking clients what their types are and whether they implement the required functions.
In addition, a system could support several alternative protocols for reasons of compatibility or convenience. In this case the reflective facilities provide a way for the system to determine which protocol is needed by a client and to ensure that the client actually supports it.
The cost of reflection
Of course, support for reflection must cost something. Whenever we add capabilities to a software system we must pay either in space or in CPU cycles or both. In the case of reflection, we pay in both ways. The cost might be high or low, depending upon how you look at it; most importantly, perhaps, your assessment of the cost will depend upon whether you need other features of dynamic languages besides reflection. Much of the runtime support used by reflective operations is also used by other OODL features such as automatic memory management and dynamic polymorphic dispatching. If you have concluded that you need those features then reflection is very cheap. If, on the other hand, you have no need of those other features then the price of reflection may seem very high.
Reflective operations must have information to work with, and, as we would expect in a dynamic language, that information must be stored in the runtime. For example, Smalltalk's respondsTo: methods must have some place to look to determine whether the symbol parameter names a method on the receiving object. This fact implies that the runtime must store tables that map symbols to methods. Similarly, Dylan's direct-superclasses function must have some place to look to determine the classes that a specified class inherits from, and must even distinguish direct superclasses from indirect.
As it turns out, the semantics of Smalltalk and Dylan require that classes are first-class data objects anyway, and so the use of either language implies that there will be class objects in the runtime. If you are using one of the languages already then the fact that reflective operations need such class objects is a happy coincidence. In such cases the price of reflection is very low, though you might not choose them just for their reflective capabilities.
The requirement that the runtime include such metainformation about the program may seem like extravagance to those accustomed to static languages. Parsimonious C programmers might be reluctant to admit to a need for any significant runtime overhead at all, but most authors of significant applications will probably agree that a certain amount of such overhead is inevitable anyway in any non-trivial application. The interesting question for a developer is whether
the particular overhead for a particular language or library is justified in context of a particular application.
As you can see, determining the exact cost of reflective capabilities is not very easy, because they use runtime features that are present in OODLs anyway. If we removed the overhead of features that support reflection then we would be removing substantial parts of the OODL languages along with them. The only way, then, to measure the real cost of reflection is to compare the size and speed of a given application written in a reflective OODL to that of a comparable one written in a language like C or Pascal. In general we can say that if you expect your application to weigh in at around 10 to 100 Kbytes and to be absolutely blazing fast then the price of reflection is probably too high. If, on the other hand, you expect your application to be in the range of 400K to two megabytes, and you expect the performance to depend more on correct choice
of algorithms than on the absolute number of instructions in each function, then languages that support reflection will probably be realistic options. The area in-between is more problematic; the question of whether to use reflective languages will come down to a tradeoff between absolute execution speed and program reliability.
You might decide that for a particular application reflective capabilities simply cost too much in terms of space and CPU time. In such cases it may be worth your while to approximate reflective features by implementing some portion of reflection as a supplement to your application code. Depending on your requirements your solution might vary from a few functions that work on a certain data type to an elaborate and fairly complete reflection facility (although if you find yourself implementing something at this scale then you might consider whether it would have been better just to use a language with reflective facilities built in).
Your first decision will be whether to implement full reflection or just a simple subset tailored to a particular use. We will assume here that you want to implement a subset of full reflective capabilities because if you need full reflection it is much easier just to use an OODL for your project.
One reasonable approach to providing just enough reflection for an application is to implement a set of classes that represent the data types central to your application and include in them support for reflective capabilities. For example, if you are working in C++ then you can implement a class, call it MReflector, that supports reflective operations much like those of Smalltalk. You can use MReflector as a mixin, taking advantage of C++ multiple inheritance support to combine MReflector with any other inheritance relationships your classes need. (A mixin is a class that implements a small, specific bit of functionality, designed to supplement other classes by being added to their inheritance graphs. The concept originated in the Flavors object-oriented programming system for Lisp, and continues to be widely used in CLOS and other Lisp-like object systems). MReflector can support the three Smalltalk reflective operations, respondsTo:, isMemberOf:, and isKindOf:, but you will have to ensure that objects that inherit from MReflector are initialized with the appropriate values so that your versions of these operations return the right values. Instances of each reflective class will need to have access to a list of the identifiers that make up its public protocol so that your version of respondsTo: can test whether a given identifier is in the protocol. They will also need access to a list of superclass identifiers so that your versions of isMemberOf: and isKindOf: can test whether the an instance belongs to the specified class or whether a class inherits from another class.
Once you have support for these operations built into an MReflector class you may have the feeling that you have the best of both worlds: you have reflective capabilities, but you don't have to pay the overhead of an OODL. Just to keep things in perspective, however, you should remember, first of all, that the difference in space and speed cost between C++ and modern OODLs is measured in small percentages rather than orders of magnitude for most large programming projects. Secondly, although you have an approximation of reflective capabilities, it works only on those classes that inherit from MReflector , and you have to manage initialization of the instance explicitly or else the reflective operations will return incorrect results. In an OODL the reflective facilities work correctly on all objects automatically.
Objective-C, an object-oriented extension to C, provides the same three reflective messages provided by Smalltalk. Although not a fully dynamic object-oriented language, Objective-C implements some useful features of OODLs, while retaining most of the efficiency of C. The design of Objective-C is one good example of a way to provide select parts of the features of an OODL when you need or prefer to work in a static language. Because the reflective operations are defined on Object, the common ancestor of all Objective-C classes, Objective-C avoids the problem of reflection that works only on select classes.
The Nextstep application framework, which was written in Objective-C, makes extensive use of a delegation mechanism like that we described in the Smalltalk section. By building a flexible delegation mechanism into the application framework using Objective-C's reflective features, the Nextstep engineers made it easy for third-party developers to incrementally change and enhance the framework's standard behavior.
This article presents only a brief introduction to a few simple but useful reflective facilities. Reflection in programming languages is a whole field of study for programming language theorists. Reflection is acknowledged to be a useful tool for working programmers, but it isn't clear yet how much reflection is practical; some experimental languages go to extremes, making reflection central to the design of most programming idioms. It will be a while before the results of such research directly affect most working programmers, but in the meantime we can take advantage of some reflective capabilities where they are appropriate. OODL programmers, especially, can benefit from the built-in support for reflection offered by object-oriented dynamic languages.