July 93 - James Coplien’s Advanced C++
James Coplien’s Advanced C++
Advanced C++ is an excellent book for the C++ programmer who is interested in object-oriented programming.
Aspects of the traditional issues of software design and reuse in an object-oriented programming language are covered. But Coplien goes further, taking the model of more dynamic languages and applying it to C++. He shows how one can design conventions in C++ that support incremental development, garbage collection, runtime polymorphism, and dynamic object loading. The book is filled with a diversity of ideas and topics, programming techniques and idioms: food for thought for the interested reader.
One is confronted with a myriad of C++ books today. They sit on the bookshelf in the Computer Languages section of the bookstore, and most of them look the same. This problem seems to plague all technical books, especially those on programming languages. How can one evaluate one of many books about an unfamiliar language? This is the dilemma I was confronted with last year at the bookstore. After selecting Stroustrup's C++ book (the new edition),I wanted to have another reference book, so I picked one called Advanced C++ by James Coplien. After all, most language books seem to stop teaching the language just as it becomes interesting, so even if Coplien's book was slightly advanced, it would be a good choice. Besides, my employer was buying.
Later, as I read the book, I found that I became completely lost during the fifth chapter. Obviously this book would require a more thorough reading. Coplien's book is about C++, but it teaches object-oriented programming technique as well. With C++, Coplien ventures into many of the aspects of experimental, dynamic languages such as Smalltalk and Self. By carefully reading and re-reading the chapters, often pausing to ponder over the examples, I began to comprehend them. As I learned C++, chapters of the book began to make more sense. This is one characteristic of a good book-it grows with the reader.
Chapter 1, "Introduction," introduces the design philosophy of C++ as well as a short history. Coplien explains how C++ was conceived and how it grew over the years as it was used. Object-oriented programming has grown with C++, and there has always been a tension between keeping the language simple and adding new features. For example, rather than add new keywords many of the existing ones, such as static or virtual, were reused. This kept the number of keywords low, but added to the complexity of the compiler and made it harder for programmers to learn. Another tension in C++ is that of providing for the programmer's needs with language features without restricting the implementation of the language or the compiler.
Chapter 2, "Data Abstraction and Abstract Data Types," is a class-oriented introduction to C++. Classes, constructors and destructors, initialization, scoping, const, and pointers to members are introduced in this chapter. Many of the differences between C and C++ that are not strictly object-oriented in nature are reserved for Appendix A.
Chapter 3, "Concrete Data Types," describes how to write objects that behave as built-in data types. Scoping and access control, the semantics of overloading, operators, and type conversions are discussed. Coplien also introduces the idea of reference counting along with several implementations: using handle classes or counted pointer objects. Bootstrapping reference counting to an existing class is also covered. The orthodox canonical form, introduced in the beginning of the chapter, is a commonly used format for objects in C++.
Chapter 4, "Inheritance," is an explanation of single inheritance. Some of the semantics of object inheritance, such as the order of instantiation and initialization, member access (public/protected/private), pointer conversion, and passing parameters to base class constructors are considered. The concept of a virtual function is reserved for chapter 5; instead Coplien implements a set of classes with type selector fields. This provides a good contrast with the next chapter, where the same example is rewritten using virtual functions to eliminate the type member variable and corresponding case statements.
Chapter 5, "Object-Oriented Programming," a long chapter, introduces many new ideas. This is where the terminology becomes complicated. Coplien will introduce an idea with its term, such as a virtual function, and will go on to build even more abstract ideas from it. He spends several sections explaining the idea of a virtual function and its purpose, including some explanation of the runtime penalty for invoking one. Many other C++ books will gloss over this idea with a half-baked example. Advanced C++ treats the topic thoroughly and is careful to explain the implications and restrictions of virtual function calls.
Virtual destructors (why they are useful) and scoping are included. Coplien then uses the idea of a virtual function to introduce the idea of a pure virtual function as a mechanism to create an abstract base class. The Envelope/Letter idiom is reintroduced as a powerful way to extend the idea of a single class. By using two classes, one that contains and controls access to the other, more dynamic functionally can be achieved. It is reimplemented using the new concepts of inheritance and delegation using virtual functions.
The chapter continues by giving several implementations of virtual constructors. There is also an extension to the Envelope/Letter idiom to allow variable-sized objects, and one for delegation by overloading the "->" operator. Functors, objects that behave like functions (by overloading the "()" operator) are introduced, with a good example, and the chapter ends with a discussion of the uses and pitfalls of multiple inheritance. This is a good introduction to some of the problems created by allowing multiple inheritance in a flexible language such a C++. Casting a pointer to a multiply-inherited object can, in some situations, change the value of the pointer. This is a potentially nasty surprise for those native C programmers who are accustomed to loose pointer casting conventions.
Chapter 6, "Object-Oriented Design," introduces some of the theory involved in object-oriented programming. Object relationships and iconic representations for them are shown. There is an excellent discussion of some of the problems of using inheritance to solve type problems. Some designers will use inheritance not because it implies a type relationship, but to gain functionality for an unrelated object. Others will use inheritance with homonymic types, types that share a common set of operations but differ semantically (usually one has more restricted functionality than the other). Coplien distinguishes between proper and improper uses of inheritance, with good examples. Inheritance with addition and cancellation are considered. Coplien also brings up the topic of public data in objects, and relates it to the problems of inheritance with class independence. At the end of the chapter are rules of thumb for subtyping and inheritance .
Chapter 7, "Reuse and Objects," covers some of the software issues surrounding code reuse. Coplien stresses that object-oriented systems must be consciously designed for reuse. Four code reuse methods, from preprocessor macros to templates, are proposed. Often a system that is designed with reuse as its highest priority cannot fulfill the basic goals that were intended, so it is important to consider reuse, but perhaps not as the primary factor. At the end of the chapter are some code reuse generalizations. Reuse is not solely an object-oriented programming issue. In fact, code reuse is more dependent on good documentation and indexing facilities than the language used. A CASE system can enhance reusability for any large software project. While C++ provides several reuse mechanisms, such as inheritance, templates, and macros, reuse must be designed into the class hierarchy. Sometimes reuse and functionality are in conflict with one another. Also, good software engineers should consider the level of generality that the system will support. Class libraries are perhaps the best of example of a system designed with reuse as a principal goal. Stroustrup is quoted as saying "Code must be usable before it can be reusable."
Chapter 8, "Programming with Exemplars in C++," introduces the idea of an Exemplar. Exemplars allow more dynamic programming by making the class constructor private (or protected) and providing a "make" virtual function with a global "exemplar" object. To create a new object, one applies "make" to the exemplar. By using a virtual member function that is not a constructor, one can invoke make on any object to clone it, when only a pointer (possibly to a base class) is available. The exemplar can be either a global pointer or a class variable (static member). Exemplars can be used to simulate virtual constructors by having an Abstract Base Exemplar examine the data and return the appropriate new object. By moving all member functions signatures to the base class as virtual functions (Inclusion Polymorphism) one can handle objects by using pointers to their base classes. This simulates a more dynamic style of object referencing. Frame-based programming, in which messages are dispatched to subclasses using a single "doit" function, is more dynamic but is subject to performance problems and requires a good error recovery subsystem.
Inclusion Polymorphism is a good idea, but requires more software support to be feasible. Putting the member functions of subclasses in a common class make a system inflexible, and requires substantial recompilation when an interface is changed, making incremental development difficult. Coplien is aware of these problems and gives some program administration tips to help. A large project would require either strict conventions or a source preprocessor. It is important to remember that any software system in C++ will require style and use conventions. As one pushes C++ into the realm of the dynamic these conventions become important. Abstract base exemplars need some information about their subclasses to implement virtual constructors. The idea of an Autonomous Generic Constructor allows abstract base exemplars to keep a list of their subclasses and return the first one that can successfully construct itself from the input data.
Chapter 9, "Emulating Symbolic Language Styles with C++," describes how, by using Envelope/Letter classes with Exemplars to create what Coplien calls the symbolic canonical form, one can create a system that supports incremental development, dynamic loading and reloading, and garbage collection. Coplien provides system-dependent code that works on a Sun (SunOS 4.0, it seems) with the AT&T USL C++ Release 3 compiler. To support dynamic loading in C++ with objects, one needs procedures to load new virtual functions and to change object formats. All loadable functions must be virtual, because when the new one is loaded, one edits the virtual-function table (vtbl) and replaces the pointer to the old function with that of the new. One also needs a way of indexing the vtbl, so that the slot for each virtual member function is known. Loading an object with a new data format is more difficult. It requires that one keep a list of all objects and apply a conversion-function to them (called cutover) when the new object is loaded. There is also some trickiness with the order in which the operations are performed, and one must assure that cutover is applied only once for each object. Because the overhead to support these mechanisms can be cumbersome, one would only want to implement a system this way if the dynamic properties are important. Unfortunately, dynamic loading is implementation dependent.
Coplien also shows how to implement garbage collection using a combined mark-and-sweep and Baker's algorithm (semispace copying). The algorithm requires each class to allocate from a memory pool and each object to have mark and in use bits. When free memory becomes low, first the objects in the exemplar's master list are marked, then the memory pool is scanned and the unmarked objects in use are reclaimed. The disadvantage of this system is the fixed-size memory pools. The chapter ends with an implementation of multi-methods using the symbolic canonical form.
Chapter 10, "Dynamic Multiple Inheritance," is a short chapter about implementing multiple inheritance using pointers to base classes and delegation. Apparently this was one of the methods used before language support for multiple inheritance was added to C++.
Chapter 11, "Systemic Issues," is a collection of topics not directly related to the other chapters. There is a short discussion of modules, frameworks, and software libraries. "Dynamic System Design" is a section on the use of object in a multithreaded environment and the design issues this raises, such as error recovery, scheduling, separate name spaces, and inter-object communication. This is a good summary of the software design idioms that are useful in a multithreaded or multiprocessor context.
Every object-oriented language has its own terminology. One of the difficulties in learning C++ is the vocabulary. Coplien makes frequent use of the terms specific to C++, but often mentions those used in other languages. It seems that he invents some of his own as well. Understanding the terminology is critical to understanding the ideas he presents. As a result, one quickly becomes lost by casually reading or "skimming" this book. Each chapter builds on the ideas and idioms of the previous chapters, and often will make reference to or reimplement idioms introduced earlier.
A major theme of Advanced C++ is the conflict of static typing with dynamic programming. Coplien is interested in how a C++ programmer can write code to simulate valuable features of dynamic object-oriented languages such as Self, Smalltalk, or CLOS. For example, the idea of a virtual constructor is prevalent throughout the book. A virtual constructor is an object constructor that evaluates the data provided and returns the appropriate object. Thus what type of object is created is not known until runtime. To accomplish this, a base class must have some information about its derived classes. Coplien shows several ways to implement the idea of a virtual constructor in C++, including Letter/Envelope classes and Abstract Base Exemplars.
One aspect of the book I like is that Coplien does not lock himself into one mode of solving problems. In the chapter on code reuse (chapter 7), he presents four different mechanisms to achieve software reuse. Examples are used well in this book. They are short enough to fit in a few pages, yet complex enough to warrant careful reading. The text makes interesting comments on the examples. Some programming books do not go into detail about the tradeoffs involved in the coding. Coplien usually explains what the limitations of his examples are. In the beginning of the book Coplien writes C and its equivalent C++ code side-by-side to show the usefulness of the C++ extensions. This works well to show the usefulness of classes over structs as well as constructors and destructors over C init/destroy functions. Longer examples are relegated to the ends of chapters. At the end of the book are several complete programs; however, none are too long or complicated to type in and run.
Coplien uses his appendices to introduce code samples and for short tangential topics that are interesting but not related directly to object-oriented programming in C++. For example, Appendix D demonstrates some of the problems with bitwise copy of objects, and explains why member-by-member copying (sometimes called "deep" copying) is not always the correct solution.
Appendix A, "C in a C++ Environment" covers converting C programs to a style more like C++. Some of it involves the conversions necessary to take traditional Kernighan and Ritchie C to ANSI-C. (C++ is almost ANSI-C compliant.) It also covers how to use const, interfacing with C libraries, sharing header files between the two languages, any how names and object data formats are represented in a C environment. This chapter is a good summary of some of the non-Object-Oriented aspects of C++.
Appendix C, "Reference Return Values from Operators" clarifies the concept of a Reference, especially as a return value from an operator. References often confuse novice C++ programmers who are accustomed to pointers in C. Appendix F, "Block-Structured Programming in C++" explains how to write a C++ program using blocks, or scopes, such as in a Pascal or Modula-2 program.
A few weeks ago I found myself again in the computer science section of the bookstore. As I browsed through a software magazine, I noticed a book review, which stated that "Advanced C++ is a classic, a must-have on the shelf next to Stroustrup's C++ Primer or the Annotated C++ Reference Manual (ARM)".
I concur. For the serious C++ programmer the book is a must-have. It is full of interesting ideas and clever techniques that extend the power of the language. The only problem I see is that using any of these techniques for a large system will require some sort of preprocessing to aid in the generation of the support for each class. The section on writing dynamically loadable (and reloadable) objects demonstrates the programming skill and depth of understanding of the author. James Coplien is well-versed not only in C++ and its implementation, but in Object-Oriented theory and practice. I highly recommend this book. -