March 93 - Solution-based Modeling: A Methodology for the Rest of Us
Solution-based Modeling: A Methodology for the Rest of Us
Jeff Alger and Neal Goldstein
This is the first part of a two-part article, much of which is excerpted from a forthcoming book by Jeff Alger and Neal Goldstein. Copyright © 1993 by Jeff Alger and Neal Goldstein. All rights reserved.
An interesting piece of historical trivia: 30 years ago, three out of four software projects failed. Either the project was never completed or the original objectives were not substantially met. A frightening contemporary fact: despite three decades of supposed progress in software engineering tools and methods, three out of four software projects today fail.
Three years ago, this and a number of other sour observations about the state of software methods led us to reconsider the entire subject from basic principles. The result was Solution-Based Modeling (SBM), a methodology quite different from the pack in both philosophy and practice. Our philosophy is that methodologies should concentrate on what most determines whether a software project succeeds or fails: communications and decision-making. This obvious perspective is often lost in the roar of technologies, dogma and personalities. We observe that people are actually pretty good problem-solvers before we get in their way, so we first study how people best work when left to their own devices, then we figure out how to let them work that way. From these simple principles the entire method springs.
We described this approach in our book, Developing Object-Oriented Software for the Macintosh: Analysis, Design and Programming, which, by the way, isn't really specific to the Macintosh despite the title. We have also taught seminars on SBM around the U.S. and in Europe and are now starting on a new, updated book on the subject. This article explores the problems of communications and decision-making in software projects followed by a brief overview of the methodology and its companion notational system, the Visual Design Language (VDL). The second part, to appear in the May FrameWorks, will explore the development process in more depth and conclude with a discussion of tools soon to be released to support the use of SBM and VDL.
What's the problem?
Before plunging into details of the method and notation, more background on the problems is in order. Though most people have a gut feeling that these are important issues, the true nature of the problems is often misunderstood.
Remember the childhood game of "Telephone"? Arrange people at a party into a circle. Whisper some complex story into the ear of one person, who turns and repeats it in a whisper to the next person, and so on. The last person tells out loud what she heard, and the room erupts with laughter. Seldom is the final message anywhere close to the original (see Figure 1).
We know, even as children, that this is the way communications work. Why, then are we surprised when the same thing happens in software projects? Communications are typically extremely poor in software projects. One study concluded that at best we retain about 50% of the information gathered during a software project. And that was IBM developing software for NASA, two organizations completely fearless when it comes to investing in paperwork. The rest of us do, on average, considerably worse.
There are two specific problems with communications in software projects: distortion and loss of information. Distortion occurs whenever information is conveyed or translated, even just in the act of reading. Loss occurs under the same circumstances, but also when some piece of information is not written down.
A parent told us recently of a daughter who came home excitedly from kindergarten and said, "Mommy, Mommy, I learned how to write in school today." "That's wonderful," replied the mother. "What did you write?" "I don't know. I don't know how to read yet." The media we use to document software projects is, to put it kindly, not people-centered. Comprehension and retention rates for textual documents in general are awful, on the order of 10%-20%, and rates for typical graphical notations aren't much better. Think about that for a minute: if you sit down and read a requirements document, you've undoubtedly missed or forgotten four-fifths of the material. What's the point?
A graphical notation is not an automatic solution. When we teach our methodology seminars we run a simple test to drive this point home. We divide up the class into three, randomly selected groups, then show each group one of three figures for exactly seven seconds. They then have sixty seconds to attempt to reconstruct what they saw. Those who see the second version, which is in our own Visual Design Language (VDL) notational system, do far and away the best. Second, however, are those who see the textual version. Trailing far behind are those who see the third, sloppy graphic version, typically recalling little, if any, of what they saw. A bad visualization can be worse than none at all.
Little wonder that one often hears the lament, "But that's not what I meant" coming from an end user or product marketing. What we have in software development is precisely what one would predict, given the methods in widespread use.
One can identify three specific problems in the way decisions are made in software projects.
- Decisions aren't made, or a decision is put off so long that only one option remains by default.
- Decisions are made by the wrong people.
- Decisions are made in the absence of useful information.
This last is a particularly ironic problem, for it is not always the lack of information that is the problem but the timeliness. For example, features are often cut following an estimation phase, which, in turn, follows an analysis phase. What's wrong with this picture? Analysis typically consumes 50% of total development costs. This means you are throwing out work that is already half done! Decisions are being made during analysis in the absence of critical information: Are we on, ahead of, or behind schedule overall? When it arrives, the estimate may or may not be accurate, but it doesn't matter; it has arrived too late to do any practical good.
Estimation is just one example of decision-making processes that are seriously out of kilter in most software projects. Valuation, setting priorities, balancing competing input from inside and outside the team, and dealing with changing objectives and business environments are all areas in which traditional methods simply don't deliver in real projects. A related symptom of decision-making problems is a tendency of project teams, including management, to refuse to face reality. The problem is not just that decisions are poorly made, but that the process by which decisions are made is poorly understood.
How People Work
What They Don't Teach in 'Methodologies 101'
One of the most remarkable characteristics of the major software methodologies is their consistency in basic assumptions about how people produce good results. Consistently wrong. Here is a limited sample:
- People produce good results working top-down. That is, starting from a high-level, abstract description, then systematically decomposing into finer and finer levels of granularity.
- People produce good results by dividing a problem into distinct, non-overlapping modules.
- People produce good results working linearly, that is, finishing one task before starting the next.
- People communicate well through text.
Of course, most people know instinctively that these are suspect, yet we continue to manage software projects on these assumptions. Here, by contrast, is what cognitive scientists know about the creative design process as the result of solid research:
- People do not work top-down, even when asked to do so. They also do not work bottom-up. Instead, they concentrate on what is known as the "basic level." Below and above this level communications become progressively more difficult as cultural and individual differences become more and more dominant and thought processes become fuzzier.
- People do not partition problems into non-overlapping pieces but rather perceive situations in terms of many overlapping gestalts.
- People work non-linearly. They jump from one task or topic to another, juggling several things in parallel and transferring results of work in one area to work in other areas.
- People communicate poorly through text as a rule.
And, we would add the following.
- People work the way they want to, then make it look like they worked in the way they were expected to.
Let's draw the logical inference here. If we try to force people to work in an unnatural fashion, such as top-down or linearly, they won't, but they will translate the way they really worked into some other form of documentation. This introduces (surprise!) distortion and loss of information. Artificial processes also lead to poor decision-making since they tend to take away leverage we use in everyday life. Finally, processes that require unlearning what Nature taught us about life require high outlays for training and years of experience, not the stuff of extended teams drawn from across an organization. Little wonder that there is such widespread cynicism about software methodologies, particularly among the brightest and most creative!
Historically, it is easy to see how things got to be such a mess. Software development methods started with programmers organizing their own source code, then systematically moved out into the worlds of design, analysis, business modeling and strategic planning. The direction is the opposite of what a Martian coming to study our methods might expect. If the fundamental problems have to do with extended teams and communications, one might expect to start by studying those kinds of people, not programming languages, then proceed gradually in the exact opposite direction until the results converge on working code. This is exactly what we have attempted to do in Solution-Based Modeling: reconsider software development methods by starting from principles of cognitive science, communications, and team management, then derive implementation strategies from them.
Overview of SBM
Solution-Based Modeling and its companion notational system, the Visual Design Language, were developed to map well to the way creative people work best while still allowing management to estimate, schedule and maintain control. The objectives are to improve communications, decision-making and, as a consequence, the success rate of projects, chiefly through improving communications and decision-making.
There are three basic components of SBM:
- An architecture that structures analysis, design and programming information into a series of well-defined models.
- A notation, VDL, that can be used to define and express the models with minimal loss and distortion of information.
- A process for overall development that pays attention to the needs of all parties to the project, from end users to management to engineering to quality assurance and documentation. This includes modeling technology that helps teach people how to build robust, complete models, alone or in work groups, and continuous quality assurance structures: algorithms and metrics that make quality software a consequence of the overall process, rather than the result of a separate step.
A fourth component, tools, is under development by SBM International and others. We will briefly discuss what those tools will look like in the second part of this article in the May FrameWorks.
There is growing acceptance in the software engineering community today of the so-called "multiple models" approach: dealing with a single subject by building many overlapping models. In SBM, we create a series of models of the following kinds:
- Business models describe the business system that surrounds the computer system or program.
- Conceptual models are idealized, or deliberately simplified, models of the content and interface of the program.
- Run-time models describe the running program and ignore implementation details about how the source code is structured.
- Implementation models collectively are the implementation, expressed as a combination of source code and design diagrams.
We call these four kinds of models planes and name them the Business, Technology, Execution and Implementation planes, respectively. Within each plane, we use the term region to describe one model of that kind. Figure 2 shows the overall four-plane, eleven region model.
The Business Plane describes the overall business system, including the program under development, its users, other collaborative programs and computers, and in many cases machinery, files, and other physical entities. The idea is to produce a high-fidelity model, one that faithfully describes the real world. To do this, we use real-world objects, their responsibilities and attributes, and relationships among all three. Figure 3 shows the basic element and relationship types used in this plane.
Our standard for what is and is not a real-world object is, if you can point to it and give it a name, it is a legitimate real-world object. That sounds simplistic, but in fact it is a very useful working definition. Thus, a user would certainly be an object, but "Time" would not (you can't point to it).
To get started, you might want to think of the term responsibility as synonymous with the word "behavior." That is, these are the things that the objects do. The term "responsibility," however, is more useful in business models, as the focus should be on the desired outcomes or objectives rather than the behaviors.
Real-world objects are described with their real-world responsibilities, data content and relationships. A Mouse object that is physically manipulated by someone is a good real-world object, but a Mouse object that is expected to move itself is not. Many objects have no real behaviors. Take an Employment Record object, for example, which is just a piece of paper with information on it. It is certainly important to the business model, but what are its "responsibilities"? There aren't any because it doesn't do anything. It must be described in terms of its data content (see Figure 4 on page 58).
The Provides Data relationship here shows that the Payroll Clerk uses the data from the form, which in turn just sits there doing nothing. In addition to attributes and the Provides Data relationship, we have two additional relationships, Provides Algorithm and Manipulated By, used to show relationships with non-behavioral objects. Provides Data, Provides Algorithm and Manipulated By are collectively called procedural relationships.
Among objects that have behaviors, there are three behavioral relationships: Collaborates, Creates and Destroys. Collaboration occurs when one object gets assistance from another. Certain objects such as reports can be created and occasionally destroyed. There are also structural relationships between objects: Whole/Part, Containment, and Membership. Membership relationships allow objects to be grouped in arbitrary ways into categories, such as Vehicle in the above example. Figure 5 shows the use of these relationship types.
The two regions of the Business Plane are called the Reference Model and the Solution Model. Each is a real-world model of the business system surrounding the program under development. The Reference Model describes the way that system functions today and concentrates on identifying and explaining exactly what problems are to be solved by the introduction of the new program. The Solution Model describes how that system should function once the new program is in use. Its focus is on business objectives to be achieved.
Describing these models is not as easy as it sounds. In particular, there is the age-old modeling problem of knowing when to stop modeling. For example, in analyzing a payroll department, you might determine that the supervisor loves opera. Does that belong in your model? Probably not. But in a project under tight deadlines where most decisions of scope are not nearly so clear, what method is to be used to decide consistently and efficiently? The answer is called the frame. The frame is similar to the border of a jigsaw puzzle, providing a way to capture scope. There are two frames, the Reference Frame and the Solution Frame, consisting of bulleted text items (see Figure 6).
The Reference Frame contains three sections:
- Problems to be solved by the project.
- Constraints that have led the business to function the way it does.
- Topics the project must address.
The problem and constraint sections are work products; the topic list is a tool to use in organizing the work, but is not itself a work product. The Solution Frame contains three sections:
- Objectives the project is expected to achieve.
- Constraints that limit what sorts of solutions are permissible.
- Topics, but now with a focus on the solution, not the problem.
Now, a detail is relevant if it helps explain some bullet point of the frame, irrelevant otherwise. And the model is complete when all bullet points of the frame have been adequately described in the model.
The two models, Reference and Solution, are connected using three correlation relationships: Implements, Replaces, and Same As (see Figure 7). These document what has changed to turn the Reference Model into the Solution Model. Every responsibility in the Reference Model must be accounted for in the Solution Model through some combination of these relationships. The correlation relationships between the Reference and Solution Models are collectively called the Impact Analysis, one of the most important work products of business modeling in SBM. Impact Analysis relationships serve several purposes:
- The Reference Model is easy to gain agreement on, since it is based on observable circumstances of today. The Solution Model - indeed, the rest of the project - is based on hypothetical assumptions. The Impact Analysis connects them.
- Some people think best in before-and-after snapshots, as with the Reference and Solution Models. Others, however, think in terms of change.
- The Impact Analysis accounts for the details of the transition, such as changing lines of authority or retraining requirements. It also makes sure nothing in the current environment is left out. This is taking the Hippocratic Oath of Software Development: "First do no harm."
- If one attaches dollar amounts or other quantities to the Impact Analysis relationships, they can be summed to yield an overall value for the project. This, in itself, can be the single most important result of business modeling.
It may seem from this discussion that business modeling in SBM is targeted to in-house applications, but this architecture is, if anything, even more important to developers of commercial products. Consider not one Reference Model, but rather one per competitor or segment of the market. Now, the Impact Analysis becomes a powerful tool of competitive positioning by demonstrating what the respective economic benefits are of the two products, theirs and the one you are developing. Even comparison to the "null" case, no competing product in use, yields information valuable in positioning and pricing the product.
The Technology Plane contains three regions: Content, User Interface and, for many programs, Environment. Each correlates to the Solution Model of the Business Plane.
The Content Model is the interface-free description of the program, its underlying data and algorithms. It is somewhat akin to describing a car as containing a motor, gas pump, and gas tank, without going into details of ignition timing or the hydraulics of brake fluids. Psychologists call this an "idealized" model, one that is deliberately simplified in order to describe the whole in some specific context. The Content Model contains objects, responsibilities, categories, attributes and relationships, but there are some important differences between these objects and the real-world objects of the Business Plane.
- The objects don't exist in the real world. We make them up to suit the purposes of our program.
- The procedural relationships Provides Data, Provides Algorithm and Manipulates do not apply to this plane.
- Through metaphor, we can make it easier for people to understand the design by borrowing names from real-world objects, but the metaphor follows the design, not the other way around.
The combination of these differences goes a long way toward explaining where many apparently wierd concepts of object-oriented design come from. The expert, without breaking a sweat, may postulate into existence a Ball object that bounces itself, Employee objects that compute their own pay, or Transaction objects that post themselves. These all borrow names from the real world, but that is where the similarity ends. They are metaphors, not real-world objects.
Each responsibility of the program object in the Solution Model must have at least one Implements relationship going down to the Content Model. Each responsibility in the Content Model must either be on the receiving end of such an Implements relationship or collaborate with a responsibility, directly or through intermediaries, that does. In this way, we assure that the two models are consistent with one another and complete in terms of each other.
The User Interface Model is the flip side of the Content Model: the content-free description of the program. It is akin to describing a box as having a switch and a light bulb, such that when the switch is thrown one way, the bulb lights up and when the switch is thrown the other way, the bulb goes out (see Figure 8).
No mention is made here of what connects the two, simple wiring or a complete home utility management computer system. It is simple cause and effect, with connecting mechanisms left out. Notice the use of a storyboard, in addition to the normal VDL notation. User Interface Models consist almost entirely of diagrams with this three-part form: pictures showing snapshots of the user interface over time, a verbal narration, and VDL-based descriptions of what that portion of the program is doing at the bottom, lined up vertically with the snapshots.
The User Interface Model correlates to collaborations of user and program in the Solution Model. That is, for every interaction of user and program in the Solution Model, we show the specific buttons, menus, and windows used by the user to carry out that interaction.
The Environment Model accounts for interactions with other collaborative programs, computer networks, external databases, and other specialized devices. For example, one of my clients connects laboratory microscopes to their Macintosh; in the Environment Model is a Microscope object. These present the ideal interface to the rest of the program and do not merely mimic the published interface of the physical device. The Environment Model correlates to interactions of the program object and device or other program elements in the Solution Model.
The clear separation of content, interface and environment corresponds well to the sorts of conceptual models people form in everyday problem-solving. We tend to switch easily from usage (interface) to structural (content) to context (environment) models, but to focus on only one at a time.
The Execution Plane has the same three regions as the Technology Plane: Content, User Interface and Environment. We now call them "architectures" rather than "models" to reflect the fact that they are no longer idealized, but are detailed descriptions of the running program. The symbology changes slightly here to emphasize the technical detail involved, from curves to hard angles (see Figure 9).
Program objects are really the same concept as real-world or conceptual objects, but are drawn differently for perceptual reasons. In place of responsibilities, we now use the term "method" and add full calling protocols. Specifically, where before we had only names for responsibilities, we now add inputs and outputs, together with their respective data types. Likewise, we add data types to attributes.
It is often necessary to show state changes in the running program. These take two forms: creation and destruction of objects and changing values of attributes. The former are shown by the Creates and Destroys relationships. The latter are shown by adding values to the attribute name and type (see Figure 10).
Another difference is the absence of categories. In their place, we use abstractions. An abstraction is a collection of objects that share the properties of the abstraction, specifically methods and attributes. When one asserts that an abstraction has a given method, that is shorthand for asserting separately that each and every one of the members of the abstraction has that method. Those familiar with object-oriented programming will recognize immediately the similarities to classes, but abstractions are actually quite a bit more general and not limited by language- or even implementation-specific considerations such as the lack of multiple inheritance. Those implementation details apply only to the source code. Abstractions allow one to ignore such details for a while and instead concentrate on the running program, independent of how that ultimately gets mapped into code.
Objects of the Execution Plane are designed to conform to architectural paradigms such as Model-View-Controller (MVC). That is, they are decomposed into special-purpose objects, and methods are assigned to objects in conformance with basic architectural principles.
The Execution Plane also contains classes of whatever class library or libraries you intend to use. We model these as collaborators of the program objects, since there are many ways to implement the relationship in code, ranging from straight inheritance to various forms of delegation and aggregation (see Figure 11).
Finally, the Execution Plane connects all three regions into a single model. That is, we show how objects are created and destroyed and how they interact across region boundaries.
Every responsibility and object in the Technology Plane must correlate downward using an Implements relationship to the Execution Plane. Every responsibility and object in the Execution Plane must either correlate upward or collaborate with something that does.
Finally, the Implementation Plane contains class hierarchies; the Object-Oriented Technobabble (OOTB) of polymorphism, derivation, and other terms used in modern OO languages; and the source code for all methods. Again, there are three regions, Content, User Interface and Environment and, again, the set of all three form a single integrated model. Because of the level of rigor used in the first three planes, the implementation is not as significant a part of the overall architecture as it is in most software projects, object-oriented or not. In a sense, source code is a "mere implementation detail." Most of the work, in fact, is in optimizing the implementation to best reuse code across classes or across projects.
We have developed recommended notational conventions for the common inheritance situations (see Figure 12):
- The subclass simply inherits a method,
- The subclass completely overrides a method, and
- The subclass overrides a method but calls the inherited version from within the override.
A diagram in VDL is called a scenario. A scenario is not just a pretty picture, but in fact embodies much of the cognitive machinery that is SBM. Most of the symbols in the VDL notational system have already been introduced, but a few remain. First is the scenario itself. This can be represented by a 3D folder-like symbol inside another scenario, as in Figure 13.
There are two forms of scenario, static and dynamic (sometimes called "temporal"). In a static scenario, no assertion is made about the order in which things happen. This has been the case in most of the examples so far. In a dynamic scenario, time flows strictly from left to right. Timelines, alternating dashed lines, visually distinguish dynamic from static scenarios. Iteration is shown in a similar fashion to musical notation. Notice in Figure 14 where the new object appears. It is lined up vertically in the spot in the execution where it is created. Similarly, its timeline ends before the right-hand margin is reached, indicating destruction. One can supplement this with creation and destruction relationships, but the timing of creation and destruction are implicit even without such relationships.
Branching timelines can be used, generally at the level of scenarios rather than individual objects (see Figure 15).
Beyond the raw symbology of VDL lie several important, but non-obvious, properties. VDL was designed from the ground up to facilitate communications across a project team, or even from one team to another. It was developed in concert with graphic designers and tested to maximize comprehension and retention of information, even by people with little or no background in objects, SBM or VDL. This is due to visual cues to meaning embodied in the symbols and the way they are used. For example, the 3D look conveys the idea of "things" far before you have to explain the meaning of the scenario. Separation into layers conveys spatial relationships and centrality. Some of the symbols contain subtle hints to meaning. For example, there is a good deal of evidence that people visualize any sort of communication or message as a substance flowing through a conduit. The collaboration symbol, an arrow passing between parallel lines, is a shorthand for an arrow passing through a 3D cylindrical pipe. The collaboration symbol was arrived at after a period of testing to balance ease of drawing with comprehension and retention.
In some cases VDL has been designed to lead to self-correcting behavior. The responsibility symbol, just one angled line, is the easiest to draw precisely because it is the most commonly used symbol and the most important. Abstractions, at the other end of the scale, are deliberately the most clumsy to draw since their overuse tends to lead to poor design decisions.
VDL makes no presumptions about the meanings of size, stroke weight, or color, leaving those available as attention-directing devices.
Many concepts common in the technology of object-oriented programming are deliberately missing from VDL as the result of comprehension and retention testing. Some concepts are better expressed by simply writing notes in the margin than through graphic symbology, while still other concepts turn out to not be that important to the outcome of projects. The distinction between a concrete and an abstract class is a good example of the latter; that distinction is a clumsy attempt to describe in a language syntax a much deeper design issue.
Another sharp distinction between VDL and most other notational systems is its flexibility. VDL more closely resembles a natural than a formal language. New idioms for usage are discovered all the time, especially among the relationships.
VDL scenarios should comfortably fit on a single sheet of paper (no 75% LaserWriter reductions, please) and commonly include only two or three objects and a small handful of responsibilities and relationships. Each scenario should be a gestalt, a whole concept greater than the sum of its parts, and there is a straightforward test for this: if you can assign a scenario a simple title, it is likely a gestalt. This leads one to create lots of simple, overlapping (e.g., sharing objects) scenarios, precisely in line with the cognitive principles we spoke of earlier.
Document control is essential in any software method. Figure 16 shows the complete form of a VDL scenario. This includes, in addition to the diagram itself,
- A document number, including the version,
- The date created or last modified,
- The author(s),
- The region, if not explicitly part of the scenario,
- A title, and
- A "To Do" list.
Frames are identical in form, except that in place of the VDL diagram are the bulleted lists of text.
To Be Continued
In the May FrameWorks the second half of this article will pick up where this one left off. So far, we have discussed the sorts of problems SBM and VDL are well-suited to solve, the overall architecture of projects, and the notational system we use to capture and explain the contents of the architecture. Next, we will explore the process used to build the models and manage projects; tools to support SBM and VDL, and will explore where SBM and VDL can go from this starting point.
SBM International provides consulting, training and tools that support the use of SBM. For more information, contact the authors at (415) 424-9400, FAX (415) 857-0198, or at 3124 South Court, Palo Alto, CA 94306.