TweetFollow Us on Twitter

Sep 96 Factory Floor
Volume Number:12
Issue Number:9
Column Tag:From The Factory Floor

Andreas Hommel, Compiler Architect

By Dave Mark

This month’s interview is with Andreas Hommel, one of the original minds behind Metrowerks’ compiler architecture. (See “A Little CodeWarrior History”, MacTech Magazine 12.7 [July 1996] 61-64, where John McEnerney recalls being shown Metrowerks’ newly acquired C compiler which “a guy named Andreas Hommel in Hamburg had been writing as a hobby”.) You’ll meet a pretty interesting person and, at the same time, learn a thing or two about the compilation process.

Dave: How did you hook up with Metrowerks?

Andreas: I got interested in compiler construction while I was still in University. I was writing computer games, and most C compilers didn’t really produce very good code. Also, I liked ANSI C a lot, but back then, Mac compilers didn’t really conform to this standard. So at one point, I decided that it would be fun to write my own compiler in my spare time. Two or three years later, I had my own little IDE and an ANSI C compiler with some C++ extensions.

I was about to finish my CS degree, and I had a good desktop publishing job offer in Hamburg, but I really liked working on my compiler project, so I decided to give it a try. I sent out a bunch of demo disks to some Macintosh compiler-related companies. A few months later, Greg Galanos called me and we started talking about technical details of the compiler and how we could come together. After a couple of Transatlantic phone calls, Greg invited me to come to Montreal to meet with him and Jean Belanger. They spoke about the incredible opportunities for a compiler company, given Apple’s pending transition to the PowerPC chip. We also talked a lot about all the technical aspects of the compiler and how it could be changed to support another code generator and a Pascal front-end. A week later, we had signed a contract.

The next 6 months were pretty busy. I still had some work to do for my old job, I had to finish and defend my thesis, and I had to start moving the compiler towards C++ for Metrowerks.

Dave: For folks who’ve never written a compiler, can you describe the compilation/link process?

Andreas: The compiler transforms each individual source file (or “translation unit”, to be technically correct) into an object file. The functions, procedures, and variables in a source file are transformed into code and data in the object file. The code in an object file is not executable because it usually contains references to code or data in other object files; these references have yet to be resolved by the linker. The unresolved references are also stored in the object file. The CodeWarrior IDE actually hides much of this process, because it stores all the object files in the project file, so you don’t see them on your hard drive (if you use the CW MPW tools, you will actually generate individual object files). The object file also stores symbolic information that is used by the Source Debugger to map source code to executable code and to find variables and their types. If you are interested in this, and you know a little bit of Assembly Language, you can use the Disassemble command (in the Project menu), which will generate a pretty complete object file dump for a particular source file.

The linker then takes all the object files (a library file is basically just an object file), resolves the external code and data references, and generates an executable file from that. On the Macintosh, the linker also merges your application with the resource data and generates a SYM file that is used by the debugger.

Dave: What about the actual compilation process?

Andreas: The compiler itself can be grouped into several phases. A user typically doesn’t notice these individual phases, and in fact most compilers do not strictly execute one phase after the other, but this logical grouping really makes it a lot easier to implement a compiler.

The first phase is the Lexical Analyzer or Scanner. This part of the compiler “looks” at your source code and splits it into individual tokens. A token is a small lexical element. For example, in C, operators such as '+', '--', '->', and keywords such as for and while are individual tokens. Identifiers, numbers and strings are also considered as tokens with attributes. For example, "123" is a numerical token with the attribute/value 123. A lexical analyzer for C and C++ is quite complicated, because it usually also implements the preprocessor, so it has to take care of source file inclusion (#include), macro expansions (#define) and conditional compilation (#if). All this is hidden from the remaining parts of the compiler, and the CodeWarrior lexical analyzer transforms a source file into a uniform stream of tokens.

The next phase is the Syntax Analyzer or Parser. Most computer languages (including C, C++, Pascal and Java) have a grammar which is a set of rules that describes which token sequences form a legal program. The parser makes sure that the stream of tokens conforms to these rules. For example, the rule for a while statement is:

<iteration-statement>: while ( <expression> ) <statement>

<iteration-statement> is the name of this grammar rule. while, '(' and ')' are tokens; <expression> and <statement> are called non-terminal tokens, which means they have to be replaced by other tokens or rules. The parser transforms the stream of individual tokens into another data structure (usually a syntax tree) that is used by the remaining phases.

The next phase is the Semantic Analyzer. The parser makes sure that a program conforms to the grammar, but it doesn’t catch any other types of error. For example, “1=2;” is a syntactically correct C assignment expression statement that is semantically incorrect, because you cannot assign 2 to 1. So this phase makes sure that types in a program match, that all identifiers (or variables) have been defined, and that operand types in expressions match.

Now we have a legal program and all we have to do is generate code from it. One could generate code directly from a syntax tree, but usually a compiler generates an intermediate code representation. For example, CodeWarrior uses a tree-based intermediate code (IR tree) that is very close to a syntax tree but has a lot of additional information about types. In fact, the CodeWarrior compiler folds the syntactic, semantic and intermediate code generation together, so basically the parser also checks the semantics and it generates an IR tree. This really speeds up the whole process. This IR tree is really the key to our compiler technology. All code generation is based on this tree. So this makes our front-ends (C/C++/Pascal) and back-ends (68K, MIPS, PPC, x86) interchangeable.

This IR tree is then passed to the individual back-ends where it is used to generate the actual 68K, PPC, MIPS, or x86 code. The back-ends are all pretty different, but they all transform the IR tree into machine instruction sequences, allocate memory and registers for variables, and generate an object file. All back-ends also do some machine-level optimizations (peephole optimizations) that replace instructions with better ones or remove redundant instructions.

We have an IR-level optimizer that removes redundant parts from this tree and does some other basic things to optimize branches. We also have a high-level IR optimizer that is currently used in the x86 compiler, but this optimizer will eventually also be used in the MIPS or 68K compilers. The PPC back-end is a little special because most of its optimization is done in the back-end. This makes sense, because the PPC’s RISC instructions are very simple, so you can do a lot of high-level optimizations (such as loop optimizations and common subexpressions) with more fine control on the actual machine level and get better results. This would be very hard to do for a CISC processor like the 68K with all its complex instructions and addressing modes.

Dave: With that in mind, what did your original development environment look like, from a technical perspective?

Andreas: It had pretty much all the basic functionality you need to write programs in C. It had a project window, a multi-window text editor, find and replace, and some simple Preference dialogs. It even had some nice little features such as function popups and multiple access paths, but a lot of major features such as multi-language support, plug-in compiler support, collapsible project views, syntax coloring, split-pane editing, multiple-pane Preference dialogs, a tool bar, and tool-server support, have been added to it since then.

Dave: How would you compare your original compiler architecture with the current CodeWarrior architecture?

Andreas: The original compiler and linker didn’t support a plug-in interface, and everything had to be linked into the IDE. I always had multiple back-end support in mind, and I was always using the IR tree. However, when John McEnerney started writing the PPC back-end, I had to clean up the front-end/back-end interface, and we had to change a few other minor things. There were also some changes in the 68K back-end, to support some Pascal-specific features such as sets and nested local variables.

There have been a lot of changes in the front-end. The original compiler had some basic support for C++ classes and function overloading, so a lot of stuff had to be added since then. I also had to change quite a few things to support multiple platforms (Mac OS, x86, MIPS). Recently, I had to make some additions to the IR to support zero runtime overhead exception handling. This also required changes in the back-ends.

Dave: What special work do you have to do in the front-end to enable multiple platform support? For example, what did you have to do to CodeWarrior to make sure it would support x86, MIPS, and 680x0 code generation?

Andreas: There are a number of areas where a front-end needs to be aware of back-end requirements. For example a C struct’s member layout is done in the front-end, so you have to be aware of the data-member alignment of the target architecture. Another problem has to do with integral and floating-point type differences. For example, a long double is an 80-bit type on 68K but a 64-bit type on the PPC. In the same vein, the x86 uses a different byte ordering (little-endian) than the 68K (big-endian). Most of the functions that deal with these issues have been isolated into target-specific front-end files.

There are also some language-specific issues. For example, both Apple and Microsoft have their own C and C++ language extensions, and the front-end needs to support all of them. One of the biggest problems is C++. There are many very powerful features in C++ like multiple inheritance, polymorphism, exception handling and runtime type identification. The ANSI C++ Standard defines how all these features work, but it does not define how they should be implemented, so every compiler vendor has their own implementation. For example, there are many ways to implement virtual function calls or to allocate base classes in a derived class hierarchy. We always had our own implementation for this, but now, with the x86 compiler, we also have to conform to Microsoft’s standard class layout. We’re still targeting full compatibility, and this requires more work in the front-end as we go forward.

Dave: Beginning with CW8, CodeWarrior offered support for zero runtime overhead. First, what is zero runtime overhead? Second, what advantages does it offer?

Andreas: C++ exception handling requires that all local class variables that have been constructed between a try and a throw be destroyed before an exception is handled in a catch block. This process is called stack unwinding. Our first exception implementation was based on a pseudo-setjmp/longjmp, and a linked list of all active local stack objects needing destruction. This was relatively easy to implement in the front-end, and it required no changes in any of the back-ends. So we were able to support exceptions in all our back-ends at the same time.

However, this implementation requires some runtime overhead. For example, each local class object that needs destruction has to be registered when it is constructed and unregistered when it is destroyed. Also, the setjmp call that was required for every try block was very expensive on the PPC, because it had to save all processor registers in a local buffer, and this implementation had to modify a global variable, so it did not work very well with threads.

Our zero runtime overhead implementation is no longer using setjmp or a linked list. Instead, the stack unwinding process is done by examining the processor’s stack and using a pretty complex exception table that tells the exception handler how to locate local variables and how to destroy them. So there is no runtime overhead required for saving registers or registering local objects. Throwing an exception is actually a little bit slower because the stack unwinder is a lot more complicated, but usually exceptions are really exceptional so your application runs much faster. Another advantage is that this exception model doesn’t have to modify any global variables, so it is really thread-safe.

The only disadvantage is that the stack unwinder can only unwind functions that do have an entry in the exception table. The current Mac OS doesn’t have any exception tables, so you can no longer throw an exception in an OS callback function and catch it in your main program, which was possible (but not recommended) in the previous implementation.

Dave: ANSI C and ANSI C++ are two different languages, yet there is a single front-end to handle both. How does this work?

Andreas: ANSI C++ was derived from ANSI C, and most of its new features are really add-ons. Almost all ANSI C features are also supported in C++, so I was able to support both languages from the same front-end by disabling C++ features depending on the state of a global variable. There are some syntactic and semantic differences, but I was able to code around those with some “if-else” statements. This really makes adding features or fixing bugs a lot easier, because everything will only have to be done once. What’s also really interesting is that our compiler architecture allows us to treat both C and C++ with the same front-end, making the transition from C to C++ much easier because it’s the same compiler. Just flip a switch and write your code.

Dave: When you first started on your compiler, work on a C++ standard had just begun (with the publication of the Annotated C++ Reference Manual, or ARM). How has this process evolved?

Andreas: The ARM was the only real useful C++ language reference when I started. This book was used as the base document for the ANSI C++ standard. It has a lot of gray areas, and the template chapter is really vague, but it has some useful sections that explain how certain C++ features like virtual functions can be implemented. Unfortunately, those sections have been removed from the standard.

The current ANSI C++ draft is the size of a phone book. Many features (namespaces, RTTI, bool/true/false, a complete C++ library) have been added to the original definition, and there are three or four revisions every year. It is very hard to keep track of all these changes, and I think it will take another two years until there will be a final ANSI C++ standard.

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

BetterTouchTool 2.332 - Customize multi-...
BetterTouchTool adds many new, fully customizable gestures to the Magic Mouse, Multi-Touch MacBook trackpad, and Magic Trackpad. These gestures are customizable: Magic Mouse: Pinch in / out (zoom... Read more
Capture One 11.0.1.40 - RAW workflow sof...
Capture One is a professional RAW converter offering you ultimate image quality with accurate colors and incredible detail from more than 400 high-end cameras -- straight out of the box. It offers... Read more
Capture One 11.0.1.40 - RAW workflow sof...
Capture One is a professional RAW converter offering you ultimate image quality with accurate colors and incredible detail from more than 400 high-end cameras -- straight out of the box. It offers... Read more
GraphicConverter 10.5.4 - $39.95
GraphicConverter is an all-purpose image-editing program that can import 200 different graphic-based formats, edit the image, and export it to any of 80 available file formats. The high-end editing... Read more
Dash 4.1.3 - Instant search and offline...
Dash is an API documentation browser and code snippet manager. Dash helps you store snippets of code, as well as instantly search and browse documentation for almost any API you might use (for a full... Read more
Microsoft OneNote 16.9 - Free digital no...
OneNote is your very own digital notebook. With OneNote, you can capture that flash of genius, that moment of inspiration, or that list of errands that's too important to forget. Whether you're at... Read more
DEVONthink Pro 2.9.17 - Knowledge base,...
Save 10% with our exclusive coupon code: MACUPDATE10 DEVONthink Pro is your essential assistant for today's world, where almost everything is digital. From shopping receipts to important research... Read more
OmniGraffle 7.6 - Create diagrams, flow...
OmniGraffle helps you draw beautiful diagrams, family trees, flow charts, org charts, layouts, and (mathematically speaking) any other directed or non-directed graphs. We've had people use Graffle to... Read more
iFinance 4.3.7 - Comprehensively manage...
iFinance allows you to keep track of your income and spending -- from your lunchbreak coffee to your new car -- in the most convenient and fastest way. Clearly arranged transaction lists of all your... Read more
Opera 50.0.2762.58 - High-performance We...
Opera is a fast and secure browser trusted by millions of users. With the intuitive interface, Speed Dial and visual bookmarks for organizing favorite sites, news feature with fresh, relevant content... Read more

Latest Forum Discussions

See All

Around the Empire: What have you missed...
Around this time every week we're going to have a look at the comings and goings on the other sites in Steel Media's pocket-gaming empire. We'll round up the very best content you might have missed, so you're always going to be up to date with the... | Read more »
Everything about Hero Academy 2: Part 4...
In this part of our Hero Academy 2 guide, we're going to have a look at some of the tactics you're going to need to learn if you want to rise up the ranks. We're going to start off slow, then get more advanced in the next section. [Read more] | Read more »
All the best games on sale for iPhone an...
Another week has flown by. Sometimes it feels like the only truly unstoppable thing is time. Time will make dust of us all. But before it does, we should probably play as many awesome mobile videogames as we can. Am I right, or am I right? [Read... | Read more »
The 7 best games that came out for iPhon...
Well, it's that time of the week. You know what I mean. You know exactly what I mean. It's the time of the week when we take a look at the best games that have landed on the App Store over the past seven days. And there are some real doozies here... | Read more »
Popular MMO Strategy game Lords Mobile i...
Delve into the crowded halls of the Play Store and you’ll find mobile fantasy strategy MMOs-a-plenty. One that’s kicking off the new year in style however is IGG’s Lords Mobile, which has beaten out the fierce competition to receive Google Play’s... | Read more »
Blocky Racing is a funky and fresh new k...
Blocky Racing has zoomed onto the App Store and Google Play this week, bringing with it plenty of classic kart racing shenanigans that will take you straight back to your childhood. If you’ve found yourself hooked on games like Mario Kart or Crash... | Read more »
Cytus II (Games)
Cytus II 1.0.1 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0.1 (iTunes) Description: "Cytus II" is a music rhythm game created by Rayark Games. It's our fourth rhythm game title, following the footsteps of three... | Read more »
JYDGE (Games)
JYDGE 1.0.0 Device: iOS Universal Category: Games Price: $4.99, Version: 1.0.0 (iTunes) Description: Build your JYDGE. Enter Edenbyrg. Get out alive. JYDGE is a lawful but awful roguehate top-down shooter where you get to build your... | Read more »
Tako Bubble guide - Tips and Tricks to S...
Tako Bubble is a pretty simple and fun puzzler, but the game can get downright devious with its puzzle design. If you insist on not paying for the game and want to manage your lives appropriately, check out these tips so you can avoid getting... | Read more »
Everything about Hero Academy 2 - The co...
It's fair to say we've spent a good deal of time on Hero Academy 2. So much so, that we think we're probably in a really good place to give you some advice about how to get the most out of the game. And in this guide, that's exactly what you're... | Read more »

Price Scanner via MacPrices.net

Deals on clearance 15″ Apple MacBook Pros wit...
B&H Photo has clearance 2016 15″ MacBook Pros available for up to $800 off original MSRP. Shipping is free, and B&H charges NY & NJ sales tax only: – 15″ 2.7GHz Touch Bar MacBook Pro... Read more
Apple restocked Certified Refurbished 13″ Mac...
Apple has restocked a full line of Certified Refurbished 2017 13″ MacBook Airs starting at $849. An Apple one-year warranty is included with each MacBook, and shipping is free: – 13″ 1.8GHz/8GB/128GB... Read more
How to find the lowest prices on 2017 Apple M...
Apple has Certified Refurbished 13″ and 15″ 2017 MacBook Pros available for $200 to $420 off the cost of new models. Apple’s refurbished prices are the lowest available for each model from any... Read more
The lowest prices anywhere on Apple 12″ MacBo...
Apple has Certified Refurbished 2017 12″ Retina MacBooks available for $200-$240 off the cost of new models. Apple will include a standard one-year warranty with each MacBook, and shipping is free.... Read more
Apple now offering a full line of Certified R...
Apple is now offering Certified Refurbished 2017 10″ and 12″ iPad Pros for $100-$190 off MSRP, depending on the model. An Apple one-year warranty is included with each model, and shipping is free: –... Read more
27″ iMacs on sale for $100-$130 off MSRP, pay...
B&H Photo has 27″ iMacs on sale for $100-$130 off MSRP. Shipping is free, and B&H charges sales tax for NY & NJ residents only: – 27″ 3.8GHz iMac (MNED2LL/A): $2199 $100 off MSRP – 27″ 3.... Read more
2.8GHz Mac mini on sale for $899, $100 off MS...
B&H Photo has the 2.8GHz Mac mini (model number MGEQ2LL/A) on sale for $899 including free shipping plus NY & NJ sales tax only. Their price is $100 off MSRP. Read more
Apple offers Certified Refurbished iPad minis...
Apple has Certified Refurbished 128GB iPad minis available today for $339 including free shipping. Apple’s standard one-year warranty is included. Their price is $60 off MSRP. Read more
Amazon offers 13″ 256GB MacBook Air for $1049...
Amazon has the 13″ 1.8GHz/256B #Apple #MacBook Air on sale today for $150 off MSRP including free shipping: – 13″ 1.8GHz/256GB MacBook Air (MQD42LL/A): $1049.99, $150 off MSRP Read more
9.7-inch 2017 WiFi iPads on sale starting at...
B&H Photo has 9.7″ 2017 WiFi #Apple #iPads on sale for $30 off MSRP for a limited time. Shipping is free, and pay sales tax in NY & NJ only: – 32GB iPad WiFi: $299, $30 off – 128GB iPad WiFi... Read more

Jobs Board

*Apple* Retail - Multiple Positions - Apple,...
Job Description:SalesSpecialist - Retail Customer Service and SalesTransform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
*Apple* Data Center Site Selection and Strat...
# Apple Data Center Site Selection and Strategy Research Analyst Job Number: 83708609 Santa Clara Valley, California, United States Posted: 18-Jan-2018 Weekly Hours: Read more
Security Engineering Coordinator, *Apple* R...
# Security Engineering Coordinator, Apple Retail Job Number: 113237456 Santa Clara Valley, California, United States Posted: 18-Jan-2018 Weekly Hours: 40.00 **Job Read more
Firmware Engineer - *Apple* Accessories - A...
# Firmware Engineer - Apple Accessories Job Number: 113422485 Santa Clara Valley, California, United States Posted: 18-Jan-2018 Weekly Hours: 40.00 **Job Summary** Read more
*Apple* Retail - Multiple Positions - Apple,...
Job Description: Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.