TweetFollow Us on Twitter

Dec 99 Factory Floor

Volume Number: 15 (1999)
Issue Number: 12
Column Tag: From the Factory Floor

Debugging AltiVec

by by Richard Atwell, ©1999 by Metrowerks, Inc., all rights reserved

Compile and debug your AltiVec code with CodeWarrior

About The Factory Floor

The Factory Floor started as a means for providing a Metrowerks presence in MacTech where you could read about the state of CodeWarrior and get to know the people behind them.

Dave Mark has been writing for the Factory Floor for about as long as I can remember. He's interviewed most of the engineers at Metrowerks and covered pretty much everything we've done since the introduction of the Power Macintosh. Dave's done a remarkable job month after month without taking a vacation so we're going to give him a break from writing for the first time in four years.

This month I'm going to show you what we've done to the debugger in order to support the debugging of AltiVec code but first I'll update you on the progress of the IDE and news about future updates.

Factory Updates

In the last article I explained how our new update strategy works. The next update should have shipped by the time this article appears in print.

Factory Update 5.3 includes updated compilers, an updated version of MSL, and the latest IDE that contains bug fixes in the integrated debugger and incorporates all the changes since the last release. The last time we updated the integrated debugger was when we shipped Pro5 back in July.

The next update that we have planned that will come out after 5.3 will include carbonized versions of MSL from our compiler/library team and a carbonized version of PowerPlant from Greg Dow, both based on the latest sources. The Pro5 reference CD contained pre-release versions of a carbonized MSL and a carbonized version of PowerPlant based on the last Pro4 update, 1.9.3. If you can't wait for the next update, that release is a good place to see what's changed although it's a little out of date with the latest Carbon SDK from Apple.

In addition we'll have a fully Carbonized IDE release that includes compilers and a single machine debugging solution for Mac OS X. You can catch up on the remote debugging news in the August 1999 issue. We've added some new features so you'll not only get a single IDE that runs on Mac OS 7-9 and Mac OS X but one that contains new features as well.

AltiVec Support

If you're not aware of what AltiVec can do be sure to read Tom Thompson's excellent article on AltiVec (MacTech July 1999). Vector technology that leapfrogs current system performance obstacles and it is built into every shipping desktop Power Macintosh.

Aside from several debugger bug fixes, the latest IDE includes complete support for debugging AltiVec code. The Pro5 IDE wasn't well suited for AltiVec debugging because the support was incomplete. AltiVec debugger support was originally added to the IDE 3.3 but because of shipping schedules, the enhancements never completely made it over to the IDE 4.0 in time for Pro5.

We've fixed this in the latest release with updates to both the PowerPC compilers and the integrated debugger. There is also a new version of the MetroNub Plugin that handles AltiVec and an AltiVec enabled MetroNub extension that also provides support for older debuggers I promised in a previous article.

Let take a tour of the new features. I'll assume that you've read Tom Thompson's article and are familiar with the AltiVec basics.

AltiVec Compiler Panel

AltiVec differs from other vector implementations because a high level programming interface is available using C/C++. Take a look at the PPC Processor panel that is available from your project target settings.


Figure 1. PPC Processor Panel.

We've added a new target processor called AltiVec. For now, this is for the only G4 class PowerPC processor option and it affects the scheduler in the compiler if you have "Schedule Instructions" enabled. The actions taken by the compiler when using this setting are based on the functional design of the G4.

You'll probably be re-writing a small portion of your program to take advantage of AltiVec but you'll still want to support the older non-AltiVec processors out there as well.

The "AltiVec Programming Model" option will enable the extensions in the compiler that deal with AltiVec additions to the C/C++ language as well as to changes in the PowerPC ABI.

In order to help you write your code, when this option is on, __VEC__ is pre-defined by the compiler. You can use this to conditionalize your code. For example:

GraphicsEngine()
{
	#if __VEC__
		AltiVecEnhanced_SceneRendering();
	#else
		PPC_SceneRendering();
	#endif
}

Over the years we've modified the compiler to support new C++ features and such and the traditional method for wrapping your code so older compilers won't try to compile code it involved checking the compiler version.

#if (__MWERKS__ >= 0x2300)
	// Use the built-in compiler feature
#else
	// fake it using a workaround
#endif

A good example of this is how bool support in handled is in Apple's CondtionalMacros.h. For AltiVec, the compiler defines __ALTIVEC__ to indicate that it can generate AltiVec code. You probably won't have to use this as often as __VEC__ so just know that it's there if you need it.

The "AltiVec Programming Model" option allows you to enable two other sub-options. The first one, "Generate VRSAVE Instructions" tells the compiler that you want it to take care of informing the OS which vector registers should be saved across context switches. A context switch occurs every-time your application is swapped out for another application. You might want to turn this off if you were writing hand-tooled assembly using our PPCAsm plugin to write your AltiVec code. Since the G4 vector unit has 32 128-bit registers and saving all of them can take numerous CPU cycles. VRSave lets you save specific registers before a context switch preserving the state of the vector unit. Enabling this option lets the compiler do the book keeping for you.

The second option, "Auto Vectorize (Ignored, TBI)", isn't available yet. This option was intended to allow you to vectorize scalar code automatically but we haven't had time to implement this yet.

The last addition for AltiVec is "Store Static Vector Data in TOC" and you have to enable "Store Small Static Data in TOC" to enable this option. If you choose this option your vector data will be placed in the fragment's TOC which has the effect of reducing the number of memory reads to access the data but the trade off is that you loose some precious space in the TOC. Because there's a 64k TOC-limit in the current compiler you'll have to decide whether you can trade off the space for speed.

AltiVec Language Enhancements

The enhancements to the C/C++ language come in the form of the vector keyword. You may recognize vector as the C++ library's (from the former STL) sequential container. The Motorola designers who extended the C/C++ language to handle AltiVec didn't think programmers would be mixing AltiVec code and C++ in the same file so they decided to reuse this identifier as a new keyword in C.

Although this was awkward for our compiler writers to implement you can now mix C, C++, function-level assembly and inline assembly all within the same file using the CodeWarrior compiler.

Vector variables are declared using these new types. All vectors are 128-bits wide and are composed of smaller scalar elements that are opaque in the sense that you can't directly access the elements inside.

vector [ unsigned | signed | bool ] int [*]
vector [ unsigned | signed | bool ] short
vector [ unsigned | signed | bool ] char
vector float
vector pixel

You may have seen code written with vector unsigned long. The int notation is preferred to the long or long int forms that are deprecated in the final programming interface specification from Motorola. Here's a simple vector declaration:

	vector unsigned int x = (vector unsigned int) (1, 2, 3, 4);

Notice that we have to cast the initializer. If you don't do this you will get a syntax error from the compiler. We can also use a shortcut if we want to initialize all the elements to the same value (called a splat):

	vector unsigned int x = (vector unsigned int) (1);

AltiVec code is written using intrinsic functions. These are the C equivalents of the assembly language instructions available in the assembler but take vector variables as parameters and return results in vector variables. Here's how to XOR two vector variables and store the result in one:

	vector unsigned int a = (vector unsigned int) (255);
	vector unsigned int b = (vector unsigned int) (-1, 0, -1, 0);

	a = vec_xor(a, b);

	- result -

	0 255 0 255

AltiVec adds 162 new PowerPC instructions and the compiler supports an intrinsic function for each one so you can do all your programming from C/C++.

In addition to the language changes there are 2 new pragmas supported by the compiler:

	#pragma altivec_model on | off
	#pragma altivec_vrsave on | off | allon

The first one is equivalent to the "AltiVec Programming Model" option in the PPC Processor panel.

The second one allows source code level control of the VRSave code generation that was described earlier. The allon option is like on except all registers are saved instead of the just the ones determined necessary by the compiler.

AltiVec MSL and Runtimes

To compliment the changes in the compiler, we've modified MSL in several ways to support the AltiVec programming model.

printf and scanf both take a new control string option for formatting. The options %vl %vh and %v break up the vector into 4, 8 or 16 elements for printing and scanning. Like the other control strings, you need to postfix these options with a conversion option to display hex or decimal. For example, this code yields the following result:

	vector unsigned int val = (vector unsigned int) (0x2A);

	printf(" hex: 0x%vlX\n dec: %vhd\n", val, val);

	- output -

	hex: 2A 2A 2A 2A
	dec: 0 42 0 42 0 42 0 42

Further additions include vector versions of the C memory allocators: vec_malloc(), vec_calloc(), vec_realloc(), and vec_free(). These routines return a 16-byte aligned address. This is very important for accuracy and maximizing the performance of the vector unit so be sure to use these with vectors variables instead of the normal forms.

There are no special versions of the MSL C.PPC.Lib or MSL C++.PPC.Lib libraries that you are required to use with AltiVec targets but you do have to select the correct runtime library.

The AltiVec programming model places some extra restrictions on the PowerPC ABI. One such restriction involves 16-byte alignment of data in memory. For example, stack frames have to be created and destroyed with this restriction so we've had to provide vector implementations of setjmp() and longjmp(). Changes to our C++ exception handler to support AltiVec were also required.

You'll need the correct version of the runtime routines depending on the AltiVec code generation setting. If you use any of the libraries on the left, be sure to use the AltiVec equivalent in your AltiVec targets.

	MSL MPWCRuntimePPC.Lib	->		MSL MPWCRuntimeAltiVec.Lib
	MSL RuntimePPC++.DLL		->	MSL RuntimeAltiVec++.DLL
	MSL RuntimePPC.DLL			->	MSL RuntimeAltiVec.DLL
	MSL RuntimePPC.Lib			->	MSL RuntimeAltiVec.Lib
	MSL StdCRuntimePPC.Lib		->	MSL StdCRuntimeAltiVec.Lib


Figure 2. Example Project Window.

That briefly summarizes what's changed in the compiler and the libraries. I've only covered the basics here and you can read about all the details in the "AltiVec Technology Programming Interface Manual" from Motorola. Apple's website has a good introduction to AltiVec and they've provided some sample code for your as well.

AltiVec Register Window

Debugging your AltiVec code is just as easy as debugging any other code using CodeWarrior.

Vector variables can be stored in vector registers or they can be stored relative to the stack as local or global variables. You can use the register keyword to ask the compiler to store variables in registers but it's up to the compiler to honor this request just like any other variable.

To make the viewing and manipulation of vector registers possible a new register window was designed and it available from the Window menu. Because vector registers are 128-bits wide the AltiVec Register window was created as a resizable window so screen real estate can be preserved as much as possible and a scrollbar has been included so you can navigate within the window at any window size.

Unlike the General and FPU Register windows, there is a zoom control so you can maximize and minimize the window to examine the contents of all registers with a single click. If the window grows so that part of it will draw off-screen the window positions itself to the borders of the monitor so the entire window is visible. Clicking zoom again will restore the original window position and scroll the window to the last register you were looking at.


Figure 3. AltiVec Register Window.

Because vector variables really just contain multiples of scalar elements each register is broken into 4 32-bit quantities. This is useful for viewing elements as ints, shorts, chars and floats. When a vector register changes value you'll notice that only the scalar element that changed is draw in red instead of the whole register. This helps you notice bugs in your code as you vectorize your existing algorithms comparing them to the original output the generate.

Finally, there is a contextual menu available that you can activate to change the display format from hex to float. When you do this for any scalar element all elements change at once as a convenience for you.

AltiVec Stack Crawl Window

Motorola states that vector variables are opaque structures and access to individual elements inside is forbidden but that's not very useful for debugging purposes.

Displaying vector variables as 16-byte quantities was awkward so we decided to display variables as structs with array like labels. Take a look at the stack crawl window and you'll see vector variables in registers are denoted with a ®VRx label in the right column or an address which indicates that variables are stack-based.


Figure 4. Stack Crawl Window.

Normally variables in registers aren't denoted in the register name but for vector variables we do because they are treated as structs in the debugger and structs can't fit within registers.

When disclosed, vector variables reveal the scalar elements that make up the variables. When the variable changes only the elements that change will hilite in red just as other array or struct elements hilite when changed. You can also edit elements one at a time which is much easier than having to insert text into a 16-byte hex value.

You can convert any of the elements using the Data menu to display the elements of a vectors as hex, signed value or other convertible formats. Because some vector variables have 16 elements inside we've automated the formatting of all elements at once whether they are in registers or on the stack. You can also reformat elements using the contextual menu.

Vector variables can be thought of arrays of scalar elements but if you remember I called them structs earlier. This is an important distinction because you can't use array notation to access vector variables in expressions.

For expressions and conditional breakpoints you need to think of them as structs. Here's an example that will stop at the breakpoint when the first element of result equals 69.


Figure 5. Conditional Breakpoints using an expression.

The [0] is the label of the first element inside the vector variable result. If you find this odd you're probably not alone because this isn't valid C/C++ syntax but at least we have a way to use vector variables in expressions.

Development Tips

Once you learn the instrinsic operators, you can concentrate on vectorizing the bottlenecks in your program instead of dealing with the awkwardness of assembly language. Optimizing AltiVec code is another article by itself, maybe even a book but here are a few tips to consider.

Always look for register spillage. This occurs when there are too many variables and too few registers to store them in. Spilled registers are stored on the stack and it's well known that programs run faster when computation is done enitrely in the registers.

Look for dependencies that reduce the usage of the multiple functional units within the G4 processor. Try to interleave usage of the permute unit with the integer or floating point units. For example:

Permute, multiply, permute, xor, permute, add, etc.

You can easily double your performance using this technique. Lastly, always double-check your alignment. AltiVec register loads and stores must always be 16-byte aligned in memory so if you don't abide by the rules your code won't be correct.

Summary

Being able to write vectorized code in C/C++ and debug it with a source level debugger is a huge improvement over other vector implementations like Intel's MMX, SSE and AMD's 3DNow!.

Because the CodeWarrior compiler has full C/C++ support for the AltiVec programming model you only have to use assembly in you want to. For example, all of Photoshop's AltiVec support was written entirely in C ; no assembly.

This capability makes it a lot easier to integrate and reuse code as well as debug and maintain your programs than if you had to write and debug entirely in assembly.

Good luck with your AltiVec endeavors and come visit our booth at MacWorld SF in January.

References

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Apple Pro Video Formats 2.0.1 - Updates...
Apple Pro Video Formats brings updates to Apple's professional-level codes for Final Cut Pro X, Motion 5, and Compressor 4. Version 2.0.1: Support for the following professional video codecs Apple... Read more
Maya 2015 - Professional 3D modeling and...
Maya is an award-winning software and powerful, integrated 3D modeling, animation, visual effects, and rendering solution. Because Maya is based on an open architecture, all your work can be scripted... Read more
EtreCheck 2.2 - For troubleshooting your...
EtreCheck is a simple little app to display the important details of your system configuration and allow you to copy that information to the Clipboard. It is meant to be used with Apple Support... Read more
OmniOutliner Pro 4.2 - Pro version of th...
OmniOutliner Pro is a flexible program for creating, collecting, and organizing information. Give your creativity a kick start by using an application that's actually designed to help you think. It's... Read more
VLC Media Player 2.2.1 - Popular multime...
VLC Media Player is a highly portable multimedia player for various audio and video formats (MPEG-1, MPEG-2, MPEG-4, DivX, MP3, OGG, ...) as well as DVDs, VCDs, and various streaming protocols. It... Read more
Nisus Writer Pro 2.1.1 - Multilingual wo...
Nisus Writer Pro is a powerful multilingual word processor, similar to its entry level products, but brings new features such as table of contents, indexing, bookmarks, widow and orphan control,... Read more
Tinderbox 6.2.0 - Store and organize you...
Tinderbox is a personal content management assistant. It stores your notes, ideas, and plans. It can help you organize and understand them. And Tinderbox helps you share ideas through Web journals... Read more
OmniOutliner 4.2 - Organize your ideas,...
OmniOutliner is a flexible program for creating, collecting, and organizing information. Give your creativity a kick start by using an application that's actually designed to help you think. It's... Read more
calibre 2.25.0 - Complete e-library mana...
Calibre is a complete e-book library manager. Organize your collection, convert your books to multiple formats, and sync with all of your devices. Let Calibre be your multi-tasking digital librarian... Read more
Things 2.5.4 - Elegant personal task man...
Things is a task management solution that helps to organize your tasks in an elegant and intuitive way. Things combines powerful features with simplicity through the use of tags and its intelligent... Read more

Watch This Homerun is Batting for the Ap...
Eyes Wide Games' Watch This Homerun is purportedly the first sports game coming to the Apple Watch, where you'll be up to bat as the pitcher tries to out-manuever you with fastballs, curveballs, and changeups. Using one-touch controls you can try to... | Read more »
Field Trip Can Take You on a Guided Tour...
Field Trip, by Google’s Niantic Labs, is an exploration app that gives you details about the awesome places you can discover wherever you find yourself. The app can show you local history, delicious restraunts, the best places to shop, and places to... | Read more »
Watch Your Six - SPY_WATCH is Infiltrati...
SPY_WATCH, by Bossa Studios, is a new game designed for the Apple Watch. Runmor has it your spy agency has fallen out of favor. To save it, you'll need to train-up a spy and send them on missions to earn you a stunningly suspicious reputation and... | Read more »
Both Halo: Spartan Assault and Halo: Spa...
Halo: Spartan Assault and Halo: Spartan Strike, by Microsoft, have officially landed on the App Store. Spartan Assault pits you against the Covenant with missions geared to tell the story of the origin of Spartan Ops. In Spartan Strike you'll delve... | Read more »
The Apple Watch Could Revolutionize the...
It’s not here yet but there’s that developing sneaky feeling that the Apple Watch, despite its price tag and low battery life, might yet change quite a lot about how we conduct our lives. While I don’t think it’s going to be an overnight... | Read more »
Mad Skills Motocross 2 Version 2.0 is He...
Mad Skills Motocross 2 fans got some good news this week as Turborilla has given the game its biggest update yet. Now you'll have access to Versus mode where you can compete against your friends in timed challenges. Turborilla has implemented a... | Read more »
Kids Can Practice Healthy Living With Gr...
Bobaka is releasing a new interactive book called Green Riding Hood  in May. The app teaches kids about yoga and organic style of life through mini-games and a fun take on the classic Little Red Riding Hood fairy tale. | Read more »
Chainsaw Warrior: Lords of the Night has...
It's time to put the Darkness back in its place now that Chainsaw Warrior: Lords of the Night has officially made it to iOS. | Read more »
A World of Ice and Fire Lets You Stalk 2...
George R. R. Martin’s A World of Ice and Fire, by Random House, is a mobile guide to the epic series. The new update gives you the Journeys map feture that will let you track the movements of 25 different characters. But don't worry, you can protect... | Read more »
Gameloft Announces Battle Odyssey, a New...
Battle Odyssey, Gameloft's newest puzzle RPG, is coming to the App Store next week. Set in the world of Pondera, you will need to control the power of the elements to defend the world from evil. You'll be able to entlist over 500 allies to aid you... | Read more »

Price Scanner via MacPrices.net

Sale! 15-inch Retina MacBook Pros for up to $...
 MacMall has 15″ Retina MacBook Pros on sale for up to $255 off MSRP. Shipping is free: - 15″ 2.2GHz Retina MacBook Pro: $1794.99 save $205 - 15″ 2.5GHz Retina MacBook Pro: $2244.99 save $255 Adorama... Read more
New 2015 MacBook Airs on sale for up to $75 o...
Save up to $75 on the purchase of a new 2015 13″ or 11″ 1.6GHz MacBook Air at the following resellers. Shipping is free with each model: 11" 128GB MSRP $899 11" 256GB... Read more
Clearance 13-inch Retina MacBook Pros availab...
B&H Photo has leftover 2014 13″ Retina MacBook Pros on sale for up to $250 off original MSRP. Shipping is free, and B&H charges NY sales tax only: - 13″ 2.6GHz/128GB Retina MacBook Pro: $1129... Read more
Clearance 2014 MacBook Airs available startin...
B&H Photo has clearance 2014 MacBook Airs available for up to $200 off original MSRP. Shipping is free, and B&H charges NY sales tax only: - 11″ 128GB MacBook Air: $729 $170 off original MSRP... Read more
Taichi Temple First Tai Chi Motion Sensor App...
Zhen Wu LLC has announced the official launch of Taichi Temple 1.0, the first motion sensor app for Tai Chi, offering a revolutionary new way to de-compress, relax and exercise all at the same time.... Read more
CleanExit – Erase your Hard Drive Quickly, Se...
CleanExit works on both Macs and PCs, securely and permanently deleting all files from any type of hard drive, flash-based drive or camera media card making the files permanently unrecoverable.... Read more
250 iPhone 6 Tips eBook Released for $1.99
Bournemouth, UK based iOS Guides has released 250 iPhone 6 Tips, a new eBook available in the iBookstore that reveals a wealth of tips and tutorials for iPhone 6 and iPhone 6 Plus. Priced at $1.99,... Read more
TigerText Introduces First Secure Enterprise...
TigerText, a provider of secure, real-time messaging for the enterprise, has announced the launch of TigerText for the Apple Watch. TigerText for the Apple Watch enables users to securely send and... Read more
The Conservation Fund Partners with Apple To...
The Conservation Fund has announced that it will partner with Apple to help protect working forests in the United States. The Apple initiative will conserve more than 36,000 acres of working... Read more
Clearance 13-inch 2.6GHz Retina MacBook Pro a...
B&H Photo has clearance 2014 13″ 2.6GHz/128GB Retina MacBook Pros now available for $1099, or $200 off original MSRP. Shipping is free, and B&H charges NY sales tax only. Read more

Jobs Board

*Apple* Solutions Consultant - Retail Sales...
**Job Summary** As an Apple Solutions Consultant (ASC) you are the link between our customers and our products. Your role is to drive the Apple business in a retail Read more
*Apple* Solutions Consultant - Retail Sales...
**Job Summary** As an Apple Solutions Consultant (ASC) you are the link between our customers and our products. Your role is to drive the Apple business in a retail Read more
DevOps Software Engineer - *Apple* Pay, iOS...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
*Apple* Pay - Site Reliability Engineer - Ap...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
Sr. Technical Services Consultant, *Apple*...
**Job Summary** Apple Professional Services (APS) has an opening for a senior technical position that contributes to Apple 's efforts for strategic and transactional Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.