TweetFollow Us on Twitter

Efficient 68030
Volume Number:8
Issue Number:5
Column Tag:Efficient coding

Efficient 68030 Programming

Optimizing your code to run faster on the 68030

By Mike Scanlin, MacTutor Regular Contributing Author

So you want to optimize your code to run faster on the 68030? Well, good for you. This article will help. Most of these optimizations apply to running on the 68020/040, too, and in no case will any of these optimizations make your code run slower on a non-030 machine (like the 68000). These optimizations are not language specific, although most of the examples given here are in Think C and assembly.

The two most important things you need to know in order to make your program run fast on a 68030 are: (1) How to intelligently align your code and data and (2) How to use the data and instruction caches effectively. Once you understand a few basic attributes of the 030 you can do a few things to your code without much work to ensure that you have optimal cache use and alignment.

How much difference will it make, you ask? Using these techniques you can save about 15% on your subroutine calling overhead, 25% to 50% on some of your 32-bit memory accesses (reading or writing longs, Ptrs, Handles, etc), 60% on your NewHandleClear calls and more. Taken in sum these things are likely to make a noticeable difference in your code’s performance when running on an 030.

The single biggest gain you can realize in your programs when running on an 030 is to make sure your 4-byte variables (longs, Ptrs, Handles, etc) are on 4-byte memory boundaries (i.e. their address is evenly divisible by 4). To understand why this makes a difference you need to know a little bit about how the data cache works.

THE DATA CACHE

The 68030 has a 256 byte data cache where it stores copies of the most recently accessed memory locations and values. If your code reads the same memory location twice within a short period of time, it is very likely that the 68030 can get the value at that location from its cache (which is in the processor itself) rather than from actual memory. This reduces the time it takes to execute the instruction accessing the memory location because it eliminates the external bus cycle that the processor would normally use to go out into memory and fetch the value that the instruction needs.

Of course, it’s not quite as simple as that. There are a couple of other details to know. The 256 byte data cache is actually broken up into 16 separate caches of 16 bytes each. When the processor is trying to fetch the value at a given memory location it first checks if it is in one of its 16 data caches. If it is, then it uses the value and everything is cool. If it’s not, then it goes out into memory and fetches the value along with the 16 or 32 bytes surrounding the value. If x is the memory location requested, then the processor fetches all the bytes from location ((x >> 4) << 4) to ( (x + sizeof(fetch)) >> 4) << 4) + 15), where sizeof(fetch) is 1, 2 or 4 depending on if this is a byte, word or long access. Basically, it fetches enough 16 byte “buckets” to completely cover the request (and each bucket begins at an address divisible by 16). The reason why more than one 16-byte bucket might be necessary is because the first part of the requested value could be at the end of one bucket and the second part could be at the beginning of the next bucket. An example is the low memory global GrayRgn, whose address is $9EE - the first half (hi-word) exists in the bucket from $9E0 to $9EF and the second half (lo-word) exists in the bucket from $9F0 to $9FF. So, assuming GrayRgn isn’t in the data cache to begin with, the instruction:

Move.L  GrayRgn,A0

will cause 32 bytes to be read in from memory (4 of which the processor actually uses to process the Move.L instruction). At the end of the instruction, two of the 16 data caches are filled with the values from $9E0 to $9FF. The reason why the processor does this is because it is predicting that the code is going to want the bytes near the most recently requested memory location. Although the first access to a 16-byte bucket is expensive, subsequent accesses to the same bucket are free (in terms of time to fetch the data). This actually helps routines like BlockMove quite a bit (it’s 46% faster for non-overlapping moves because of it). And it’s not quite as sick as you might expect to read in 16 or 32 bytes from memory because the processor uses a special “data burst” mode to do it (which runs somewhat parallel with other stuff the processor is doing).

The other detail to know about the 030 data cache is that it’s a write-through cache. That means that if you change the value at a given location and that location is already in the data cache, then both the cached value and the real memory value are updated. This way you can never have stale data in the cache (i.e., the cache always agrees with actual memory and vice versa). (Note: That’s somewhat of a lie. It is actually possible to get stale data in an 030 data cache if you have a DMA card writing to main memory at interrupt time, but I assume that if you’re doing that then you already know what’s going on. Most people won’t ever have to worry about this detail. Check out TN #261: “Cache As Cache Can” for more info on this.)

The last little thing to know about alignment is that independent of the 16-byte buckets the data cache uses, it is almost always faster to access a 4-byte quantity on a 4-byte boundary than on a 2-byte boundary (the only time it’s not is if the entire 4 bytes are already in the data cache). How much difference does it make? In the case of reading a 4-byte value it is 34% faster if the value is long-aligned and in the case of writing a 4-byte value it is 25% faster if the value is long-aligned (and if the non-aligned value crosses a 16-byte bucket boundary then it’s 52% faster for reads and 25% faster for writes; notice there is no penalty, other than the normal non-aligned penalty, for writing to a 4-byte value that crosses a 16-byte boundary because write operations don’t cause 32-byte reads to happen).

So, the take home message is that you should always put your 4 byte quantities (longs, Ptrs, Handles, etc) at addresses that are divisible by 4 (or, to be more general, put your n-byte quantities on n-byte boundaries). At the time the Macintosh was originally conceived this was not a design goal because several of the low memory globals are 4-byte quantities at odd-word addresses. In fact, a few live at the worst possible location for a 4-byte quantity: those addresses ending in $E that will cause 32-byte reads when they are accessed: FCBSPtr ($34E), GrayRgn ($9EE), MinStack ($31E), ROMBase ($2AE) and WMgrPort ($9DE). There’s nothing we can do about those, but there is something we can do about your code.

ALIGNING YOUR VARIABLES AND DATA

The basic idea of aligning your variables is to keep all of your 4-byte variables/structs together and then have all of your 2-byte and 1-byte variables/structs follow. If you know that the first variable you allocate is long-word aligned and all the ones that follow it are multiples of 4-bytes in length, then you know that they are all going to be long-word aligned (likewise, if the first one is off then they are all going to be off).

Here is an example of some variables, some of which are guaranteed to not be aligned:

inti;
long  x;
char  c[2];
Ptrp;

Because i and c are 2-byte quantities, you know that one of x or p is going to be long-word aligned and the other one is not going to be long-word aligned, no matter where the compiler allocates them (assuming that they are all allocated next to each other in the order they’re declared).

Here’s a regrouping that is better:

/* 1 */

long  x;
Ptrp;
inti;
char  c[2];

Now, as long as you know that x is long-word aligned, you know that p is long-word aligned, too. Forcing long-word alignment of variables comes in three parts: globals, locals and data structures.

Before we get to the specific details, I should mention that the C language doesn’t specify the location or order of declared variables; it’s implementation dependent. Therefore, it’s more important that you understand what is desirable and why it is desirable. Then, you can consult your compiler manual or, more likely, write some test programs to see what it’s doing with your variables. In a perfect world you wouldn’t have to be concerned with alignment - it would be a compiler option. Until that time, read on.

ALIGNING GLOBAL VARIABLES

Because of the fact that the Mac memory manager always allocates memory on 4-byte boundaries (Ptrs and dereferenced handles always end in 0, 4, 8 or C) you know that the heap is going to start and end on a long-word boundary. Further, because you know that the jump table is a multiple of 8 bytes you can assume that register A5 will point to an address divisible by 4. This means that as long as you allocate your globals in chunks of 4 bytes you will have long-word alignment for them. The best way to do this is to allocate your biggest variables first, then your next biggest, and so on, on down to your smallest. Just make sure that you have an amount of each one that divides by four.

Good global allocation:

/* 2 */

/* First, the 4*n byte quantities */
 Rect r;

/* Now list the 4 byte quantities */
 Handle h;
 long x;
 char *p;

/* Now list the 2 bytes quantities */
 int  i, j, k;
 int  dummyInt1; /* need this for padding */

/* And finally the 1 byte quantities */
 char ch;
 Booleanb;
 int  dummyInt2; /* need this for padding */

Note that each section is a multiple of 4 bytes long. This is important because the Think C 5.0 compiler/linker allocates globals in backwards order from how they’re declared. If these are your only global variables:

long  x;
inti;

then x is at $FFFA(A5) and i is at $FFFE(A5). But if you put a dummyInt in there like this:

/* 3 */

long  x;
inti;
intdummyInt;

then x becomes long-word aligned at $FFF8(A5) (which is the goal).

The only gotcha with globals has to do with the QuickDraw (QD for short) globals which aren’t a multiple of 4 bytes long (they’re 206 bytes long). You need to know if your compiler allocates them before or after your own globals and maybe add a dummyInt either before or after all of your globals to compensate for the unaligned QD globals. Think C users don’t really have to worry about this problem because the Think compiler allocates the QD globals after the application globals (so the QD globals might be unaligned but there’s generally nothing you can do about that; at least they won’t affect your application globals’ alignment).

ALIGNING YOUR LOCAL VARIABLES

Aligning local variables is trickier because they are allocated on the stack. It requires that the stack be long-word aligned on entry to your routine, which you can’t always control (especially if you’re a callback proc). However, there is a little we can do to make this a more likely event.

First, we need to make sure the stack is aligned as soon as our code begins to execute. The easiest way to do this is by not having any local variables in main() and by adding a little code that makes sure the stack is aligned after launching by adjusting it by 2 if necessary. The following inline functions are used for this purpose:

/* 4 */

void AlignStack(void) = {
 0x200F,//Move.L SP,D0
 0x0240,0x0003,  //And  #0x0003,D0
 0x6604,//Bne.S  @1
 0x7004,//Moveq  #4,D0
 0x554F,//Subq #2,SP
 0x3F00 // @1  Move D0,-(SP)
};

void UnAlignStack(void) = {
 0xDED7,//Add  (SP),SP
};

If you need to, make main a shell proc that calls your “real” main, MyMain, so that you don’t have any local variables in main():

/* 5 */

void main() {
 AlignStack();
 MyMain();
 UnAlignStack();
}

If you do this, then you know for sure that MyMain will be called with an aligned stack. Now the goal is to make MyMain’s stack frame (as well as every other stack frame in the entire program) aligned - make sure you declare variables from largest to smallest and make sure that the total size of the stack frame (including parameters) is a multiple of 4 bytes (in addition, if you’re using Pascal then you need to include the size of the function result since it’s allocated on the stack). The following inline function is useful to put at the front of each function to make sure the stack is aligned on entry. It will cause a hang (branch-to-self instruction) if it detects an unaligned stack (in which case you know the caller of that function needs work):

/* 6 */

void CheckStackAlignment(void) = {
 0x2F00,//Move.L D0,-(SP)
 0x200F,//Move.L SP,D0
 0x0240,0x0003,  //And  #0x0003,D0
 0x66FE,// @1  Bne.S @1
 0x201F,//Move.L (SP)+,D0
};

Here’s an example program with all of these macros in use:

/* 7 */

void main() {
 AlignStack();
 MyMain();
 UnAlignStack();
}

void MyMain() {
 long x;
 int  i;
 int  d; // needed for alignment

#if DEBUG
 CheckStackAlignment();
#endif
 x = 5;
 i = MyProc(x);
}

int MyProc(long x) {
#if DEBUG
 CheckStackAlignment();


#endif
 return(x);
}

Not only are you ensuring that x is long-word aligned in this example but, because the stack is aligned the whole time you are getting another benefit: faster subroutine calls. The reason is because the processor can push and pop the return address (a 4-byte value) to an aligned location on the stack. The overall gain of an aligned stack on a Jsr/Bsr and Rts pair is 14% to 16% faster than the same Jsr/Bsr and Rts on a non-long-word aligned stack. Don’t forget, too, that if the routine does a Movem.L to save/restore the registers it uses on the stack that the Movem.L instructions will be much faster on a long-aligned stack. Note: This assumes that your compiler is not doing delayed stack clean up. If you are coding for speed and using these alignment techniques, then you need to turn off that compiler option. It is called “Defer & combine stack adjusts” in Think C (make sure it’s unchecked).

Note that you don’t want to put CheckStackAlignment in callback procs you pass to the ROMs (like filterProcs or ioCompletion routines) or in VBL tasks because you can’t control what the ROM or interrupt put on the stack before you are called. You’ll just have to live with whatever alignment you get at the time you’re called. You can, however, do something about the alignment of the routines called by your callback proc if you put an AlignStack/UnAlignStack in your callback proc. That way, only the callback proc’s parameters and local variables will be potentially unaligned - all procs that it calls will be aligned.

There is one more detail relative to functions: You should design your functions so that, in C, the parameters should add up to a multiple of 4 bytes and, in Pascal, the parameters plus the size of the function result should add up to a multiple of 4 bytes (because in Pascal the function result is on the stack and in C it’s in a register). Otherwise your carefully crafted stack frames won’t start at an aligned address. It is worth it to increase the size of an int parameter to long or even to add a dummy int parameter in order to maintain this multiple-of-4-bytes rule. If this is your original function:

int MyProc(int i) { return(i); }

then changing it to:

/* 8 */

int MyProc(long i) { return(i); }

will cause it to execute 17% faster when called from an aligned-stack caller. Changing 
it to:

/* 9 */

int MyProc(int i, int dummy) { return(i); }

with a calling sequence of:

/* 10 */

i = MyProc(i, 0);

will cause it to execute 9% faster when called from an aligned-stack caller.
There are three things to keep in mind when adding up the sizes of parameters: (1) 
One-byte parameters (Booleans, chars, bytes, etc) count as 2 bytes each because they are 
pushed on the stack as 16-bits each in order to maintain word-alignment on the stack (or 
else you’d get an address error on the 68000), (2) pointers to any size parameter (like 
char *) are 4 bytes each and (3) declare your parameters in the correct order. If you’re 
working in C, then declare them in order from smallest to largest (the opposite of what 
you do with global and local variables) - that way the 4-byte values (which are pushed 
first) will be long-aligned (assuming the stack is aligned at the time of the subroutine 
call). Usually, a C compiler will push parameters in order from right to left (because of 
functions like printf that take a variable number of parameters, the first parameter needs 
to be pushed last) so you want you’re biggest (4 byte) parameters at the end of the parameter 
list. Here are some examples of bad (unaligned) C prototypes:

void MyProc1(long x, int i);
int MyProc2(int i, long x, char c);
Handle MyProc3(Handle h, long x, int i);

Better (aligned) versions of these would be:

/* 11 */

void MyProc1(int i, long x);
int MyProc2(char c, int i, long x);
Handle MyProc3( int i, long x, Handle h);

Note that the size of the return result doesn’t matter in C. Also, you could do even better than these by making the sum of the sizes of the parameters a multiple of 4 bytes, as was mentioned above (make ints into longs or add dummy ints for filler; MyProc1 and MyProc3 need this). If you’re unsure which way your C compiler pushes the parameters or if you want to write portable code that compiles on two different C compilers that do it differently, then just make sure all of your non-4-byte parameters are in one part of the parameter list and that you have a multiple of 4 bytes worth of them. Then it won’t matter which way the compiler pushes them - the 4-byte ones (which are what we really care about) are guaranteed to be aligned (as is the stack when the routine that is called starts to execute).

In Pascal, it’s a little different. Parameters are pushed from left to right and only after space has been allocated for the function result (if there is one). So, you want to declare your parameters from largest to smallest in this case (the opposite of what you’d do in C). Here are the same prototypes in aligned Pascal:

{ 12 }

PROCEDURE MyProc1(x: LongInt; i: INTEGER);
FUNCTION MyProc2(i: INTEGER; x: LongInt; c: Char): INTEGER;
FUNCTION MyProc3(h: Handle; x: LongInt; i: INTEGER): Handle;

Notice that in Pascal the size of the return result matters. In MyProc2 we make the first parameter an int to make up for the function result INTEGER; that way x (the LongInt) will be aligned when it is pushed on the stack. If the return result is a 4-byte quantity (like the Handle in MyProc3), then you can forget about it and just list the parameters from biggest to smallest. Don’t forget that in Pascal the goal is to make the sum of all the parameters plus the function result be a multiple of 4 bytes (which is not true for any of the above). As in C, make INTEGERs into LongInts or add dummy INTEGERs if you need to.

ALIGNING DATA STRUCTURES

What about custom data structures? Where do they fit in? Well, the answer is simple: in order to have the least maintenance possible with all of this long-word alignment stuff, make sure all of your structs are multiples of 4 bytes long. Add a dummy field or two to the end if you need to, just make sure all structs are multiples of 4 bytes long. That way, if you declare them as a global or as a local, you will know they are multiples of 4 bytes long, and you can forget about them and put them in any section without ruining your alignment.

Bad (unaligned) data structure design:

typedef struct {
 int  i;
 long j;
} MyStruct;

Good (aligned) data structure design:

/* 13 */

typedef struct {
 long j;
 int  i;
 int  dummyInt; /* force the length to a multiple of 4 */
} MyStruct;

Some might say that the dummyInt is an unnecessary waste of space. I would argue that it is a good thing because it ensures that MyStruct is a multiple of 4 bytes, and it might come in useful some day when you need a couple extra bytes in MyStruct - think of dummyInt as unused and reserved for future expansion.

But what about system data structures? Yes, some of them are a problem. Take BitMap, for example:

struct BitMap {
    Ptr baseAddr;
    short rowBytes;
    Rectbounds;
};

There are two problems with it: (1) Bounds starts at an odd-word address (even though it’s really an array of 4 ints, someone might reference the first half as topLeft and it’ll be an unaligned 4-byte reference) and (2) The size is not a multiple of 4 bytes. Unfortunately, we can’t do anything about (1) but, we can do something about (2). Whenever you have a system data structure in your globals or locals and you’re not sure if it is aligned or no,t then you should add this filler variable to take up the extra space:

/* 14 */

BitMap  b;
char  filler1[4-(sizeof(BitMap)%4)];

The worst case is where the struct in question is already aligned. In that case, the filler will be 4 bytes long (when what you’d really like is to have it be zero bytes long). Still, if you’re coding for speed, then a few extra bytes are worth it (better yet, you could manually check to see if the size of the struct is aligned and then not put in the filler in that case).

INSTRUCTION CACHE

The 68030 has a 256 byte instruction cache where it stores instructions (code). It works much like the data cache in that there are 16 caches of 16 bytes each. When an instruction is fetched, the 16 or 32 bytes surrounding it are fetched into the instruction cache, too. Like the other caches, you can turn off the instruction cache if you want to. I haven’t mentioned this fact before because it is hardly ever a good idea. (If you are so motivated, though, you would use the Movec D0,CACR instruction to enable/disable/clear the caches; this is a priveledged instruction, though, so don’t ship any products that use it.).

The goal when writing small loops is to have everything contained within 242 bytes. You can’t assume all 256 bytes of the instruction cache unless your routine begins at an address divisible by 16; the worst case is that your code begins at an address that ends in $E, thus the 242 byte number (2 bytes from the first bucket plus 15 other complete buckets). The time difference between a fully cached function vs. the same function that isn’t fully cached is 24%. So, you can reduce the time of the entire loop by 24% if you can make it fit in 242 bytes or less (including the size of any routines called from within the loop, of course). Note that, in general, unrolling loops is a good way to speed them up (since a short branch not taken is faster than any size branch taken) but, make sure you don’t unroll them to be larger than 242 bytes.

If you have a IIci with a cache card or a IIfx, then you won’t notice this 24% performance gain; it only applies to the standard 030 machines: SE/30, Classic II, IIx, IIcx, IIsi, IIci (no card), PB 140 and PB 170. The IIci’s cache card and IIfx’s built in cache consist of 2048 16-byte caches (32K total) that can be used for data or instructions.

The 68020, by the way, only has a 64-byte instruction cache, which is 16 caches of 4 bytes each. If you want to make sure that your tight loops run well on that processor, then make sure they are 62 bytes or smaller. The 020 does not have a data cache.

The 68040 has 256 instruction caches of 16 bytes each, or 4K worth of instruction caches. It also has a similar 4K data cache.

INTERLEAVED WRITE OPERATIONS

Because memory write operations are write-through, the second of two successive write operations must wait until the first has finished before it can execute. Knowing this, you can use the delay

to your advantage to streamline the pipelining process. The idea is

to avoid two memory writes in a row in cases where you can.

Consider the code you might have at the top of a routine, where you save some registers and then set up some initial values:

Movem.L D0-D3,-(SP)
Moveq #x,D0
Moveq #y,D1
Moveq #w,D2
Moveq #z,D3

You can make this code execute 15% faster on a IIfx (it’s 19% faster on a IIci) if you interleave the memory writes with non-memory reads and writes:

;15

Move.L  D3,-(SP)
Moveq #z,D3
Move.L  D2,-(SP)
Moveq #w,D3
Move.L  D1,-(SP)
Moveq #y,D1
Move.L  D0,-(SP)
Moveq #x,D0

Of course, there is a tradeoff. This faster version takes 4 extra bytes. But if you’re 
coding for speed, it’s worth it.
SPECIAL INSTRUCTIONS
The 68030 supports some addressing modes and instructions that are not implemented 
on the 68000 (they do work on the 020/040, though). If you use these, then your program 
will not run on a 68000. If you have a tight loop that you’d like to run faster on 020s, 
030s and 040s, then consider modifying it to use these instructions. Be sure to do a run-time 
check for which CPU you’re running on (using Gestalt with the gestaltProcessorType selector) 
and only call routines that use these instructions if you’re running on the appropriate CPU 
(using a set of procPtrs that you install when you start up your app is a good idea). Rather 
than give you a dry listing of the new instructions, I’ll give a couple of examples of where 
the common ones might come into play when programming the Mac. If you really want to 
know more, then go get a book on the 030 (if you call Motorola they’ll probably send you 
a Programmer’s Guide for free: 408/749-0510).
Double dereferencing of handles can now be done in one instruction with a new addressing 
mode. If you want to put the first byte of a handle in A0 into D0, you would do this on the 
68000:

 Move.L (A0),A0
 Move.B (A0),D0

Now you can use this instruction instead (which is 35% faster):

;16

 Move.B ([A0]),D0

Of course, if you’re going to need the dereferenced handle for more than this one instruction, then you’re better off keeping the dereferenced handle in a register rather than using this addressing mode to repeatedly redereference it.

You can add an offset after the first dereference, too. If you want the 10th byte of the handle in A0, you used to do this:

 Move.L (A0),A0
 Move.B 9(A0),D0

On the 68020 (and 030/040) you can do this:

;17

 Move.B ([A0],9),D0

Or, suppose your handle points to a list of 8 byte elements, Rects, let’s say, and suppose you have a zero-based index of the Rect you want in D0 and suppose you want the botRight of that Rect in D1. On the 68000 you would do something like this:

 Move.L (A0),A0
 Asl  #3,D0
 Move.L botRight(A0,D0),D1

But now you can do this in a much more glamourous instruction:

;18

 Move.L ([A0],D0*8,botRight),D1

which is 22% faster and 2 bytes smaller than the old way (and, oh, so much more glamourous). There is another new addressing mode, too, which is nearly as glamourous, and has the general form of ([bd,An,Xn.SIZE*SCALE],od). Geez, I can see why people are moving towards RISC these days - these instructions are getting a bit complex

A new instruction, Extb.L, was added that allows you to sign extend a byte to a long. On the 68000 you have to use two instructions, Ext.W and Ext.L, to get this effect.

Bit fields were added, too. They’re really cool if you’re working with bitmapped graphics. It will be worth your time to go read about them if you do a lot of graphics work.

DON’T USE ‘CLEAR’

The NewPtrClear and NewHandleClear traps are a lot slower than they need to be. This is for two reasons: (1) because they use the Clr instruction (which does a memory read before setting the memory to zero; thus filling the data cache with useless data) and (2) because they only clear one byte at a time. The inner loop that they both use is something like this:

@1 Clr.B(A0)+
 Dbra D0,@1

By substituting this loop instead:

;19

 Moveq  #0,D1
@1 Move.B D1,(A0)+
 Dbra D0,@1

you get a 19% speed improvement. Furthermore, by clearing longs at a time (and dealing with the end as a special case) you can clear memory 65% faster than the Clr.B loop method. If you do a lot of NewPtrClears or NewHandleClears you probably want to write your own MyNewPtrClear and MyNewHandleClear that do this.

ASSEMBLY TRIVIA

Who knows what the slowest instruction on the 68030 is (other than Reset)? If you guessed something to do with division you’re wrong. If, on the other hand, you guessed:

;20

Movem.L  ([d32,An],Xn,d32),A0-A7/D0-D7

then you’re right. Weighing in at about 200 cycles you should avoid this instruction wherever possible (Quick! Go change your source!).

I would like to thank Chris B, Dolf S and John A for help in preparing this article. It’s always nice to have people around who can tell you precisely how confused you are. I would also like to thank Timo B for the use of his computer. It’s always nice to have someone around who has more hardware than you do.

 
AAPL
$95.60
Apple Inc.
-2.55
MSFT
$43.16
Microsoft Corpora
-0.42
GOOG
$571.60
Google Inc.
-15.82

MacTech Search:
Community Search:

Software Updates via MacUpdate

OneNote 15.2 - Free digital notebook fro...
OneNote is your very own digital notebook. With OneNote, you can capture that flash of genius, that moment of inspiration, or that list of errands that’s too important to forget. Whether you’re at... Read more
iStat Menus 4.22 - Monitor your system r...
iStat Menus lets you monitor your system right from the menubar. Included are 8 menu extras that let you monitor every aspect of your system. Some features: CPU -- Monitor cpu usage. 7 display... Read more
Ember 1.8 - Versatile digital scrapbook....
Ember (formerly LittleSnapper) is your digital scrapbook of things that inspire you: websites, photos, apps or other things. Just drag in images that you want to keep, organize them into relevant... Read more
OmniPlan 2.3.6 - Robust project manageme...
With OmniPlan, you can create logical, manageable project plans with Gantt charts, schedules, summaries, milestones, and critical paths. Break down the tasks needed to make your project a success,... Read more
Command-C 1.1.1 - Clipboard sharing tool...
Command-C is a revolutionary app which makes easy to share your clipboard between iOS and OS X using your local WiFi network, even if the app is not currently opened. Copy anything (text, pictures,... Read more
Knock 1.1.7 - Unlock your Mac by knockin...
Knock is a faster, safer way to sign in. You keep your iPhone with you all the time. Now you can use it as a password. You never have to open the app -- just knock on your phone twice, even when it's... Read more
Mellel 3.3.6 - Powerful word processor w...
Mellel is the leading word processor for OS X and has been widely considered the industry standard since its inception. Mellel focuses on writers and scholars for technical writing and multilingual... Read more
LibreOffice 4.3.0.4 - Free Open Source o...
LibreOffice is an office suite (word processor, spreadsheet, presentations, drawing tool) compatible with other major office suites. The Document Foundation is coordinating development and... Read more
Freeway Pro 7.0 - Drag-and-drop Web desi...
Freeway Pro lets you build websites with speed and precision... without writing a line of code! With it's user-oriented drag-and-drop interface, Freeway Pro helps you piece together the website of... Read more
Drive Genius 3.2.4 - Powerful system uti...
Drive Genius is an OS X utility designed to provide unsurpassed storage management. Featuring an easy-to-use interface, Drive Genius is packed with powerful tools such as a drive optimizer, a... Read more

Latest Forum Discussions

See All

Dawn of the Immortals Review
Dawn of the Immortals Review By Jennifer Allen on July 31st, 2014 Our Rating: :: RESPECTABLE EXPLORATIONUniversal App - Designed for iPhone and iPad Dawn of the Immortals might not re-invent the wheel, but it does tweak it a little... | Read more »
80 Days Review
80 Days Review By Jennifer Allen on July 31st, 2014 Our Rating: :: EPIC ADVENTUREUniversal App - Designed for iPhone and iPad A fantastic and fascinating re-envisioning of the classic novel by Jules Verne, 80 Days is a delightful... | Read more »
Battleheart Legacy Guide
The world of Battleheart Legacy is fun and deep; full of wizards, warriors, and witches. Here are some tips and tactics to help you get the most enjoyment out of this great game. | Read more »
Puzzle Roo Review
Puzzle Roo Review By Jennifer Allen on July 31st, 2014 Our Rating: :: PUZZLE-BASED TWISTUniversal App - Designed for iPhone and iPad A different take on the usual block dropping puzzle game, Puzzle Roo is quite pleasant.   | Read more »
Super Crossfire Re-Release Super Crossfi...
Super Crossfire Re-Release Super Crossfighter Coming Soon, Other Radiangames Titles Go 50% Off Posted by Ellis Spice on July 31st, 2014 [ | Read more »
Hexiled Review
Hexiled Review By Rob Thomas on July 31st, 2014 Our Rating: :: HEX SELLSUniversal App - Designed for iPhone and iPad In space, no one can hear you… spell? Hexiled is a neat concept for a word scramble puzzle, but it doesn’t go too... | Read more »
Summoners War: Sky Arena Passes 10 Milli...
Summoners War: Sky Arena Passes 10 Million Installs! Posted by Jessica Fisher on July 31st, 2014 [ permalink ] Universal App - Designed for iPhone and iPad | Read more »
Deep Loot Review
Deep Loot Review By Jennifer Allen on July 31st, 2014 Our Rating: :: DIVE DEEPUniversal App - Designed for iPhone and iPad Dive deep in this fun explore-em-up that’s a little grind heavy but ultimately quite entertaining.   | Read more »
Despicable Me: Minion Rush is One Year O...
Despicable Me: Minion Rush is One Year Old, Gets its Biggest Update Yet Posted by Jennifer Allen on July 31st, 2014 [ permalink ] Universal App - Designed for iPhone and iPad | Read more »
Fish & Shark Review
Fish & Shark Review By Jordan Minor on July 31st, 2014 Our Rating: :: FLAPPY FISHUniversal App - Designed for iPhone and iPad Fish & Shark’s beauty is only scale deep.   | Read more »

Price Scanner via MacPrices.net

Save up to $130 on an iPad mini with Apple re...
The Apple Store has Certified Refurbished 2nd generation iPad minis with Retina Displays available for up to $130 off the cost of new models, starting at $339. Apple’s one-year warranty is included... Read more
iPad Cannibalization Threat “Overblown”
Seeking Alpha’s Kevin Greenhalgh observes that while many commentators think Apple’s forthcoming 5.5-inch panel iPhone 6 will cannibalize iPad sales, in his estimation, these concerns are being... Read more
Primate Labs Releases July 2014 MacBook Pro P...
Primate Labs’ John Poole has posted Geekbench 3 results for most of the new MacBook Pro models that Apple released on Tuesday. Poole observes that overall performance improvements for the new MacBook... Read more
Apple Re-Releases Bugfixed MacBook Air EFI Fi...
Apple has posted a bugfixed version EFI Firmware Update 2.9 a for MacBook Air (Mid 2011) models. The update addresses an issue where systems may take longer to wake from sleep than expected, and... Read more
Save $50 on the 2.5GHz Mac mini, plus free sh...
B&H Photo has the 2.5GHz Mac mini on sale for $549.99 including free shipping. That’s $50 off MSRP, and B&H will also include a free copy of Parallels Desktop software. NY sales tax only. Read more
Save up to $140 on an iPad Air with Apple ref...
Apple is offering Certified Refurbished iPad Airs for up to $140 off MSRP. Apple’s one-year warranty is included with each model, and shipping is free. Stock tends to come and go with some of these... Read more
$250 price drop on leftover 15-inch Retina Ma...
B&H Photo has dropped prices on 2013 15″ Retina MacBook Pros by $250 off original MSRP. Shipping is free, and B&H charges NY sales tax only: - 15″ 2.3GHz Retina MacBook Pro: $2249, $250 off... Read more
More iPad Upgrade Musings – The ‘Book Mystiqu...
Much discussed recently, what with Apple reporting iPad sales shrinkage over two consecutive quarters, is that it had apparently been widely assumed that tablet users would follow a two-year hardware... Read more
13-inch 2.5GHz MacBook Pro on sale for $999,...
Best Buy has the 13″ 2.5GHz MacBook Pro available for $999.99 on their online store. Choose free shipping or free instant local store pickup (if available). Their price is $100 off MSRP. Price is... Read more
Save up to $300 on an iMac with Apple refurbi...
The Apple Store has Apple Certified Refurbished iMacs available for up to $300 off the cost of new models. Apple’s one-year warranty is standard, and shipping is free. These are the best prices on... Read more

Jobs Board

*Apple* Retail - Multiple Positions (US) - A...
Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, you're also the Read more
Sr. Product Leader, *Apple* Store Apps - Ap...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
Sr Software Lead Engineer, *Apple* Online S...
Sr Software Lead Engineer, Apple Online Store Publishing Systems Keywords: Company: Apple Job Code: E3PCAK8MgYYkw Location (City or ZIP): Santa Clara Status: Full Read more
Sr Software Lead Engineer, *Apple* Online S...
Sr Software Lead Engineer, Apple Online Store Publishing Systems Keywords: Company: Apple Job Code: E3PCAK8MgYYkw Location (City or ZIP): Santa Clara Status: Full Read more
*Apple* Solutions Consultant (ASC) - Apple (...
**Job Summary** The ASC is an Apple employee who serves as an Apple brand ambassador and influencer in a Reseller's store. The ASC's role is to grow Apple Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.