TweetFollow Us on Twitter

Jun 94 Challenge
Volume Number:10
Issue Number:6
Column Tag:Programmers’ Challenge
!seealso: "May 94 Challenge" " Jul 94 Challenge"

Programmers’ Challenge

By Mike Scanlin, MacTech Magazine Regular Contributing Author

Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.

The rules

Here’s how it works: Each month there will be a different programming challenge presented here. First, you must write some code that solves the challenge. Second, you must optimize your code (a lot). Then, submit your solution to MacTech Magazine (formerly MacTutor). A winner will be chosen based on code correctness, speed, size and elegance (in that order of importance) as well as the postmark of the answer. In the event of multiple equally desirable solutions, one winner will be chosen at random (with honorable mention, but no prize, given to the runners up). The prize for the best solution each month is $50 and a limited edition “The Winner! MacTech Magazine Programming Challenge” T-shirt (not to be found in stores).

In order to make fair comparisons between solutions, all solutions must be in ANSI compatible C (i.e., don’t use Think’s Object extensions). Only pure C code can be used. Any entries with any assembly in them will be disqualified (except for those challenges specifically stated to be in assembly). However, you may call any routine in the Macintosh toolbox you want (i.e., it doesn’t matter if you use NewPtr instead of malloc). All entries will be tested with the FPU and 68020 flags turned off in THINK C. When timing routines, the latest version of THINK C will be used (with ANSI Settings plus “Honor ‘register’ first” and “Use Global Optimizer” turned on) so beware if you optimize for a different C compiler. All code should be limited to 60 characters wide. This will aid us in dealing with e-mail gateways and page layout.

The solution and winners for this month’s Programmers’ Challenge will be published in the issue two months later. All submissions must be received by the 10th day of the month printed on the front of this issue.

All solutions should be marked “Attn: Programmers’ Challenge Solution” and sent to Xplain Corporation (the publishers of MacTech Magazine) via “snail mail” or preferably, e-mail - AppleLink: MT.PROGCHAL, Internet:, CompuServe: 71552,174 and America Online: MT PRGCHAL. If you send via snail mail, please include a disk with the solution and all related files (including contact information). See page 2 for information on “How to Contact Xplain Corporation.”

MacTech Magazine reserves the right to publish any solution entered in the Programming Challenge of the Month. Authors grant MacTech Magazine the non-exclusive right to publish entries without limitation upon submission of each entry. Copyrights for the code are retained by the author.


Being able to factor quickly is an important part of breaking secret codes, I mean, writing cool Mac games. This month’s challenge, therefore, is to factor a 64-bit number into the two primes that were multiplied together to produce it.

The prototype of the function you write is:

/* 1 */
void Factor64(lowHalf, highHalf
 prime1Ptr, prime2Ptr)
unsigned long lowHalf;
unsigned long highHalf;
unsigned long *prime1Ptr;
unsigned long *prime2Ptr;

highHalf and lowHalf are the 64-bit input number split into two pieces (bit zero of lowHalf is bit 0 of the input number and bit 31 of highHalf is bit 63 of the input number). The input number is guaranteed to be the product of two primes, each of which is 32 bits or less. Your routine will store one prime at *prime1Ptr and the other one at *prime2Ptr (in either order).

Remember, solutions must be in C to qualify for entry into the Challenge but assembly versions might get mentioned if they’re wicked fast. Also, if anyone has a nice routine for factoring even larger numbers (like, say, 256-bit numbers) into composite primes and wouldn’t mind sharing it with MacTech readers then send it on in. The best one might get published along with the winning solution.


The competition for the Swap Blocks challenge was unusually tough. There were several very high quality entries. Congratulations to Bill Karsh (Chicago, IL) for winning with the fastest entry. It was only last month that I declared Bob Boonstra (Westford, MA) the Programmer Challenge Champion for having the most number of first place showings but now he and Bill are tied for that elusive title (with three wins each). Jorg Brown (San Francisco, CA) deserves praise for his second place showing. His code size was just over half of Bill’s winning solution and was nearly as fast.

Here are the code sizes and times for two different tests. The first time test was for random size inputs (according to the distribution stated in the problem). The second time test was for blocks that were roughly, but not exactly, equal in size (again, with the given distributions but with both sizes coming from the same size category). Numbers in parens after a person’s name indicate how many times that person has finished in the top 5 places of all previous Programmer Challenges, not including this one:

Name time 1 time 2 code size

Bill Karsh (3) 170 219 642

Jorg Brown 174 242 366

Jim Lloyd 209 408 1642

Lorn Olsen 239 350 670

Ted Krovetz 243 247 88

Stepan Riha (6) 243 347 452

Bob Boonstra (8) 247 443 480

Jeffry Spain 248 397 234

Greg Landweber (1) 264 491 300

Martin Weiss 281 601 210

Christopher Suley 299 321 110

Dave Darrah 299 681 284

Ernst Munter 315 414 632

Xan Gregg 340 1260 484

Michael Anderson 359 942 156

Allen Stenger (5) 393 436 156

Michael Panchenko 409 465 82

Danny Stevenson 449 583 424

Eric Bennett 493 1478 284

Arnold Woodworth 595 729 206

Bob Boonstra 212 418 400


The SwapBytes problem is really a multi-byte rotate problem. Think about it this way: If you had a 32-bit register and you wanted to swap the low 7 bits with the upper 25 bits you could just rotate it 7 bit positions to the right. The rotate instruction is like a SwapBits operation where size1 + size2 always equals 32.

Almost everyone who entered used a variant of this observation. The fifth place entry by Ted Krovetz (Santa Cruz, CA) illustrates it nicely:

/* 2 */
void SwapBlocks (void *p1, void *p2,
 void *swapPtr, ulong size1,
 ulong size2, ulong swapSize)
 long *lp1 = (long *)p1;
 long *lp2 = (long *)p2;  
 ulong s1 = size1 >> 2;
 ulong s2 = size2 >> 2;
 ulong count;
 long temp, *tempp1, *tempp2;
 do {
 if (s1 < s2) {
 count = s1;
 tempp1 = lp1;
 s2 -= s1;
 tempp2 = lp2 + s2;
 else {
 count = s2;
 tempp1 = lp1;
 tempp2 = lp2;
 lp1 += s2;
 s1 -= s2;
 do {
 temp = *tempp1;
 *(tempp1++) = *tempp2;
 *(tempp2++) = temp;
 } while (--count);
 } while (s1);

Because Bill’s winning solution is so general purpose and macro-ized it is not the easiest code to read (although I commend his generality in making a useful piece of reusable and portable code). He has compile-time flags that let you build a large fast version (over 600 bytes, which was the version timed) or a small slower version (less than 100 bytes). And you can optionally change the 4 byte alignment assumption into a 2 byte or 1 byte alignment assumption (by redefining AtomSize).

I used Think C’s preprocessor command to see what all those #defines would boil down to. The core swap code for those cases where you can’t use the temporary swap space (cause it’s too small) ends up looking like this:

/* 3 */
switch( (short)q ) {
case 0:
 while( --nS ) {
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 7:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 6:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 5:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 4:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 3:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 2:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 1:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
 } /* end while */
}; /* end switch */

This illustrates some interesting loop unrolling syntax that’s possible in C. As the code shows, it’s legal to spread a while statement over several case labels in a switch statement. Which nicely solves the problem of “How do you handle the remainder?” when you unroll a loop 8 times. In this example nS is the number of times to swap divided by 8 and q is numTimesToSwap mod 8. So if numTimesToSwap is 10 then q is 2 and nS is 1. When the switch statement is executed it will branch to case 2 which does 2 swaps and then loops back to the top of the while loop. It runs through one set of 8 swaps and then stops. Pretty cool syntax.

Here’s Bill’s winning solution:


Response to Apr 94 MacTech Programmer's Challenge.

by Bill Karsh

Object: Exchange contents of two adjacent memory blocks.

Redirection: This is an interesting problem, but what would make this guy really useful? As stated, the blocks for the challenge are 4i bytes long and start on 4j aligned addresses. These are special circumstances which apply to Memory Manager blocks, and then, only on 68020 or later cpu's. Memory blocks on the 68000 are merely even aligned and even length. Further, this could be a word processor tool for swapping runs of bytes, but we would have to relax the alignment and size restrictions even further to arbitrary address and length since we would almost always be pointing to characters interior to a handle.

I have written the routine to give its best performance, subject to a specified minimum enforced alignment and atom size (smallest unit to move). This is controlled at compile time by:

/* 4 */
typedef long  Atom, for len = 4i, addr = 4j,
typedef short Atom, for len = 2i, addr = 2j,
typedef Byte  Atom, for len = any, addr = any.

Note - due to an ancient law of portability, preprocessor directives are not allowed to compare enums, types, sizeof()s or anything else that has machine dependency hidden in it. This means you have to #define the AtomSize manually. This is needed to select the proper performance crossover points for that type.

But wait there’s more... You might not tolerate a 644 byte dedicated word swapper in your text editor, but a 96 byte one might fit. We handle that.

You can tailor the routine to your requirements for execution speed vs. code size by setting the JobMode constant according to this table:

JobMode Buffers MonsterCopies MonsterSwaps

Smallest No No No

Small No No Yes

Fast Yes No Yes

Fastest Yes Yes Yes

- billKarsh

/* 5 */
#pragma options( honor_register, !assign_registers )

#define Smallest                0
#define Small                   1
#define Fast                    2
#define Fastest                 3
User Selectable Parameters

/* 6 */
#define JobMode                 Fastest
#define Verify_p1_LowerThan_p2  0

Sorry, you must #define your chosen Atom’s size by hand. The preprocessor won’t accept sizeof operators. Yuck! The XOvers below vary according to this size, so we have to know it.

/* 7 */
typedef longAtom;
#define AtomSize 4

#if JobMode >= Fast
#define UseBuffer1
#if JobMode == Fastest
#define MonsterCopy1
#if JobMode >= Small
#define MonsterSwap1

#define Lo3B0x00ffffff

#if AtomSize == 4
#define FwdXOver            144
#define BckXOver            120
#define SwpXOver            44
#elif AtomSize == 2
#define FwdXOver            48
#define BckXOver            44
#define SwpXOver            32
#define FwdXOver            24
#define BckXOver            20
#define SwpXOver            12

#define FwdOp                                        \
 *dst++ = *src++

#define BckOp                                        \
 *--pR = *--pL

#define SwpOp                                        \
 q     = *pL;                                       \
 *pL++ = *pR;                                       \
 *pR++ = q

#define Cases3_1( op )                               \
 case 3:     op;                                    \
 case 2:     op;                                    \
 case 1:     op

#define Cases7_1( op )                               \
 case 7:     op;                                    \
 case 6:     op;                                    \
 case 5:     op;                                    \
 case 4:     op;                                    \
 Cases3_1( op )

#define CalcPasses( bits )                           \
 nS /= sizeof(Atom);                                \
 q = nS & ((1 << bits) - 1);                        \
 nS >>= bits;                                       \

#define Monster( op, cases )                         \
 switch( (short)q ) {                               \
 case 0:                                          \
 while( --nS ) {                                \
 op;                                          \
 cases( op );                                 \
 }                                              \

#if MonsterCopy == 1
#define CopyInc( dst, src, n )                     \
 nS = n;                                           \
 if( nS > FwdXOver ) {                             \
 _CopyInc(                                       \
  (Atom*)(dst), (Atom*)(src), nS );              \
 }                                                 \
 else {                                            \
 pL = (Atom*)(dst);                              \
 pR = (Atom*)(src);                              \
 do { *pL++ = *pR++; } while(nS-=sizeof(Atom));  \
#define CopyInc( dst, src, n )                      \
 nS = n;                                            \
 pL = (Atom*)(dst);                                 \
 pR = (Atom*)(src);                                 \
 do { *pL++ = *pR++; } while(nS-=sizeof(Atom))

#if MonsterCopy == 1
#define CopyDec( dst, src, n )                      \
 nS = n;                                            \
 pR = (Atom*)((Byte*)(dst) + nS);                   \
 pL = (Atom*)((Byte*)(src) + nS);                   \
 if( nS > BckXOver ) {                              \
 CalcPasses( 2 );                                 \
 Monster( BckOp, Cases3_1 );                      \
 }                                                  \
 else {                                             \
 do { BckOp; } while(nS-=sizeof(Atom));           \
#define CopyDec( dst, src, n )                      \
 nS = n;                                            \
 pR = (Atom*)((Byte*)(dst) + nS);                   \
 pL = (Atom*)((Byte*)(src) + nS);                   \
 do { BckOp; } while(nS-=sizeof(Atom))

#if MonsterSwap == 1
#define Swap                                        \
 if( nS > SwpXOver ) {                              \
 CalcPasses( 3 );                                 \
 Monster( SwpOp, Cases7_1 );                      \
 }                                                  \
 else {                                             \
 do { SwpOp; } while(nS-=sizeof(Atom));           \
#define Swap                                        \
 do { SwpOp; } while(nS-=sizeof(Atom))

#define MacroMania              true

#if JobMode == Fastest

Copy specified number of Bytes from src to dst.  Addresses are incremented, 
so src and dst can overlap iff dst <= src.
static void _CopyInc(
 register Atom           *dst,
 register const Atom     *src,
 register unsigned long  nS )
 short  q, pad;
 CalcPasses( 3 );
 Monster( FwdOp, Cases7_1 );


void SwapBlocks(
 void           *p1,
 void           *p2,
 void           *swapPtr,
 unsigned long  size1,
 unsigned long  size2,
 unsigned long  swapSize )
 register Atom   *pL, *pR, *p0;
 register long   nL, nR, nS, q;
 Boolean         done;
 short           pad;
 if( !(nL = size1) || !(nR = size2) ) return;
 p0 = p1;

If you can safely assume that p1 is always lower or same as p2, define Verify_p1_LowerThan_p2 = 0 (the #if section is not necessary).

If the “1” and “2” in p1 and p2 are simply labels, indicating nothing about position in memory of the blocks, then you must order them by activating the #if section. Define Verify_p1_LowerThan_p2 = 1.

Ordering means comparing addresses, which treats them as 32-bit numbers, no matter the current cpu addressing mode. If GetMMUMode returns true, we are in 32-bit mode - all 32-bits are significant.

In 24-bit mode, when the cpu uses an address to load or store something, it totally ignores the high-byte of the address. The high-byte may be random garbage. In this mode we suppress any garbage before comparing by masking it to zero.

/* 8 */
#if Verify_p1_LowerThan_p2 == 1

 pR = p2;
 if( !GetMMUMode() ) {
 p0 = (Atom*)((long)p0 & Lo3B);
 pR = (Atom*)((long)pR & Lo3B);
 if( pR < p0 ) {
 q  = (long)p0;
 p0 = pR;
 p2 = (Atom*)q;
 q  = nL;
 nL = nR;
 nR = q;

First, make use of buffer if we can. This is faster in most cases. A notable exception is equal size case which is best done in situ (let drop through).

Compare only the smaller size with buffer. If left is smaller, we can use post-increment addressing which is the faster mode. If right is smaller, use pre-decrement mode. We omit seeing if right-smaller will work with post-increment mode (if left also fits buffer). Preflighting overhead swallows us up very quickly.

/* 9 */
#if UseBuffer == 1

 if( nL < nR ) {
 if( nL <= swapSize ) {
 CopyInc( swapPtr, p0, nL );
 CopyInc( p0, p2, nR );
 CopyInc( (Byte*)p0 + nR, swapPtr, nL );
 else if( nL > nR ) {
 if( nR <= swapSize ) {
 CopyInc( swapPtr, p2, nR );
 CopyDec( (Byte*)p0 + nR, p0, nL );
 CopyInc( p0, swapPtr, nR );

This algorithm always does the job, buffer or not.

Find the smaller block. Swap it immediately into its final place. Now the larger block is in two out-of-order, but contiguous pieces. Wait a minute, this is what we started with! The only differences are: now the sizes are {smaller, larger - smaller}, and the start addresses have to keep up with the new pieces.

We repeat until the two pieces were the same length. In other words, the final swap didn’t break anybody in two. This can end with sizes larger than Atom-Atom. It depends on whether the smaller evenly divides the larger.

/* 10 */
In Situ
 done = false;

 do {
 pL = p0;
 pR = p2;

 if( nL < nR ) {
 nR = nR - nL;
 pR = (Atom*)((Byte*)pR + nR);
 nS = nL;
 else if( nL > nR ) {
 p0 = (Atom*)((Byte*)pL + nR);
 nL = nL - nR;
 nS = nR;
 else {
 nS = nL;
 done = true;
 } while( !done );

Community Search:
MacTech Search:

Software Updates via MacUpdate

LooperSonic (Music)
LooperSonic 1.0 Device: iOS Universal Category: Music Price: $4.99, Version: 1.0 (iTunes) Description: LooperSonic is a multi-track audio looper and recorder that will take your loops to the next level. Use it like a loop pedal to... | Read more »
Space Grunts guide - How to survive
Space Grunts is a fast-paced roguelike from popular iOS developer, Orange Pixel. While it taps into many of the typical roguelike sensibilities, you might still find yourself caught out by a few things. We delved further to find you some helpful... | Read more »
Dreii guide - How to play well with othe...
Dreii is a rather stylish and wonderful puzzle game that’s reminiscent of cooperative games like Journey. If that sounds immensely appealing, then you should immediately get cracking and give it a whirl. We can offer you some tips and tricks on... | Read more »
Kill the Plumber World guide - How to ou...
You already know how to hop around like Mario, but do you know how to defeat him? Those are your marching orders in Kill the Plumber, and it's not always as easy as it looks. Here are some tips to get you started. This is not a seasoned platform... | Read more »
Planar Conquest (Games)
Planar Conquest 1.0 Device: iOS Universal Category: Games Price: $12.99, Version: 1.0 (iTunes) Description: IMPORTANT: Planar Conquest is compatible only with iPad 3 & newer devices, iPhone 5 & newer. It’s NOT compatible with... | Read more »
We talk to Cheetah Mobile about its plan...
Piano Tiles 2 is a fast-paced rhythm action high score chaser out now on iOS and Android. You have to tap a series of black tiles that appear on the screen in time to the music, being careful not to accidentally hit anywhere else. Do that and it's... | Read more »
Ultimate Briefcase guide - How to dodge...
Ultimate Briefcase is a simple but tricky game that’s highly dependent on how fast you can react. We can still offer you a few tips and tricks on how to survive though. Guess what? That’s exactly what we’re going to do now. Take it easy [Read more... | Read more »
SoundPrism Link Edition (Music)
SoundPrism Link Edition 1.0 Device: iOS Universal Category: Music Price: $4.99, Version: 1.0 (iTunes) Description: ***Introductory price for a the first few days after launch - if you're reading this, get it while it's fresh out of... | Read more »
Pre-register now for hack and slasher An...
Fincon, which won Facebook's Studio to Watch award in 2015, has announced that pre-registration is now open for the massive 3.0 update for its award-winning hack and slasher Angel Stone. Angel Stone is a post-apocalyptic action RPG in which the... | Read more »
Google has named Piano Tiles 2 as its Be...
Google has named Piano Tiles 2, which launched back in August last year, as its Best Game of 2015. If you're yet to play it, now's a good time to do so. It's a sequel to the hugely successful viral hit Piano Tiles (Don't Tap the White Tile) but... | Read more »

Price Scanner via

Apple refurbished 2014 13-inch Retina MacBook...
Apple has Certified Refurbished 2014 13″ Retina MacBook Pros available for up to $400 off original MSRP, starting at $979. An Apple one-year warranty is included with each model, and shipping is free... Read more
Macs available for up to $300 off MSRP, $20 o...
Purchase a new Mac or iPad using Apple’s Education Store and take up to $300 off MSRP. All teachers, students, and staff of any educational institution qualify for the discount. Shipping is free, and... Read more
Watch Super Bowl 50 Live On Your iPad For Fre...
Watch Super Bowl 50 LIVE on the CBS Sports app for iPad and Apple TV. Get the app and then tune in Sunday, February 7, 2016 at 6:30 PM ET to catch every moment of the big game. The CBS Sports app is... Read more
Two-thirds Of All Smart Watches Shipped In 20...
Apple dominated the smart watch market in 2015, accounting for over 12 million units and two-thirds of all shipments according to Canalys market research analysts’ estimates. Samsung returned to... Read more
12-inch 1.2GHz Retina MacBooks on sale for up...
B&H Photo has 12″ 1.2GHz Retina MacBooks on sale for $180 off MSRP. Shipping is free, and B&H charges NY tax only: - 12″ 1.2GHz Gray Retina MacBook: $1499 $100 off MSRP - 12″ 1.2GHz Silver... Read more
12-inch 1.1GHz Gray Retina MacBook on sale fo...
B&H Photo has the 12″ 1.1GHz Gray Retina MacBook on sale for $1199 including free shipping plus NY sales tax only. Their price is $100 off MSRP, and it’s the lowest price available for this model... Read more
Apple now offering full line of Certified Ref...
Apple now has a full line of Certified Refurbished 2015 21″ & 27″ iMacs available for up to $350 off MSRP. Apple’s one-year warranty is standard, and shipping is free. The following models are... Read more
Free GUI Speedometer – The Ultimate Digital D...
Miami, Florida based RMKapps has announced the official release of GUI Speedometer 1.0, their digital dashboard display developed for iOS devices. GUI Speedometer allows users to track their precise... Read more
FutureCalc: Ergonomic iOS Calculator App For...
London, United Kingdom based Independent software developer and entrepreneur, Hovik Melikyan has announced the release and immediate availability of FutureCalc 1.0, his new ergonomic calculator app... Read more
Save up to $600 with Apple refurbished Mac Pr...
Apple has Certified Refurbished Mac Pros available for up to $600 off the cost of new models. An Apple one-year warranty is included with each Mac Pro, and shipping is free. The following... Read more

Jobs Board

*Apple* Subject Matter Expert - Experis (Uni...
This position is for an Apple Subject Matter Expert to assist in developing the architecture, support and services for integration of Apple devices into the domain. Read more
*Apple* Macintosh OSX - Net2Source Inc. (Uni...
…: * Work Authorization : * Contact Number(Best time to reach you) : Skills : Apple Macintosh OSX Location : New York, New York. Duartion : 6+ Months The associate would Read more
Computer Operations Technician ll - *Apple*...
# Web Announcement** Apple Technical Liaison**The George Mason University, Information Technology Services (ITS), Technology Support Services, Desktop Support Read more
Restaurant Manager - Apple Gilroy Inc./Apple...
…in every aspect of daily operation. WHY YOU'LL LIKE IT: You'll be the Big Apple . You'll solve problems. You'll get to show your ability to handle the stress and Read more
Simply Mac *Apple* Specialist- Service Repa...
Simply Mac is the largest premier retailer of Apple products in the nation. In order to support our growing customer base, we are currently looking for a driven Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.