TweetFollow Us on Twitter

Jun 94 Challenge
Volume Number:10
Issue Number:6
Column Tag:Programmers’ Challenge
!seealso: "May 94 Challenge" " Jul 94 Challenge"

Programmers’ Challenge

By Mike Scanlin, MacTech Magazine Regular Contributing Author

Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.

The rules

Here’s how it works: Each month there will be a different programming challenge presented here. First, you must write some code that solves the challenge. Second, you must optimize your code (a lot). Then, submit your solution to MacTech Magazine (formerly MacTutor). A winner will be chosen based on code correctness, speed, size and elegance (in that order of importance) as well as the postmark of the answer. In the event of multiple equally desirable solutions, one winner will be chosen at random (with honorable mention, but no prize, given to the runners up). The prize for the best solution each month is $50 and a limited edition “The Winner! MacTech Magazine Programming Challenge” T-shirt (not to be found in stores).

In order to make fair comparisons between solutions, all solutions must be in ANSI compatible C (i.e., don’t use Think’s Object extensions). Only pure C code can be used. Any entries with any assembly in them will be disqualified (except for those challenges specifically stated to be in assembly). However, you may call any routine in the Macintosh toolbox you want (i.e., it doesn’t matter if you use NewPtr instead of malloc). All entries will be tested with the FPU and 68020 flags turned off in THINK C. When timing routines, the latest version of THINK C will be used (with ANSI Settings plus “Honor ‘register’ first” and “Use Global Optimizer” turned on) so beware if you optimize for a different C compiler. All code should be limited to 60 characters wide. This will aid us in dealing with e-mail gateways and page layout.

The solution and winners for this month’s Programmers’ Challenge will be published in the issue two months later. All submissions must be received by the 10th day of the month printed on the front of this issue.

All solutions should be marked “Attn: Programmers’ Challenge Solution” and sent to Xplain Corporation (the publishers of MacTech Magazine) via “snail mail” or preferably, e-mail - AppleLink: MT.PROGCHAL, Internet: progchallenge@xplain.com, CompuServe: 71552,174 and America Online: MT PRGCHAL. If you send via snail mail, please include a disk with the solution and all related files (including contact information). See page 2 for information on “How to Contact Xplain Corporation.”

MacTech Magazine reserves the right to publish any solution entered in the Programming Challenge of the Month. Authors grant MacTech Magazine the non-exclusive right to publish entries without limitation upon submission of each entry. Copyrights for the code are retained by the author.

FACTORING

Being able to factor quickly is an important part of breaking secret codes, I mean, writing cool Mac games. This month’s challenge, therefore, is to factor a 64-bit number into the two primes that were multiplied together to produce it.

The prototype of the function you write is:


/* 1 */
void Factor64(lowHalf, highHalf
 prime1Ptr, prime2Ptr)
unsigned long lowHalf;
unsigned long highHalf;
unsigned long *prime1Ptr;
unsigned long *prime2Ptr;

highHalf and lowHalf are the 64-bit input number split into two pieces (bit zero of lowHalf is bit 0 of the input number and bit 31 of highHalf is bit 63 of the input number). The input number is guaranteed to be the product of two primes, each of which is 32 bits or less. Your routine will store one prime at *prime1Ptr and the other one at *prime2Ptr (in either order).

Remember, solutions must be in C to qualify for entry into the Challenge but assembly versions might get mentioned if they’re wicked fast. Also, if anyone has a nice routine for factoring even larger numbers (like, say, 256-bit numbers) into composite primes and wouldn’t mind sharing it with MacTech readers then send it on in. The best one might get published along with the winning solution.

TWO MONTHS AGO WINNER

The competition for the Swap Blocks challenge was unusually tough. There were several very high quality entries. Congratulations to Bill Karsh (Chicago, IL) for winning with the fastest entry. It was only last month that I declared Bob Boonstra (Westford, MA) the Programmer Challenge Champion for having the most number of first place showings but now he and Bill are tied for that elusive title (with three wins each). Jorg Brown (San Francisco, CA) deserves praise for his second place showing. His code size was just over half of Bill’s winning solution and was nearly as fast.

Here are the code sizes and times for two different tests. The first time test was for random size inputs (according to the distribution stated in the problem). The second time test was for blocks that were roughly, but not exactly, equal in size (again, with the given distributions but with both sizes coming from the same size category). Numbers in parens after a person’s name indicate how many times that person has finished in the top 5 places of all previous Programmer Challenges, not including this one:

Name time 1 time 2 code size

Bill Karsh (3) 170 219 642

Jorg Brown 174 242 366

Jim Lloyd 209 408 1642

Lorn Olsen 239 350 670

Ted Krovetz 243 247 88

Stepan Riha (6) 243 347 452

Bob Boonstra (8) 247 443 480

Jeffry Spain 248 397 234

Greg Landweber (1) 264 491 300

Martin Weiss 281 601 210

Christopher Suley 299 321 110

Dave Darrah 299 681 284

Ernst Munter 315 414 632

Xan Gregg 340 1260 484

Michael Anderson 359 942 156

Allen Stenger (5) 393 436 156

Michael Panchenko 409 465 82

Danny Stevenson 449 583 424

Eric Bennett 493 1478 284

Arnold Woodworth 595 729 206

Bob Boonstra 212 418 400

(assembly)

The SwapBytes problem is really a multi-byte rotate problem. Think about it this way: If you had a 32-bit register and you wanted to swap the low 7 bits with the upper 25 bits you could just rotate it 7 bit positions to the right. The rotate instruction is like a SwapBits operation where size1 + size2 always equals 32.

Almost everyone who entered used a variant of this observation. The fifth place entry by Ted Krovetz (Santa Cruz, CA) illustrates it nicely:


/* 2 */
void SwapBlocks (void *p1, void *p2,
 void *swapPtr, ulong size1,
 ulong size2, ulong swapSize)
{
 long *lp1 = (long *)p1;
 long *lp2 = (long *)p2;  
 ulong s1 = size1 >> 2;
 ulong s2 = size2 >> 2;
 ulong count;
 long temp, *tempp1, *tempp2;
 
 do {
 if (s1 < s2) {
 count = s1;
 tempp1 = lp1;
 s2 -= s1;
 tempp2 = lp2 + s2;
 }
 else {
 count = s2;
 tempp1 = lp1;
 tempp2 = lp2;
 lp1 += s2;
 s1 -= s2;
 }
 do {
 temp = *tempp1;
 *(tempp1++) = *tempp2;
 *(tempp2++) = temp;
 } while (--count);
 } while (s1);
}

Because Bill’s winning solution is so general purpose and macro-ized it is not the easiest code to read (although I commend his generality in making a useful piece of reusable and portable code). He has compile-time flags that let you build a large fast version (over 600 bytes, which was the version timed) or a small slower version (less than 100 bytes). And you can optionally change the 4 byte alignment assumption into a 2 byte or 1 byte alignment assumption (by redefining AtomSize).

I used Think C’s preprocessor command to see what all those #defines would boil down to. The core swap code for those cases where you can’t use the temporary swap space (cause it’s too small) ends up looking like this:


/* 3 */
switch( (short)q ) {
case 0:
 while( --nS ) {
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 7:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 6:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 5:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 4:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 3:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 2:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 1:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
 } /* end while */
}; /* end switch */

This illustrates some interesting loop unrolling syntax that’s possible in C. As the code shows, it’s legal to spread a while statement over several case labels in a switch statement. Which nicely solves the problem of “How do you handle the remainder?” when you unroll a loop 8 times. In this example nS is the number of times to swap divided by 8 and q is numTimesToSwap mod 8. So if numTimesToSwap is 10 then q is 2 and nS is 1. When the switch statement is executed it will branch to case 2 which does 2 swaps and then loops back to the top of the while loop. It runs through one set of 8 swaps and then stops. Pretty cool syntax.

Here’s Bill’s winning solution:

SwapBlocks

Response to Apr 94 MacTech Programmer's Challenge.

by Bill Karsh

Object: Exchange contents of two adjacent memory blocks.

Redirection: This is an interesting problem, but what would make this guy really useful? As stated, the blocks for the challenge are 4i bytes long and start on 4j aligned addresses. These are special circumstances which apply to Memory Manager blocks, and then, only on 68020 or later cpu's. Memory blocks on the 68000 are merely even aligned and even length. Further, this could be a word processor tool for swapping runs of bytes, but we would have to relax the alignment and size restrictions even further to arbitrary address and length since we would almost always be pointing to characters interior to a handle.

I have written the routine to give its best performance, subject to a specified minimum enforced alignment and atom size (smallest unit to move). This is controlled at compile time by:


/* 4 */
typedef long  Atom, for len = 4i, addr = 4j,
typedef short Atom, for len = 2i, addr = 2j,
typedef Byte  Atom, for len = any, addr = any.

Note - due to an ancient law of portability, preprocessor directives are not allowed to compare enums, types, sizeof()s or anything else that has machine dependency hidden in it. This means you have to #define the AtomSize manually. This is needed to select the proper performance crossover points for that type.

But wait there’s more... You might not tolerate a 644 byte dedicated word swapper in your text editor, but a 96 byte one might fit. We handle that.

You can tailor the routine to your requirements for execution speed vs. code size by setting the JobMode constant according to this table:

JobMode Buffers MonsterCopies MonsterSwaps

Smallest No No No

Small No No Yes

Fast Yes No Yes

Fastest Yes Yes Yes

- billKarsh


/* 5 */
#pragma options( honor_register, !assign_registers )

#defines
#define Smallest                0
#define Small                   1
#define Fast                    2
#define Fastest                 3
User Selectable Parameters

/* 6 */
#define JobMode                 Fastest
#define Verify_p1_LowerThan_p2  0

Sorry, you must #define your chosen Atom’s size by hand. The preprocessor won’t accept sizeof operators. Yuck! The XOvers below vary according to this size, so we have to know it.


/* 7 */
typedef longAtom;
#define AtomSize 4


#if JobMode >= Fast
#define UseBuffer1
#endif
#if JobMode == Fastest
#define MonsterCopy1
#endif
#if JobMode >= Small
#define MonsterSwap1
#endif


#define Lo3B0x00ffffff


#if AtomSize == 4
#define FwdXOver            144
#define BckXOver            120
#define SwpXOver            44
#elif AtomSize == 2
#define FwdXOver            48
#define BckXOver            44
#define SwpXOver            32
#else
#define FwdXOver            24
#define BckXOver            20
#define SwpXOver            12
#endif

FwdOp
#define FwdOp                                        \
 *dst++ = *src++

BckOp
#define BckOp                                        \
 *--pR = *--pL

SwpOp
#define SwpOp                                        \
 q     = *pL;                                       \
 *pL++ = *pR;                                       \
 *pR++ = q

Cases3_1
#define Cases3_1( op )                               \
 case 3:     op;                                    \
 case 2:     op;                                    \
 case 1:     op

Cases7_1
#define Cases7_1( op )                               \
 case 7:     op;                                    \
 case 6:     op;                                    \
 case 5:     op;                                    \
 case 4:     op;                                    \
 Cases3_1( op )

CalcPasses
#define CalcPasses( bits )                           \
 nS /= sizeof(Atom);                                \
 q = nS & ((1 << bits) - 1);                        \
 nS >>= bits;                                       \
 ++nS

Monster
#define Monster( op, cases )                         \
 switch( (short)q ) {                               \
 case 0:                                          \
 while( --nS ) {                                \
 op;                                          \
 cases( op );                                 \
 }                                              \
 }

CopyInc
#if MonsterCopy == 1
#define CopyInc( dst, src, n )                     \
 nS = n;                                           \
 if( nS > FwdXOver ) {                             \
 _CopyInc(                                       \
  (Atom*)(dst), (Atom*)(src), nS );              \
 }                                                 \
 else {                                            \
 pL = (Atom*)(dst);                              \
 pR = (Atom*)(src);                              \
 do { *pL++ = *pR++; } while(nS-=sizeof(Atom));  \
 }
#else
#define CopyInc( dst, src, n )                      \
 nS = n;                                            \
 pL = (Atom*)(dst);                                 \
 pR = (Atom*)(src);                                 \
 do { *pL++ = *pR++; } while(nS-=sizeof(Atom))
#endif

CopyDec
#if MonsterCopy == 1
#define CopyDec( dst, src, n )                      \
 nS = n;                                            \
 pR = (Atom*)((Byte*)(dst) + nS);                   \
 pL = (Atom*)((Byte*)(src) + nS);                   \
 if( nS > BckXOver ) {                              \
 CalcPasses( 2 );                                 \
 Monster( BckOp, Cases3_1 );                      \
 }                                                  \
 else {                                             \
 do { BckOp; } while(nS-=sizeof(Atom));           \
 }
#else
#define CopyDec( dst, src, n )                      \
 nS = n;                                            \
 pR = (Atom*)((Byte*)(dst) + nS);                   \
 pL = (Atom*)((Byte*)(src) + nS);                   \
 do { BckOp; } while(nS-=sizeof(Atom))
#endif

Swap
#if MonsterSwap == 1
#define Swap                                        \
 if( nS > SwpXOver ) {                              \
 CalcPasses( 3 );                                 \
 Monster( SwpOp, Cases7_1 );                      \
 }                                                  \
 else {                                             \
 do { SwpOp; } while(nS-=sizeof(Atom));           \
 }
#else
#define Swap                                        \
 do { SwpOp; } while(nS-=sizeof(Atom))
#endif


#define MacroMania              true


#if JobMode == Fastest

_CopyInc
Copy specified number of Bytes from src to dst.  Addresses are incremented, 
so src and dst can overlap iff dst <= src.
 
static void _CopyInc(
 register Atom           *dst,
 register const Atom     *src,
 register unsigned long  nS )
{
 short  q, pad;
 
 CalcPasses( 3 );
 Monster( FwdOp, Cases7_1 );
}
#endif

SwapBlocks

void SwapBlocks(
 void           *p1,
 void           *p2,
 void           *swapPtr,
 unsigned long  size1,
 unsigned long  size2,
 unsigned long  swapSize )
{
 register Atom   *pL, *pR, *p0;
 register long   nL, nR, nS, q;
 Boolean         done;
 short           pad;
 
 if( !(nL = size1) || !(nR = size2) ) return;
 
 p0 = p1;

If you can safely assume that p1 is always lower or same as p2, define Verify_p1_LowerThan_p2 = 0 (the #if section is not necessary).

If the “1” and “2” in p1 and p2 are simply labels, indicating nothing about position in memory of the blocks, then you must order them by activating the #if section. Define Verify_p1_LowerThan_p2 = 1.

Ordering means comparing addresses, which treats them as 32-bit numbers, no matter the current cpu addressing mode. If GetMMUMode returns true, we are in 32-bit mode - all 32-bits are significant.

In 24-bit mode, when the cpu uses an address to load or store something, it totally ignores the high-byte of the address. The high-byte may be random garbage. In this mode we suppress any garbage before comparing by masking it to zero.


/* 8 */
#if Verify_p1_LowerThan_p2 == 1

 pR = p2;
 
 if( !GetMMUMode() ) {
 p0 = (Atom*)((long)p0 & Lo3B);
 pR = (Atom*)((long)pR & Lo3B);
 }
 
 if( pR < p0 ) {
 q  = (long)p0;
 p0 = pR;
 p2 = (Atom*)q;
 
 q  = nL;
 nL = nR;
 nR = q;
 }
#endif

First, make use of buffer if we can. This is faster in most cases. A notable exception is equal size case which is best done in situ (let drop through).

Compare only the smaller size with buffer. If left is smaller, we can use post-increment addressing which is the faster mode. If right is smaller, use pre-decrement mode. We omit seeing if right-smaller will work with post-increment mode (if left also fits buffer). Preflighting overhead swallows us up very quickly.


/* 9 */
Buffer?
#if UseBuffer == 1

 if( nL < nR ) {
 if( nL <= swapSize ) {
 CopyInc( swapPtr, p0, nL );
 CopyInc( p0, p2, nR );
 CopyInc( (Byte*)p0 + nR, swapPtr, nL );
 return;
 }
 }
 else if( nL > nR ) {
 if( nR <= swapSize ) {
 CopyInc( swapPtr, p2, nR );
 CopyDec( (Byte*)p0 + nR, p0, nL );
 CopyInc( p0, swapPtr, nR );
 return;
 }
 }
#endif

This algorithm always does the job, buffer or not.

Find the smaller block. Swap it immediately into its final place. Now the larger block is in two out-of-order, but contiguous pieces. Wait a minute, this is what we started with! The only differences are: now the sizes are {smaller, larger - smaller}, and the start addresses have to keep up with the new pieces.

We repeat until the two pieces were the same length. In other words, the final swap didn’t break anybody in two. This can end with sizes larger than Atom-Atom. It depends on whether the smaller evenly divides the larger.


/* 10 */
In Situ
 done = false;

 do {
 
 pL = p0;
 pR = p2;

 if( nL < nR ) {
 nR = nR - nL;
 pR = (Atom*)((Byte*)pR + nR);
 nS = nL;
 }
 else if( nL > nR ) {
 p0 = (Atom*)((Byte*)pL + nR);
 nL = nL - nR;
 nS = nR;
 }
 else {
 nS = nL;
 done = true;
 }
 
 Swap;
 
 } while( !done );
}
 

Community Search:
MacTech Search:

Software Updates via MacUpdate

How to get all the crabs in Mr Crab 2
Mr. Crab 2 may look like a cutesy platformer for kids, but if you're the kind of person who likes to complete a game 100%, you'll soon realise that it's a tougher than a crustacean's shell. [Read more] | Read more »
How to be a star in Britney Spears: Amer...
If you've ever wanted to be a star, baby, then you've probably already checked out Britney Spears: American Dream and are happily making your way up the charts. But fame doesn't come easy, and everyone needs a helping hand sometimes. So we've got... | Read more »
AppSpy is hiring a part time Staff Write...
| Read more »
How to save lives in ER Surgery Simulato...
A serious earthquake has struck a nearby town in ER Surgery Simulator - Emergency Doctor, and it’s up to you to save the victims. [Read more] | Read more »
Tips and tricks to get a high score in G...
Ketchapp Games loves the endless runner genre. And its newest game, Gravity Switch, is no exception. Gravity Switch takes a fresh approach, though, as you move a block, suspended in zero gravity, safely through a maze of shifting pillars. If the... | Read more »
Tips and tricks to get a high score in S...
Smash Fu is a high-paced tile-tapping game that requires quick reflexes and some practice. You’ll have to smash bricks with the skill of a seasoned black belt to get a high score. To raise the stakes a bit, you’ll also have to avoid tapping any... | Read more »
How to keep the ball rolling in Dropple
If you're new to the minimalist puzzler Dropple, you may find yourself struggling to make it beyond the first couple of steps before your ball falls into the endless abyss below. [Read more] | Read more »
Game Craft releases new Legend of War ti...
Set for release at the end of this month, real time strategy title Legend of War seems sure to delight with a veritable feast of sweet features to get stuck into. Developed by Game Craft, the game is due for release through both the App Store and... | Read more »
How not to die in Traffic Rider
Traffic Rider, an Out Run-esque game in which your ride a motorcycle recklessly into trffic, might not seem particularly complicated. [Read more] | Read more »
How to adjust your chess game for Regici...
At first glance you might likenWarhammer 40,000: Regicide to Chess - and you'd be right. Regicideputs its own spin on the classic board game though, so some of your tried and true methods may not work quite so well here. [Read more] | Read more »

Price Scanner via MacPrices.net

Textkraft Professional Becomes A Mobile Produ...
The new update 4.1 of Textkraft Professional for the iPad comes with many new and updated features that will be particularly of interest to self-publishers of e-books. Highlights include import and... Read more
SnipNotes 2.0 – Intelligent note-taking for i...
Indie software developer Felix Lisczyk has announced the release and immediate availability of SnipNotes 2.0, the next major version of his productivity app for iOS devices and Apple Watch.... Read more
Pitch Clock – The Entrepreneur’s Wingman Laun...
Grand Rapids, Michigan based Skunk Tank has announced the release and immediate availability of Pitch Clock – The Entrepreneur’s Wingman 1.1, the company’s new business app available exclusively on... Read more
13-inch 2.9GHz Retina MacBook Pro on sale for...
B&H Photo has the 13″ 2.9GHz Retina MacBook Pro (model #MF841LL/A) on sale for $1599 including free shipping plus NY tax only. Their price is $200 off MSRP. Amazon also has the 13″ 3.9GHz Retina... Read more
Apple price trackers, updated continuously
Scan our Apple Price Trackers for the latest information on sales, bundles, and availability on systems from Apple’s authorized internet/catalog resellers. We update the trackers continuously: - 15″... Read more
Clearance 12-inch Retina MacBooks available s...
B&H Photo has dropped prices on leftover 2015 12″ Retina MacBooks with models now available starting at $999. Shipping is free, and B&H charges NY tax only: - 12″ 1.1GHz Gray Retina MacBook... Read more
Check Apple prices on any device with the iTr...
MacPrices is proud to offer readers a free iOS app (iPhones, iPads, & iPod touch) and Android app (Google Play and Amazon App Store) called iTracx, which allows you to glance at today’s lowest... Read more
New 2016 13-inch 256GB MacBook Air on sale fo...
B&H Photo has the new 13″ 1.6GHz/256GB MacBook Air (model MMGG2LL/A) on sale for $1149 including free shipping plus NY sales tax only. Their price is $50 off MSRP. Amazon has the 13″ 1.6GHz/256GB... Read more
Apple refurbished iPad Air 2s available start...
Apple has Certified Refurbished iPad Air 2 available starting at $339. Apple’s one-year warranty is included with each model, and shipping is free: - 128GB Wi-Fi iPad Air 2: $499 - 64GB Wi-Fi iPad... Read more
Accenture and Vatican Opera Romana Pellegrina...
Accenture has announced that the official mobile application for the Extraordinary Jubilee Year of Mercy declared by Pope Francis has been built and launched by Accenture Mobility, part of Accenture... Read more

Jobs Board

*Apple* Nissan Service Technicians - Apple A...
Apple Automotive is one of the fastest growing dealer...and it shows. Consider making the switch to the Apple Automotive Group today! At Apple Automotive , Read more
ISCS *Apple* ID Site Support Engineer - APP...
…position, we are looking for an individual who has experience supporting customers with Apple ID issues and enjoys this area of support. This person should be Read more
Automotive Sales Consultant - Apple Ford Linc...
…you. The best candidates are smart, technologically savvy and are customer focused. Apple Ford Lincoln Apple Valley is different, because: $30,000 annual salary Read more
*Apple* Support Technician II - Worldventure...
…global, fast growing member based travel company, is currently sourcing for an Apple Support Technician II to be based in our Plano headquarters. WorldVentures is Read more
Restaurant Manager (Neighborhood Captain) - A...
…in every aspect of daily operation. WHY YOU'LL LIKE IT: You'll be the Big Apple . You'll solve problems. You'll get to show your ability to handle the stress and Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.