TweetFollow Us on Twitter

Jun 94 Challenge
Volume Number:10
Issue Number:6
Column Tag:Programmers’ Challenge
!seealso: "May 94 Challenge" " Jul 94 Challenge"

Programmers’ Challenge

By Mike Scanlin, MacTech Magazine Regular Contributing Author

Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.

The rules

Here’s how it works: Each month there will be a different programming challenge presented here. First, you must write some code that solves the challenge. Second, you must optimize your code (a lot). Then, submit your solution to MacTech Magazine (formerly MacTutor). A winner will be chosen based on code correctness, speed, size and elegance (in that order of importance) as well as the postmark of the answer. In the event of multiple equally desirable solutions, one winner will be chosen at random (with honorable mention, but no prize, given to the runners up). The prize for the best solution each month is $50 and a limited edition “The Winner! MacTech Magazine Programming Challenge” T-shirt (not to be found in stores).

In order to make fair comparisons between solutions, all solutions must be in ANSI compatible C (i.e., don’t use Think’s Object extensions). Only pure C code can be used. Any entries with any assembly in them will be disqualified (except for those challenges specifically stated to be in assembly). However, you may call any routine in the Macintosh toolbox you want (i.e., it doesn’t matter if you use NewPtr instead of malloc). All entries will be tested with the FPU and 68020 flags turned off in THINK C. When timing routines, the latest version of THINK C will be used (with ANSI Settings plus “Honor ‘register’ first” and “Use Global Optimizer” turned on) so beware if you optimize for a different C compiler. All code should be limited to 60 characters wide. This will aid us in dealing with e-mail gateways and page layout.

The solution and winners for this month’s Programmers’ Challenge will be published in the issue two months later. All submissions must be received by the 10th day of the month printed on the front of this issue.

All solutions should be marked “Attn: Programmers’ Challenge Solution” and sent to Xplain Corporation (the publishers of MacTech Magazine) via “snail mail” or preferably, e-mail - AppleLink: MT.PROGCHAL, Internet: progchallenge@xplain.com, CompuServe: 71552,174 and America Online: MT PRGCHAL. If you send via snail mail, please include a disk with the solution and all related files (including contact information). See page 2 for information on “How to Contact Xplain Corporation.”

MacTech Magazine reserves the right to publish any solution entered in the Programming Challenge of the Month. Authors grant MacTech Magazine the non-exclusive right to publish entries without limitation upon submission of each entry. Copyrights for the code are retained by the author.

FACTORING

Being able to factor quickly is an important part of breaking secret codes, I mean, writing cool Mac games. This month’s challenge, therefore, is to factor a 64-bit number into the two primes that were multiplied together to produce it.

The prototype of the function you write is:


/* 1 */
void Factor64(lowHalf, highHalf
 prime1Ptr, prime2Ptr)
unsigned long lowHalf;
unsigned long highHalf;
unsigned long *prime1Ptr;
unsigned long *prime2Ptr;

highHalf and lowHalf are the 64-bit input number split into two pieces (bit zero of lowHalf is bit 0 of the input number and bit 31 of highHalf is bit 63 of the input number). The input number is guaranteed to be the product of two primes, each of which is 32 bits or less. Your routine will store one prime at *prime1Ptr and the other one at *prime2Ptr (in either order).

Remember, solutions must be in C to qualify for entry into the Challenge but assembly versions might get mentioned if they’re wicked fast. Also, if anyone has a nice routine for factoring even larger numbers (like, say, 256-bit numbers) into composite primes and wouldn’t mind sharing it with MacTech readers then send it on in. The best one might get published along with the winning solution.

TWO MONTHS AGO WINNER

The competition for the Swap Blocks challenge was unusually tough. There were several very high quality entries. Congratulations to Bill Karsh (Chicago, IL) for winning with the fastest entry. It was only last month that I declared Bob Boonstra (Westford, MA) the Programmer Challenge Champion for having the most number of first place showings but now he and Bill are tied for that elusive title (with three wins each). Jorg Brown (San Francisco, CA) deserves praise for his second place showing. His code size was just over half of Bill’s winning solution and was nearly as fast.

Here are the code sizes and times for two different tests. The first time test was for random size inputs (according to the distribution stated in the problem). The second time test was for blocks that were roughly, but not exactly, equal in size (again, with the given distributions but with both sizes coming from the same size category). Numbers in parens after a person’s name indicate how many times that person has finished in the top 5 places of all previous Programmer Challenges, not including this one:

Name time 1 time 2 code size

Bill Karsh (3) 170 219 642

Jorg Brown 174 242 366

Jim Lloyd 209 408 1642

Lorn Olsen 239 350 670

Ted Krovetz 243 247 88

Stepan Riha (6) 243 347 452

Bob Boonstra (8) 247 443 480

Jeffry Spain 248 397 234

Greg Landweber (1) 264 491 300

Martin Weiss 281 601 210

Christopher Suley 299 321 110

Dave Darrah 299 681 284

Ernst Munter 315 414 632

Xan Gregg 340 1260 484

Michael Anderson 359 942 156

Allen Stenger (5) 393 436 156

Michael Panchenko 409 465 82

Danny Stevenson 449 583 424

Eric Bennett 493 1478 284

Arnold Woodworth 595 729 206

Bob Boonstra 212 418 400

(assembly)

The SwapBytes problem is really a multi-byte rotate problem. Think about it this way: If you had a 32-bit register and you wanted to swap the low 7 bits with the upper 25 bits you could just rotate it 7 bit positions to the right. The rotate instruction is like a SwapBits operation where size1 + size2 always equals 32.

Almost everyone who entered used a variant of this observation. The fifth place entry by Ted Krovetz (Santa Cruz, CA) illustrates it nicely:


/* 2 */
void SwapBlocks (void *p1, void *p2,
 void *swapPtr, ulong size1,
 ulong size2, ulong swapSize)
{
 long *lp1 = (long *)p1;
 long *lp2 = (long *)p2;  
 ulong s1 = size1 >> 2;
 ulong s2 = size2 >> 2;
 ulong count;
 long temp, *tempp1, *tempp2;
 
 do {
 if (s1 < s2) {
 count = s1;
 tempp1 = lp1;
 s2 -= s1;
 tempp2 = lp2 + s2;
 }
 else {
 count = s2;
 tempp1 = lp1;
 tempp2 = lp2;
 lp1 += s2;
 s1 -= s2;
 }
 do {
 temp = *tempp1;
 *(tempp1++) = *tempp2;
 *(tempp2++) = temp;
 } while (--count);
 } while (s1);
}

Because Bill’s winning solution is so general purpose and macro-ized it is not the easiest code to read (although I commend his generality in making a useful piece of reusable and portable code). He has compile-time flags that let you build a large fast version (over 600 bytes, which was the version timed) or a small slower version (less than 100 bytes). And you can optionally change the 4 byte alignment assumption into a 2 byte or 1 byte alignment assumption (by redefining AtomSize).

I used Think C’s preprocessor command to see what all those #defines would boil down to. The core swap code for those cases where you can’t use the temporary swap space (cause it’s too small) ends up looking like this:


/* 3 */
switch( (short)q ) {
case 0:
 while( --nS ) {
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 7:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 6:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 5:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 4:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 3:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 2:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
case 1:
 q = *pL;
 *pL++ = *pR;
 *pR++ = q;
 } /* end while */
}; /* end switch */

This illustrates some interesting loop unrolling syntax that’s possible in C. As the code shows, it’s legal to spread a while statement over several case labels in a switch statement. Which nicely solves the problem of “How do you handle the remainder?” when you unroll a loop 8 times. In this example nS is the number of times to swap divided by 8 and q is numTimesToSwap mod 8. So if numTimesToSwap is 10 then q is 2 and nS is 1. When the switch statement is executed it will branch to case 2 which does 2 swaps and then loops back to the top of the while loop. It runs through one set of 8 swaps and then stops. Pretty cool syntax.

Here’s Bill’s winning solution:

SwapBlocks

Response to Apr 94 MacTech Programmer's Challenge.

by Bill Karsh

Object: Exchange contents of two adjacent memory blocks.

Redirection: This is an interesting problem, but what would make this guy really useful? As stated, the blocks for the challenge are 4i bytes long and start on 4j aligned addresses. These are special circumstances which apply to Memory Manager blocks, and then, only on 68020 or later cpu's. Memory blocks on the 68000 are merely even aligned and even length. Further, this could be a word processor tool for swapping runs of bytes, but we would have to relax the alignment and size restrictions even further to arbitrary address and length since we would almost always be pointing to characters interior to a handle.

I have written the routine to give its best performance, subject to a specified minimum enforced alignment and atom size (smallest unit to move). This is controlled at compile time by:


/* 4 */
typedef long  Atom, for len = 4i, addr = 4j,
typedef short Atom, for len = 2i, addr = 2j,
typedef Byte  Atom, for len = any, addr = any.

Note - due to an ancient law of portability, preprocessor directives are not allowed to compare enums, types, sizeof()s or anything else that has machine dependency hidden in it. This means you have to #define the AtomSize manually. This is needed to select the proper performance crossover points for that type.

But wait there’s more... You might not tolerate a 644 byte dedicated word swapper in your text editor, but a 96 byte one might fit. We handle that.

You can tailor the routine to your requirements for execution speed vs. code size by setting the JobMode constant according to this table:

JobMode Buffers MonsterCopies MonsterSwaps

Smallest No No No

Small No No Yes

Fast Yes No Yes

Fastest Yes Yes Yes

- billKarsh


/* 5 */
#pragma options( honor_register, !assign_registers )

#defines
#define Smallest                0
#define Small                   1
#define Fast                    2
#define Fastest                 3
User Selectable Parameters

/* 6 */
#define JobMode                 Fastest
#define Verify_p1_LowerThan_p2  0

Sorry, you must #define your chosen Atom’s size by hand. The preprocessor won’t accept sizeof operators. Yuck! The XOvers below vary according to this size, so we have to know it.


/* 7 */
typedef longAtom;
#define AtomSize 4


#if JobMode >= Fast
#define UseBuffer1
#endif
#if JobMode == Fastest
#define MonsterCopy1
#endif
#if JobMode >= Small
#define MonsterSwap1
#endif


#define Lo3B0x00ffffff


#if AtomSize == 4
#define FwdXOver            144
#define BckXOver            120
#define SwpXOver            44
#elif AtomSize == 2
#define FwdXOver            48
#define BckXOver            44
#define SwpXOver            32
#else
#define FwdXOver            24
#define BckXOver            20
#define SwpXOver            12
#endif

FwdOp
#define FwdOp                                        \
 *dst++ = *src++

BckOp
#define BckOp                                        \
 *--pR = *--pL

SwpOp
#define SwpOp                                        \
 q     = *pL;                                       \
 *pL++ = *pR;                                       \
 *pR++ = q

Cases3_1
#define Cases3_1( op )                               \
 case 3:     op;                                    \
 case 2:     op;                                    \
 case 1:     op

Cases7_1
#define Cases7_1( op )                               \
 case 7:     op;                                    \
 case 6:     op;                                    \
 case 5:     op;                                    \
 case 4:     op;                                    \
 Cases3_1( op )

CalcPasses
#define CalcPasses( bits )                           \
 nS /= sizeof(Atom);                                \
 q = nS & ((1 << bits) - 1);                        \
 nS >>= bits;                                       \
 ++nS

Monster
#define Monster( op, cases )                         \
 switch( (short)q ) {                               \
 case 0:                                          \
 while( --nS ) {                                \
 op;                                          \
 cases( op );                                 \
 }                                              \
 }

CopyInc
#if MonsterCopy == 1
#define CopyInc( dst, src, n )                     \
 nS = n;                                           \
 if( nS > FwdXOver ) {                             \
 _CopyInc(                                       \
  (Atom*)(dst), (Atom*)(src), nS );              \
 }                                                 \
 else {                                            \
 pL = (Atom*)(dst);                              \
 pR = (Atom*)(src);                              \
 do { *pL++ = *pR++; } while(nS-=sizeof(Atom));  \
 }
#else
#define CopyInc( dst, src, n )                      \
 nS = n;                                            \
 pL = (Atom*)(dst);                                 \
 pR = (Atom*)(src);                                 \
 do { *pL++ = *pR++; } while(nS-=sizeof(Atom))
#endif

CopyDec
#if MonsterCopy == 1
#define CopyDec( dst, src, n )                      \
 nS = n;                                            \
 pR = (Atom*)((Byte*)(dst) + nS);                   \
 pL = (Atom*)((Byte*)(src) + nS);                   \
 if( nS > BckXOver ) {                              \
 CalcPasses( 2 );                                 \
 Monster( BckOp, Cases3_1 );                      \
 }                                                  \
 else {                                             \
 do { BckOp; } while(nS-=sizeof(Atom));           \
 }
#else
#define CopyDec( dst, src, n )                      \
 nS = n;                                            \
 pR = (Atom*)((Byte*)(dst) + nS);                   \
 pL = (Atom*)((Byte*)(src) + nS);                   \
 do { BckOp; } while(nS-=sizeof(Atom))
#endif

Swap
#if MonsterSwap == 1
#define Swap                                        \
 if( nS > SwpXOver ) {                              \
 CalcPasses( 3 );                                 \
 Monster( SwpOp, Cases7_1 );                      \
 }                                                  \
 else {                                             \
 do { SwpOp; } while(nS-=sizeof(Atom));           \
 }
#else
#define Swap                                        \
 do { SwpOp; } while(nS-=sizeof(Atom))
#endif


#define MacroMania              true


#if JobMode == Fastest

_CopyInc
Copy specified number of Bytes from src to dst.  Addresses are incremented, 
so src and dst can overlap iff dst <= src.
 
static void _CopyInc(
 register Atom           *dst,
 register const Atom     *src,
 register unsigned long  nS )
{
 short  q, pad;
 
 CalcPasses( 3 );
 Monster( FwdOp, Cases7_1 );
}
#endif

SwapBlocks

void SwapBlocks(
 void           *p1,
 void           *p2,
 void           *swapPtr,
 unsigned long  size1,
 unsigned long  size2,
 unsigned long  swapSize )
{
 register Atom   *pL, *pR, *p0;
 register long   nL, nR, nS, q;
 Boolean         done;
 short           pad;
 
 if( !(nL = size1) || !(nR = size2) ) return;
 
 p0 = p1;

If you can safely assume that p1 is always lower or same as p2, define Verify_p1_LowerThan_p2 = 0 (the #if section is not necessary).

If the “1” and “2” in p1 and p2 are simply labels, indicating nothing about position in memory of the blocks, then you must order them by activating the #if section. Define Verify_p1_LowerThan_p2 = 1.

Ordering means comparing addresses, which treats them as 32-bit numbers, no matter the current cpu addressing mode. If GetMMUMode returns true, we are in 32-bit mode - all 32-bits are significant.

In 24-bit mode, when the cpu uses an address to load or store something, it totally ignores the high-byte of the address. The high-byte may be random garbage. In this mode we suppress any garbage before comparing by masking it to zero.


/* 8 */
#if Verify_p1_LowerThan_p2 == 1

 pR = p2;
 
 if( !GetMMUMode() ) {
 p0 = (Atom*)((long)p0 & Lo3B);
 pR = (Atom*)((long)pR & Lo3B);
 }
 
 if( pR < p0 ) {
 q  = (long)p0;
 p0 = pR;
 p2 = (Atom*)q;
 
 q  = nL;
 nL = nR;
 nR = q;
 }
#endif

First, make use of buffer if we can. This is faster in most cases. A notable exception is equal size case which is best done in situ (let drop through).

Compare only the smaller size with buffer. If left is smaller, we can use post-increment addressing which is the faster mode. If right is smaller, use pre-decrement mode. We omit seeing if right-smaller will work with post-increment mode (if left also fits buffer). Preflighting overhead swallows us up very quickly.


/* 9 */
Buffer?
#if UseBuffer == 1

 if( nL < nR ) {
 if( nL <= swapSize ) {
 CopyInc( swapPtr, p0, nL );
 CopyInc( p0, p2, nR );
 CopyInc( (Byte*)p0 + nR, swapPtr, nL );
 return;
 }
 }
 else if( nL > nR ) {
 if( nR <= swapSize ) {
 CopyInc( swapPtr, p2, nR );
 CopyDec( (Byte*)p0 + nR, p0, nL );
 CopyInc( p0, swapPtr, nR );
 return;
 }
 }
#endif

This algorithm always does the job, buffer or not.

Find the smaller block. Swap it immediately into its final place. Now the larger block is in two out-of-order, but contiguous pieces. Wait a minute, this is what we started with! The only differences are: now the sizes are {smaller, larger - smaller}, and the start addresses have to keep up with the new pieces.

We repeat until the two pieces were the same length. In other words, the final swap didn’t break anybody in two. This can end with sizes larger than Atom-Atom. It depends on whether the smaller evenly divides the larger.


/* 10 */
In Situ
 done = false;

 do {
 
 pL = p0;
 pR = p2;

 if( nL < nR ) {
 nR = nR - nL;
 pR = (Atom*)((Byte*)pR + nR);
 nS = nL;
 }
 else if( nL > nR ) {
 p0 = (Atom*)((Byte*)pL + nR);
 nL = nL - nR;
 nS = nR;
 }
 else {
 nS = nL;
 done = true;
 }
 
 Swap;
 
 } while( !done );
}
 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Latest Forum Discussions

See All

COSMOS RINGS (Games)
COSMOS RINGS 1.0.0 Device: iOS iPhone Category: Games Price: $5.99, Version: 1.0.0 (iTunes) Description: This game cannot be played without the Apple Watch.Released anniversary sale until August 31,2016 PST! A tragic tale of time's... | Read more »
Human Anatomy Atlas 2017 Edition - Compl...
Human Anatomy Atlas 2017 Edition - Complete 3D Human Body 1.0.24 Device: iOS iPhone Category: Medical Price: $24.99, Version: 1.0.24 (iTunes) Description: | Read more »
Heroes of Normandie (Games)
Heroes of Normandie 1.5 Device: iOS Universal Category: Games Price: $14.99, Version: 1.5 (iTunes) Description: The game does not support iPhone 4s and below | Read more »
Why you should never power up Pokemon in...
There's no question that candy is dandy in Pokemon GO. You need big quantities of it to evolve your Pokemon, and when combined with stardust, it can be used to power up your favorite pocket monsters as well, making them more formidable for the gym... | Read more »
Webzen launches 3D MMORPG MU Origin on i...
Mu Origin is featured time and time again at the very top of App Stores in China, and within the top five worldwide top-grossing charts on Google Play.Its popularity in Korea and China, featuring more than 120 registered players in China and 6... | Read more »
Severed (Games)
Severed 1.0 Device: iOS Universal Category: Games Price: $5.99, Version: 1.0 (iTunes) Description: LAUNCH DISCOUNT ON NOW!! ENDS AUGUST 4! ==== Take control of a one-armed warrior named Sasha, wielding a living sword on her journey... | Read more »
CSR Racing 2: How to master Live Races
Getting better at racing the AI in CSR Racing 2 is a pretty straightforward process. You run enough races with a particular car to get its shift timing down, continually improve it by buying upgrades whenever you can, and add any fusion parts you... | Read more »
VisualRuler - turns your iPhone into rul...
VisualRuler - turns your iPhone into ruler 1.0 Device: iOS iPhone Category: Utilities Price: $2.99, Version: 1.0 (iTunes) Description: Need to measure the size of the object but do not have a ruler? VisualRuler calculates the size of... | Read more »
Blyss (Games)
Blyss 2.0 Device: iOS Universal Category: Games Price: $1.99, Version: 2.0 (iTunes) Description: Travel through Beautiful mountains, serene valleys and harsh deserts solving Blyss' unique and self-evolving puzzles. The endless... | Read more »
Road Not Taken (Games)
Road Not Taken 1.0 Device: iOS iPhone Category: Games Price: $4.99, Version: 1.0 (iTunes) Description: "It looks like a cute fairy tale, but this is a turn-based game that's thorny with challenge and packed with an incredible number... | Read more »

Price Scanner via MacPrices.net

Notebook Makers In No Rush To Adopt USB-C – R...
Digitimes’ Cage Chao and Joseph Tsai note that while the USB Type-C interface is enjoying growing popularity among smartphones and tablet makers, notebook and all-in-one (AIO) PC vendors (other than... Read more
iMacs on sale for up to $250 off MSRP
B&H Photo has 21″ and 27″ Apple iMacs on sale for up to $250 off MSRP including free shipping plus NY sales tax only: - 27″ 3.3GHz iMac 5K: $2049 $250 off MSRP - 27″ 3.2GHz/1TB Fusion iMac 5K: $... Read more
12-inch 1.1GHz Retina MacBooks on sale for up...
Amazon has 2016 12″ 1.1GHz/256GB Retina MacBooks on sale for up to $100 off MSRP including free shipping: - 12″ 1.1GHz Space Gray Retina MacBook: $1199 $100 off MSRP - 12″ 1.1GHz Silver Retina... Read more
Bare Bones Software Releases Free TextWrangle...
Bare Bones Software has announced the release and immediate availability of TextWrangler 5.5, a significant update to its powerful, free, general purpose text editor for Mac OS X. TextWrangler is a... Read more
Apple’s 2016 Back to School promotion: Free B...
Purchase a new Mac or iPad using Apple’s Education Store and take up to $300 off MSRP. All teachers, students, and staff of any educational institution qualify for the discount. Shipping is free, and... Read more
Apple refurbished iPad Air 2s available start...
Apple has Certified Refurbished iPad Air 2 available starting at $339. Apple’s one-year warranty is included with each model, and shipping is free: - 128GB Wi-Fi iPad Air 2: $499 - 64GB Wi-Fi iPad... Read more
13-inch 2.5GHz MacBook Pro available for $964...
Overstock has the 13″ 2.5GHz MacBook Pro available for $964.21 including free shipping. Their price is $135 off MSRP. Read more
External Keyboard Innovations For iPad Pro (1...
I’m an input device aficionado. With non-touchscreen computers, which includes all Macs, the keyboard and mouse or trackpad are the tactile points of interface between user and machine, and the... Read more
GSK Rheumatoid Arthritis Study Leverages iPho...
Global healthcare products company GlaxoSmithKline (GSK) says that since 2014 they have begun transforming the way they conduct research, by leveraging state-of-the-art digital technologies — a... Read more
Clearance 12-inch Retina MacBooks, Apple refu...
Apple has Certified Refurbished 2015 12″ Retina MacBooks available starting at $929. Apple will include a standard one-year warranty with each MacBook, and shipping is free. The following... Read more

Jobs Board

*Apple* Solutions Consultant - APPLE (United...
Job Summary As an Apple Solutions Consultant, you'll be the link between our future customers and our products. You'll showcase your entrepreneurial spirit as you Read more
Lead *Apple* Solutions Consultant - APPLE (...
Job Summary The Lead ASC is an Apple employee who serves as the Apple business manager and influencer across a number of Reseller locations. The Lead ASC's role Read more
*Apple* Retail - Multiple Positions, Charles...
Job Description:SalesSpecialist - Retail Customer Service and SalesTransform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
*Apple* Retail - Multiple Positions, Willow...
Job Description:SalesSpecialist - Retail Customer Service and SalesTransform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
*Apple* Evangelist - JAMF Software (United S...
The Apple Evangelist is responsible for building and cultivating strategic relationships with Apple 's small and mid-market business development field teams. This Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.