TweetFollow Us on Twitter

Oct 97 Challenge

Volume Number: 13 (1997)
Issue Number: 10
Column Tag: Programmer's Challenge

Programmer's Challenge

by Bob Boonstra, Westford, MA

Who Owns the Zebra?

I'm finishing up this column while at the beach on vacation. The place we're staying is kind of interesting, in that all of the condominiums look exactly alike, except that each has a different color door. Even more interesting, each condominium is occupied by a person of a different nationality, and each person owns a different kind of pet. There are three condominiums, with red, green, and blue doors. They are occupied by an American, a Canadian, and an Australian (but not necessarily in that order). The American vacationer lives in the house with the red door. The person in the house with the blue door owns a dog. The person who owns the cat lives in the middle house. And the house with the green door is immediately to the right of the house with the blue door. So, who owns the zebra?

Well, I really am on vacation, but pets aren't allowed, the doors are all the same color, and I don't know the nationalities of my neighbors. But I did run across a zebra puzzle recently, and it seemed like a good logic problem for the Challenge. In the above example, you can reason through the four clues to rule out most of the 216 possible combinations and conclude that the American owns the zebra. Problem complexity grows rapidly with the number of variables - in a problem with 5 variables, there are more than 24 thousand million (that's 24 billion for Americans) combinations, but the zebra can be found with as few as 14 clues.

Your Challenge this month is to write a program that will reason through a set of clues and provide a solution consistent with all of the clues. The problem will be provided in a stilted syntax - for example, the sample problem above would be given as follows:

  American ISA person
  Canadian ISA person
  Australian ISA person
  redDoor ISA house
  greenDoor ISA house
  blueDoor ISA house
  dog ISA pet
  cat ISA pet
  zebra ISA pet
  person lives_in house
  person owns pet
  American lives_in redDoor
  blueDoor owns dog
  cat IS_LOCATED IN_MIDDLE 
  greenDoor IMMED_RIGHT_OF blueDoor
  SOLVE person owns zebra
  ANSWER person house pet

The nine ISA relations define the variables (person, house, pet) and the values those variables assume in the problem. The next two statements define the relations (lives_in, owns) between selected pairs of variables. The next four statements are the clues describing relations between values of variables, discussed further below. The SOLVE statement defines the question that you are to answer, and the ANSWER statement defines the format that your solution should take.

The prototype of the code you should write is:

void WhoOwnsZebra(
  long problemDimension,    /* number of problem variables */
  long numClues,          /* number of clues provided */
  CStr255 clues[],        /* the clues */
  CStr255 solution[]      /* storage for problemDimension result strings */
);

The problemDimension parameter describes the number of variables in the problem you are to solve (in the example above, problemDimension was 3). The number of clues provided is given as numClues (17 in the example). The solution is to be provided as a sequence of problemDimension n-tuples that form a solution to the problem, where each n-tuple is a sequence of values in the order described by the ANSWER clue. In the example given above, one solution would be:

  Australian blueDoor  dog     
  Canadian  greenDoor cat     
  American  redDoor  zebra   

The clues will consist of a sequence of case-sensitive tokens separated by spaces. The clues will take one of the following forms, where tokens in all caps are reserved words:

  value ISA variable
  variable relation variable
  value relation value
  value IS_LOCATED [AT_LEFT | IN_MIDDLE | AT_RIGHT]
  value [NEXT_TO | IMMED_RIGHT_OF | IMMED_LEFT_OF] value
  SOLVE variable relation value
  ANSWER variable ... variable

The ISA reserved word is used to define the variables in the problem and associate legal values with those variables. The relation statement takes two forms, one that defines a relationship between two variables, and one that associates a value taken by one variable with a value taken by another. These associations are transitive (e.g., if the American lives_in the redDoor house, and the person in the redDoor house owns the zebra, then the American owns the zebra). The relations associate values, and the specific words used to define a relation have no meaning except to make the problem more readable. In addition to the relations defined by the problem, there is a left-to-right ordering of the n-tuples in the solution. The special predefined NEXT_TO, IMMED_RIGHT_OF, and IMMED_LEFT_OF relations provide information about the relative left-to-right ordering of values. The predefined IS_LOCATED relation associates values with three fixed points in the left-to-right ordering: AT_LEFT | IN_MIDDLE (middle position, meaningful only for odd values of problemDimension), and AT_RIGHT (rightmost position).

There may be more than one set of n-tuples that solve the problem, so the solution you report need not be unique, as long as it is consistent with all the clues. Enough clues will be provided to uniquely answer the question that you are asked to SOLVE, and you may use this fact in directing your search. The n-tuples provided in the solution should be provided in left to right order.

There are no memory restrictions on this problem, except that it must run on my 96MB 8500/200. You should deallocate any dynamically allocated memory before returning. This will be a native PowerPC Challenge, using the latest CodeWarrior environment. Solutions may be coded in C, C++, or Pascal.

Now, back to the beach to find that zebra....

Three Months Ago Winner

Congratulations once again to Ernst Munter (Kanata, Ontario) for submitting the fastest solution to the July Disambiguator Challenge. The problem this month was to implement a partial string matching algorithm similar to that used in Apple's QuickView utility, with the problem complicated by the addition of wildcards. Eight people / teams submitted solutions, and five of those worked correctly. Ernst's winning solution was more than 15% faster than the second place solution by the team of Peter Lewis and Eric Gundrum, whose solution was in turn 50% faster than the next solution.

My test cases used two dictionaries of more than 25000 words each, using more than 800 strings to be matched between the two cases. Some of the strings resulted in a single match, while others generated several dozen, and a few as many as 2500. Overall, the test cases generated more than 170000 matches.

There are two key elements to the speed of the winning solution. First, Ernst creates a digest bitmap for each word in the dictionary. The digest indicates whether the word contains a single occurrence of particular characters, multiple occurrences, or no occurrences. The second key feature is the aggregation of these digests into pages of up to 32 words of equal length. The page digests allow groups of words to be eliminated from detailed consideration if the words do not contain the correct characters. The very detailed commentary provides more insight into other optimizations and special cases in the solution.

The table below lists the execution times in seconds for the combined test cases required by each correct entry. The number in parentheses after the entrant's name is the total number of Challenge points earned in all Challenges to date prior to this one.

Name Time Code Data Language
Ernst Munter (266) 1.86 4924 496 C++
Peter Lewis (32) 2.21 2788 455 C++
Eric Gundrum (10) 2.21 2788 455 C++
Ludovoc Nicolle (21) 3.36 2948 72 C
Jonathan Kleid 13.92 3876 16 C++
Randy Boring (37) 22.96 4072 242 C
A. F. errors 2428 220 C++
D. L. errors 4772 32 Pascal
D. H. errors 1068 24 C++

Top 20 Contestants

Here are the Top Contestants for the Programmer's Challenge. The numbers below include points awarded over the 24 most recent contests, including points earned by this month's entrants.

You might notice that one of the entries this month was a team solution. One previous winner was from a team, and in that case I gave the full point award to both members of the team. I've decided that this approach is inappropriate, so I divided the points for second place this month between the two members of the team. I also retroactively split the point award for the previous winning team entry.

You might also notice that our Editor-in-Chief participated in this month's Challenge. Eric participates for the enjoyment. (In fact, during MacHack '97, Peter and Eric worked on their Disambiguator entry as well as on their hacks for the Hack Contest.) To avoid any appearance of conflict of interest, the prize for any Challenge that Eric might win will be awarded to the author of the second-place entry.

RankNamePoints
1. Munter, Ernst 216
2. Gregg, Xan 63
3. Cooper, Greg 54
4. Lengyel, Eric 40
5. Boring, Randy 39
6. Lewis, Peter 37
7. Mallett, Jeff 30
8. Murphy, ACC 30
9. Nicolle, Ludovic 28
10. Larsson, Gustav 27
11. Antoniewicz, Andy 24
12. Picao, Miguel Cruz 21
13. Day, Mark20
14. Higgins, Charles20
15. Slezak, Ken20
16. Studer, Thomas20
17. Gundrum, Eric15
18. Hart, Alan14
19. O'Connor, Turlough 14
20. Karsh, Bill12

There are three ways to earn points: (1) scoring in the top 5 of any Challenge, (2) being the first person to find a bug in a published winning solution or, (3) being the first person to suggest a Challenge that I use. The points you can win are:

1st place 20 points
2nd place 10 points
3rd place 7 points
4th place 4 points
5th place 2 points
finding bug 2 points
suggesting Challenge 2 points

Here is Ernst's winning solution:

Disambiguator.cp
® 1997 Ernst Munter

/*

Problem Statement

Given a list of words and a findString, find all matching words which start with findString. FindString may contain wild cards ? * and +.

An initialization routine can be used to build a private data structure to speed up searching.

The initial word list may or may not be sorted, but each generated match list should be sorted.

Solution Strategy

A string matching engine is needed to evaluate words from the word list against the find string.

In the simplest case, the match list can be created by comparing each word in the word list with the find string, and then sorting match list. This is the procedure I use when there is insufficient storage available (which may happen if the word list is very short).

When there is enough storage available, a private index is derived from the word list. This is based on word length as well as bit maps representing 64-bit "digests" of words.

On searching with a particular find string, only words of at least the known minimum length need be considered.

The find string is also represented with a digest. The digest does not describe a word uniquely, and a final direct comparison is needed to confirm a word for the match list. But the digest representation helps in filtering out many words and groups of words quickly that need not be compared.

Alphabetical sorting is done on each generated match list.

The String Compare Engine

There are two possibilities:
- findString contains wild cards
- or it does not

If it does, string matching will be performed by a virtual machine executing a very simple byte code program; however, if there are no wild cards, an even simpler comparison routine is used.

In any case, the Parser function processes the input string (findString) into a program and an output string. The output string is simply the input string with wild cards removed.

The first value in the program is the minimum length of any string required to match the original findString. The subsequent program bytes are from the set of instructions

  • READ n read n chars without comparing
  • COMPF n compare n chars, on mismatch exit with FAIL
  • COMPR n compare n chars, on mismatch go back (n-1) and retry

The program always ends with SUCCEED

Special single byte forms of READ1 and COMPn (n<=7) optimize for common cases (I hope), and also ensure that the length of the program is guaranteed not to exceed the length of findString. Since strlen(findString) is known to be < 256, a fixed amount of memory can be provided for the program on the CPU stack.

Private Data Organization

All words of the original word list are represented grouped in pages of 32 (max) words of equal length. Words of length > 31 are not differentiated and are lumped in with pages of 31-char words.

Each page contains pointers to 32 words, their individual word digests, and a summarizing page digest.

Word Digest

Two 32-bit words (28 bits used), one for single, the other for multiple character occurrences in a word.

Digest-1 of a wordList word is a 32-bit computer word, each bit indicating the presence or not of a character value in the wordList word.

Bits in digest-2 are set only if 2 or more of a particular character occur in the word.

The hash function to convert a character into a bit position is tabulated in charTable[128] which maps all legal characters into the range 1 to 28. The values in charTable are roughly in order of frequency of occurrence of letters in an English dictionary.

The digest for a find string is based on the non-wild cards in the string, and thus will be a subset of any match word matching the find string.

Pages

The example below shows an excerpt from a typical page based on the dictionary I used for a word list.

Each row represents the word digest for the referenced word at the end of the row.

The last row is the page digest, the bit-OR of the columns.

Note that for reasons explained below, storage of the word digests is by column, that is the bits shown below are stored in 56 ulongs, each column representing the presence single or multiple, of a particular character in all words of the page.

  - single -          - multiple -
_ qjxzkwvfybhmgpudclotransie= QJXZKWVFYBHMGPUDCLOTRANSIE
00000000000000001000010111100000000000000000000000001010 Tunisian
00000000000000001000010111100000000000000000000000000100 sustains
00000000000011001000010111100000000000000000000000000000 humanist
00000000000000011000010111100000000000000000000001000000 pantsuit
00000000000000011000010111100000000000000000000000000100 puissant
10000000000000000100010111100000000000000000000000001000 stand_in
00000000000000100100010111100000000000000000000000001000 standing
00000000010000000010010111100000000000000000000000010000 fanatics
00000000001000000010010111100000000000000000000001000000 sanctity
00000000011000000010010111100000000000000000000000000000 sanctify
00000000000000100010010111100000000000000000000000000100 castings
00000000000000100010010111100000000000000000000001000000 scatting
00000010000000100010010111100000000000000000000000000000 stacking
00000000000010100010010111100000000000000000000000000000 scathing
00000000000000010010010111100000000000000000000000010000 captains
00000000000000000110010111100000000000000000000000010000 antacids
00000000000000000001010111100000000000000000000000000010 initials
00000000000000000001010111100000000000000000000100000100 installs
00000000000000000001010111100000000000000000000000010010 Italians
00000000010000000001010111100000000000000000000000000010 finalist
00000000001000000001010111100000000000000000000000000010 salinity
00000000000000100001010111100000000000000000000000001000 slanting
00000000000000100001010111100000000000000000000100000000 stalling
00000000000000100001010111100000000000000000000000000010 tailings
00000010000000100001010111100000000000000000000000000000 stalking
00000000000100100001010111100000000000000000000000000000 blasting
00000000000100100001010111100000000000000000000000000000 stabling
00000000000000010001010111100000000000000000000000000010 tailspin
00000000000000110001010111100000000000000000000000000000 stapling
00000000000100001001010111100000000000000000000000000000 Istanbul
00000000000000101001010111100000000000000000000000000000 saluting
00000000000000011001010111100000000000000000000000000000 nuptials
page digest:
10000010011111111111010111100000000000000000000101011110

Before assembling the words into pages, a private temporary word list is built, sorted on digest1. This groups words of similar digests into a page, resulting in sparser, more effective bit patterns for the page digests.

Word Filtering

To find matching words we use the page index to skip past all pages containing words of less than the minimum length.

The page index scan then provides a fast screening function by comparing the find digest with each page digest (a copy of which exists in the index entry).

Once a promising page is identified on the basis of its digest, it is scanned to find matching words.

The findString generates a signature, a string of unique hash values of the original findString. To scan the digest bits in a page for a particular findString, the signature values are used to select columns of bits. These are accumulated (bitwise AND). The resulting 32-bit value identifies, by 1-bit positions, the words matching the findString as far as digests go. The accumulation of columns is abandoned as soon as the accumulator reads 0, indicating that no words on the page will match.

If the accumulator pattern is not 0, one or several words may be indicated, but they still need to be confirmed with the alpha-numerical string matcher before being added to the match list.

Sorting

Sorting of the final match list is done using a form of HeapSort, split into its two components, building of the priority queue (my routine Send()) and down sifting (my routine Sort()).

Special Cases

If findString contains ONLY wild cards, there is no need to match any words except for length. Since word pages are already organized by word length, it remains to send all words from all pages of the minimum length given by findString (the number of '+' and '?' symbols).

Optimizations

When there are no wild cards in findString, a simpler and faster alphanumeric comparison is used.

The pageIndex entries contain a copy of the page digest1. This avoids memory access to many pages which evidently contain no matching words.

The hash function is chosen to represent character frequency in order to cluster words in pages more optimally. This feature relies on presorting the private word list accordingly. The program will work correctly if this sort is turned off, resulting in faster initialization, but slower matches.

In my tests, presorting paid off.

There is some amount of code replication for expediency, notably there are two customized copies of heap sort, and two versions of generating word digests. These could probably avoided without a great loss of speed.

A final optimization is the way word signatures are constructed. These are derived from findString and used to scan the columns in the page bit maps. Rather than convert the findString alphanum characters to signature characters (range 1-56) in the order of occurrence in findString, we arrange it so the rarest characters are tested first.

Assumptions

All words, and findString are < 256 characters long.

There are no zero-length words in wordList.

The absolute minimum amount of private storage is 4 bytes. This ensures that non-initialization of the tables can be determined. In this mode, a sequential scan of wordList is made.

For optimal operation, memory available for private data should be at least 256 bytes + about 13 bytes per word. The exact amount of storage needed depends on the length distribution of the words in wordList.

FindString will be modified by Disambiguator.

Static memory of 384 bytes is used for lookup tables.

*/

typedef unsigned long ulong;

void InitDisambiguator(
 const char *const wordList[],  /* words to match against */
 ulong numWords,                /* number of words in wordList */
 void *privStorage,            /* private storage preinitialized to zero */
 ulong storageSize           /* number of bytes of privStorage */
);

ulong /*numMatch*/ Disambiguator(
 const char *const wordList[],  /* words to match against */
 ulong numWords,            /* number of words in wordList */
 void *privStorage,               /* private storage */
 ulong storageSize,           /* number of bytes of privStorage */
 char *findString,           /* string to match, includes wild cards */
 const char *matchList[]     /* return matched words here */
);


#include <stdlib.h> // It seems unnecessary to mention
#include <string.h> // these with CW-Pro

// NOTE:
// Program Tuning Option
// Turning SORTWORDINDEX off reduces initialization time
//    to 1/3 but increases search time by 50 to 80%
// The best setting depends on dictionary, findString patterns 
//    and the number of searches in a test run.

#define SORTWORDINDEX 1

typedef unsigned char Opcode;

// 'const char' is used a lot.
#define CCC const char

Prototypes
static int Parse(char* spec,Opcode program[]);
static int SimpleMatch(CCC* x,char* spec,int minLen);
static int VMMatch(CCC* x,char* spec,Opcode* program);
static int Comp(CCC* ap,CCC* bp);
static ulong MakeSubDigest(
        CCC* xString,ulong* dig2,char sig[]);
static void Sort(CCC* matchList[],long numMatch);
static void Send(CCC* matchList[],CCC* wp,ulong numMatch);

Static allocations
// charTable serves double duty:
//    in parsing, it helps separate wild cards.
//    in digest forming, it provides a sort order.
static char charTable[128] = {
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 0, 0, 0, 0, 0, 0,
 0,25,12,19,18,28,10,16,13,27, 4, 7,20,14,23,21,
15, 3,24,26,22,17, 9, 8, 5,11, 6, 0, 0, 0, 0, 1,
 0,25,12,19,18,28,10,16,13,27, 4, 7,20,14,23,21,
15, 3,24,26,22,17, 9, 8, 5,11, 6, 0, 0, 0, 0, 0
};


// The table LSB stores the position of the least
// significant '1' bit in a byte (range 1 to 8),
// a zero-byte reports 0.
// This table is built from nested #defines
#define T1 1
#define T2 T1,2,T1
#define T3 T2,3,T2
#define T4 T3,4,T3
#define T5 T4,5,T4
#define T6 T5,6,T5
#define T7 T6,7,T6
#define T8 T7,8,T7

static char LSB[256]={0,T8};

// MISMATCH takes care of case insensitive matching
#define MISMATCH(a,b) ((a^b)&0xDF)

#define MIN(a,b) (a<b?a:b)

// HASH is just shorthand for charTable.
#define HASH(c) (charTable[c])

// Maximum length of strings
#define MAX_LEN 256

const enum {          //compr must be last
SUCCEED=0,FAIL,RETRY,READ,READ1,COMPF,COMPR=COMPF+8};

VMMatch
static int VMMatch(CCC* x,int slen,char* spec,Opcode* program) {
// Virtual machine interpreter.
// VMMatch executes "program" with the help of
// "spec" to determine if string "x" matches.
 int margin=slen-*program;
 if (margin<0) return FAIL;
 CCC* saveX;
 char* saveSpec;
 Opcode* savePC=0;
 Opcode* pgm=program;
 int matchLen,i;
 x--;spec--;
 for (;;) {
  Opcode op=*++pgm;
  switch (op) {
case SUCCEED:
case FAIL:
   return op;
case COMPF+7: if (MISMATCH(*++x,*++spec)) goto retry;
case COMPF+6: if (MISMATCH(*++x,*++spec)) goto retry;
case COMPF+5: if (MISMATCH(*++x,*++spec)) goto retry;
case COMPF+4: if (MISMATCH(*++x,*++spec)) goto retry;
case COMPF+3: if (MISMATCH(*++x,*++spec)) goto retry;
case COMPF+2: if (MISMATCH(*++x,*++spec)) goto retry;
case COMPF+1: if (MISMATCH(*++x,*++spec)) goto retry;
   break;
case COMPF:
   matchLen=*++pgm;
   for (i=0;i<matchLen;i++) {
    if (MISMATCH(*++x,*++spec)) goto retry;
   }
   break;
retry:
// this mysterious retry after mismatch takes care of
// "*x?y" to find word "xxay", matching the second x
   if ((savePC==0)||(0==margin--)) return op;
   pgm=savePC-1;
   x=saveX+1;
   spec=saveSpec;
   break;
case COMPR+7:
case COMPR+6:
case COMPR+5:
case COMPR+4:
case COMPR+3:
case COMPR+2:
case COMPR+1:
   savePC=pgm;
    matchLen=op-COMPR;
    goto try_again;
case COMPR:
    savePC=pgm;
    matchLen=*++pgm;
try_again:
    for (i=1;i<=matchLen;i++) {
    if (MISMATCH(x[i],spec[i])) {
     if (margin--) {
      ++x;
      goto try_again;
      }
      return op;
    }
    }
    saveX=x;
   saveSpec=spec;
   x+=matchLen;
   spec+=matchLen;
   break;
case READ:
   x+=*++pgm;
   break;
case READ1:
   ++x;
   break;
default:
   return op;
  }
 }
}

WordIndex
// The class WordIndex holds a pointer to a word from
// wordList and computes digests and signature for it.
struct WordIndex {
 ulong digest1;
 CCC* word;

 void Init(CCC* wordx) {
  CCC* wp=wordx;
  int s=HASH(*wp);
  ulong dig1=1L<<s;
  for (;;) {
   int c;
   if (0==(c=*++wp)) break;
   s=HASH(c);
   ulong bit=1L<<s;
   dig1 |= bit;
  }
  digest1=dig1;
  word=wordx;
 }

 ulong MultiDigest(char* sig) {
// Returns digest2 for the word,
// and computes its signature
  CCC* wp=word;
  int s=HASH(word[0]);
  ulong digest2=0;
  ulong dig1=1L<<s;
  sig[0]=s;
  for (;;) {
   int c;
   if (0==(c=*++wp)) break;
   s=HASH(c);
   ulong bit=1L<<s;
   if (0==(dig1 & bit)) {
    *++sig=s;
    dig1 |= bit;
   } else if (0==(digest2 & bit)) {
    *++sig=s+28;
    digest2 |= bit;
   }
  }
  *++sig=0;
  return digest2;
 }
 int CompDigest(WordIndex wx) {
  return (wx.digest1)-(digest1);
 }
};

Page
// The class Page holds pointers to 32 words from wordList
// which are of the same length (words >= 31 together).
// Page also contains both word digests for each word,
// in the bits[] array, stored in signature oriented columns.
// A page provides string matching for the 32 words it owns.
struct Page{
 CCC*   word[32];
 char  len[32];
 int   fill;
 Page* next;
 ulong pageDigest1;
// ulong pageDigest2; overlay on unused bits[0]
#define pageDigest2 bits[0]
 ulong bits[57];      
   
 void Init(Page* following) {
// clears all and sets linkage.
// memory may already be precleared, but not necessarily
// since pages may overlay the temporary wordIndex
  memset(this,0,sizeof(Page));
  next=following;
 }

 int IsFull() {return (fill>=32);}

 void Add(WordIndex* wip,int length) {
 // Adds one word to a page,
 // ORs the word digests into the page digests,
 // Also ORs the horizontal bit slice representing word digests
 //   into the bits array.
  char sig[64];
  ulong curbit=1L<<fill;
  len[fill]=length;
  word[fill++]=wip->word;
  pageDigest1 |= wip->digest1;
  pageDigest2 |= wip->MultiDigest(sig);
  int c;
  char* sigp=sig;
  while (0 !=(c=*sigp++)) {
   bits[c] |= curbit;
  }
 }

 ulong Match(char sig[])
 // Accumulates vertical bit slices from a given signature
 // Returns a bit map of likely candidate words
 {
  int c=sig[0];
  ulong acc=bits[c];
  while ((acc) && (0 != (c=*++sig))) {
   acc &= bits[c];
  }
  return acc;
 }

 ulong SendSelectionVM(
    ulong acc,
    char* findString,
    Opcode program[],
    CCC* matchList[],
    ulong numMatch)
 {
 // Uses the "acc" bitmap to identify words for full
 // string matching, and sends matching words to the matchList
 // Uses the LSB array to quickly isolate bits in acc.
  CCC**  wp=word-1;
  char*  lenp=len-1;
  do {
   ulong accLo=acc & 0xFF;
   if (accLo) {
    int j=LSB[accLo];
    acc>>=j;
    wp+=j;lenp+=j;
    if (SUCCEED==VMMatch(*wp,*lenp,findString,program))
     Send(matchList,*wp,numMatch++);
   } else {
    acc>>=8;
    wp+=8;
   }
  } while (acc);
  return numMatch;
 }

 ulong SendSelection(
    ulong acc,
    char* findString,
    Opcode program[],
    CCC* matchList[],
    ulong numMatch)
 {
 // Same as SendSelectionVM, but uses simpler string matching
  CCC** wp=word-1;
  int minLen=program[0];
  do {
   ulong accLo=acc & 0xFF;
   if (accLo) {
    int j=LSB[accLo];
    acc>>=j;
    wp+=j;
    if (SUCCEED==SimpleMatch(*wp,findString,minLen))
     Send(matchList,*wp,numMatch++);
   } else {
    acc>>=8;
    wp+=8;
   }
  } while (acc);
  return numMatch;
 }

 ulong SendAll(CCC* matchList[],ulong numMatch) {
 // Sends all words on page to matchList
  for (int i=0;i<fill;i++)
   Send(matchList,word[i],numMatch++);
  return numMatch;
 }
};

PageIndex
// The class PageIndex contains a pointer to a page, and
// keeps a copy of the page digest1.
// During the scan, PageIndex provides a screening function
// to eliminate unnecessary page accesses if digest1 can
// already rule out all words on a page.
struct PageIndex {
 ulong digest1;  
 Page* page;        
 void Init(Page* thePage) {
  digest1=thePage->pageDigest1;
  page=thePage;
 }
 Page* Screen(ulong findDigest1,ulong findDigest2) {
  if ((findDigest1 ^ (findDigest1 & digest1)))
   return 0;
  return page;
 }
};

PrivateData
// The class PrivateData is the main structure mapped into the
// private memory space allocated on the heap.
// The pageGroup[] array holds pointers to linked lists of
//    pages, according to word length.
// Once all pages are assembled, the page addresses are remapped
//     into a linear page index, sorted by ascending word length
// The indexGroup[i] array points to the the first page of each
//    group of pages of a given word length i.
struct PrivateData {
// Page* nextPage; overlay on unused pageGroup[0]
#define nextPage pageGroup[0]
 Page* pageGroup[32];
// PageIndex* endOfPageIndex; overlay on indexGroup[0]
#define endOfPageIndex indexGroup[0]
 PageIndex* indexGroup[32];
 int bottom[1];

 void Init(
     CCC * const wordList[],
     ulong numWords,
     ulong storageSize)
 {
// Must have at least this much storage to build minimal
// page index system
  ulong minimumStorage=
   sizeof(pageGroup) +
   sizeof(indexGroup) +
   numWords*sizeof(WordIndex) +
   sizeof(PageIndex) +
   sizeof(Page);
  if (storageSize<minimumStorage) {
   nextPage=0;
   return;
  }

#if SORTWORDINDEX  
// Build priority queue of word index items
// This is the insertion step of heap sort
  WordIndex* wordIndexList=(WordIndex*)bottom;
  int i,j,n;
  for (n=0;n<numWords;n++) {
   WordIndex wx;
   wx.Init(wordList[n]);
   WordIndex* base=wordIndexList-1;
    WordIndex z;
   i=n+1,j=i>>1;
   while ((j>0)&&(wx.CompDigest(z=base[j])>0)) {
    base[i]=z;
    i=j;j=i>>1;
   }
   base[i]=wx;
  }

// Unload priority queue, step 2 of heap sort
  nextPage=(Page*)this + storageSize/sizeof(Page) - 1;
  WordIndex* wip;
  WordIndex* base=wordIndexList-1;
  WordIndex x;
  ulong numUnsorted=numWords;
  wip=base+numUnsorted+1;
  if (numUnsorted>1) do {
   i=1;j=2;
   x=base[numUnsorted--];
   *(--wip) = base[1];
   if (numUnsorted<=1) {
    base[1]=x;
    break;
   }
   while (j<=numUnsorted) {
    if ((j<numUnsorted)
     && (base[j].CompDigest(base[j+1])<0))
     j++;
    if (x.CompDigest(base[j])>=0)
     break;
    base[i]=base[j];
    i=j;j+=j;
   }
   base[i]=x;
  } while(1);
#else
// No sorting, just copy and initialize word index
// in whatever order wordList is in.
// This reduces Initialization time at the expense
// of search efficiency because more pages will
// have to be filtered.
  WordIndex* wordIndexList=(WordIndex*)bottom;
  int n;
  for (n=0;n<numWords;n++)
   wordIndexList[n].Init(wordList[n]);

  WordIndex* wip;
#endif

// Unload word index and build pages in linked lists
// based on word length
  nextPage=(Page*)this + storageSize/sizeof(Page) - 1;
  wip=wordIndexList+numWords;
  while (wip>wordIndexList) {
   wip--;
   if (0==InsertWordInPage(wip)) return;
  }

// Map linked lists to a linear index of pages by length
  if (0==BuildIndex()) {
   nextPage=NULL; // not enough storage
   return;
  }
 }

 int InsertWordInPage(WordIndex* wip) {
// Inserts one word in a page, opens a new page if
// none exists or if the current page is full.
  int len=strlen(wip->word);
  if (len==0) return 1; // ignore 0-length words
  int cutLen=MIN(31,len);
  Page* page=pageGroup[cutLen];
  if ((page==0) || (page->IsFull())) {
   Page* temp=page;
   page=nextPage--;
// test if the bottom of the growing page array collides
// with the top of the shrinking word index array
   if (page <= (Page*)wip) {
// not enough storage, we have to bail out
    nextPage=NULL;
    return 0;
   }
   page->Init(temp);
   pageGroup[cutLen]=page;
  }
  page->Add(wip,len);
  return 1;
 }

 PageIndex* BuildIndex() {

// Builds the page index, starting at this->bottom,
// overwriting storage previously used by word index.
  PageIndex* pi=(PageIndex*)bottom;
    PageIndex* piTop=(PageIndex*)nextPage;
  for (int len=1;len<32;len++) {
   Page* page=pageGroup[len];
   indexGroup[len]=pi;
   while (page) {
    if (pi>=piTop) return 0;
    pi++->Init(page);
    page=page->next;
   }
  }
  return (endOfPageIndex=pi);
 }

// Use 'nextPage' as a flag to indicate if we initialized
// nextPage will be NULL if we ran out of storage during
// the initialization.
 void* IsInitialized(){return nextPage;}

 ulong SendAll(CCC* matchList[],ulong minLen) {
// Sends all words >= minimum length from all pages
  ulong numMatch=0;
  for (int len=MIN(31,minLen);len<32;len++) {
   Page* page=pageGroup[len];
   while (page) {
    numMatch=page->SendAll(matchList,numMatch);
    page=page->next;
   }
  }
  return numMatch;
 }

 ulong CollectVM(
     char* findString,
     CCC* matchList[],
     Opcode program[])
 {
// CollectVM scans all pages above the minimum length,
// matches using the virtual machine string matcher,
// and sends matched words into matchList
  char sig[64];
  ulong findDigest2;
  ulong findDigest1=MakeSubDigest(findString,&findDigest2,sig);
  ulong numMatch=0;
  PageIndex* pi=indexGroup[MIN(31,program[0])];
  for (;pi<endOfPageIndex;pi++) {
   Page* page=pi->Screen(findDigest1,findDigest2);
   if (page) {
    ulong acc=page->Match(sig);
    if (acc) numMatch=page->SendSelectionVM(
     acc,findString,program,matchList,numMatch);
   }
  }
  return numMatch;
 }

 ulong Collect(  
    char* findString,
    CCC* matchList[],
    Opcode program[])
 {
// Collect is similar to CollectVM but uses the simpler
// string matching function.
  char sig[64];
  ulong findDigest2;
  ulong findDigest1=MakeSubDigest(findString,&findDigest2,sig);
  ulong numMatch=0;
  PageIndex* pi=indexGroup[MIN(31,program[0])];
  for (;pi<endOfPageIndex;pi++) {
   Page* page=pi->Screen(findDigest1,findDigest2);
   if (page) {
    ulong acc=page->Match(sig);
    if (acc) numMatch=page->SendSelection(
     acc,findString,program,matchList,numMatch);
   }
  }
  return numMatch;
 }
};

InitDisambiguator
void InitDisambiguator(
// InitDisambiguator to be called from the application
 CCC *const wordList[],
 ulong numWords,
 void *privStorage,
 ulong storageSize
) {
// Just sets up the private data structure
 PrivateData* PD=(PrivateData*)privStorage;
 PD->Init(wordList,numWords,storageSize);
}

Disambiguator
ulong Disambiguator(
// Disambiguator to be called from the application
 CCC *const wordList[],
 ulong numWords,
 void *privStorage,
 ulong storageSize,
 char *findString,
 CCC *matchList[]
) {
 PrivateData* PD=(PrivateData*)privStorage;

// Allocates space for the longest possible VM program on
// the stack and parses find string
 Opcode program[MAX_LEN];
 int useVM=Parse(findString,program);

// findstring is now stripped of wildcards
 ulong numMatch;

 if (PD->IsInitialized()) {
// should always be true except ..
  if ((*findString) || (program[0]>31)) {
// Normal findString with alphanum characters
// or wildCards only, but minLength > 31
    
   if (useVM) {
// wild cards detected, must use VM matching
    numMatch=PD->CollectVM(findString,matchList,program);

   } else {
// no wild cards, can use faster simple matching function
    numMatch=PD->Collect(findString,matchList,program);
   }

  } else {
// No characters to match, minimum length <= 31
// we can simply send all >= minimum length
   numMatch=PD->SendAll(matchList,program[0]);
  }

 } else {
// if we get here, PD was not initialized, and we have to
// just scan the entire word list for matches
// We don't really expect to get here except with
// extremely short word lists.
  numMatch=0;
  if ((*findString) || (program[0]>31)) {
   if (useVM) {
    for (ulong i=0;i<numWords;i++)
     if (SUCCEED==
      VMMatch(wordList[i],strlen(wordList[i]),
        findString,program))
      Send(matchList,wordList[i],numMatch++);
   } else {
    for (ulong i=0;i<numWords;i++)
     if (SUCCEED==
      SimpleMatch(wordList[i],findString,program[0]))
      Send(matchList,wordList[i],numMatch++);
   }
  } else {
   for (ulong i=0;i<numWords;i++)
    if (strlen(wordList[i]) >= program[0])
     Send(matchList,wordList[i],numMatch++);
  }
 }

// Final sort (step 2 of heap sort) for match list
 Sort(matchList,numMatch);
 return numMatch;
}

SimpleMatch
static int SimpleMatch(
    CCC* x,
    char* findString,
    int minLen) {
// alphanumeric matche of x against findString,
// no wild cards allowed
 x--;findString--;
 for (int i=0;i<minLen;i++) {
  if (MISMATCH(*++x,*++findString)) return FAIL;
 }
 return SUCCEED;
}

Send
static void Send(
// Inserts word wp in matchList as a priority queue
// in preparation for sorting later
 CCC* matchList[],
 CCC* wp,
 ulong numMatch) {
 CCC** base=matchList-1;
 CCC* z;
 long i=numMatch+1,j=i>>1;
 while ((j>0)&&Comp(wp,z=base[j])>0) {
  base[i]=z;
  i=j;
  j=i>>1;
 }
 base[i]=wp;
}

Comp
static int Comp(CCC* ap,CCC* bp) {
// Alphabetic case insensitive string comparator
 char a=*ap;
 if (a) do {
  char b=*bp;
  if (MISMATCH(a,b)) {
   return (a | 0x20) - (b | 0x20);
  }
  a=*++ap;
  bp++;
 } while (a);
 return -1;
}

Parse
static int Parse(char* findString,Opcode program[]) {
// Scans the findString and creates a bytecode program.
// All wild cards are stripped from findString.
// program[0] contains the minimum length of words to match,
// the rest of program[] contains tokens from the enum set.
// Returns the number of wild cards removed from findString.
#define EMIT(x) {*++pgm=x;}
 Opcode* pgm=program;
 char* newFindString=findString;
 int onFailure=FAIL;
 char c=*findString;
 int n,minLen=0,usesWild=0;
 for (;;) {
  if (0==charTable[c])
  switch(c) {
case '?':
   n=0;
   do {n++;} while ('?'==(c=*++findString));
   if (n==1) EMIT(READ1) else {
    EMIT(READ);
    EMIT(n);
   }
   minLen+=n;
   usesWild++;
   break;
case '+':
   minLen++;
   EMIT(READ1);
case '*':
   c=*++findString;
   onFailure=RETRY;
   usesWild++;
   break;
default:
   EMIT(SUCCEED);
   *newFindString=0; //zero terminate the new FindString
   program[0]=minLen;
   return usesWild;
  } else {
   n=0;
   do {
    n++;
    *newFindString++=c;  // copy non-wilds to new string
   } while (charTable[c=*++findString]);
   if (FAIL==onFailure) {
    if (n<=7) EMIT(COMPF+n) else {
     EMIT(COMPF);
     EMIT(n);
    }
   } else {
    if (n<=7) EMIT(COMPR+n) else {
     EMIT(COMPR);
     EMIT(n);
    }
   }
   onFailure=FAIL;
   minLen+=n;
  }
 }
}

MakeSubDigest
static ulong MakeSubDigest(
    CCC* xString,
    ulong* dig2,
    char sig[]) {
// Creates a pair of word digests and a signature from
// findString. No duplicate signature characters.
 int s=HASH(xString[0]);
 ulong digest1=1L<<s;
 ulong multi=0;
 for (;;) {
  int c;
  if (0==(c=*++xString)) break;
  s=HASH(c);
  ulong bit=1L<<s;
  if (0==(digest1 & bit)) {
   digest1 |= bit;
  } else {
   if (0==(multi & bit)) {
    multi |= bit;
   }
  }
 }
// Make the signature in bit order, that is with the
// less frequent English letters first, so that page
// scanning will fail as soon as possible if no word
// in the page will match.
 ulong bit1=2,bit2=2;
 ulong single=digest1 & (~multi);
 int s1,s2;

// Test rarest letter combinations first
 if (multi) {
  for (s2=29;s2<=56;s2++) {
   if (multi & bit2) *sig++=s2;
   bit2 += bit2;
  }
 }

 for (s1=1;s1<=28;s1++) {
  if (single & bit1) *sig++=s1;
  bit1 += bit1;
 }
 *sig++=0;
 *dig2=multi;
 return digest1;
}

Sort
static void Sort(CCC* matchList[],long numMatch) {
// Heap sort step 2, used for final sorting of the
// match list.
 CCC** sList=matchList-1;
 CCC* x;
 int i,j;
 CCC** b=sList+numMatch+1;
 if (numMatch>1) do {
  i=1;j=2;
  x=sList[numMatch--];
  *(--b) = sList[1];
  if (numMatch<=1) {
   sList[1]=x;
   break;
  }
  while (j<=numMatch) {
   if ((j<numMatch)
   && (Comp(sList[j],sList[j+1])<0))
    j++;
   if (Comp(x,sList[j])>=0)
    break;
   sList[i]=sList[j];
   i=j;j+=j;
  }
  sList[i]=x;
 } while(1);
}
 

Community Search:
MacTech Search:

Software Updates via MacUpdate

OmniOutliner Pro 4.2 - Pro version of th...
OmniOutliner Pro is a flexible program for creating, collecting, and organizing information. Give your creativity a kick start by using an application that's actually designed to help you think. It's... Read more
VLC Media Player 2.2.1 - Popular multime...
VLC Media Player is a highly portable multimedia player for various audio and video formats (MPEG-1, MPEG-2, MPEG-4, DivX, MP3, OGG, ...) as well as DVDs, VCDs, and various streaming protocols. It... Read more
Nisus Writer Pro 2.1.1 - Multilingual wo...
Nisus Writer Pro is a powerful multilingual word processor, similar to its entry level products, but brings new features such as table of contents, indexing, bookmarks, widow and orphan control,... Read more
Tinderbox 6.2.0 - Store and organize you...
Tinderbox is a personal content management assistant. It stores your notes, ideas, and plans. It can help you organize and understand them. And Tinderbox helps you share ideas through Web journals... Read more
OmniOutliner 4.2 - Organize your ideas,...
OmniOutliner is a flexible program for creating, collecting, and organizing information. Give your creativity a kick start by using an application that's actually designed to help you think. It's... Read more
Things 2.5.4 - Elegant personal task man...
Things is a task management solution that helps to organize your tasks in an elegant and intuitive way. Things combines powerful features with simplicity through the use of tags and its intelligent... Read more
NeoOffice 2014.10 - Mac-tailored, OpenOf...
NeoOffice is a complete office suite for OS X. With NeoOffice, users can view, edit, and save OpenOffice documents, PDF files, and most Microsoft Word, Excel, and PowerPoint documents. NeoOffice 3.x... Read more
iPhoto Library Manager 4.2 - Manage mult...
iPhoto Library Manager allows you to organize your photos among multiple iPhoto libraries, rather than having to store all of your photos in one giant library. You can browse the photos in all your... Read more
Web Snapper 3.3.8 - Capture entire Web p...
Web Snapper lets you capture Web pages exactly as they appear in your browser. You can send them to a file as images or vector-based, multi-page PDFs. It captures the whole Web page - eliminating the... Read more
TeamViewer 10.0.41404 - Establish remote...
TeamViewer gives you remote control of any computer or Mac over the Internet within seconds, or can be used for online meetings. Find out why more than 200 million users trust TeamViewer! Free for... Read more

Chainsaw Warrior: Lords of the Night has...
It's time to put the Darkness back in its place now that Chainsaw Warrior: Lords of the Night has officially made it to iOS. | Read more »
A World of Ice and Fire Lets You Stalk 2...
George R. R. Martin’s A World of Ice and Fire, by Random House, is a mobile guide to the epic series. The new update gives you the Journeys map feture that will let you track the movements of 25 different characters. But don't worry, you can protect... | Read more »
Gameloft Announces Battle Odyssey, a New...
Battle Odyssey, Gameloft's newest puzzle RPG, is coming to the App Store next week. Set in the world of Pondera, you will need to control the power of the elements to defend the world from evil. You'll be able to entlist over 500 allies to aid you... | Read more »
Fusion - HDR Camera (Photography)
Fusion - HDR Camera 1.0.0 Device: iOS Universal Category: Photography Price: $1.99, Version: 1.0.0 (iTunes) Description: Fusion creates HDR (high dynamic range) photos by capturing different exposures and then combining them into one... | Read more »
Sago Mini Toolbox (Education)
Sago Mini Toolbox 1.1 Device: iOS Universal Category: Education Price: $2.99, Version: 1.1 (iTunes) Description: Come build with the Sago Mini friends! Use a wrench, try a saw, or hammer some nails. From sewing hand puppets to... | Read more »
You Should Probably Grab Hitman GO While...
Hitman GO is a surprisingly cool (yet also incredibly drastic) departure from the Hitman series. It's well worth playing for any puzzle game fans out there, and at the moment you can get your hands - or garrotte if you will - on it for a mere $0.99... | Read more »
IFTTT is Bringing Do Button and Do Note...
IFTTT has announced Do Button and Do Note for the Apple Watch. Do Button lets you make your own personalized button that can connect to things like your Google Drive, control the temperature in your home with Nest Thermostat, or turn the lights on... | Read more »
How Many Days, Hours, and Minutes Are Le...
Countdown, by Yves Tscherry, is now available on the App Store. The app keeps track of countdowns to your favorite things such as someones birthday or days till the New Year. You can display the time in seconds, minutes, hours, days, weeks, months,... | Read more »
The All-New Misfit 2.0 App is Available...
Misfit has just given their app a complete overhaul. Misfit 2.0 now has a brand new interface with a sleek design and is easier to navigate. You'll be able to sync your Misfit device and look up health and fitness information faster than ever before... | Read more »
Halo: Spartan Strike (Games)
Halo: Spartan Strike 1.0 Device: iOS Universal Category: Games Price: $5.99, Version: 1.0 (iTunes) Description: Delve into 30 challenging missions through cities and jungles using a devastating arsenal of weapons, abilities and... | Read more »

Price Scanner via MacPrices.net

TigerText Introduces First Secure Enterprise...
TigerText, a provider of secure, real-time messaging for the enterprise, has announced the launch of TigerText for the Apple Watch. TigerText for the Apple Watch enables users to securely send and... Read more
The Conservation Fund Partners with Apple To...
The Conservation Fund has announced that it will partner with Apple to help protect working forests in the United States. The Apple initiative will conserve more than 36,000 acres of working... Read more
Clearance 13-inch 2.6GHz Retina MacBook Pro a...
B&H Photo has clearance 2014 13″ 2.6GHz/128GB Retina MacBook Pros now available for $1099, or $200 off original MSRP. Shipping is free, and B&H charges NY sales tax only. Read more
Apple refurbished 2014 13-inch Retina MacBook...
The Apple Store has Apple Certified Refurbished 2014 13″ Retina MacBook Pros available for up to $400 off original MSRP, starting at $979. An Apple one-year warranty is included with each model, and... Read more
iMacs on sale for up to $205 off MSRP, NY tax...
B&H Photo has 21″ and 27″ iMacs on sale for up to $205 off MSRP including free shipping plus NY sales tax only: - 21″ 1.4GHz iMac: $1019 $80 off - 21″ 2.7GHz iMac: $1189 $110 off - 21″ 2.9GHz... Read more
Sale! 16GB iPhone 5S for $1 with service
Best Buy is offering 16GB iPhone 5Ss for $1.00 with 2-year activation at a participating cellular provider. Choose free home shipping and activation, or buy online and activate during in-store pickup... Read more
Apple refurbished 2014 MacBook Airs available...
The Apple Store has Apple Certified Refurbished 2014 MacBook Airs available starting at $679. An Apple one-year warranty is included with each MacBook, and shipping is free. These are currently the... Read more
27-inch 3.5GHz 5K iMac on sale for $2349, sav...
 Adorama has the 27″ 3.5GHz 5K iMac in stock today and on sale for $2349 including free shipping plus NY & NJ sales tax only. Their price is $150 off MSRP. For a limited time, Adorama will... Read more
Save up to $380 on an iMac with Apple refurbi...
The Apple Store has Apple Certified Refurbished iMacs available for up to $380 off the cost of new models. Apple’s one-year warranty is standard, and shipping is free: - 27″ 3.5GHz 5K iMac – $2119 $... Read more
iFixIt Teardown Awards 12-IInch Retina MacBoo...
iFixIt has posted its illustrated teardown of the new 12-inch MacBook Retina. They note that this new MacBook is less than half the thickness of the last Apple notebook called just “MacBook” back in... Read more

Jobs Board

*Apple* Solutions Consultant - Retail Sales...
**Job Summary** As an Apple Solutions Consultant (ASC) you are the link between our customers and our products. Your role is to drive the Apple business in a retail Read more
*Apple* Retail - Multiple Positions (US) - A...
Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, you're also the Read more
Technical Project Manager - *Apple* Pay - A...
**Job Summary** Apple Pay is seeking an experienced technical PM to…manage the on boarding of new merchants for the Apple Pay platform in the US Within this role you Read more
Senior Identity Architect - *Apple* Pay - A...
**Job Summary** Apple , Inc. is looking for a highly motivated, innovative and hands-on senior identity architect to join the Apple Pay Engineering team. You will Read more
Program Manager, *Apple* Retail Global Tale...
…for the worldwide creation and implementation of the key talent development initiatives within Apple Retail. He or she will work closely with our Retail Corporate team, Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.