TweetFollow Us on Twitter

Support Multibyte Text

Volume Number: 14 (1998)
Issue Number: 1
Column Tag: Toolbox Techniques

Supporting Multi-byte Text in Your Application

by Nat McCully, Senior Software Engineer, Claris Corporation

How to make most efficient use of the Script Manager and International APIs in your text engine


You have developed the next greatest widget or application and you want to distribute it on the Net. Your application has a text engine, maybe simple TextEdit, or one of the more sophisticated engines available for license, or even one you wrote yourself. One day, you get an e-mail from someone in Japan:

Dear Mr. McCully,
My name is Takeshi Yamamoto and I use your program, MagicBook, everyday. But, I have a problem using Japanese characters in it. When I hit the delete key, I get weird garbage characters. My friends and I wish to use both Japanese and English in your program, but it does not work properly. Please fix it!
T. Yamamoto

Suddenly, there are people on the other side of the world who want to use your application in their language, and you are faced with a dilemma. You have no first-hand knowledge of the language itself, but you may be somewhat familiar with the Macintosh's ability to handle multiple languages in a single document with ease. Simply cracking open Inside Macintosh: Text seems daunting. How will these new routines affect the performance of your program? Will you introduce unwanted instability and anger your existing user base? Where can you find information on how to use which routines best, not just a description of what each routine does?

This article attempts to address some of these issues, and in general familiarize you with some of the best things that make the Mac an excellent international computing environment. Intelligent use of the Macintosh's international routines, WorldScript, and the other managers in the Toolbox can be the difference between a US-only application and a truly "world-ready" tool that any user, anywhere, can utilize as soon as they download it to their hard disk. Although this paper deals primarily with Japanese language issues, the concepts outlined herein can be used with any multi-lingual environment.

What is WorldScript?

WorldScript is the set of patches to the system that enables the correct display and measurement of multi-lingual text. Over time, many of these patches have been rolled in to the base system software, but even in MacOS 7.6.1, you will find a set of WorldScript extensions in the Extensions folder when you install one of the Language Kits available from Apple. The concepts and code snippets in this paper will work equally well on, for example, the Japanese localized Mac OS, or on a standard U.S. system with the Japanese Language Kit (JLK). A good source of localized system software and Language Kits is the Apple Developer Mailing CD-ROM, available from the Apple Developer Catalog. WorldScript is one of the Apple-only technologies that makes multi-lingual computing possible in a far easier way than the other guys. When it comes to having Chinese, Korean and Japanese all in the same document, WorldScript, on Mac OS, is the only thing out there.

Multi-byte Text on the Mac

OK, let's get to the meat, you say. How do you make your text engine handle two-byte characters? Well, before giving you a bunch of code, let's explain how the Mac handles two-byte text.

What is a Script?

Each language that the Mac supports is grouped into categories called "scripts." For example, English and the other Roman letter-based languages like French and German all belong to the Roman script. Japanese belongs to the Japanese script. Character glyphs in the Roman script are each represented by a single-byte character code. Japanese characters are represented by a 16-bit (2 byte) character code.

Setting Up the Port: Pre-System 7 API's versus New API's

On the Mac, each font is also associated with a script. There are Roman script fonts like Helvetica and Palatino, and Japanese script fonts like Osaka and Heisei Minchou. As you know, when text is drawn into a QuickDraw grafPort, you first set up the port with the appropriate font, size and style, and then call DrawText() to draw the text. If you are using the old Script Manager API's (like CharByte() and GetEnvirons()), you need to set the port to a font in the script you are interested in processing. Once the port is set to a particular font, calls to the Script Manager will follow the rules of that font's script. So the port, by way of setting the font, also has an implicit script setting. This is the key to using the Script Manager routines so they will return the correct information to your application using the older API's. The new API's have a script parameter, so it is not necessary to set the font of the port before using them. Since the Script Manager doesn't have to call FontScript() to find out the script of the current font before passing the script to WorldScript, using the newer API's could speed up your application is certain cases.

Adding Script-savvy Features to your Application

First, you need to determine if the user's system has a non-Roman script system installed. Many programs boost performance by calling Script Manager routines only when absolutely necessary. One way to find out all the scripts installed is to loop through all 33 possible script codes (smRoman being 0 and smUninterp being 32) and calling GetScriptManagerVariable() with the selector smEnabled. Roman script is always enabled.

Listing 1: Finding Which Scripts are Installed

ScriptCode   script;
for (script = smRoman + 1; script <= smUninterp; script++)
   if (GetScriptManagerVariable(script, smEnabled))
            // non-Roman script present...

InitInternalScriptInfo() refers to a routine you write that will initialize your internal data structures that deal with specific script systems, such as line-breaking tables, on a script-by-script basis.

Line Breaking

Most applications don't rely entirely on the Script Manager for line breaking, hit testing, or word selection, because using those routines is thought to be too slow. It is possible to optimize your text engine so that you incorporate the correct behaviors for each script system present, while maintaining the highest possible performance. The Toolbox call for finding line breaks is StyledLineBreak(). To use it, however, you must restrict the text you pass it to lengths of less than 32K (actually, this is true of the whole Script Manager, so tough) and text widths to whole pixel values (can you say 'rounding error?'), and if you are explicitly scaling the text, it won't work at all. You must also organize the text you pass to it in terms of script runs and style runs within them. Therefore, most applications that have word-processing functionality choose to implement their own line-breaking code that is customized for their own needs. Unfortunately, many of these private implementations break when used on WorldScript systems.

Line Breaking with Japanese Text

The simplest line-breaking algorithm for English text is to look for a space (ASCII 0x20) character in the line near the graphic break, and if there is none, to break on the byte-boundary nearest the graphic break. Japanese text is a bit more complicated. Japanese text has no spaces, so you must break at the character boundary nearest the graphic break. There is an additional wrinkle: Certain characters are not allowed to begin a line, and certain characters are not allowed to end a line. This set of line-breaking rules is referred to as Kinsoku shori. For example, you cannot begin a line with a two-byte period. You cannot end a line with a two-byte open parenthesis followed by more text on the next line. A list of kinsoku characters is available from the Japanese Standards Association in the form of a Japanese Industrial Standard (JIS) document. It is also in Ken Lunde's excellent book, Understanding Japanese Information Processing, in the section entitled "Japanese Hyphenation." While not all Japanese agree on the correct set of kinsoku characters, this set is a good default. Some applications allow the user to edit the kinsoku character set to their own liking.

Once you know that the current byte offset in your text is on or just before the graphic break, you need to see if that byte is part of a two-byte character. Then you need to see if the character is a character that can't end a line. Then you need to check the character after it to see if it is a character that can't begin a line. This can be repeated as necessary, for support of a string of kinsoku characters. For example, suppose the character on the graphic break (the break char) can end a line, but the character after it can't begin a line, causing the break char to wrap. However, the character before the break char is one that can't end a line, so you must then check the char before it, and so on, and so on, and... The example below is simplified to illustrate a particular case; actual code for an application would probably be organized differently.

Listing 2: Checking Graphic Break Char with Kinsoku Processing

Check a text run's length against the pixel edge of the line
(the so-called graphic break), and then adjust the line break
if there is an illegal Kinsoku character on the break. 

UInt16 *   gStartLineKinsokuChars;   // chars that can't begin
                                     // a line.
UInt8      gNumStartKinsokuChars;    // number of chars above.
UInt16 *   gEndLineKinsokuChars;     // chars that can't end a line
UInt8      gNumEndKinsokuChars;      // number of chars above.

// This function will return FALSE if the char at offset
// is not a valid break point. It checks the char after it,
// but not the char before it, for kinsoku.
static Boolean CheckGraphicBreak(UInt8 *      textPtr,
                                              Uint16      offset,
                                              ScriptCode script);
   SInt16 result;

   // The textPtr starts at a known 'good' character
   // boundary. In this case it is the beginning of the
   // line, but it could be the beginning of the
   // stylerun.

   // Find out if script only has 1-byte chars. If so,
   // we assume it's ok to break at this char.
   if (GetScriptVariable(script, smScriptFlags) &
               (1 << smsfSingByte))
      return TRUE;

   result = CharacterByteType((Ptr)textPtr,
   if (result == smSingleByte)
      return TRUE;   // In real life, you're not done
                        // until you check the chars before
                        // and after this one for kinsoku.
   if (result == smFirstByte)
      return FALSE;
   if (result == smLastByte)
      UInt8      index;
      UInt16   theChar = *(UInt16 *)&textPtr[offset - 1];

      // Now we have a valid break on a 2-byte char.
      // We need to check if it's a kinsoku character.
      // This code checks Japanese kinsoku only, but
      // with a little work this could be extended to
      // all 2-byte scripts that don't break on spaces.
      if (script != smJapanese)
         return TRUE;
      for (index = 0; index < gNumEndKinsokuChars; index++)
         if (theChar == gNumEndKinsokuChars[index])
            return FALSE;
      // Now we check the char after this one, in case it
      // is a char that can't start a line. First see if
      // it's a 1-byte char. In real life, there are 1-byte
      // kinsoku chars to check for.
      if (textPtr[offset + 1] == NULL || 
         CharacterByteType((Ptr)&textPtr[offset + 1],
                  0, script) == smSingleByte)
         return TRUE;

      theChar = *(UInt16 *)&textPtr[offset + 1];
      for (index = 0; index < gNumStartKinsokuChars;
         if (theChar == gNumStartKinsokuChars[index])
            return FALSE;
   return TRUE;

Hit Testing

Hit testing is another area in your text engine that demands the highest possible performance. When the user clicks in the text, any delay in setting the insertion point there will be noticed. Drag selection is another example of the same code working hard to find the character boundaries and setting the correct hilite area.

Some applications use a locally allocated cache of possible first byte character codes that they use to test a particular character in the text stream for byteness. This is simple to create with the Toolbox call FillParseTable(). FillParseTable() returns in your pre-allocated 256 byte buffer all the bytes that can be a first byte of a two-byte character in the script you pass to it. Be aware that in some scripts, some character codes can be both the first byte of a two-byte character as well as the second byte of a two-byte character, depending on their context within the text stream. Therefore, you need more than just this information to successfully find out what kind of character the byte you're interested in is a part of. In a mixed stream of text with both one-byte and two-byte characters, using the parse table in a single pass over the text is much faster than calling CharacterByteType() for each byte. An example of this is below, in a sample function that goes through a text stream and counts the number of characters in it:

Listing 3: Counting the Chars in Mixed-byte Text

Counts the number of characters in a run of contiguous script
(font), checking for both one-byte and two-byte characters.

UInt32 CountCharsInScriptRun(UInt8 * textPtr, UInt32 length,
               ScriptCode script)
   UInt8      parseTable[256];
   UInt32   curByte, charCount;

   (void)FillParseTable(&parseTable, script);
   for (curByte = 0L, charCount = 0L; curByte < length;
      if (parseTable[textPtr[curByte]] == 1)
   return (charCount);

Notice that because we started at a known 'good' boundary, we were able to test only the first bytes of the two-byte characters in the stream as we counted along. This code would not work in all cases if we started at an arbitrary point in unknown text, because of the ambiguity of the byteness of some character codes in some scripts. Caching the parse tables for all installed scripts in the user's system at launch time would further speed up your processing, so you wouldn't have to call FillParseTable() every time.

Measuring Two-byte Characters

On the Mac, all two-byte characters are the same width. In a future system software release, proportional two-byte characters will be supported, but up until now all two-byte-savvy applications assume mono-spaced two-byte characters, and even if proportional characters are supported, they will be mono-spaced by default so as not to break every application currently shipping.

Before the Mac OS supported measuring two-byte characters with TextWidth(), a special code point in the single-byte 256 char width table was reserved for the two-byte character width for that font. In the Japanese and both Chinese scripts, this code point is 0x83. In Korean script, it is 0x85. This code point still works, even though Apple now recommends you use TextWidth() for all measuring of multi-byte or mixed text. In the future for proportional measuring, TextWidth() will probably be what you will use.

Below is an example of a function that measures any text, and returns the amount in a Fixed variable. This is useful if you are measuring text and the user has Fractional Glyph Widths turned on (meaning you made a call to SetFractEnable()).

Listing 4: Measuring Mixed-byte Text

This function supports multiple styleruns in the text, and consists
of two loops: One for the styles and one for the characters in each
style. The port is set on each stylerun and then Macintosh APIs
are called to measure the text. 

#define JSCTCWidthChar   0x83
#define KWidthChar   0x85

typedef struct tagStyleRun {
   UInt32     styleStart;
   UInt16     font;
   UInt16     size;
   UInt8      face;
} StyleRun,   *StyleRunPtr;

Fixed GetTextWidth(UInt8 * textPtr, UInt32 length,
                           StyleRunPtr styleRuns, UInt32 numStyles)
   ScriptCode      curScript;
   UInt32          byteNum, styleNum;
   Fixed           totalWidth = 0L;
   ScriptCode      curScript;
   FMetricRec      curFontMetrics;
   WidthTable **   curWidthTable;
   UInt8           parseTable[256];
   // loop thru each stylerun, measure its characters
   for (styleNum = 0L; styleNum < numStyles; styleNum++)
      // Set up the port (in real life, you'd restore the
      // old settings when you exit) 
      curWidthTable = curFontMetrics.wTabHandle;
      curScript = FontScript();
      (void)FillParseTable(&parseTable, curScript);
      // loop thru each char in the stylerun
      for (byteNum = styleRuns[styleNum].styleStart;
            (styleNum + 1 < numStyles &&
               byteNum < styleRuns[styleNum+1].styleStart) ||
            (styleNum + 1 >= numStyles && byteNum < length);
         if (parseTable[textPtr[byteNum]] == 1)
            if (curScript == smJapanese ||
                  curScript == smTradChinese ||
                  curScript == smSimpChinese)
               totWidth +=
            else if (curScript == smKorean)
               totWidth +=
               totWidth += (Fixed)
                  TextWidth(&textPtr[byteNum], 0, 2) << 16;

            totWidth += 

   return (totWidth);

The above function still makes expensive calls like FontMetrics(), FillParseTable() and TextWidth() on each stylerun. It would be an even better idea to have a local cache of the width tables and parse tables of fonts you know are in the document, so you don't have to rebuild them every time the user clicks or drags or types in the text.

So, now that you have a relatively fast way of measuring the text, you can use it to find the pixel value of any character in the text, and use that for your internal CharToPixel and PixelToChar logic. Or, you can use the Mac OS Toolbox calls CharToPixel() and PixelToChar(), which will always work on any script but may be slower.

Localizing Your Application for Japan

Now that we have reviewed a few of the basic text engine issues for handling two-byte text, there are a few things about Japan in particular that make localization a challenge.

Japan is possibly the most interesting major software market to localize for, if you are interested in text and text layout. It is a mature market, with a diverse number of products enjoying many millions of dollars in sales each year. The Macintosh has a larger market share there than in the U.S. or Europe. Text in Japan has traditionally been difficult to input and output using machines, and the use of text in graphic design requires that the text layout be extremely flexible. The characters are complex (so complex that bolding them may make them illegible), and emphasis or adornment has forms that use background shading, different types of lines around the text, and even dots or ticks above or to one side of each character. Condensed and extended faces have different results on PostScript printers than they do on QuickDraw. Bold and italic faces were not supported on the first PostScript Japanese printers. Underlines are not drawn by QuickDraw when the font is a Japanese font. These last two things might be fixed in future releases of the system software, but for now the application developer must work around them.

For underline, you must draw a line under the text. The reason QuickDraw doesn't draw it for you is that it usually uses the font's baseline as the underline location, but Japanese fonts' two-byte glyphs take up more room and descend below the baseline. Where you draw your underline is up to you, but take a look at how other Japanese programs do it and make it fairly consistent. Vertical text is pretty much a checkbox item nowadays in Japanese word-processing programs. Most novels and many magazines are layed out vertically, but until recently computers were horizontal-only. While the Windows95® APIs support drawing text vertically, the Mac OS still does not, outside of using QuickDrawGX typography (which is excellent, by the way). In comparing vertical text to horizontal text, several things change about the line layout: The first line starts at the top right, and the text flows down to line-end, then wraps to the next line, which is to the left of the first line; the baseline is generally considered to be in the center of the line; underlines are drawn to the right of the text, as are emphasis dots; two-byte characters are not rotated, but single-byte characters are, 90° clockwise; certain characters have vertical text variants, like many punctuation characters. Where these variants are in the font can be found in the 'tate' table in the font ("tahteh" means "vertical" in Japanese).

Rubi are small annotation characters, placed above, below or to the side of the text they annotate. Usually they provide pronunciation guidance for unusual or hard-to-pronounce Kanji characters.

Date formats in Japan include the current year of the emperor's reign; again, supported on Windows95® but not on Mac OS. It is up to the application to support these formats if so desired. Also, date formats 2 and 3 produce identical results, due to the fact that the abbreviated month and the long month are the same thing in Japanese. Japanese applications may opt to substitute a different format in one of those formats' place.

Find and Replace needs to be expanded to include the different types of characters used in Japanese. Standard Japanese text may contain any of the following types of characters: one-byte Roman, two-byte Roman, one-byte numerals and symbols, two-byte numerals and symbols, one-byte katakana syllables, two-byte katakana syllables, two-byte hiragana syllables, and two-byte Kanji characters. The hiragana and katakana characters are equivalent in terms of the sounds they represent in Japanese, so a good Find/Replace function should include an option to find the search string in either syllabary.

Sorting in Japanese is difficult because the Kanji characters can have different pronunciations depending on their context. To sort Kanji correctly, you need a separate kana key field that indicates the pronunciation and you sort on that. Also, Mac OS CompareText() doesn't sort the long sound symbol correctly (that symbol changes sound depending on the character before it, but Mac OS always sorts it in the symbols area), so for linguistically correct sorting you need to write your own sorting routine.

If your application supports character tracking using the Color QuickDraw function CharExtra(), be aware that the CGrafPort member chExtra only uses 4 bits for signed integer values and the other 12 bits for the fraction. The value you pass to CharExtra() is a Fixed value of how many pixels you want to track out (or in) the text, and QuickDraw divides that by the current text size, to arrive at the chExtra value. This means that if the tracking value you pass to CharExtra() is greater than 8 times the text size, the chExtra field will go negative, and your text will be drawn incorrectly. Unfortunately, Japanese text is routinely tracked out beyond this limit in many applications. The only workaround is for you to draw the text one character at a time, and use the QuickDraw pen movement calls like MoveTo() to move the pen yourself. The same is true for SpaceExtra().

Inline Input

Inline input of Japanese, Chinese or Korean is a way of using an intermediate program (called an Input Method) to translate your keystrokes into the many thousands of possible characters in those languages, all in the same place on screen that you would normally see characters typed in the line. In Japanese, the Input Method changes your keystrokes into phonetic Japanese kana characters, then converts some of those characters into Kanji characters to form a mixed kana and Kanji sentence. Then the user hits the return key to confirm the text in the line, ending the inline input session. Inline input on the Mac on System 7.1 or later uses the Text Services Manager (TSM). If your application uses TextEdit as its main text engine, you can support inline input quite easily using TSMTE. If you have your own text engine, you will need to do more work to support TSM Inline Input.

TSM uses Apple events to send and receive data between your application and the Input Method. You must implement several AppleEvent handlers, the most complex of which is the kUpdateActiveInputArea. In that handler, you must draw the text in all its intermediate stages, as the user is composing and editing the Japanese sentence before s/he confirms it to the document. If there is text after the so-called 'inline hole,' you must actively reflow the text if such editing causes the length to change. Each time the user makes a change, the text in the inline hole is received from the Input Method in an Apple event. The application draws it in the text stream, along with special Inline styles that help the user tell which text in the inline hole is raw (unconverted) text, which is converted text, which is the active phrase, where the phrase boundaries are in the inline hole, and other information.

After implementing the TSM support in your application, it is imperative that you test it with third-party Input Methods. At the time TSM was introduced, the documentation for how to write an Input Method was still a little spotty. This resulted in each Input Method handling text slightly differently. Also, Kotoeri, Apple's Input Method, has fewer features than the leading third-party Input Methods. Be sure to test your application with all of them you can find, so you can verify that it won't crash or produce strange results. Some Input Methods have strange quirks, like always eating mouseDown events, or having different requirements about how large a buffer they can handle without crashing. This knowledge comes from testing, and sometimes can be found on the Internet in Usenet newsgroups (in Japanese).

What About Unicode?

Unicode is being billed as the latest panacea for the problems of internationalization. What does Unicode give you? Where does it fall short?

Unicode was designed to solve one problem: There are many incompatible, overlapping encoding schemes for different languages, and supporting all of these encodings is a complex problem. What if there was a single encoding scheme that supported all the writing systems of the world, and guaranteed that you could display text in all the languages Unicode supports if only you had the right Unicode font for each language? Unicode tries to be that encoding.

For Japanese text data, the Mac OS and Windows95 use Shift-JIS internally, while Rhapsody and WindowsNT® use Unicode. On the Internet, most Japanese text is encoded using the 7-bit ISO-2022-JP standard. Whether or not you use Unicode to represent text internally to your application, you will have to support all three standards for full file and data compatibility with the rest of the world. In Unicode, all characters are two bytes long. So, you no longer have to worry about testing for byteness in a Unicode stream. However, all ASCII characters are represented with a leading 0x00 in Unicode. So you can't have loops that look for a terminating NULL in a C-string. And, all your formerly one-byte text doubles in size unless you explicitly compress it (and then you lose the byteness testing advantage).

Whether or not you think testing byteness is too complex or expensive to do, you should know that Unicode also does another controversial thing: For the so-called "Han" languages (Japanese, Chinese, Korean) that use characters that originated in China, it attempts to unify them into one codepoint for each character judged by the Unicode Consortium to be unique, even if it has variant forms in each language. The same is true for Arabic languages (Persian & Farsi). Because of this, you cannot tell what language a character is in just from its codepoint. Unicode was not designed to be a multi-lingual solution, in that representations of Chinese and Japanese in the same document will have overlapping character codes, requiring the OS to provide a parallel linguistically-coded data structure to render the glyph forms appropriately to each language. This might be another version of today's font/script/language relationship on Mac OS. As you can imagine, the Chinese, Japanese and Korean governments have each published competing encoding standards to Unicode, labeling the latter as something designed by foreigners who didn't understand the issues (both political and linguistic) involved in trying to make a worldwide encoding system.

Another issue about Unicode is that although it can represent 65,536 characters, there is not enough space for all the Han characters and their variants, plus all the other languages that Unicode currently supports. New languages are becoming computerized as more countries join the Digital Revolution and the Unicode Consortium cannot give space to all of them. Preferring the flat encoding model, they came up with another standard that uses four bytes per character (the ISO 10646 encoding standard). Given that on the Internet, where many languages need simultaneous support on computers, bandwidth is at a premium, I would prefer using the ISO 2022 standard of mixed-byte (7-bit and 14-bit characters) plus the escape codes that tell you what language the current stream is in to sending 32-bit characters through the wire. Since most web pages use this encoding, expect your OS to provide utilities for encoding conversion (like the Mac OS Encoding Converter debuting soon on a Mac near you).

Cross-Platform Development Issues

Going cross-platform is already complicated without having to think about internationalization. Should you have separate codebases for maximum use of each platform's unique features? Or should you have a single codebase and use an emulation layer for the other OS's APIs? Each has its advantages, but for this article I can speak to those of you who have a joint codebase, and tell you about some of the things that the Windows platform lacks that you have to write yourself for multi-byte support and internationalization.

Windows has no Script Manager. There is no Gestalt Manager. It cannot support multiple two-byte codepages at the same time. It uses totally separate fonts for vertical and horizontal text. It supports proportional kana in Japanese, so you can't assume all two-byte characters are the same width.

If your code uses the Script Manager routines heavily, then you will have to write them yourself on the Windows side. All the convenience of the Mac OS's international routines comes very clear when you try the same things on a PC!

Also, Japan once again has its own special challenges. Until Windows came out in Japan, each computer manufacturer made its own proprietary OS and hardware. Even floppy disks were incompatible with each other. Now, most companies have adopted the Intel PC standard, but NEC continues to manufacture its own line of incompatible PCs. NEC has such a huge share of the market in Japan that it has teamed up with Microsoft to produce its own version of Windows95 for NEC. So when you buy Windows95 in Japan, you find there are three versions: MS Windows95 for Intel, MS Windows95 for NEC, and NEC Windows95 for NEC. All three versions are basically the same feature-for-feature, but the drivers are different and you need to test your application on each platform to verify compatibility.

On the hardware side, you will find that Japanese hardware is different: They use different displays, different keyboards, different printers, and different floppy formats. The drive lettering on NEC machines is different from Intel PCs: The hard disk drive is labeled 'A:' on one and 'C:' on the other. Make sure your installer isn't hard-coded to install on drive C:.


As we have seen, internationalization of your software on Mac OS is not very difficult to do, and it is to your benefit to try and enable as many users as possible to enter text in their own language when using your program. We have also examined Japanese localization in more depth, and demonstrated that Japanese language applications usually require some amount of new features designed specifically for that language's needs and conventions. As more markets around the world reach maturity, you can be sure that there will be ample opportunity to differentiate your product by adding locale-specific features. It is these locale-specific features that will tell your users that they are valued customers, and that their needs are being addressed in a very specific way. For your product, especially if you are in the initial designing phases, I would recommend you try to make it as easily expandable as possible. Design generic internationization into the core modules, while leaving open the opportunity to add locale-specific features for certain markets like Japan, as you see your product's market expand and rise in success.

Bibliography and Related Reading

  • Apple Computer, Inc. Inside Macintosh: Text, Menlo Park, CA: Addison Wesley, March 1993.
  • Apple Computer, Inc. "Technote OV 20, Internationalization Checklist," Cupertino, CA: Apple Computer, Inc, November 1993.
  • Griffith, Tague. "Gearing Up for Asia With the Text Services Manager and TSMTE," Develop Issue 29. Cupertino, CA: Apple Computer, Inc, March 1997.
  • Apple Computer, Inc. "Technote TE 531, Text Services Manager Q&As," Cupertino, CA: Apple Computer, Inc, May 1993.
  • Lunde, Ken. Understanding Japanese Information Processing, Sebastopol, CA: O'Reilly & Associates, September, 1993.

See also Ken Lunde's home page at for more information about multi-byte text processing on computers.

Nat McCully has been at Claris in the Japanese Development Group for the last 6 years. He has worked on numerous Japanese products, including MacWrite II-J, Filemaker Pro-J, Claris Impact-J, ClarisDraw-J, and ClarisWorks-J. He speaks, reads and writes Japanese, and enjoys traveling in Japan. He is currently working as Development Lead on the next release of ClarisWorks-J.


Community Search:
MacTech Search:

Software Updates via MacUpdate

Dropbox 22.4.24 - Cloud backup and synch...
Dropbox is an application that creates a special Finder folder that automatically syncs online and between your computers. It allows you to both backup files and keep them up-to-date between systems... Read more
Posterino 3.3.5 - Create posters, collag...
Posterino offers enhanced customization and flexibility including a variety of new, stylish templates featuring grids of identical or odd-sized image boxes. You can customize the size and shape of... Read more
Kodi 17.1. - Powerful media center tool...
Kodi (was XBMC) is an award-winning free and open-source (GPL) software media player and entertainment hub that can be installed on Linux, OS X, Windows, iOS, and Android, featuring a 10-foot user... Read more
Kodi 17.1. - Powerful media center tool...
Kodi (was XBMC) is an award-winning free and open-source (GPL) software media player and entertainment hub that can be installed on Linux, OS X, Windows, iOS, and Android, featuring a 10-foot user... Read more
Bookends 12.8 - Reference management and...
Bookends is a full-featured bibliography/reference and information-management system for students and professionals. Bookends uses the cloud to sync reference libraries on all the Macs you use.... Read more
Apple iTunes 12.6 - Play Apple Music and...
Apple iTunes lets you organize and stream Apple Music, download and watch video and listen to Podcasts. It can automatically download new music, app, and book purchases across all your devices and... Read more
Default Folder X 5.1.4 - Enhances Open a...
Default Folder X attaches a toolbar to the right side of the Open and Save dialogs in any OS X-native application. The toolbar gives you fast access to various folders and commands. You just click on... Read more
Amazon Chime 4.1.5587 - Amazon-based com...
Amazon Chime is a communications service that transforms online meetings with a secure, easy-to-use application that you can trust. Amazon Chime works seamlessly across your devices so that you can... Read more
CrossOver 16.2 - Run Windows apps on you...
CrossOver can get your Windows productivity applications and PC games up and running on your Mac quickly and easily. CrossOver runs the Windows software that you need on Mac at home, in the office,... Read more
Adobe Creative Cloud - Access...
Adobe Creative Cloud costs $19.99/month for a single app, or $49.99/month for the entire suite. Introducing Adobe Creative Cloud desktop applications, including Adobe Photoshop CC and Illustrator CC... Read more

The best deals on the App Store this wee...
Deals, deals, deals. We're all about a good bargain here on 148Apps, and luckily this was another fine week in App Store discounts. There's a big board game sale happening right now, and a few fine indies are still discounted through the weekend.... | Read more »
The best new games we played this week
It's been quite the week, but now that all of that business is out of the way, it's time to hunker down with some of the excellent games that were released over the past few days. There's a fair few to help you relax in your down time or if you're... | Read more »
Orphan Black: The Game (Games)
Orphan Black: The Game 1.0 Device: iOS Universal Category: Games Price: $4.99, Version: 1.0 (iTunes) Description: Dive into a dark and twisted puzzle-adventure that retells the pivotal events of Orphan Black. | Read more »
The Elder Scrolls: Legends is now availa...
| Read more »
Ticket to Earth beginner's guide: H...
Robot Circus launched Ticket to Earth as part of the App Store's indie games event last week. If you're not quite digging the space operatics Mass Effect: Andromeda is serving up, you'll be pleased to know that there's a surprising alternative on... | Read more »
Leap to victory in Nexx Studios new plat...
You’re always a hop, skip, and a jump away from a fiery death in Temple Jump, a new platformer-cum-endless runner from Nexx Studio. It’s out now on both iOS and Android if you’re an adventurer seeking treasure in a crumbling, pixel-laden temple. | Read more »
Failbetter Games details changes coming...
Sunless Sea, Failbetter Games' dark and gloomy sea explorer, sets sail for the iPad tomorrow. Ahead of the game's launch, Failbetter took to Twitter to discuss what will be different in the mobile version of the game. Many of the changes make... | Read more »
Splish, splash! The Pokémon GO Water Fes...
Niantic is back with a new festival for dedicated Pokémon GO collectors. The Water Festival officially kicks off today at 1 P.M. PDT and runs through March 29. Magikarp, Squirtle, Totodile, and their assorted evolved forms will be appearing at... | Read more »
Death Road to Canada (Games)
Death Road to Canada 1.0 Device: iOS Universal Category: Games Price: $7.99, Version: 1.0 (iTunes) Description: Get it now at the low launch price! Price will go up a dollar every major update. Update news at the bottom of this... | Read more »
Bean's Quest Beginner's Guide:...
Bean's Quest is a new take on both the classic platformer and the endless runner, and it's free on the App Store for the time being. Instead of running constantly, you can't stop jumping. That adds a surprising new level of challenge to the game... | Read more »

Price Scanner via

Apple iMacs on sale for up to $200 off MSRP,...
B&H Photo has 21″ and 27″ Apple iMacs on sale for up to $200 off MSRP, each including free shipping plus NY sales tax only: - 27″ 3.3GHz iMac 5K: $2099 $200 off MSRP - 27″ 3.2GHz/1TB Fusion iMac... Read more
Apple Certified Refurbished iMacs available f...
Apple has Certified Refurbished 2015 21″ & 27″ iMacs available for up to $350 off MSRP. Apple’s one-year warranty is standard, and shipping is free. The following models are available: - 21″ 3.... Read more
1.4GHz Mac mini on sale for $419, $80 off MSR...
B&H Photo has the 1.4GHz Mac mini on sale for $80 off MSRP including free shipping plus NY sales tax only: - 1.4GHz Mac mini: $419.88 $80 off MSRP Read more
Apple refurbished Mac minis available for up...
Apple has Certified Refurbished Mac minis available starting at $419. Apple’s one-year warranty is included with each mini, and shipping is free: - 1.4GHz Mac mini: $419 $80 off MSRP - 2.6GHz Mac... Read more
Updated iPad Price Trackers
Scan our Apple iPad (and iPod touch) Price Trackers for the latest information on sales, bundles, and availability on systems from Apple’s authorized internet/catalog resellers. We update the... Read more
12-inch 32GB Space Gray iPad Pro on sale for...
B&H Photo has 12″ Space Gray 32GB WiFi Apple iPad Pros on sale for $50 off MSRP including free shipping. B&H charges sales tax in NY only: - 12″ Space Gray 32GB WiFi iPad Pro: $749 $50 off... Read more
2.6GHz Mac mini on sale for $559, $140 off MS...
Guitar Center has the 2.6GHz Mac mini (MGEN2LL/A) on sale for $559 including free shipping. Their price is $140 off MSRP, and it’s the lowest price available for this model. Read more
SSD Speeder RAM Disk SSD Life Extender App Fo...
Fehraltorf, Switzerland based B-Eng has announced they are making their SSD Speeder app for macOS publicly available for purchase on their website. SSD Speeder is a RAM disk utility that prevents... Read more
iPhone Scores Highest Overall in Smartphone D...
Customer satisfaction is much higher among smartphone owners who use their device to operate other connected home services such as smart thermostats and smart appliances, according to the J.D. Power... Read more
Swipe CRM Free Photo-Centric CRM Sales DEal C...
Swipe CRM LLC has introduced Swipe CRM: Visual Sales 1.0 for iPad, an app for creating, managing, and sharing visually stunning sales deals. Swipe CRM is targeted to small-and-medium creative... Read more

Jobs Board

*Apple* Mobile Master - Best Buy (United Sta...
**492889BR** **Job Title:** Apple Mobile Master **Location Number:** 000886-Norwalk-Store **Job Description:** **What does a Best Buy Apple Mobile Master do?** Read more
*Apple* Mobile Master - Best Buy (United Sta...
**492472BR** **Job Title:** Apple Mobile Master **Location Number:** 000470-Seattle-Store **Job Description:** **What does a Best Buy Apple Mobile Master do?** Read more
*Apple* Mobile Master - Best Buy (United Sta...
**492562BR** **Job Title:** Apple Mobile Master **Location Number:** 000853-Jackson-Store **Job Description:** **What does a Best Buy Apple Mobile Master do?** Read more
*Apple* Retail - Multiple Positions - Apple,...
Job Description: Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
Fulltime aan de slag als shopmanager in een h...
Ben jij helemaal gek van Apple -producten en vind je het helemaal super om fulltime shopmanager te zijn in een jonge en hippe elektronicazaak? Wil jij werken in Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.