TweetFollow Us on Twitter

Adding Regular Expressions To Your Cocoa Application.

Volume Number: 19 (2003)
Issue Number: 4
Column Tag: Cocoa Development

Adding Regular Expressions To Your Cocoa Application.

Using MOKit to add the ability to match regular expressions in Cocoa.

by Ron Davis

Does your application need to parse data out of a bunch of text, or match strings that can vary some, but have a regular syntax? Do you have a Find command in your text editor? If you do you need to add regular expression matching to your app. Regular Expressions are textual representations of strings match pattern. They go beyond just finding a string and let you do things like find a string that begins and ends with certain characters, but can have anything in the middle. Or a string that contain four numbers followed by a letter.

I've been around the Mac a long time and never really thought about grep or regex or other commands that use regular expressions. But OSX changes that. Every UNIX geek out there knows about grep and it various offspring. Scripting languages like Perl use regular expressions as well, so I thought I needed to learn about them. Once I did I was hooked, and wanted to use them in my own applications. That lead me to Mike Ferris' MOKit, a Cocoa framework that lets you easily deal with regular expressions in your application.

Introduction to Regular Expressions

We'll start with a quick look at regular expression syntax for those of you who have no idea what I'm talking about. The introduction will be fast and shallow. If you need more information check out the URL in the Bibliography at the end of the article.

Symbol         Meaning                       Example
character      The character typed,          A is a, b is b, etc.
               with the exception of 
               special characters.

[character -   Any of a range of .           [a-d] = a,b,c, or d.
 character]    characters
.              Period matches any one 
               character, except line 
#              Matches any digit.            0,1,2,3,4,5,6,7,8,9 
\r             return

\t             tab 

\              The escape character like     \. matches a period. 
               in printf. Putting a slash    \\ matches a slash.
               in front of a special 
               character allows that 
               character to be matched.
?              0 or 1 of the previous .      ca?t, matches cat, or ct,
               characters                    but not caat.

*              0 or more of the              ca*t, matches ct, cat, 
               previous characters           caat, caaat.
+              1 or more of the              ca+t, matches cat, caat,
               previous characters           caaat, but not ct.
^              any character but the         (^r23) any character 
               ones after the carat.         but r, 2, or 3.
pattern |      match pattern or pattern.     ca|t, matches ca or t, 
pattern                                      but not cat.

(pattern)      Matching: treats what is      (ca)*t, matches cat, or 
               in the parenthesis as a       cacat, but not ct.
               single character.   c(*?)t,   on string coat, 
               Searching: delineates the     returns "oa".
               information to be 
               remembered in a find.

The last pattern there gives you a hint that regular expression can be used in two different ways. One way is matching, where you have a string and you want to know if it is equal to a regular expression. This returns a Boolean value, either the string matches or it doesn't. The other way to use regular expressions is to find a substring or strings in a longer string. When you do this you give an expression and you specify what part of the matched string you want back by placing that part in parentheses.

Let's look at an example or two. Say you let the user input a seven digit zip code and you want to make sure they didn't put any letters in there. You could get their input string and compare it against the regular expression "#+", which matches 1 or more digits, but wouldn't match an empty string, nor one with letters in it.

Now say you have an HTML tag for a link like <A HREF=>RAD productions</A> and you wanted to pull out the URL. You could search with the regular expression "=(.*?)>" and you would get back You may wonder why the ? is there. If you just put ".*", which means match 0 or more characters, you get to the end of the string because quotes and brackets are characters too. This is called a greedy search. Putting the ? tells it to only search until it finds the next part of the expression string.


MOKit is a Cocoa framework written by Mike Ferris. It contains some text manipulation classes, one of which handles regular expressions. The underlying regular expression engine is actually a standard package written by Henry Spencer and used in one form or another by a lot of interesting things such as tcl and perl. MOKit classes are "not public domain, but they are free" according to the web page. The code can be downloaded at You can get both compiled frameworks and the source to MOKit. Version 2.6 was used for this article.

MOKit has two main parts, classes for text completion and classes for regular expressions. We'll only be talking about the regular expression classes here. These classes are MORegularExpression and MORegexFormatter. MORegularExpression is the main class for handling the evaluation of regular expressions. It is the one we'll use in our sample code. Here's its declaration.

Listing 1: MORegularExpression interface.

@interface MORegularExpression : NSObject <NSCopying, NSCoding> {
    NSString *_expressionString;
    NSString *_lastMatch;
    NSRange _lastSubexpressionRanges 
    void *_compiledExpression;
    BOOL _ignoreCase;
+ (BOOL)validExpressionString:(NSString *)expressionString;
+ (id)regularExpressionWithString:(NSString *)
               expressionString ignoreCase:(BOOL)ignoreCaseFlag;
+ (id)regularExpressionWithString:(NSString *)
- (id)initWithExpressionString:(NSString *)expressionString
- (id)initWithExpressionString:(NSString *)
- (NSString *)expressionString;
- (BOOL)matchesString:(NSString *)candidate;
- (NSRange)rangeForSubexpressionAtIndex:(unsigned)index
                inString:(NSString *)candidate;
- (NSString *)substringForSubexpressionAtIndex:
               (unsigned)index inString:(NSString *)candidate;
- (NSArray *)subexpressionsForString:(NSString *)candidate;

As you can see, it is a fairly simple class. To use a regular expression in your code you create an instance of this class. If you need to keep it around, using the initWithExpressionString methods will probably be easiest. If you're just going to use it in the scope of a single method, use the class methods regularExpressionWithString, so you won't have to deal with releasing. Both of these methods have twins that take an ignoreCase parameter which, if set to YES, will cause evaluations to ignore the case of the characters in the expression and the search string. If you don't explicitly set case sensitivity then searches are case sensitive. Here's an example of how to create an expression for finding HREFs in a string of HTML:

MORegularExpression*   linkURLExp = [MORegularExpression regularExpressionWithString: 
                                    @"<A HREF=.*?</A>" ignoreCase:YES];

If you want to make sure the expression you create is valid you can call the class method validExpressionString, which will return YES if the expression is a valid regular expression. If you want to know what an MORegularExpression object's expression is you can get it from the expressionString accessor.

Now we can actually do some evaluations. As I said previously, there are two ways to use regular expressions, to match a string and to find a sub-string. If you have a string and you want to make sure it conforms to the regular expression you created, you can pass it into matchesString and the result will tell you if it matches. This is what MORegexFormatter does. It is a formatter you can add to a field and it will validate the value in that field by the regular expression you give it.

Getting sub-expressions is interesting. If you just want to find the location in the target string of a sub-string, you can use the rangeForSubexpressionAtIndex method. If you want the whole sub-string back as a new NSString* you use the substringForSubexpressionAtIndex, passing the string you are searching for in the inString parameter. The index is which value in parentheses you want back. There can be 0 to 20 sets of parentheses in a MOKit expression, and the index indicates which one you want the range for. So you could create an expression like "<A HREF=(.*?)>(.*?)</A>" to search for a link in an HTML page. If we used the HTML in Listing 2, and you asked for index 0 you would get the whole HREF tag: "<A HREF=>R.A.D. Productions</A>". If you asked for index 1, you'd get the link back "". If you asked for index 2, you'd get back the text "R.A.D. Productions".

Listing 2: Sample HTML

<TITLE>R.A.D. Productions Home Page</TITLE>
<A HREF=>R.A.D. Productions</A>

In a nutshell, that is all there is to finding sub-strings with MORegularExpression. The last method in the interface, subexpressionsForString, is there for backwards compatibility and I'm not even going to explain it.

There is one tricky thing about using MORegularExpression in a large amount of text. What happens if you want to find every link in an HTML page? substringForSubexpressionAtIndex is only going to return the first occurrence in the string. Turns out there is no way to say, start searching at character n in the candidate string. What I did was truncate the string after each search to find the next one. Here's my code to find all of the links and their URL in an HTML page.

Listing 3: Finding all of the links.

   MORegularExpression*   bothExp = 
                        @"<A HREF=(.*?)>(.*?)</A>" 
   MORegularExpression*   startStopExp = 
   NSString*            result = nil;
   NSRange               range;
   NSString*            curString = [startStopExp 
      range = [bothExp rangeForSubexpressionAtIndex:0
                     inString:curString ];
      if ( range.length > 0 )
         NSString*   URLString;
         NSString*   linkString;
         NSURL*      fullURL;
         result = [linkURLExp 
         URLString = [bothExp 
         fullURL = [NSURL URLWithString:URLString 
         URLString = [fullURL absoluteString];
         linkString = [bothExp
         if ( linkString == nil || 
               URLString == nil || 
               ([linkString length]== 0) || 
                     ([URLString length]== 0) )
            {} else 
            [self addURL:URLString withText:linkString];
         curString = [curString substringFromIndex:
                        (range.location + range.length)];
   while ([curString length] > 0 && 
               range.location != NSNotFound );

A little explanation. The method is in a class that has a method addURL. The class also keeps two arrays, one for URLs and one for the link text. When you call addURL the URL and the link string are added to the arrays for future reference. The class also knows what the URL of the page you are parsing is, and saves it in a variable called baseURL.

The first thing the method does is set up our regular expression for links. Then it makes a new string that will contain only the text between the <HTML> tag. You can use this to limit the search to just a certain part of the page. Then it sets up a loop, which will always execute once and will end when we don't get anything back from our search, or we run out of HTML to parse. Inside the loop we first try to find our expression's range in the HTML. If it isn't there, were done. If we find something, then we use our expression to get the sub-string for the URL. Some times a URL will be relative, so we use NSURL with the page's URL to create a full URL. Then we ask for the second index, which is the link text. If we get both, we add it to our list.

If we find something, then we need to search from the end of the string we found. So we create a sub-string from our current HTML string, that starts at the end of what we found and ends at the end of the current string. This effectively chops off everything from the beginning of the string to the end of what we just found. Then we loop.

Hopefully you've seen the coolness of regular expressions and want to use them in your Cocoa apps. MOKit makes this easy and is easy to use. So go to Mike Ferris' website and download it and add regular expressions to your app.


Mastering Regular Expressions, Jeffrey E. F. Friedl,

Using Regular Expressions, Stephen Ramsay,

Regular Expressions specification,

A Tao of Regular Expressions,

BBEdit Grep Tutorial,

Ron Davis is a long time Mac programmer, having worked on everything from Virex Anti-Virus to CodeWarrior. His day job is working for Alsoft, and his evening job is R.A.D. Productions, makers of Suck It Down and FinderEye.


Community Search:
MacTech Search:

Software Updates via MacUpdate

Safari Technology Preview 10.1 - The new...
Safari Technology Preview contains the most recent additions and improvements to WebKit and the latest advances in Safari web technologies. And once installed, you will receive notifications of... Read more
VueScan 9.5.60 - Scanner software with a...
VueScan is a scanning program that works with most high-quality flatbed and film scanners to produce scans that have excellent color fidelity and color balance. VueScan is easy to use, and has... Read more
Civilization VI 1.0.0 - Next iteration o...
Sid Meier’s Civilization VI is the next entry in the popular Civilization franchise. Originally created by legendary game designer Sid Meier, Civilization is a strategy game in which you attempt to... Read more
Adobe Flash Player - Plug-in...
Adobe Flash Player is a cross-platform, browser-based application runtime that provides uncompromised viewing of expressive applications, content, and videos across browsers and operating systems.... Read more
RestoreMeNot 2.0.4 - Disable window rest...
RestoreMeNot provides a simple way to disable the window restoration for individual applications so that you can fine-tune this behavior to suit your needs. Please note that RestoreMeNot is designed... Read more
Persecond 1.0.7 - Timelapse video made e...
Persecond is the easy, fun way to create a beautiful timelapse video. Import an image sequence from any camera, trim the length of your video, adjust the speed and playback direction, and you’re done... Read more
iShowU Instant 1.1.0 - Full-featured scr...
iShowU Instant gives you real-time screen recording like you've never seen before! It is the fastest, most feature-filled real-time screen capture tool from shinywhitebox yet. All of the features you... Read more
Spotify - Stream music, creat...
Spotify is a streaming music service that gives you on-demand access to millions of songs. Whether you like driving rock, silky R&B, or grandiose classical music, Spotify's massive catalogue puts... Read more
HoudahSpot 4.2.6 - Advanced file-search...
HoudahSpot is a powerful file search tool. Use HoudahSpot to locate hard-to-find files and keep frequently used files within reach. HoudahSpot will immediately feel familiar. It works just the way... Read more
Yummy FTP Pro 1.11.11 - $29.99
Yummy FTP Pro is an advanced Mac file transfer app which provides a full-featured professional toolkit combined with blazing speeds and impeccable reliability, so whether you want to transfer a few... Read more

Latest Forum Discussions

See All

5 Halloween mobile games for wimps
If you're anything like me, horror games are a great way to have nightly nightmares for the next decade or three. They're off limits, but perhaps you want to get in on the Halloween celebrations in some way. Fortunately not all Halloween themed... | Read more »
The 5 scariest mobile games
It's the most wonderful time of the year for people who enjoy scaring themselves silly with haunted houses, movies, video games, and what have you. Mobile might not be the first platform you'd turn to for quality scares, but rest assured there are... | Read more »
Lifeline: Flatline (Games)
Lifeline: Flatline 1.0.0 Device: iOS Universal Category: Games Price: $2.99, Version: 1.0.0 (iTunes) Description: The Lifeline series takes a terrifying turn in this interactive horror experience. Every decision you make could help... | Read more »
Game of Dice is now available on Faceboo...
After celebrating its anniversary in style with a brand new update, there’s even more excitement in store for Game of Dice has after just being launched on Facebook Gameroom. A relatively new platform, Facebook Gameroom has been designed for PC... | Read more »
4 addictive clicker games like Best Fien...
Clickers are passive games that take advantage of basic human psychology to suck you in, and they're totally unashamed of that. As long as you're aware that this game has been created to take hold of your brain and leave you perfectly content to... | Read more »
Smile Inc. Guide: How not to die on the...
As if Mondays weren't bad enough, at Smile Inc. you have to deal with giant killer donuts, massive hungry staplers, and blasting zones. It's not exactly a happy, thriving work environment. In fact, you'll be lucky to survive the nine to five.... | Read more »
Oh...Sir! The Insult Simulator (Games)
Oh...Sir! The Insult Simulator 1.0 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0 (iTunes) Description: | Read more »
WitchSpring2 (Games)
WitchSpring2 1.27 Device: iOS Universal Category: Games Price: $3.99, Version: 1.27 (iTunes) Description: This is the story of Luna, the Moonlight Witch as she sets out into the world. This is a sequel to Witch Spring. Witch Spring 2... | Read more »
4 popular apps getting a Halloween makeo...
'Tis the season for all things spooky. So much, so, in fact, that even apps are getting into the spirt of things, dressing up in costume and spreading jack o' lanterns all about the place. These updates bring frightening new character skins, scary... | Read more »
Pokémon GO celebrates Halloween with can...
The folks behind Pokémon GO have some exciting things planned for their Halloween celebration, the first in-game event since it launched back in July. Starting October 26 and ending on November 1, trainers will be running into large numbers of... | Read more »

Price Scanner via

Macs’ Superior Enterprise Deployment Cost Eco...
IBM’s debunking of conventional wisdom and popular mythology about the relative cost of using Apple Mac computers as opposed to PCs running Microsoft Windows at the sixth annual Jamf Nation User... Read more
12-inch WiFi Apple iPad Pros on sale for $50-...
B&H Photo has 12″ WiFi Apple iPad Pros on sale for $50-$70 off MSRP, each including free shipping. B&H charges sales tax in NY only: - 12″ Space Gray 32GB WiFi iPad Pro: $749 $50 off MSRP -... Read more
Apple refurbished 12-inch 128GB iPad Pros ava...
Apple has Certified Refurbished 12″ Apple iPad Pros available for up to $160 off the cost of new iPads. An Apple one-year warranty is included with each model, and shipping is free: - 32GB 12″ iPad... Read more
Apple refurbished iPad minis and iPad Air 2s...
Apple recently dropped prices on several Certified Refurbished iPad mini 4s and 2s as well as iPad Air 2s. An Apple one-year warranty is included with each model, and shipping is free: - 16GB iPad... Read more
MacHTTP-js Preview Full-featured Web Server f...
MacHTTP.Org has released MacHTTP-js Preview for macOS, a full-featured Web server for 21st Century desktops and servers. MacHTTP-js is a modern take on the classic stand-alone, desktop computer Web... Read more
Samsung Galaxy Tab A 10.1 with S Pen Makes US...
Samsung Electronics America, Inc. has announced the release of the Galaxy Tab A 10.1 with S Pen in a highly mobile, lightweight tablet. “With an embedded S Pen, consumers can discover more ways to... Read more
13-inch 2.5GHz MacBook Pro (Apple refurbished...
Apple has Certified Refurbished 13″ 2.5GHz MacBook Pros available for $829, or $270 off the cost of new models. Apple’s one-year warranty is standard, and shipping is free: - 13″ 2.5GHz MacBook Pros... Read more
27-inch iMacs on sale for up to $220 off MSRP
B&H Photo has 27″ Apple iMacs on sale for up to $200 off MSRP including free shipping plus NY sales tax only: - 27″ 3.3GHz iMac 5K: $2099 $200 off MSRP - 27″ 3.2GHz/1TB Fusion iMac 5K: $1899.99 $... Read more
13-inch 2.5GHz MacBook Pro available for $927...
Overstock has the 13″ 2.5GHz MacBook Pro available for $926.99 including free shipping. Their price is $172 off MSRP. Read more
Apple refurbished 2015 13-inch MacBook Airs a...
Apple has Certified Refurbished 2015 13″ MacBook Airs available starting at $759. An Apple one-year warranty is included with each MacBook, and shipping is free: - 2015 13″ 1.6GHz/4GB/128GB MacBook... Read more

Jobs Board

*Apple* Retail - Multiple Positions - Apple,...
Job Description: Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
Software Engineering Intern: UI Applications...
Job Summary Apple is currently seeking enthusiastic interns who can work full-time for a minimum of 12-weeks between Fall 2015 and Summer 2016. Our software Read more
Security Data Analyst - *Apple* Information...
…data sources need to be collected to allow Information Security to better protect Apple employees and customers from a wide range of threats.Act as the subject Read more
*Apple* Retail - Multiple Positions - Apple,...
Job Description: Sales Specialist - Retail Customer Service and Sales Transform Apple Store visitors into loyal Apple customers. When customers enter the store, Read more
*Apple* Solutions Consultant - Apple (United...
# Apple Solutions Consultant Job Number: 52812872 Houston, Texas, United States Posted: Oct. 18, 2016 Weekly Hours: 40.00 **Job Summary** As an Apple Solutions Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.