TweetFollow Us on Twitter

September 96 - MPW TIPS AND TRICKS: Automated Editing With StreamEdit

MPW Tips and Tricks: Automated Editing With StreamEdit

Tim Maroney

In this column in Issue 26 of develop, I showed you a wide range of scriptable editing commands available from the MPW Shell. This time I'll discuss a single tool that provides a powerful self-contained text-editing scripting language, StreamEdit.

Why would you want to use StreamEdit instead of the other text-editing features of the MPW Shell?

  • Performance -- A StreamEdit script is faster than an MPW script containing various Replace and Find commands.

  • Self-containment -- Because StreamEdit is a self-contained tool, you can run it from within ToolServer, unlike the scriptable editing commands discussed in Issue 26, which are available only in the MPW Shell itself. This means you can use StreamEdit to create lightweight drag-and-drop grinder AppleScript scripts that send StreamEdit commands to ToolServer.

  • Consistency -- Keeping all your editing in a single scripting language confers the elusive mystical boon of code consistency, making your system easier to maintain and modify in the future.


GETTING TO KNOW YOU

StreamEdit is based very closely on the hoary UNIX tool named sed. If you already know sed, much of this will be familiar, but StreamEdit isn't directly compatible with sed scripts. StreamEdit implements a pattern-matching language. Every time a particular pattern is matched, a sequence of commands will be executed. As in most pattern-matching languages, StreamEdit's scripts are lists of pattern/command pairs, with the pattern coming before the command. The input file or files are read through the script interpreter, which searches for instances of the patterns and executes the corresponding commands. Anything that doesn't match a pattern is passed through unchanged.

StreamEdit scans one line at a time through the input, matching its current line to every pattern in its script. After processing each line, it writes out the modified line. The result is a concatenation of three internal buffers: the insert buffer, then the edit buffer, and finally the append buffer. The edit buffer gets filled with the current line, while the other buffers are empty at the start. The Insert and Append commands place text in the insert and append buffers, allowing you to add text to the beginning and end of the output line. The Change, Delete, and Replace commands modify the contents of the edit buffer.


SHARING ADDRESSES

As usual, MPW uses words in ways previously unknown in human speech. In StreamEdit, patterns are referred to as "addresses." There are two kinds of addresses: line numbers and regular expressions. Line numbers ought to be self-explanatory, but it may help to note that the numbers must be Arabic numerals rather than Roman, and must be in base 10 rather than the hexadecimal or sexagesimal number systems. There are three special line numbers:
  • the bullet symbol (Option-8), meaning the point before the first line (enabling you to add a line before the first line, for example)

  • the infinity symbol ([[infinity]], Option-5), meaning the point after the last line

  • dollar sign ($), meaning the last line
    The keyboard shortcuts, as always in this column, are for American QWERTY keyboards; if you've got some other type of keyboard, you're on your own.
Regular expressions are expressions that manage their diets sensibly. They can be used for searching, and were explained in detail in Issue 26. In StreamEdit addresses, though, regular expressions find the entire line containing the pattern, rather than just the pattern. Regular expressions are denoted by slashes. Only forward slashes are used (StreamEdit doesn't have a backward search mode, having been frightened at an early age by the legends of Eurydice and Lot's wife). Three new constructs have been added to regular expressions in StreamEdit:
  • ç (Option-C), which indicates a case-sensitive search

  • // (two slashes), which means the last regular expression that was matched

  • <=variable>= (a variable name embedded in inequality operators, here overloaded as a special kind of angle brackets, and typed as Option-comma and Option-period), which means the text of an expanded StreamEdit variable, treated as literal text to be matched rather than as a regular expression
StreamEdit has variables that can be set with the Set command (more on this later) or from the command line using the -set variable [=value] option.

You can form more complex addresses using a few operators. The Boolean and, or, and not operators are the same as in C (&&, ||, and !, respectively). Parentheses can be used for grouping within addresses. The comma operator matches the range of lines specified; for example, 3,5 matches lines 3 through 5. A range address matches each of the lines in the range, if any. It can be thought of as matching more than once: it fires off the accompanying command on the first line matched, the last one matched, and all lines in between. If the termination condition is never met, the address continues to match until the end of input. This could happen if you specify a range of lines ending at line 15, for instance, and there are only ten lines in the file, or if your range termination condition is a regular expression that doesn't appear anywhere in the input.


TAKING ACTION

Matching patterns is very nice, but what do you do once you match them? Statements in StreamEdit attach actions to patterns. An action consists of one or more commands, separated by semicolons or by the end of a line. There's no begin or end bracketing as in Pascal or C. Addresses and commands are syntactically distinct, so the script interpreter can figure out where the list of commands for a pattern ends and the next pattern begins.

Editing commands

  • Insert text [-n] -- Adds the specified text to the start of the line by putting it in the insert buffer. The -n option (in this command and in Append and Change) prevents adding a newline when the line is written out.

  • Append text [-n] -- Adds the specified text to the end of the line by putting it in the append buffer.

  • Change text [-n] -- Changes the line to the specified text by replacing the contents of the edit buffer.

  • Delete -- Clears the edit buffer.

  • Replace [-c count] /pattern/ text -- Replaces the pattern with the specified text. This is the second part of a two-step matching process: first the address matches a line, then Replace searches in the edit buffer and replaces the pattern. The count argument indicates the maximum number of times to perform the replacement in the line. It can be a positive integer or infinity ([[infinity]]). The default count is 1.

Control commands

  • Exit [status] -- Stops StreamEdit with the given error status. The default is 0, which means execution completed with no errors. Any nonzero error status indicates a problem, and unless the built-in MPW variable Exit is set to something other than 0, this will stop execution of the script (if any) from which the StreamEdit command was executed.

  • Next -- Somewhat like the C keyword continue. When a Next command is executed, all pending changes are written out and no more addresses are matched against the current line; that is, StreamEdit immediately goes on to the next line without matching the rest of the rules against the current edit buffer.

  • Set variable text [-i | -a] -- Much like the MPW Shell Set command. The variable is set to the specified text. The -i and -a options allow text to be added to any existing setting of the variable at the start or the end, respectively.

Output commands

  • Print [text] [-appendto | -to file] -- Writes output to a specified file. If text is empty, the current line is printed without modification. The -appendto and -to options write at the end of the file or overwrite the file, respectively. If no file is specified, standard output is used. If the file name is empty, nothing gets printed.

  • Option AutoDelete -- Deletes all input lines, leaving only output from Next and Print commands. You can get the same effect by specifying the -d option on the StreamEdit command line or by including this in the script:
/~/   Delete
The text arguments to these commands are usually literal text, denoted by single or double quotes. There are a few other forms as well:
  • An unquoted variable name can be used, in which case the variable is expanded; no brackets need be (or even may be) supplied.

  • A period means the current input line up to but not including the newline at the end.

  • As discussed in Issue 26, you can use reg. (Option-R) followed by a digit to mean the expression with that number matched in the pattern.

  • You can read text from a file with -from filename, which reads the next line of text from the specified file. The filename is usually literal text, but it could also be a variable, the current input line (denoted by a period), or a reg. expression.

A HYPOTHETICAL EXAMPLE

Let's say you're the director of corporate communications at a major computer maker and, without any warning except for inventory backlogs larger than the gross national products of many developing countries, you experience a sudden transition in chief executive officers, corporate policy, and product line. Your quarterly report (10-Q) is due in the SEC's EDGAR database tomorrow. Fortunately the SEC requires the cutting-edge ASCII format for its filings, and you realize that you can automate 90% of the tedious changes with a single StreamEdit script.
# Change nickname of CEO
/Diesel/
Replace // 'Flyboy'
# Change corporate policy
/1,$/
Replace /capture market share/ 'survive'

# Remove lines referring to obsolete products
/PowerTalk/ || /eWorld/
Delete

# Change developer relations strategy
/third-party developers/
Replace /evangelize/ 'listen to'

# Mark lines referring to old schedules
# with a distinctive string at the start
# of the line for manual editing later
/1996/
Insert 'WHOOPS: '

# Add new final line of report
[[infinity]]
Append 'May God have mercy on our souls.'

CONTROL YOURSELF

StreamEdit is almost too powerful. People have used it for everything, including pretty-printing source code, converting files to HTML, and postprocessing object files for dynamic linking tools. If you use it for finding incriminating passages in coworkers' e-mail, karma may get you, but the limitations of the tool won't. Use your powers for good rather than evil, and a grateful world will thank you.

TIM MARONEY has appeared professionally in newspapers, magazines, compact discs, videotape, and of course, computer software. Tim is a technical lead in human interface software at Apple and is editing a series of books for a horror publisher. His skin burns easily in the sun and tans in the moon. He uses white T-shirts only for house painting and car repair.

Thanks to Arno Gourdol, Alex McKale, and Robert Ulrich for reviewing this column.

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Paragraphs 1.0.1 - Writing tool just for...
Paragraphs is an app just for writers. It was built for one thing and one thing only: writing. It gives you everything you need to create brilliant prose and does away with the rest. Everything in... Read more
BlueStacks App Player 0.9.21 - Run Andro...
BlueStacks App Player lets you run your Android apps fast and fullscreen on your Mac. Version 0.9.21: Note: Now requires OS X 10.8 or later running on a 64-bit Intel processor. Initial stable... Read more
Apple iTunes 12.2 - Play Apple Music...
Apple iTunes lets you organize and stream Apple Music, download and watch video and listen to Podcasts. It can automatically download new music, app, and book purchases across all your devices and... Read more
Apple Security Update 2015-005 - For OS...
Apple Security Update 2015-005 is recommended for all users and improves the security of OS X. For detailed information about the security content of this update, please visit: http://support.apple.... Read more
Apple HP Printer Drivers 3.1 - For OS X...
Apple HP Printer Drivers includes the latest HP printing and scanning software for OS X Lion or later. For information about supported printer models, see this page. Version 3.1: The latest printing... Read more
Epson Printer Drivers 3.1 - For OS X 10....
Epson Printer Drivers installs the latest software for your EPSON printer or scanner for OS X Yosemite, OS X Mavericks, OS X Mountain Lion, and OS X Lion. For more information about printing and... Read more
Xcode 6.4 - Integrated development envir...
Xcode provides everything developers need to create great applications for Mac, iPhone, and iPad. Xcode brings user interface design, coding, testing, and debugging into a united workflow. The Xcode... Read more
OS X Yosemite 10.10.4 - Apple's lat...
OS X Yosemite is Apple's newest operating system for Mac. An elegant design that feels entirely fresh, yet inherently familiar. The apps you use every day, enhanced with new features. And a... Read more
Dash 3.0.2 - Instant search and offline...
Dash is an API Documentation Browser and Code Snippet Manager. Dash helps you store snippets of code, as well as instantly search and browse documentation for almost any API you might use (for a full... Read more
FontExplorer X Pro 5.0 - Font management...
FontExplorer X Pro is optimized for professional use; it's the solution that gives you the power you need to manage all your fonts. Now you can more easily manage, activate and organize your... Read more

Heroki (Games)
Heroki 1.0 Device: iOS Universal Category: Games Price: $7.99, Version: 1.0 (iTunes) Description: CLEAR THE SKIES FOR A NEW HERO!The peaceful sky village of Levantia is in danger! The dastardly Dr. N. Forchin and his accomplice,... | Read more »
Hands-On With Raceline CC
Set for release soon, Rebellion’s motorbike racing game, Raceline CC certainly looks stylish. But how does it play? I got my hands on a preview build to answer exactly that. | Read more »
Siegefall - Tips, Tricks, and Strategies...
So, you fancy establishing a base and ruling the world again. Siegefall is a convenient place to do that, but how about some great tips and tricks on how best to go about it? Here are a few ideas on how to get ahead as a beginner to this medieval... | Read more »
The WWE Comes to Racing Rivals - Because...
Racing Rivals is a racing game that's all about, well, rivalry. And who knows rivalry better than WWE superstars (shhhh, that was rhetorical)? [Read more] | Read more »
Hey, Who Put Apple Music in My SoundHoun...
One of the App Store's popular music discovery sources - SoundHound - has already been updated to include Apple's own music discovery source - Apple Music. That was fast! [Read more] | Read more »
Arcane Legends has a New Expansion Calle...
Arcane Legends has been going strong since it debuted at the tail end of 2012. So well, in fact, that it's already up to its sixth expansion. [Read more] | Read more »
Vector 2 is Officially a Thing and it...
Vector is a pretty cool parkour-driven runner that's gotten a pretty decent following since it first came out - although personally I think more people could stand to show it some love. Anyway, Nekki has announced that a sequel isofficially on its... | Read more »
Get Ready to Trucksform and Roll Out (an...
It looks like NuOxygen is bringing the truck-transforming racer Trucksform (get it?) to iOS in a couple of weeks. Although really it's more of an auto-driver than a racer. But still, transforming trucks! [Read more] | Read more »
This Week at 148Apps:June 22-26, 2015
June's Summer Journey Continues With 148Apps How do you know what apps are worth your time and money? Just look to the review team at 148Apps. We sort through the chaos and find the apps you're looking for. The ones we love become Editor’s Choice,... | Read more »
LEGO® Minifigures Online (Games)
LEGO® Minifigures Online 1.0.1 Device: iOS iPhone Category: Games Price: $4.99, Version: 1.0.1 (iTunes) Description: | Read more »

Price Scanner via MacPrices.net

Logo Pop Free Vector Logo Design App For OS X...
128bit Technologies has released of Logo Pop Free 1.2 for Mac OS X, a vector based, full-fledged, logo design app available exclusively on the Mac App Store for the agreeable price of absolutely free... Read more
21-inch 1.4GHz iMac on sale for $999, save $1...
B&H Photo has new 21″ 1.4GHz iMac on sale for $999 including free shipping plus NY sales tax only. Their price is $100 off MSRP. Best Buy has the 21″ 1.4GHz iMac on sale for $999.99 on their... Read more
16GB iPad mini 3 on sale for $339, save $60
B&H Photo has the 16GB iPad mini 3 WiFi on sale for $339 including free shipping plus NY tax only. Their price is $60 off MSRP. Read more
Save up to $40 on iPad Air 2, NY tax only, fr...
B&H Photo has iPad Air 2s on sale for up to $40 off MSRP including free shipping plus NY sales tax only: - 16GB iPad Air 2 WiFi: $489 $10 off - 64GB iPad Air 2 WiFi: $559 $40 off - 128GB iPad Air... Read more
Apple Releases OS X 10.10.4 With WIFi Fix, iO...
On Tuesday, Apple released final versions of OS X 10.10.4 and iOS 8.4, as well as updates for the Safari browser for OS X Yosemite, Mavericks, and Mountain Lion. The OS X 10.10.4 update focuses on... Read more
Dual-Band High-Gain Antennas for Home Wi-Fi N...
Linksys has announced what it claims are the first dual-band, omni-directional high-gain antennas for the consumer market. The new Linksys high-gain antennas available in a 2- and 4-pack (WRT004ANT... Read more
Apple refurbished 2014 15-inch Retina MacBook...
The Apple Store has Apple Certified Refurbished 2014 15″ 2.2GHz Retina MacBook Pros available for $1609, $390 off original MSRP. Apple’s one-year warranty is included, and shipping is free. They have... Read more
Clearance 2014 MacBook Airs available for up...
Adorama has 2014 MacBook Airs on sale for up to $301 off original MSRP including NY + NJ sales tax and free shipping: - 11″ 256GB MacBook Air: $798 $301 off original MSRP - 13″ 128GB MacBook Air: $... Read more
5K iMacs on sale for $100 off MSRP, free ship...
B&H Photo has the new 27″ 3.3GHz 5K iMac on sale for $1899.99 including free shipping plus NY tax only. Their price is $100 off MSRP. They have the 27″ 3.5GHz 5K iMac on sale for $2199, also $100... Read more
27-inch 3.2GHz iMac on sale for $1679, save $...
B&H Photo has the 27″ 3.2GHz iMac on sale for $1679.99 including free shipping plus NY sales tax only. Their price is $120 off MSRP. Read more

Jobs Board

Senior Payments Security Manager - *Apple*...
**Job Summary** Apple , Inc. is looking for a highly motivated, innovative and hands-on senior payments security manager to join the Apple Pay security team. You will Read more
Project Manager - *Apple* Pay Security - Ap...
**Job Summary** The Apple Pay Security team is seeking a highly organized, results-driven Project Manager to drive the development of Apple Pay Security. If you are Read more
Web Developer, *Apple* Online Store Innovat...
**Job Summary** The Apple Online Store Innovation Lab team is seeking a talented individual with strong web development and design skills to prototype future Apple Read more
*Apple* TV Live Streaming Frameworks Test En...
**Job Summary** Work and contribute towards the engineering of Apple 's state-of-the-art products involving video, audio, and graphics in Interactive Media Group (IMG) at Read more
Project Manager, WW *Apple* Fulfillment Ope...
…a senior project manager / business analyst to work within our Worldwide Apple Fulfillment Operations and the Business Process Re-engineering team. This role will work Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.