Mac in the Shell: What Has Really Changed in Leopard?
Volume Number: 24 (2008)
Issue Number: 02
Column Tag: Mac in the Shell
Mac in the Shell: What Has Really Changed in Leopard?
Some differences for us Unixy scripting people
by Edward Marczak
The dichotomy strikes me as a bit amazing: on one hand, I don't find too many end-users that can really quantify what's better about Leopard as compared to Tiger. I've upgraded some people that just had to have the latest and greatest and honestly, they haven't noticed much difference. The flip side of this is the tech perspective: take just about everything you understood in Tiger and throw it out the window. The underlying technology in Leopard is radically different in many ways from its predecessor. So, specifically, what is different for us shell-dwellers?
Leopard is the first version of OS X that is Unix 03 certified. What does that mean? To understand this requires a short trip back in time (cue Time Machine space backdrop...), though not too far.
Unix and Unix-like operating systems fall into several categories. AT&T/Bell Labs' Unix (UNICS, then UNIX System V1) is the oldest variant and often the reference implementation. "BSD Unix" was originally distributed at Berkeley University (Berkeley Software Distribution) and did things differently than AT&T Unix. Over the years, BSD and AT&T Unix have diverged significantly. In the early 1990's, we began to see the rise of GNU/Linux and yet another method of doing things.
Often, one would find tools from another Unix platform that were missing from (or just better than) the tools on your current platform. Over time, these tools would get ported, but not necessarily in a way consistent with the target platform. Standards have emerged, such as POSIX, that generally smooth over these cross-platform differences, but not everyone follows the standards.
While OS X has, from the beginning, had a BSD layer on top of its Mach kernel, and typically followed BSD semantics, Leopard changes all of this. As of OS X v10.5, OS X follows AT&T semantics for most command-line utilities.
Where Does it Hurt?
OK, Apple changed this over, so what? Well, like any target upgrade, you need to ensure that long-time scripts still work. I personally got bitten by the change as the sort utility no longer uses the GNU semantics that I had gotten used to. Let's take a look there first.
I had a script that used sort in the most simple way: to sort data on a single column. To do this I used the now-deprecated plus-notation:
generate_data | sort +4 > final_file.txt
Along comes Leopard which acted like I was insane. "Plus notation? What are you talking about? That doesn't exist." A quick trip to the man page revealed that this is true:
sort [OPTION]... [FILE]...
...with no plus signs in the man page anywhere. However, a quick trip back to Tiger showed that I wasn't crazy:
sort [-cmus] [-t separator] [-o output-file] [-T tempdir] [-bdfiMnr]
[+POS1 [-POS2]] [-k POS1[,POS2]] [file...]
Ah ha! Of course, running under Leopard, I still needed to update my script. Naturally, there had to be a hurdle, no matter how small: the plus notation counts fields from zero like a computer, and the "-k" switch counts from 1 like a human. So, my incantation now had to read:
generate_data | sort -k 5 > final_file.txt
In any case, the issue was now fixed.
While the case I just gave was easily solved, be aware that all of these differences can cause issues when writing scripts that will deployed cross-platform -- whether the platforms in question are intra-OS X or beyond those bounds to Solaris, Linux, etc.
A command as seemingly simple as chown may trip you up if your script is to run on different OSes. Typically, if you use chown on a symlink, the symlink itself remains unmodified, while the target of the symlink is altered. This can be changed with the -P flag. In AT&T semantics, issuing the following commands alter the symlink:
ln -s file.txt test.txt
chown -RP user test.txt
In BSD, the owner of the symlink is not altered. Yes, this includes 10.4 and earlier.
Another one that has caught many is echo. To prompt for input, it has been nice to fall back on the "-n" BSDism that removes suppresses echo from printing a newline character. However, under Leopard, this seemingly broke. Frustratingly, visiting the man page for echo showed that "-n" was still valid and should do what you'd expect. What was going on?
Technically, echo is a shell built-in. Prior to 10.5, the built-in and the version in /bin happened to have the same behavior. 10.5 brings the built-in up to SUSv3 conformance using AT&T semantics, meaning, no "-n" switch. Of course, the shell executes built-in before looking along $PATH. Now, anyone who simply specified "echo" (just about everyone, myself included), and needs the "-n" switch got a surprise. There's a few ways to handle this. The easiest is to call the external echo in /bin, as it still supports "-n". This may actually be your best choice in many cases as there are many tutorials and documents that assume the availability of the "-n" switch.
Another option is to AT&T-ify it. This requires you to drop the "-n" flag and finish off the string with a "\c" character:
echo "Enter number of seconds: \c"
This is perfect, if you're going all-Leopard.
The best choice may be to drop the use of echo altogether. printf is a more robust and universal command. On all systems that I can think of, printf will not print a newline at all, and requires a '\n' character to do so. For cross-platform scripting, printf is the way to go.
In fact, printf is even more flexible than echo, mimicking some of C's formatting functionality. Of course, note that printf comes in both a built-in and external version.
Finally, keep in mind that sometimes, it's simply return codes that differ between platforms. So, while some command may do the same thing, in exactly the same way:
file_pouncer -k -d 7 -y "*txt"
success and failure (or error) may be defined differently in $?. Some programs will silently ignore errors, returning a zero. Remember to test these scenarios between destination platforms.
A final example of this: we've been accustomed to BSD-based 'cp' to copy files. Under BSD, cp will march through, despite errors and copy as much as possible. AT&T semantics have cp abort after the first error. Subtle, and something that will only crop up under a failure condition.
Of course, this doesn't begin to cover what has really changed in Leopard, as there is so much that has, on so many different levels. Those of us working in the shell, though, do need to be aware of what the new SUS certification means to us, and how the AT&T semantics affects our scripts. Remember: test, test, test!
Ed Marczak is a married, father-of-two technologist. Outside of learning about life, business and technology -- particularly OS X -- you can find him home with his family in New York. For a few spare hours at night, you can also occasionally find him riding through Azeroth on his Dreadsteed.