File System Metadata
Volume Number: 18 (2002)
Issue Number: 12
Column Tag: Section 7
What's in there, beside the data...
by Rich Morin
One of Apple's larger challenges, in merging Mac OS and BSD, has been the need to reconcile differences between their respective ways of handling files. Previous columns have touched on some of these issues; this month, we'll go into the topic in a bit more depth.
In order to keep both BSD and Mac OS program(mer)s happy, Apple has had to implement nearly every feature found in either HFS+ (Mac OS's "Extended Hierarchical File System") or UFS (BSD's "Unix File System"). This is a challenging task; the systems evolved independently and a variety of decisions were made in different manners.
A look at links
To take just one example, consider file links (BSD) and aliases (Mac OS). Conceptually, these are very similar: they provide ways to have multiple names for a single file. The devil, of course, lies in the details.
Hard links have been around since the earliest days of Unix. A hard link is a directory entry, within the same file system, to the same item. So, for example, the "." and ".." entries that you see in an "ls -al" directory listing are actually implemented as hard links.
Because all of BSD's file metadata is stored in the file's inode (information node), anything that affects any hard link to a file affects every hard link to the file. So, for example, you can't change the ownership or permissions on just one of a file's hard links.
Symbolic links (i.e., symlinks) were invented by the BSD developers, but other versions of Unix quickly picked up on the idea. A symlink is a specially-interpreted a file, containing the name of some other file, directory, etc. When the kernel is asked to access a symlink, it grabs the contained file name, then wanders off to resolve it. The referenced file may be anywhere on the system; in fact, it may not even exist!
Aliases are quite a bit like symlinks, but they differ in some subtle ways. For instance, Mac OS will try to find the file that an alias references, even if it has been moved or renamed. As long as the referenced file stays in the same file system, Mac OS will typically succeed in finding it. Symlinks, in contrast, either match completely or not at all.
These issues aside, symlinks act (mostly) like aliases, when viewed from the Finder. An alias, however, is unusable from the BSD command line; it simply appears to be a zero-length file. In short, if you need a link that is visible everywhere, use either a hard link or a symlink...
File System Metadata
Apple has done a very creditable job in merging file system metadata. Both Mac OS and BSD programs have access to the sorts of information they need; users and even programmers can largely ignore the underlying issues. To take full advantage of the combined environment, however, we need to know what kinds of metadata it supports.
The tables below list a variety of file system metadata and ways for users to get access to their values. Programmers will have to look up the appropriate libraries and/or frameworks, but the discussion below may provide some useful hints.
Perl programmers should look at Dan Kogai's MacOSX::File module, which provides access to a wealth of file metadata. Perl versions of several Mac OS X file access commands are included in the distribution (available on the CPAN: cpan.perl.org).
My table entries are a bit cryptic, so some explanation may be in order. Get Info (GI) is part of the standard Mac OS X distribution. File Buddy (FB) and Super Get Info (SGI) are available from SkyTag Software (www.skytag.com) and Bare Bones Software (www.barebones.com), respectively.
As their terse names hint, du and ls have been in the Unix lexicon for a number of years. GetFileInfo (and a companion command, SetInfo) are part of the Developer Tools and may be found (along with several other file-related utilities) in /Developer/Tools.
I realize that I may have left out some interesting data items (not to mention useful tools!); please drop me a line if you spot an egregious error or omission. As Apple continues to evolve the file system, I'm sure that a follow-on article will be needed...
Ownership FB, GI, SGI ls -l
Permissions FB, GI, SGI ls -l
set[gu]id ls -l
sticky ls -l
Prior to OSX, Mac OS didn't have much in the way of access control. In fact, the only limitations were on file sharing. As convenient as this may have been, it did not protect users from each others' mistakes, nor protect the system files from user errors. BSD, as a multi-user system, has long had the notions of file ownership and permissions. OSX now has these, as well, but you can expect to see other (e.g., capability-based) forms of access control being added over time.
The setgid and setuid bits allow programs to run with the gid (group id) or uid (user id) of the file, rather than that of the user running the command. This is a convenient way to let programs do privileged things, in a controlled manner.
The "sticky" bit has different meaning, depending on the item involved. Adding the sticky bit to a directory makes the directory "append only"; see sticky(8) for details. On some systems (but not OSX), adding the sticky bit to an executable file causes its binary image to stay in memory after the program has finished.
The first character of the permission string printed by "ls -l" indicates the type of the file. This is not the same as the Type (four-character code) shown by some Mac utilities. In fact, it is more similar to the "Kind" field: it specifies whether the item is a plain file, a directory, a symlink, etc.
Access FB ls -lu
Creation FB, GI, SGI ls -cl, GetFileInfo
Modification FB, GI, SGI ls -l, GetFileInfo
file length SGI
size on disk SGI ls -l
file length SGI
size on disk SGI
file length FB, GI, SGI ls -l
size on disk FB, GI, SGI du -k
The access and modification times, as you might expect, are the last times that the file's contents were accessed or modified, respectively. The creation time, on the other hand, is quite inconsistent and potentially confusing.
In both BSD and Mac OS, the creation time is initially defined as the time that the file was created. In BSD, however, the creation time gets updated when the file's directory entry (e.g.., its permissions) has been changed. To get the latter version, use ls -cl.
Because BSD does not have the notion of file "forks", standard command-line tools do not deal with them in a particularly consistent manner. In particular, ls may list the size of a resource fork as zero.
Alias file FB GetFileInfo
Bundle FB GetFileInfo
Custom icon FB GetFileInfo
Extension hidden FB, GI, SGI GetFileInfo
Inited FB GetFileInfo
Invisible FB, GI, SGI GetFileInfo
Locked FB, GI, SGI GetFileInfo
No INIT resource FB GetFileInfo
Shared FB GetFileInfo
Stationery Pad FB, GI, SGI GetFileInfo
System file GetFileInfo
Finder FB, GI, SGI
Creator FB, SGI GetFileInfo
Kind FB, GI, SGI
Open with SGI
Path FB, SGI
Type FB, SGI GetFileInfo
Name locked FB
Preview FB, GI, SGI
Version FB, SGI
To the best of my knowledge, all of these items are irrelevant to most BSD commands. That is, BSD commands neither affect nor respond to these items. Consequently, I won't discuss them in any detail.
The "Comments" information is, however, an exception. OSX stores comments in a binary file called ".DS_Store", located in the same directory as the file being commented upon. It also performs some hidden magic, copying the comments to the appropriate copy of .DS_Store when a file is moved, etc.
Because the BSD commands know nothing about this mechanism, they do NOT perform the same magic. So, if you use "cp" or "mv" on a file, don't expect the comments to follow along!
In Other News
If you're reading this column, you should probably take a look at O'Reilly 's "Mac OS X for Unix Geeks" (Jepson & Rothman) and "Learning Unix for Mac OS X (Taylor & Peek). Both volumes are slim and unassuming, but contain valuable nuggets of OSX lore .
O'Reilly has just released the second (and substantially larger) edition of "Mac OS X: The Missing Manual", by David Pogue. This looks like a handy book for all sorts of Mac OS X users, whatever their background. If you want a quick reference to the parts of OSX that aren't in your area(s) of specialization, give this book a once-over.
Rich Morin has been using computers since 1970, Unix since 1983, and Mac-based Unix since 1986 (when he helped Apple create A/UX 1.0). When he isn't writing this column, Rich runs Prime Time Freeware (www.ptf.com), a publisher of books and CD-ROMs for the Free and Open Source software community. Feel free to write to Rich at firstname.lastname@example.org.