Feb 01 Online
Volume Number: 17 (2001)
Issue Number: 2
Column Tag: MacTech Online
The New Basics: WebDAV
by Jeff Clites <firstname.lastname@example.org>
Over the last few months we have be covering a set of technologies that I've dubbed the New Basics. In brief, these are technologies that are rapidly becoming must-know items for all programmers, independent of platform, programming language, and even area of focus. Most of these technologies are new to everyone - not just to Macintosh programmers. So far, we have covered XML and POSIX threads. This month we are going to focus on a quiet revolutionary in the internet arena called WebDAV.
So What's WebDAV?
The name "WebDAV" stands for Web Distributed Authoring and Versioning (also referred to simply as "DAV"). The thumbnail sketch is that WebDAV marks the advent of the "read-write" web - where the same types of protocols which today allow us to read HTML documents over the web will also let us create and publish them.
WebDAV is an extension to the HTTP protocol, completing the set of primitive operations to allow read, write, create, copy, and delete, as well as locking, the creation and management of collections of resources (logically equivalent to directories or folders), and metadata about those resources. To put it another way, in the WebDAV world URLs get the semantics of files, so that it makes sense to save to a URL or copy a file from one URL to another. And as a true product of the times, WebDAV uses an XML-based format for its handling of metadata, allowing resources to be labeled (and searched for) by properties such as author or subject matter.
What's WebDAV Good For?
Several scenarios for usage of this type of remote or distributed authoring can be readily imagined. One common motivating example is to allow web page authors to access and edit their resources via the same URLs from which they are served, without the need to understand the mapping from a location on a server's filesystem to a location on the web. I don't actually find this example to be a particularly convincing argument for the need for a technology such as WebDAV, because I think that professional web designers need to be (and in fact are) up to the task of understanding how their resources need to be organized on a server, and more importantly commercial web sites will generally not want to perform authoring and editing directly on their live web site. (And of course, many such web sites will use application servers for much of their content, which breaks the direct URL-to-resource mapping anyway.) That said, this model could be useful for small, individual web sites, or for larger web sites with mostly static content that needs minor revisions from time-to-time. In fact, both GoLive and Dreamweaver have begun to incorporate WebDAV support into their products, allowing users to save directly to a URL.
More interesting uses present themselves as well. Once there is full support for versioning (it's actually not present in WebDAV, despite its name, and it's now a goal of the larger Delta-V effort, of which WebDAV is a part), web site authors can use this mechanism to track changes to their content, the way programmers today can use CVS to keep track of what changes were made to their code, when they were made, and by whom. (See below for more on the connection with CVS.) Even more interestingly, sites could continue to server multiple revisions of their content - for instance, a news site could correct errors in their reporting but allow readers to access the original version of a story for historical purposes, or a web page containing a current project plan for a team could allow team members to access old versions in order to track how the plan is evolving over time.
Moving away from the web arena, a mundane but potentially very welcome application is document sharing within an office. Today, despite file-sharing protocols available on every platform, and ready access to tools such as FTP, it's very common to use email to transfer documents from person to person - it's easy and it works when other methods fail, and it has no trouble crossing platform boundaries. The drawbacks are many, however: email systems often don't handle large attachments well (either rejecting them, or causing the entire mail system to bog down), recipients may have to wait through a significant delay for messages to "show up", and collaborators may end up losing changes or accidentally working on out-of-date versions of documents as they are modified and passed back and forth. Using WebDAV, authors can simply save documents to a WebDAV-enabled server and then email the URL to co-workers, if necessary. Revisions can be performed directly from this location, so at any given time there is logically only one copy of this document (the most up-to-date copy), and WebDAV's ability to impose write-locks can prevent individual contributors from accidentally overwriting each others changes if two users try to edit at the same time. And again, the ability to track and even revert changes adds an extra degree of safety, as well as an audit trail. This model should be easier to understand for most users than the transfer model of FTP, and in fact the actual transport is faster as well.
WebDAV and Software Development
So how is WebDAV important to the programmer? On the one hand, even with OS-level integration, there will still be a need for applications which are WebDAV-aware - for instance, applications which administer WebDAV servers or collections and thus need to interact with a WebDAV repository as something other than a filesystem. WebDAV may also impact programmers in another way - by changing the tool set they use to do development. There has been a high interest in using WebDAV to implement a successor to CVS. We covered CVS in this column several months back, and although it is probably the most widely used version-control system around, it does have several design-level limitations which happen to coincide with strengths of WebDAV. For one, CVS is unable to directly version directories, whereas collections are a first-class part of the WebDAV protocol. Also, CVS has only a half-hearted client-server implementation, which reflects its original design to operate only with repositories which are on the same filesystem as the user (possibly via an NFS mount), while WebDAV is inherently client-server. This can be especially important with repositories which need complicated security models to accommodate users with different access permissions.
Readers familiar with CVS may at first think that WebDAV is a complete mismatch, given that the latter uses locking extensively while the former allows multiple users to modify resources concurrently. (In fact, this is one of the strengths of CVS, as version control systems with a lock-based model can actually inhibit the development process. Note that, however, the locking-based model may be more appropriate in an office environment, where users tend to work with binary files which are difficult to compare and merge, such as images or Microsoft Office files.) It should be straightforward, however, to use WebDAV as merely a transport protocol for CVS, without modifying the semantics of CVS. (In fact, WebDAV's locking would find a use during the checkout and commit processes, where CVS does in fact use locking to prevent corruption which could occur if a given file were simultaneously being checked out by one user and checked in by another.) Another point of significant difference is that editing under the CVS model occurs via local working copies, whereas WebDAV editing logically occurs directly on the server. Here again there is potential for an interesting synergy, if WebDAV's model is not taken too literally. One approach would be to maintain the CVS concept of editing via working copies (a "sandbox"), but to leave these copies actually on a remote WebDAV server. By mounting the remote sandbox as a filesystem, the user would retain the experience of a local copy, but with a few optimizations. For instance, checkout and merge speeds could be increased by delaying the creation of actual copies of files until the developer modifies them (copy-on-write semantics), so that files the developer doesn't actually modify are never physically copied. Also, since the "checked out" files are still accessed via WebDAV, it should be possible to continue to version them, allowing the developer to track (and revert) changes that take place in between commits to the main repository. Builds could also occur remotely, on the server machine, with the generated object files also under version control, if desired, giving a team of developers a consistent build environment. (Server-resident files would also allow for a consistent backup policy, and a consistent set of tools for examining differences between versions.)
Most of this, of course, is still at the "what if" stage, but it's pretty clear that WebDAV has a lot to offer to the developer community, both as a new technology to be used in creative applications, and as a tool to improve the development process itself.
Mac OS X v. Windows 2000
Judging from the beta version of Mac OS X, as well as a few other public comments, Apple's new operating system will support mounting remote WebDAV servers - in other words, allowing users to access them as though they were local disk drives. The bad new is that Microsoft's Windows 2000 has beaten us to the punch, but the good news is that they've taken the wrong approach. Windows 2000 (or possible IE, as it is hard to tell where one ends and the other begins) has a feature called Web Folders, which allows applications to access WebDAV servers, but only if the applications are specifically written to do so - it treats the WebDAV server more like a database than a filesystem. The obvious downside of this is that most existing applications won't be able to take advantage of this feature, and it shifts the burden onto developers to make their applications "Web Folder aware". With Mac OS X's approach, any application will be able to join the party - to them, a WebDAV server will be just another volume.
Apache v. IIS
Given that WebDAV is an extension to HTTP (although it will be natural to apply it in contexts which are very different that what we traditionally think of as "the web"), WebDAV servers are usually traditional web servers. In particular, both Apache and Microsoft's Internet Information Server (IIS) support WebDAV, with Apache's support coming via the mod_dav module. It's interesting to look at the different approaches the two servers take to access permissions. IIS controls access to WebDAV resources via local file permissions, so that resources accessed via WebDAV have exactly the same restrictions as they have when accessed directly via the server's filesystem. Apache, on the other hand, "owns" all of the files it servers via WebDAV, so that access by way of the WebDAV protocol is controlled by Apache's permissions system, but direct access to these files on the server is not permitted. The IIS approach has the advantage that files may be accessed through several different protocols while maintaining a consistent access policy (this is a theme of Windows 2000 and its Active Directory permissions system), and Apache's approach has the advantage of permitting "WebDAV users" which are not otherwise known to the local system, and possibly imposing a more sophisticated security model than is implemented in the local file system. Different applications may naturally favor one approach over the other, and given the inherent flexibility of Apache it is likely that it will eventually support the "local permissions" model as well.
The main resource on the web for information on WebDAV is, of course, the WebDAV home page. There you will find all the expected information: news, links to relevant standards and working groups, links to an FAQ, and listings of products currently supporting WebDAV. WebTechniques magazine has a very good overview article by Jim Whitehead, chair of the IETF WebDAV Working Group. It makes a strong case for the synergy of WebDAV and CVS, and has an interesting sidebar on the details of Microsoft's approach to WebDAV in their Office 2000 suite. And while we're on the subject of Microsoft, they also have an interview with Jim Whitehead, which gives a further high-level overview. If you are interested in the version-control angle, you might also want to take a look at Subversion, an open-source effort to provide a CVS alternative, and it does, in fact, use WebDAV for its transport.
For Mac-OS-X-specific coverage, start with an article on the O'Reilly Network, which gives an overview of the WebDAV support in Mac OS X Public Beta. Also of interested is a note on one of the WebDAV mailing lists, simply entitled "Another Dav client", unofficially announcing the Mac's forthcoming support for the protocol at the filesystem level. Also of interest is Goliath, a WebDAV-based web site management tool for the Macintosh. (It's a Carbon application, so it runs under both Mac OS 9 and Mac OS X.) Goliath shows off WebDAV's ability to manage locking, including supplying information about which user currently holds the lock on a given resource. For even more fun, check out an article on MacNN which steps through the (very easy) process of activating WebDAV support in Apache as it ships with Mac OS X Public Beta, allowing your machine to act as a WebDAV server. (Note that this is, of course, not necessary for you to act as a WebDAV client.)
For a more developer-centric view of things, start with WebDAV in 2 Minutes, and if you plan on implementing a WebDAV client or server, you'll want to look at the specification itself, RFC 2518, as well as related specifications, and keep up with the IETF Working Groups on WebDAV and Delta-V. For an introduction to the use of XML in WebDAV, check out Communicating XML Data Over the Web with WebDAV on Microsoft's developer site.
WebDAV promises some exciting developments for collaboration, and just for simple remote access to files. It remains to be seen how WebDAV will stack up against established protocols such as NFS (Sun's Network File System) for distributed filesystems, but I look forward to seeing where it is going to take us.