TweetFollow Us on Twitter

Iterating for Profit and Pleasure

Volume Number: 16 (2000)
Issue Number: 5
Column Tag: Programming

Iterating for Profit and Pleasure

By Neil Mayhew, Calgary, AB

Using iterators to tap into the power of the Standard Template Library

A Brave New world

Iterators are not a new idea. These lightweight objects provide a convenient way to step through the elements of a collection. Likewise, the Standard Template Library has been around for a while now, although many people have still to explore its depths. Iterators are a fundamental part of STL, and an understanding of their role is vital for effective use of the Library. Many of the data structures that we work with every day are 'virtual' collections, and by writing our own STL-compatible iterators for them we can make use of many powerful algorithms from STL.

As an example of custom iterators, this third and final article presents a set of classes that iterate through the items in a MacOS folder. These mesh with algorithms from STL to produce an application that checks the System Folder for additions and changes to its extensions and control panels since the last time the application was run.

Developers already familiar with STL and its use of iterators may wish to skip to the first implementation section, "Walking a Folder".

In search of elegance

In the most general sense, an iterator is an object that records an index into a collection of other objects, and allows those objects to be retrieved for processing. Of course, it must also be possible to advance the index forwards, and possibly backwards, to reference other members of the collection. These actions are invoked by calling methods of the iterator object.

An example of the effective use of iterators is in the reading of run-length encoded pixel data. It is possible to define an iterator that presents the illusion of a rectangular array of pixels, whereas this array is a virtual collection that is computed on the fly from the run-length-encoded data. The iterator's internal storage (instance variables) record a pointer to the current run and a count of the number of pixels still to be read from that run.

There are of course many ways to design a set of methods for an iterator. For example, next() and previous() methods can be used for advancing forwards and backwards through the associated collection, and atEnd() can be used to test for the availability of further elements. Unfortunately, a lack of standards in this area is a great hindrance to code reuse. Code that uses iterators from a given source cannot easily be adapted to use iterators from a different source. Likewise, programmers often find it easier to develop their own, simple-minded solutions to a problem than learn how to use yet another set of iterator conventions.

Yet C has some very powerful iterators of its own that many other languages lack: pointers. These are iterators in the sense that they step through a collection (an array or linked-list) without additional support. There is no need for a separate reference to a collection that must be used with a subscripting operation when retrieving elements. Countless algorithms owe their elegance to the use of pointers instead of arrays and subscripts. Learning to use pointers effectively is therefore one of the key requirements for becoming a skilled C programmer.

A standardized approach

Since C already has a well-established paradigm for the use of iterators, it makes sense to standardize on this model in C++. We discussed smart pointers in an earlier article. These are objects that have the syntax and semantics of a pointer, through overloading the *, -> and ++ operators. A standardized iterator is therefore a special case of a smart pointer.

The advantage of this approach is that code may be written in a style that is very similar to the one that would be used with raw pointers. This makes the code easy to write and easy to understand, and yet considerable sophistication may be present 'under the surface' within the iterators themselves. There need not be any hidden pitfalls, however: if the semantics of the iterator are clearly defined, and consistent with the usual behavior of pointers, then client code will work as expected.

In fact, code using smart pointers for iterators may often be safer than equivalent code using raw pointers. For example, a smart pointer is usually capable of performing bounds checking, in a debug build at least. Or, awkward and error-prone pointer manipulations can be taken care of in one place, inside the iterator, and client code is free from having to worry about such details.

Mastering new idioms

A subtle characteristic of the smart-pointer approach to iterators is that they will be passed around by value, since this is how regular pointers are treated. These iterators therefore need to be small, usually just 4 or 8 bytes; any bigger than this and client code would have to be aware of the need for efficient handling.

Sophisticated behavior, however, doesn't necessarily require large amounts of storage. An iterator for a linked list need only occupy four bytes, and yet the dereference operator can return a reference to the list element rather than to the node that contains it. Likewise, the increment operators can follow the link to the next node.

This standardization allows a linked list to be processed just as if it were an array. For example, the following construction is idiomatic for all standardized iterators:

	for (Iterator p = begin; p != end; ++p)
	{
		[Expressions using *p …]
	}

Note that the loop increment uses ++p and not p++. It is important to understand the reason for this, and to remember to do likewise in your code. Recall that the post-increment operator returns a copy of the iterator or pointer as it was before the increment. This behavior is not needed in this context, since the value is not used anyway. However, when the item being incremented is a user-defined object rather than a pointer, the compiler is not entitled to substitute a pre-increment rather than a post-increment, since the two methods may have different side effects. There is therefore no way to avoid creating a redundant copy of the iterator object each time around the loop.

Note too that the loop test uses != and not <. Again, there is a good reason for this. In the case of a linked list, the < operator is very difficult to implement, and not very useful either. Since there is no compelling reason to use <, even for an array, it is better to write all loops using !=.

These conventions enable client code to remain unchanged even when the collection's underlying representation is changed, for example from an array to a list. As well as easing the maintenance burden, this has the much more general advantage that template functions can be written that work with a wide range of representation types. The code can be written once, as a template, and reused with any collection that uses standardized iterators as an interface. These generic functions are usually referred to as algorithms.

Algorithms + Containers = Programs

Naturally, many of these algorithms are well-understood examples from computer science, such as sorting and searching. Now that we have a mechanism for writing these in a completely generic way, it makes sense to have a standard library of templates that can be re-used in almost any situation. Hence the name Standard Template Library or STL.

Just as the algorithms are classic examples, so are many of the collection types, such as linked lists and dynamically growing arrays. The Library therefore also includes a representative selection of collection types. These are usually referred to as containers, and they avoid the need for endless re-implementations of common container types. The template mechanism is still flexible enough to allow a wide variation in the behavior that can result from use of the template.

The magic ingredient that enables algorithms and containers to work well together is the standardized iterator. All of the containers have begin() and end() methods that return iterator values for the start and end of the collection, and all of the algorithms expect collections to be represented by two such values.

An algorithm typically has the following form:

	template
	some_algorithm(It begin, It end)
	{
		for (It p = begin; p != end; ++p)
		{
			…
		}
	}

It is called like this:

		some_algorithm(container.begin(), container.end());

Neither the algorithms nor the containers are closed sets: the user can very easily add new algorithms and containers, and provided standardized iterators are involved these will integrate seamlessly with the rest of the Library. We will see an example of this shortly.

Smart glue

Iterators are the glue that holds algorithms and containers together to make programs. The Standard Template Library elevates the use of iterators almost to the level of an art form.

For example, many algorithms produce a new collection as their output. They expect to write this output using an iterator. Such iterators are called output iterators, to distinguish them from the read-only input iterators that provide the input data.

However, it may not be known in advance exactly how many elements will be produced in the output. This shouldn't be a problem with a dynamically sized collection, except that the elements are being appended to the collection using an iterator. The iterator has to be smart enough to call the container's append method whenever the algorithm writes a value through the dereferenced iterator. Such an iterator is called a back inserter, and it works by some rather clever sleight of hand that includes overloading the = operator. Needless to say, this is implemented by yet another template in the Library.

Another example of the creative use of iterators is the set of stream iterators that is provided by the Library. These iterators enable the successive reading or writing of objects via iostreams, so that intermediate storage is not required when passing the data to or from an algorithm.

Not all iterators have to be so smart, however. Every algorithm in the library will work happily with raw pointers as the iterators, provided the underlying raw array is large enough to hold any output that may be generated. It is a fundamental principle of the library that iterators and pointers must be completely interchangeable.

Committing to the Standard

Many libraries have been based on excellent theoretical principles, and some have been promoted as universal standards. However, that doesn't mean they have always been universally accepted.

STL is exceptionally well designed, and has been thoroughly tested over a number of years. It is efficient, comprehensive, and portable. Nothing in STL precludes it from being a good citizen within the MacOS. Although iostreams are supported as an option in several areas, they are by no means required, and most of the functionality is purely computational.

Code that makes extensive use of STL over more homegrown solutions usually has much greater clarity, especially if the reader has a general familiarity with STL. A lot of the clarity results from the fact that loops are largely eliminated from user-code, since most of the looping takes place inside the algorithms. User-code is more parallel in nature, being expressed in terms of operations on entire collections. Also, groups of calls to standard algorithms are usually much easier to read than a series of custom loops.

As more and more developers become aware of the benefits of STL, and commit to using it in their work, the value of STL can only increase. It is increasingly likely that another developer studying one's code will be comfortable with its style and structure. The quality of STL implementations continues to improve, and new algorithms and collections are becoming available all the time, often for free via the web.

Just as the functions from the standard C library have become an integral part of the language over the years, the algorithms and containers from STL are becoming an integral part of C++. This is not surprising, given that Bjarne Stroustrup, the creator of C++, has been very active in the development of the Library. For a while, the language and the Library developed in parallel, as the template capabilities of the language were pushed to the limit and beyond.

Fortunately, the language and the Library are now static, having at last been adopted as an ISO standard. It is safe to assume that any modern C++ implementation will have a solid implementation of the Standard Library, which is a superset of STL. Code using STL should therefore be portable to any platform that has a modern C++ compiler.

Walking A Folder

Theory is of little value unless it is used. The combination of standardized iterators and STL algorithms can be very effective in a MacOS setting. Although there is always another way to skin any cat, an STL-based approach is usually rapid in its development and efficient in its implementation.

All operating systems have facilities for iterating through every file in a folder. The MacOS is no exception, although the sole method for doing it, using PBGetCatInfo, can be rather tedious. It would be nice to develop an iterator facility that gives this operation a simple, high-level interface. The result would be a virtual collection, since the elements need not all exist at once as objects in memory, although they do exist as a collection of 'objects' on disk.

Simple interface, subtle design

An interesting design issue concerns the exact nature of the information that is returned. The obvious answer might seem to be an FSSpec, but unfortunately this doesn't allow one to distinguish between files and folders. It would be possible to skip the folders, but this would prevent a recursive descent of an entire folder hierarchy. At the other extreme, the entire contents of PBGetCatInfo's parameter block could be returned, but this would be cumbersome and inefficient in most cases. The best compromise seems to be to return just the most frequently-used file and directory attributes, representing this using a subclass of FSSpec so that it can be used as a parameter to any of the MacOS APIs that expect this datatype.

Another design issue concerns storage management. This is almost always an issue in C++, and many people feel that this is an unwelcome complication of the language. However, it also allows great flexibility in the implementation tradeoffs that can be made. Other languages force one to store all non-scalar data in dynamically allocated heap storage, which introduces significant computational overhead. In contrast, C++ allows objects to be returned by value, rather than by a pointer or a reference. This eliminates the problem of object lifetimes, which would otherwise have to be handled by reference counting or garbage collection.

It is often thought that return-by-value is also expensive computationally, because of the amount of data that has to be copied, but this need not be so. When client code calls a function that returns an object, a hidden argument is passed that contains a pointer to a block of uninitialized storage that will hold the returned object. In effect, the called function copies the return value into this storage using the appropriate copy-constructor, and the client code uses a copy-constructor or assignment operator to move the object to another location. Two destructor calls are also involved, of course, to clean up the intermediate values.

However, if the return statement in the called function is itself a constructor call, most compilers know to optimize this as a direct construction into the space that was reserved by the caller and passed as the hidden argument. If in addition the client code uses the function call as the initializer for a named variable of the same type as the return value, the compiler will optimize this by passing the address of the uninitialized variable as the hidden argument. The net result is that just one object is involved, which is constructed in the called function's return statement.

For example:

	// Caller
	Foo x = GetAFoo();

	// Callee
	Foo GetAFoo() {
		return Foo(arg1, arg2);
	}

This is functionally equivalent to:

	// Caller
	Foo x = Foo(arg1, arg2);

Note that this is not dependent on the callee being an inline function. If the assignment operator is significantly more expensive than the constructor that is being used, it is better to use a fresh variable for every function call than to try to re-use the same variable several times, especially inside a loop.

Hand over your data!

Using this model, the iterator’s dereference operator needs to construct and return an object containing a subset of the information returned by PBGetCatInfo. We’ll call this a FolderItem. Since this will have quite a number of instance variables, we want to avoid having to pass every value as a separate constructor argument. A neat solution is to have FolderItem’s constructor perform the call to PBGetCatInfo. This means we only have to pass three arguments to the constructor: volume, directory, and index.

We also want to ensure that copying and assignment is as efficient as possible. Since the FolderItem will need to contain space for a variable-length name, it makes sense not to use the compiler’s automatically generated methods, and instead to copy just the bytes that are used by the name. On the other hand, it is quite handy not to have to include an explicit copy of every instance variable—it is easy to forget to add an extra line if a new instance variable is introduced. We can have the best of both worlds if we define a base class that contains the name, and a subclass that contains the other information. One uses explicit copy/assignment methods, and the other uses automatic ones. As already explained, we also want to have FSSpec as the ultimate base class.

This leads to the implementation shown in Listing 1.


Listing 1: FolderItem.h

// Class declaration for FolderItem and its bases
class FileSpec : public FSSpec
{
	// Constructors
protected:
	FileSpec() {}	// Used when subclass initializes everything
public:
	FileSpec(short vol, long dir, ConstStringPtr pName)
	{
		vRefNum = vol;
		parID = dir;
		BlockMoveData(pName, name, pName[0] + 1);
	}

	// Copy constructor	
	FileSpec(const FileSpec& other)
		{ *this = other; } // Uses assignment

	// Assignment operator	
	FileSpec& operator= (const FileSpec& other)
	{
		vRefNum = other.vRefNum;
		parID = other.parID;
		BlockMoveData(other.name, name, other.name[0] + 1);
		return *this;
	}
	
	// Comparison operators
	bool operator<  (const FileSpec& other) const
		{ return RelString(name, other.name, false, false) < 0; }
	bool operator== (const FileSpec& other) const
		{ return EqualString(name, other.name, false, false); }
	bool operator!= (const FileSpec& other) const
		{ return !operator== (other); }
};

class FolderItem : public FileSpec
{
public:
	// File attributes
	typedef unsigned long Date;
	Date			created, modified;
	Size			data, resource;
	OSType		type, creator;
	Boolean	folder;

	// Constructors
	FolderItem() {}		// Needed when creating arrays
	FolderItem(short vol, long dir, int index);
		
	// Comparison operators
	bool operator<  (const FolderItem& other) const;
	bool operator== (const FolderItem& other) const;
	bool operator!= (const FolderItem& other) const
		{ return !operator== (other); }
};

There are only three non-trivial methods, all in FolderItem: the constructor that calls PBGetCatInfo, and the two comparison operators. These have fairly obvious implementations as shown in Listing 2.


Listing 2: FolderItem.cp

// Implementation of FolderItem
FolderItem::FolderItem(short vol, long dir, int index)
{
	vRefNum  = vol;
	parID    = dir;

	CInfoPBRec pb;

	pb.hFileInfo.ioCompletion = 0;
	pb.hFileInfo.ioVRefNum = vol;
	pb.hFileInfo.ioDirID = dir;
	pb.hFileInfo.ioFDirIndex = index;
	pb.hFileInfo.ioNamePtr = name;

	PBGetCatInfoSync(&pb);

	folder = pb.hFileInfo.ioFlAttrib & ioDirMask;

	if (folder)
	{
		created  = pb.dirInfo.ioDrCrDat;
		modified = pb.dirInfo.ioDrMdDat;
		data     = 0;
		resource = 0;
		type     = 0;
		creator  = 0;
	}
	else
	{
		created  = pb.hFileInfo.ioFlCrDat;
		modified = pb.hFileInfo.ioFlMdDat;
		data     = pb.hFileInfo.ioFlLgLen;
		resource = pb.hFileInfo.ioFlRLgLen;
		type     = pb.hFileInfo.ioFlFndrInfo.fdType;
		creator  = pb.hFileInfo.ioFlFndrInfo.fdCreator;
	}
}

bool FolderItem::operator== (const FolderItem& other) const
{
	return FileSpec::operator== (other)
		&& created  == other.created
		&& modified == other.modified
		&& data     == other.data
		&& resource == other.resource
		&& type     == other.type
		&& creator  == other.creator
		&& folder   == other.folder;
}	

bool FolderItem::operator<  (const FolderItem& other) const
{
	if (FileSpec::operator< (other)) return true;
	if (FileSpec::operator!=(other)) return false;
	if (created   < other.created)   return true;
	if (created  != other.created)   return false;
	if (modified  < other.modified)  return true;
	if (modified != other.modified)  return false;
	if (data      < other.data)      return true;
	if (data     != other.data)      return false;
	if (resource  < other.resource)  return true;
	if (resource != other.resource)  return false;
	if (type      < other.type)      return true;
	if (type     != other.type)      return false;
	if (creator   < other.creator)   return true;
	if (creator  != other.creator)   return false;
	if (folder    < other.folder)    return true;
	return false;
}	

This takes care of most of the low-level grunt work. The rest is finesse.

Asking for it to happen

Typically, we don’t want to create FolderItems directly. It is much more convenient to define a Folder class that represents the entire collection, and obtain appropriately constructed boundary iterators from it. The Folder needs to store the volume and directory parameters, and it also needs to record the number of items in the folder so that it can supply an end() iterator when required. This item count can be initialized in the constructor, using PBGetCatInfo again.

It is helpful to be able to construct a Folder either from volume and directory numbers, such as returned from FindFolder, or from an FSSpec, such as returned from iterating another Folder. This enables recursive iteration of folder hierarchies. In fact, it is possible to write a ‘super-iterator’ that traverses a hierarchy as if it was a flat list of files. This involves storing a stack of folders and indexes, which is relatively easy with STL although it is still beyond the scope of this article.

Since FolderItem requires the Folder’s volume and directory information to be passed to its constructor, it makes sense to have Folder do the work of constructing FolderItems. This is done very conveniently by defining an array subscripting operator, operator[]. Implementation of the FolderIterator is then fairly trivial, since it is based on integer indexes, and delegates dereferencing to Folder::operator[].

There is some circularity in the definitions of Folder and its iterator: the iterator calls the subscript operator from Folder, and Folder has methods that return a FolderIterator. It is necessary to define FolderIterator first, since Folder passes FolderIterators by value. Then, although the implementation of FolderIterator::operator* is inline, this has to be postponed until after Folder has been declared. The results are shown in Listing 3.


Listing 3: Folder.h

class Folder;	// Forward declaration

// Class declaration for FolderIterator
class FolderIterator
{
	// Instance variables – 8 bytes
	const Folder&	folder;
	int						index;
public:
	// Constructor – zero-based index
	FolderIterator(const Folder& f, int i)
		: folder(f), index(i) {}

	// Dereference operator
	FolderItem operator* () const;

	// Increment and decrement
	FolderIterator& operator++ ()
		{ ++index; return *this; }
	FolderIterator  operator++ (int)	// Post-increment
		{ return FolderIterator(folder, index++); }
	FolderIterator& operator— ()
		{ —index; return *this; }
	FolderIterator  operator— (int) // Post-increment
		{ return FolderIterator(folder, index—); }
	
	// Comparison operators
	bool operator<  (const FolderIterator& other) const
		{ return index <  other.index; }
	bool operator== (const FolderIterator& other) const
		{ return index == other.index; }
	bool operator!= (const FolderIterator& other) const
		{ return index != other.index; }
};

// Class declaration for Folder
class Folder
{
	short	vol;
	long		dir;
	int		count;
public:
	// Constructors - initialize count using PBGetCatInfo
	Folder(short v, long d);
	Folder(const FSSpec&);

	// Subscript operator
	FolderItem operator[] (int i) const
		{ return FolderItem(vol, dir, i + 1); }

	// Bounds
	FolderIterator begin() const
		{ return FolderIterator(*this, 0); }
	FolderIterator end()   const
		{ return FolderIterator(*this, count); }

	// Number of elements
	int size() const { return count; }
	
	// Information
	short volume()    const { return vol; }
	long  directory() const { return dir; }
};

// Deferred implementation
inline FolderItem FolderIterator::operator* () const
	{ return folder[index]; }

The only non-trivial methods are the two constructors for Folder. These have straightforward implementations as shown in Listing 4.

Listing 4: Folder.cp

// Implementation of Folder class

Folder::Folder(short v, long d)
	: vol(v), dir(d)
{
	CInfoPBRec pb;
	Str255 name;

	pb.hFileInfo.ioCompletion = 0;
	pb.hFileInfo.ioVRefNum = vol;
	pb.hFileInfo.ioDirID = dir;
	pb.hFileInfo.ioFDirIndex = -1;
	pb.hFileInfo.ioNamePtr = name;

	PBGetCatInfoSync(&pb);

	count = pb.dirInfo.ioDrNmFls;
}

Folder::Folder(const FSSpec& spec)
	: vol(spec.vRefNum), dir(spec.parID)
{
	CInfoPBRec pb;

	pb.hFileInfo.ioCompletion = 0;
	pb.hFileInfo.ioVRefNum = vol;
	pb.hFileInfo.ioDirID = dir;
	pb.hFileInfo.ioFDirIndex = 0;
	pb.hFileInfo.ioNamePtr = const_cast(spec.name);

	PBGetCatInfoSync(&pb);

	count = pb.dirInfo.ioDrNmFls;
	dir   = pb.dirInfo.ioDrDirID;
}

Seeing the wood instead of the trees

It would be easy to lose sight of the fact that, despite the implementation details, these classes are designed to provide a clean, reusable interface for folder iteration. A typical use might be as follows:

Folder theFolder(anFSS);
long sum = 0;

FolderIterator p;
for (p = theFolder.begin(); p != theFolder.end(); ++p)
{
	sum += (*p).data;
}

Note that it is not possible to use p->data instead of (*p).data, since the iterator returns a copy of the value and not a reference to an object with an independent existence. If multiple uses of *p were needed, it would be much more efficient to create a named variable initialized with *p. This is a slightly dangerous aspect of the interface, since it is not obvious that each use of *p makes a fresh call to PBGetCatInfo. However, the convenience does seem to be worth the risk.

Healthy criticism

A natural question to ask at this point is whether using integer indexes wouldn’t have been simpler. Could this hype about iterators simply be another case of The Emperor’s New Clothes?

In the example above, where the folder is a local variable in the same routine as the loop, an integer approach would have worked well. However, we haven’t looked at any algorithms yet. An algorithm using integers would require the container to be passed as an additional argument. Some algorithms take two input sequences and produce a third, output sequence. This would be very cumbersome with integers, especially if the algorithm needed to operate on subsections of a collection. Also, an algorithm needs to be independent of any container type and so it can’t make any assumptions about begin() and end() methods. Raw arrays, which are a type of collection, don’t even have any methods!

FolderIterator uses an integer implementation internally, but it carries around within itself a reference to the associated collection. Tying essential information together in this way makes it algorithm-friendly, and yet retains the simplicity and robustness of implementation associated with array subscripting.

Checking Extensions

How might the folder-iterating facilities we have developed be put to use in a real-life application? As mentioned in the introduction, it is possible to combine standard algorithms with our own iterators to develop a program that checks the System Folder for additions and changes to its extensions. As well as being a good illustration of STL principles, this could be a useful tool in the continual battle to keep wayward installer programs under control. It seems almost an everyday occurrence that obsolete extensions get installed on one’s system, or shared libraries are overwritten with older versions, both of which can lead to inexplicable crashes at some random point in the future. Help stamp out installation abuse!

Planning the strategy

It is too much to expect that one will remember to run the checking program every time some new software is installed. A safer approach is to create a program fast enough to be run on every startup, that compares the currently installed extensions with their last known state, and records a new state if there have been any changes. This is combined with an interactive program that can generate an attractive display of the differences between any two states chosen from a list box.

The startup behavior breaks down into a number of discrete steps:

  1. Read the current list of extensions into memory
  2. Find and read the previous list from a preference file
  3. Check if the lists are equal
  4. Write a new preference file if they differ

The interactive behavior involves similar steps, but a more complex algorithm at step 3:

  1. Read the current list of previous preference files
  2. Ask the user to choose two of these
  3. Analyze the differences between them
  4. Display the results

We won’t cover the details of finding and creating preference files here, or the GUI mechanisms used to display differences between sets. We will however look at all the algorithms in detail.

Creating a suitable framework

As a framework for our code, we define a class Extensions that inherits from one of the STL containers. Creating, reading and comparing sets are then methods of this class. The class definition is as follows:

class Extensions : public std::vector
{
public:
	Extensions() {}
	Extensions(short vRefNum);

	OSErr Read (const FSSpec&);
	OSErr Write(const FSSpec&) const;

	bool operator== (const Extensions& other) const;
	bool operator!= (const Extensions& other) const
		{ return !operator== (other); }
};

The STL container class vector implements a dynamically sized array. The template argument FolderItem specifies that this is to be an array of FolderItems.

All Standard Library classes and functions are placed in the namespace std. It is possible to avoid having to specify this explicitly by means of a using declaration in .cp files, but it is not a good idea to do this in a header file. Any client that includes such a file will have the whole of the Standard Library dumped into its global namespace.

Reading the current extensions

We don’t want switches between different Extensions Manager sets to be recorded as a change. We therefore read the contents of both enabled and disabled extensions folders and merge them into a single list. This is accomplished very easily with STL. The code for this is in the Extensions constructor:

Extensions::Extensions(short vRefNum)
{
	long extensionsDirID, disabledDirID;
	
	FindFolder(vRefNum, kExtensionFolderType,
		kDontCreateFolder, &vRefNum, &extensionsDirID);
	FindFolder(vRefNum, kExtensionDisabledFolderType,
		kDontCreateFolder, &vRefNum, &disabledDirID);

	Folder extensionsFolder(vRefNum, extensionsDirID);
	Folder   disabledFolder(vRefNum,   disabledDirID);

	std::remove_copy_if(
		extensionsFolder.begin(), extensionsFolder.end(),
			std::back_inserter(*this), isFolder);
	std::remove_copy_if(
		disabledFolder.begin(), disabledFolder.end(),
			std::back_inserter(*this), isFolder);
	
	std::sort(begin(), end());
}

The STL algorithm remove_copy_if copies its input sequence—here corresponding to one of the two folders — to its output sequence, removing any elements that satisfy the given predicate. The predicate used here is a small user-defined function, isFolder, defined like this:

inline bool isFolder(const FolderItem& item)
	{ return item.folder; }

The output sequence is the collection stored in the Extensions object, which is a container derived from vector. However, we don’t know in advance how many elements will be needed, so we append them one by one using a back inserter. The compiler is smart enough to figure out what template arguments to use based on the data that is passed to back_inserter.

The two copying algorithms generate a list of extensions that is a straight concatenation of the contents of the two folders. We then sort the entire list in order to merge the two sections, and to make comparison with other sets easier. The arguments to the sort algorithm are very simple, because the compiler is able to deduce much of the necessary information from the data types being used. The comparison function reverts to the less-than operator unless a specific predicate is specified as a third argument.

Saving and restoring sets

The i/o functions are mostly plain MacOS code, except for the computations of the length and address to read or write. The resize() method (from std::vector) changes the number of elements in the vector, and the size() method returns the current number. The begin() method returns an iterator for the first element of the vector, but since vector is an implementation of an array this iterator is in fact a raw pointer to the beginning of a contiguous block containing all the elements.

OSErr Extensions::Read(const FSSpec& file)
{
	resize(FSpDataSize(&file) / sizeof(FolderItem));
	short fileRef;
	OSErr err = FSpOpenDF(&file, fsRdPerm, &fileRef);
	if (err != noErr)
		return err;
	long count = size() * sizeof(FolderItem);
	err = FSRead(fileRef, &count, begin());
	FSClose(fileRef);
	return err;
}

OSErr Extensions::Write(const FSSpec& file) const
{
	FSpCreate(&file, ‘7INI’, ‘ESET’, smSystemScript);
	short fileRef;
	OSErr err = FSpOpenDF(&file, fsWrPerm, &fileRef);
	if (err != noErr)
		return err;
	long count = size() * sizeof(FolderItem);
	err = FSWrite(fileRef, &count, begin());
	FSClose(fileRef);
	return err;
}

These functions, as written, are rather wasteful of disk space and i/o bandwidth, as most of the 256 bytes in FSSpec’s name field are unused. A better solution would be to compact the items into a character buffer, using variable-length space for the names, and write that buffer to and from disk.

Comparing sets

Checking for equality between Extensions collections is a very simple application of STL calls:

bool Extensions::operator== (const Extensions& other) const
{
	return size() == other.size()
		&& std::equal(begin(), end(), other.begin());
}

The equal method expects its input sequences to be of the same length, so it is necessary to check the lengths for equality first. If the lengths differ the sequences are obviously different.

Determining the differences between collections is a little more interesting. This is implemented as an algorithm using iterators, to allow it to be used with any container type and not just vector. For the same reason, we have not made it a method of Extensions.

The algorithm determines which items have been added, removed or changed between two sequences of FolderItems, and appends the results to its three output sequences. STL has a group of set algorithms that work on sorted sequences, and these are ideal for the implementation:

template<class In, class Out>
void FolderDifferences(
	In begin1, In end1,
	In begin2, In end2,
	Out unique1,
	Out unique2,
	Out changed)
{
	using namespace std;

	vector<FolderItem> common1, common2;

	set_difference(begin1, end1, begin2, end2,
		unique1, less<FileSpec>());
	set_difference(begin2, end2, begin1, end1, 
		unique2, less<FileSpec>());

	set_intersection(begin1, end1, begin2, end2, 
		back_inserter(common1), less<FileSpec>());
	set_intersection(begin2, end2, begin1, end1, 
		back_inserter(common2), less<FileSpec>());
	
	set_difference(
		common2.begin(), common2.end(),
		common1.begin(), common1.end(),
		changed);
}

The set_difference algorithm produces a sequence of elements that are members of its first, but not its second, input sequence. This is used to determine which items are unique to the first set, and then those unique to the second set. The set_intersection algorithm produces a sequence of elements from the first sequence that are also found in the second sequence. This is used to determine which items are found in both sets. It is used twice, to generate separate lists of common items from each of the sets, because the common items will later need to be checked for attribute differences.

In each of the first four set operations, the items are compared as FileSpecs rather than FolderItems, because at this point we are only interested in the names. Other attributes may differ, but if the names are the same we regard two items as one item that has been changed.

We arrange for the algorithm to make comparisons on this basis by passing a predicate as an additional argument. This predicate, less<>, is actually a template class from STL and not a function. Such classes are referred to as functors. The name is followed by parentheses in the argument list because we are constructing an unnamed temporary object and passing a reference to it. The algorithm calls the object's function-call operator, operator(), to perform the comparisons. The function-call operator of less in turn uses the < operator on its two arguments. By specifying FileSpec as the template argument, we ensure that the function call will take arguments of type const FileSpec&, producing the kind of comparison we want. Fortunately, it is not necessary to understand why it works in order to use it!

The final use of set_difference picks out the items from the second set that differ from their namesake in the first set. This time native FolderItem comparison is used, because we want to take all attributes into account. No special predicate is required. The reason set_difference produces the result we want is that unchanged items will be members of both common sequences, whereas changed items will not.

It would have been slightly more efficient to write a custom loop to implement FolderDifferences. This would reduce the number of element comparisons that have to be performed. However, the compute time is small compared with the time taken to scan the extensions folders in the first place. The clarity and ease of implementation easily outweigh the cost.

PowerPlant, anyone?

The algorithm is used as follows:

vector<FolderItem> unique1, unique2;
vector<FolderItem> changed;

FolderDifferences(
		previous.begin(), previous.end(),
		current.begin(), current.end(),
		back_inserter(unique1),
		back_inserter(unique2),
		back_inserter(changed));

These three vectors can then form the inputs to a three-pane PowerPlant window that displays the differences in three columns. A double-click in the Changed column could produce a dialog box that compares the item's attributes in each of the two sets.

A subtle point that is often overlooked is the fact that the elements of the output sequences do not need to be of the same type as the elements of the input sequences. Provided there is an appropriate assignment operator, elements will be converted as they are written to the output. The three output vectors above could have been vector<std::string> if there is an operator that can assign FolderItems to strings.

For testing, it may be appropriate to send the output to a text window. STL can help with this too. First a << operator must be written to output a FolderItem onto an ostream. Then, an STL ostream_iterator can be used to copy an entire vector to an ostringstream. The code looks like this:

	ostringstream text;
	if (unique1.size())
	{
		text << "Previous only:\r";
		copy(unique1.begin(), unique1.end(),
			ostream_iterator<FolderItem>(text, "\r"));
	}
	if (unique2.size())
	{
		text << "Current only:\r";
		copy(unique2.begin(), unique2.end(),
			ostream_iterator<FolderItem>(text, "\r"));
	}
	if (changed.size())
	{
		text << "Changed:\r";
		copy(changed.begin(), changed.end(),
			ostream_iterator<FolderItem>(text, "\r"));
	}

The string argument of the ostream_iterator will be output after every item. As used here, this will put each item onto a separate line. The generated text can then be assigned to a PowerPlant text window in one step:

mTextView->SetTextPtr(text.str().begin(), text.str().size());

Parting Shots

This article explores the principles behind the Standard Template Library in some depth, especially with regard to the role of iterators. It also introduces a representative slice of STL's capabilities in terms of algorithms, containers and iterators.

Unfortunately it is not possible to do proper justice to the Library in such a brief presentation, and the reader is encouraged to continue learning about STL through the many books and papers that have been written on the subject. Most good C++ books nowadays devote a significant proportion of their length to teaching the Standard Library, and indeed any C++ book that does not should be treated with some skepticism. Stroustrup (see Bibliography) devotes almost half the book to the Library, and uses library features in examples right from the first chapter.

STL in retrospect

STL encourages a very different style of programming. Very few loops are seen, and a variety of powerful library functions are used instead. Code is written in terms of logical operations on entire collections.

Although STL's capabilities are well thought-out and remarkably comprehensive, there are inevitably a few clumsy areas, and limitations that can be frustrating at times. However, as the standard begins to receive wide acceptance throughout the C++ industry, it is worth living with all of these limitations in order to reap the benefits of having a much-needed common standard. One has only to look at the ground that has been lost to other languages to realize how much the lack of a standard library has hurt the language. This is a pity, as the power and flexibility of the Standard Template Library are now far, far ahead of any equivalent capability available in other languages that don't implement templates.

Where there are limitations in STL, it is usually possible to overcome these relatively easily, due to the Library's highly extensible design. New containers can be added that will work seamlessly with all of the standard algorithms, and new algorithms that work with all of the standard containers. New iterators can be developed that allow existing algorithms to perform in ways not previously possible. All of these additions can be made by the application developer as required, either on an ad-hoc basis for a single application, or as part of a private library of reusable capabilities. Either way, code quality and clarity is greatly enhanced.

C++ in retrospect

Much has been said about this remarkable language, both good and bad. Whatever else is said, there is no denying that it is lean, elegant, expressive and advanced. These are characteristics to die for, amongst serious programmers. Hard work and discipline are needed in reaping the benefits, but the results are admirably worth it.

It is often said that C++ is a 'large' language, in that there is a lot to learn and remember in order to become proficient in using it. It certainly didn't start out as a large language-in the early days it was considered a big improvement on Ada in that respect! C++ is actually based around a relatively small number of features thoughtfully and consistently applied. If these concepts are discovered and learned, the rest of the language will inevitably fall into place.

C++ has some unique features. Well-applied, these can make a huge difference to our software development. Learn them, and use them!

Bibliography

I have refrained from including a bibliography with every article. Everything I have written about may be found in any good C++ book, and there is no shortage of those. However, I cannot let this series pass without acknowledging the incredible debt that we owe to Bjarne Stroustrup, the creator and refiner of C++, and I can think of no better C++ reference than his own. It is comprehensive, and provides unique insight into how the language is designed and the best way to use it.

  • Stroustrup, Bjarne: The C++ Programming Language. 3rd edn. Addison-Wesley. Reading, Mass. 1997.


Neil Mayhew works for Wycliffe Bible Translators, a non-profit organization dedicated to translating the Bible for the world's 400 million people that do not have it in their own language. Neil started programming in C in 1983, and graduated to the Mac in 1989. When he's not at his Mac or trying to beat his kids at video games you might find him flying a stunt kite if it's windy, or throwing a boomerang if it's not.

 
AAPL
$97.19
Apple Inc.
+2.47
MSFT
$44.87
Microsoft Corpora
+0.04
GOOG
$595.98
Google Inc.
+1.24

MacTech Search:
Community Search:

Software Updates via MacUpdate

Firefox 31.0 - Fast, safe Web browser. (...
Firefox for Mac offers a fast, safe Web browsing experience. Browse quickly, securely, and effortlessly. With its industry-leading features, Firefox is the choice of Web development professionals... Read more
Little Snitch 3.3.3 - Alerts you to outg...
Little Snitch gives you control over your private outgoing data. Track background activityAs soon as your computer connects to the Internet, applications often have permission to send any... Read more
Thunderbird 31.0 - Email client from Moz...
As of July 2012, Thunderbird has transitioned to a new governance model, with new features being developed by the broader free software and open source community, and security fixes and improvements... Read more
Together 3.2 - Store and organize all of...
Together helps you organize your Mac, giving you the ability to store, edit and preview your files in a single clean, uncluttered interface. Smart storage. With simple drag-and-drop functionality,... Read more
Cyberduck 4.5 - FTP and SFTP browser. (F...
Cyberduck is a robust FTP/FTP-TLS/SFTP browser for the Mac whose lack of visual clutter and cleverly intuitive features make it easy to use. Support for external editors and system technologies such... Read more
iExplorer 3.4 - View and transfer all th...
iExplorer is an iPhone browser for Mac lets you view the files on your iOS device. By using a drag and drop interface, you can quickly copy files and folders between your Mac and your iPhone or... Read more
Airmail 1.4 - Powerful, minimal email cl...
Airmail is a powerful, minimal mail client.It was designed to retain the same experience with a single or multiple accounts and provide a quick, modern and easy-to-use user experience. Airmail... Read more
Macs Fan Control 1.1.12 - Monitor and co...
Macs Fan Control allows you to monitor and control almost any aspect of your computer's fans, with support for controlling fan speed, temperature sensors pane, menu-bar icon, and autostart with... Read more
A Better Finder Rename 9.37 - File, phot...
A Better Finder Rename is the most complete renaming solution available on the market today. That's why, since 1996, tens of thousands of hobbyists, professionals and businesses depend on A Better... Read more
MacBook Air EFI Firmware Update 2.9 - Fo...
MacBook Air EFI Firmware Update is recommended for MacBook Air (Mid 2011) models. This update addresses an issue where systems may take longer to wake from sleep than expected and fixes a rare issue... Read more

Latest Forum Discussions

See All

Together for iOS (Productivity)
Together for iOS 1.0 Device: iOS Universal Category: Productivity Price: $9.99, Version: 1.0 (iTunes) Description: Together is an app for keeping things in one place. Notes, documents, images, movies, sounds, web pages and bookmarks... | Read more »
Traps n' Gemstones (Games)
Traps n' Gemstones 1.00 Device: iOS Universal Category: Games Price: $2.99, Version: 1.00 (iTunes) Description: LAUNCH SALE! 40% off, JULY ONLY! TRAPS N' GEMSTONES is an adventurous platform game, among gamers typically known as the... | Read more »
Soccer Physics (Games)
Soccer Physics 1.0 Device: iOS Universal Category: Games Price: $1.99, Version: 1.0 (iTunes) Description: One-button soccer game! So dumb it's fun. "Soccer Physics is probably the funniest football game you'll play on iOS" —... | Read more »
Ex-Angry Birds Developers Release Monsu...
Ex-Angry Birds Developers Release Monsu Teaser Trailer Posted by Jennifer Allen on July 23rd, 2014 [ permalink ] Finnish developer Boomlagoon has released a teaser trailer of their forthcoming side-scrolling action platformer, | Read more »
Lots of New Modes Have Been Added to Can...
Lots of New Modes Have Been Added to Canabalt Posted by Jennifer Allen on July 23rd, 2014 [ permalink ] Universal App - Designed for iPhone and iPad | Read more »
Stronghold 3: The Campaigns Review
Stronghold 3: The Campaigns Review By Jennifer Allen on July 23rd, 2014 Our Rating: :: DULL STRATEGIZINGiPad Only App - Designed for the iPad A cumbersome strategy game, Stronghold 3: The Campaigns has a few too many issues to... | Read more »
Table Tennis Touch on Sale for a Limited...
Table Tennis Touch on Sale for a Limited Time Posted by Jessica Fisher on July 23rd, 2014 [ permalink ] Universal App - Designed for iPhone and iPad | Read more »
Secret Files Tunguska Review
Secret Files Tunguska Review By Jennifer Allen on July 23rd, 2014 Our Rating: :: CONSPIRACY-LITTERED ADVENTURINGUniversal App - Designed for iPhone and iPad Offering traditional adventuring with no fear of in-app purchases, Secret... | Read more »
Celebrate Summer With a Cat in the Hat L...
Celebrate Summer With a Cat in the Hat Learning Library Sale Posted by Ellis Spice on July 22nd, 2014 [ permalink ] Universal App - Designed for iPhone and iPad | Read more »
Dragon Raiders Review
Dragon Raiders Review By Nadia Oxford on July 22nd, 2014 Our Rating: :: RUN, DRAGON, RUNUniversal App - Designed for iPhone and iPad Dragon Raiders is rough and scaly in some parts, but overall it’s an enjoyable level-based running... | Read more »

Price Scanner via MacPrices.net

What Should Apple’s Next MacBook Priority Be;...
Stabley Times’ Phil Moore says that after expanding its iMac lineup with a new low end model, Apple’s next Mac hardware decision will be how it wants to approach expanding its MacBook lineup as well... Read more
ArtRage For iPhone Painting App Free During C...
ArtRage for iPhone is currently being offered for free (regularly $1.99) during Comic-Con San Diego #SDCC, July 24-27, in celebration of the upcoming ArtRage 4.5 and other 64-bit versions of the... Read more
With The Apple/IBM Alliance, Is The iPad Now...
Almost since the iPad was rolled out in 2010, and especially after Apple made a 128 GB storage configuration available in 2012, there’s been debate over whether the iPad is a serious tool for... Read more
MacBook Airs on sale starting at $799, free s...
B&H Photo has the new 2014 MacBook Airs on sale for up to $100 off MSRP for a limited time. Shipping is free, and B&H charges NY sales tax only. They also include free copies of Parallels... Read more
Apple 27″ Thunderbolt Display (refurbished) a...
The Apple Store has Apple Certified Refurbished 27″ Thunderbolt Displays available for $799 including free shipping. That’s $200 off the cost of new models. Read more
WaterField Designs Unveils Cycling Ride Pouch...
High end computer case and bag maker WaterField Designs of San Francisco now enters the cycling market with the introduction of the Cycling Ride Pouch – an upscale toolkit with a scratch-free iPhone... Read more
Kingston Digital Ships Large Capacity Near 1T...
Kingston Digital, Inc., the Flash memory affiliate of Kingston Technology Company, Inc.,has announced its latest addition to the SSDNow V300 series, the V310. The Kingston SSDNow V310 solid-state... Read more
Apple’s Fiscal Third Quarter Results; Record...
Apple has announced financial results for its fiscal 2014 third quarter ended June 28, 2014, racking up quarterly revenue of $37.4 billion and quarterly net profit of $7.7 billion, or $1.28 per... Read more
15-inch 2.0GHz MacBook Pro Retina on sale for...
B&H Photo has the 15″ 2.0GHz Retina MacBook Pro on sale for $1829 including free shipping plus NY sales tax only. Their price is $170 off MSRP. B&H will also include free copies of Parallels... Read more
Apple restocks refurbished Mac minis for up t...
The Apple Store has restocked Apple Certified Refurbished Mac minis for up to $150 off the cost of new models. Apple’s one-year warranty is included with each mini, and shipping is free: - 2.5GHz Mac... Read more

Jobs Board

Senior Interaction Designer, *Apple* Online...
**Job Summary** Apple is looking for a hands on Senior…will be a key player in designing for the Apple Online Store. The ideal designer will have a Read more
*Apple* Sales Chat Rep - Apple (United State...
…is looking for motivated, outgoing, and tech savvy individuals who want to offer Apple Customers an unparalleled customer experience over chat. At Apple , we believe Read more
Mac Expert - *Apple* Online Store Mexico -...
…MUST be fluent in English and Spanish to be considered for this position At Apple , we believe that hard work, a fun environment, creativity and innovation fuel the Read more
*Apple* Industrial Design CAD Sculptor - App...
**Job Summary** The Apple Industrial Design team is looking for a CAD sculptor/Digital 3D modeler to create high quality CAD models used in the industrial design process Read more
*Apple* Developer Support Advisor - Portugue...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.