Subject: [ANN] macstl turns 0.1.3 today with new CodeWarrior support, and
65536 distinct Altivec constants

Pixelglow Software is proud to announce that we’ve updated macstl to 0.1.3,
which now works with Metrowerks Codewarrior 9. The new version also
features 65,566 different generated Altivec constants, and all the standard
Altivec operators in a neat object-oriented package. We=92ve listened to my
users and wrote over 100 new pages of reference documentation on the
website and a comprehensive unit test regime.

macstl is a C++ source library designed to bring the Macintosh into the
world of modern generic programming. The cornerstone is a fast valarray
optimized for Altivec: it runs 3.9x – 18.2x faster than gcc 3.3 libstdc++
and 5.2x – 16.2x faster than Metrowerks MSL C++. Developers unfamiliar with
Altivec can write to a portable, intuitive and standard component, and just
flicking a single compiler switch will make it run fast on a G4 or G5, or
run correctly on other non-Altivec systems.

The license is BSD-like, which means you can change, redistribute, resell,
chop up or torch the source code to your heart=92s content without fee.
However, if you want to compile into object code for more than 30 days, you
should register a single developer for $99 or a site for $499. Object code
is similarly free from royalties and additional fees, and you get priority
support and a share of subsequent fees for code and debug contributions.

Codewarrior Developers

macstl is particularly valuable for CodeWarrior developers, since the fast,
efficient Metrowerks compiler is hampered by a slow, basic valarray
implementation in its MSL library. On the other hand, macstl valarray is
10x faster on inline arithmetic than MSL C++, thanks to aggressive use of
the expression template technique. I=92m especially interested in tests on
this compiler.

Altivec Developers

Programmers who want to get down and dirty with direct Altivec programming
now get the full benefit of working with C++ classes, including intuitive
infix syntax for arithmetic operators, sensible function overloading and
most of the valarray functionality, like cshifting and summing within a
vector. An aggressive optimizer like gcc means the objects and temporaries
will live inside registers and not touch memory. And Holger Bettag=92s famou=
65536 + 30 generated constants are now in a convenient template function,
allowing neat tricks like compile-time arithmetic for constant selection.

C++ Aficionados

You can browse the internal references or download the code to look at some
portable, cutting edge tips and techniques. I intend that valarray should
compile successfully on non-OS X systems:
* Expression templates with a twist: curiously recursive inheritance to
share code and minimize object size, creates an open-ended type system. *
Template template parameters as functors to switch between scalar and
vector code.
* Limited typeof using SFINAE to synchronize Altivec classes to the
inbuilt types and operators.
* Templates that both share code and have different functionality without
up-front policy, partial specialization or virtual inheritance — a sort of
internalized policy-based design.
* Partial specialization as a expression template pattern matching technique=
* Valarray calculation using STL iterators to allow algorithms to scale
* Versions of std algorithms further tuned for random access iterators:
indexing rather than incrementing yields aliasing improvements.


* Now builds with Metrowerks CodeWarrior 9
* Now builds with August 2003 gcc 3.3Updater.
* Added 65,566 generated Altivec constants, using optimal instruction
sequences without memory loads [HBe].
* Added all intrinsic Altivec operators.
* Added unit tests for operators and scalarizers against the builtin
* Added doxygen comments to most public classes and some internal classes.
* Added cshift member to altivec class.
* Added pixel class.

* Now works correctly as #included system headers:added include guards,
fixed internal #lincludes.
* Simplified term use of template template functors, used explicit
enchunking of functors.
* Removed libstdc++ dependencies: added destroy_n algorithm and identity
functor, rewrote uninitialized_copy_n and copy_n, remapped type traits

* Fixed terms including )) and (( not compiling.
* Fixed initial valarray fill not compiling if the type had a non-trivial
* Fixed valarray of bool compiling incorrectly if sizeof (bool) !=3D 4.
* Fixed binder1st/2nd with differing argument types not compiling.
* Fixed operator- of some term iterators not compiling. Fixed incorrect
value for chunked sum of char.

Glen Low, Pixelglow Software