TweetFollow Us on Twitter

Beat the Optimizer
Volume Number:7
Issue Number:4
Column Tag:Fortran's World

Beat The Optimizer!

By Jonathan Bell, Clinton, SC

Note: Source code files accompanying article are located on MacTech CD-ROM or source code disks.

Beat the Optimizer!

[Jon earned a Ph.D. in elementary particle physics from the University of Michigan by doing a lot of number-crunching in FORTRAN on a VAX. Now he teaches physics and computer science at Presbyterian College in Clinton, SC, where he first saw a Macintosh and fell in love with it at first sight. Although he now does much of his teaching and programming in Pascal, and dabbles in assembler, he’s still fond of FORTRAN, especially the dialects with post-FORTRAN-77 extensions.]

Introduction

In heavy-duty number-crunching applications, the efficiency of the object code is a major consideration. For this reason, minicomputer and mainframe compilers can usually optimize the code in various ways. Most Macintosh compilers until recently have not done a very good job with optimization, but this is starting to change. One example of this is the Language Systems FORTRAN compiler, version 2.0, which runs under the Macintosh Programmer’s Workshop (MPW). As Jörg Langowski has pointed out (MacTutor v6#3), this new compiler does a creditable job of optimizing array-index calculations in code involving two-dimensional arrays.

This does not mean that assembly language programming has been rendered obsolete. In this article I would like to demonstrate (a) that it is still relatively easy to “beat the optimizer” with hand-coded assembly language, and that (b) assembly language code does not have to be especially “tricky” or obscure to be effective.

This article will also demonstrate how to use both implementations of Apple’s floating-point arithmetic package: “software SANE”, which is available on all Macintoshes via a package in the System file, and “hardware SANE”, which is available on 68020 and 68030 machines (Mac II or higher) via the 68881 or 68882 floating-point coprocessor chip. (Whenever I mention the 68030, I will also implicitly include the 68020; likewise the 68882 will also “include” the 68881.)

I will not attempt to teach assembly language programming in general, but will assume that you have at least some reading knowledge of 680x0 assembly language. For those who are interested in learning about assembly language, some references are listed at the end of the article.

Matrix Multiplication

As our “test case” I will use matrix multiplication. To refresh your memory (or perhaps initialize it), recall that a matrix is simply a two-dimensional table of numbers, typically represented in a computer program by a two-dimensional array. By convention, we let the first and second indices be the row and column numbers respectively. Thus in Pascal or C, A[5,2] denotes the number in the fifth row of the second column of matrix A. In FORTRAN we use parentheses rather than square brackets: A(5,2).

If the number of columns in matrix A is the same as the number of rows in matrix B, or equivalently, if a row of A has the same length as a column of B, then we can multiply A and B to form a new matrix, C. To calculate C(5,2), for example, we take the fifth row of A and the second column of B, “match up” the corresponding elements, multiply them pairwise and finally add up the products:

        C(5,2) = A(5,1) * B(1,2)  
               + A(5,2) * B(2,2)  
               + A(5,3) * B(3,2)  
               + ...  
               + A(5,M) * B(M,2).

Multiplying two matrices requires three nested loops. The two outer loops cycle over the rows and columns of C, while the innermost loop cycles over the terms in the sum for each element of C. In Language Systems FORTRAN we might write the following subroutine to perform matrix multiplication:

C 1

      SUBROUTINE MXMPY (A, B, C, L, M, N)
C------------------------------------------------
C  Performs the matrix multiplication A*B = C.  
C  The dimensions of the matrices must be related C  as shown below.
C------------------------------------------------
      IMPLICIT NONE
      INTEGER I, J, K, L, M, N
      EXTENDED A(L,M), B(M,N), C(L,N), SUM
      DO I = 1, L
         DO J = 1, N
            SUM = 0.0X0
            DO K = 1, M
               SUM = SUM + A(I,K) * B(K,J)
            END DO
            C(I,J) = SUM
         END DO
      END DO
      END

If you are familiar with FORTRAN, you will note that this code is not standard FORTRAN 77. It uses several “extensions” of the language:

1. Each DO-loop is terminated with an END DO statement rather than a labeled CONTINUE statement. The corresponding DO statements do not use a statement number (label) to specify the end of the loop.

2. The IMPLICIT NONE statement turns off FORTRAN’s default type-specification rules and forces the programmer to declare all variables with their data types, as in Pascal.

3. The EXTENDED data type is an 80-bit or 96-bit floating point number as used by the SANE packages on the Mac; which one is used depends on whether the routine is compiled with the -mc68882 compiler option. Alternatively, for portability to other systems, floating-point variables can be declared as REAL (32 bits) or DOUBLE PRECISION (64 bits), in which case the compiler’s -extended option can be used to force them to be compiled as if they were EXTENDED.

4. The constant 0.0 should be written in exponential notation as 0.0X0 to tell the compiler that it’s supposed to be stored in EXTENDED format. Otherwise it will be stored as a 32-bit REAL and converted to EXTENDED whenever used in calculations, which wastes time. Again, the compiler’s -extended option will force REAL constants to be stored as EXTENDED.

Non-FORTRAN programmers should pay special attention to the “variable dimensions” of the arrays A, B and C. In a FORTRAN subroutine, an array argument (or parameter, to you Pascal and C people) may be specified with variable dimensions which are also specified as arguments. Both the array and the dimensions must be arguments, not local variables. This allows a FORTRAN programmer to write a subroutine which can perform the same operation on arrays and matrices of different sizes.

Speed Tests

I wrote a simple test program [Listing #1] which sets up a matrix, multiplies that matrix by itself ten times, records the time using the Toolbox TickCount function, and divides by ten to get the time for a single matrix multiplication. The program repeats this process for 5x5, 10x10, 20x20 and 50x50 matrices. (I will present results only for 50x50.)

I also wanted to find out what proportion of this time was spent in actual floating-point calculations as opposed to array-indexing overhead. To do this, I compiled the subroutine to assembly-language source code (using the compiler’s -a option), commented out the floating-point operations, renamed it DUMMY, and assembled it using the MPW Assembler. The test program runs DUMMY ten times, using the same matrix sizes as with the “real” MXMPY.

The Language Systems FORTRAN compiler has four levels of optimization, numbered from 0 (no optimization) to 3 (maximum optimization). Repeating the timing tests for each level revealed that for this example, levels 0 and 1 give the same results, as do levels 2 and 3. (Of course, I constructed a separate DUMMY for each optimization level!)

The timings I obtained for 50 x 50 matrix multiplication using the standard SANE library (“software SANE”) are shown in lines 1 and 2 of the table at the end of the article, for each of the two levels of optimization, on a Mac SE. The optimization does in fact reduce the overhead significantly. However, the overhead is only a small fraction of the total execution time, because software SANE is very slow, so we gain a speed increase of only about 12%.

The only way to speed up software SANE significantly is to switch to a faster machine. Lines 3 and 4 of the table show results for a Mac SE/30, demonstrating a speed increase of about 5x for both levels of optimization.

We can improve the SE/30 figures by taking advantage of the 68882 floating-point coprocessor (FPU). The Language Systems FORTRAN compiler allows you to specify that you want code which uses the FPU directly (“hardware SANE”). Recompiling the subroutines and test program using the -mc68030 and -mc68882 options (and creating a new dummy subroutine) gave the results shown in lines 5 and 6 of the table.

Overall, the hardware-SANE versions are 6.6x (opt=1) and 7.9x (opt=2) faster than the software-SANE versions. Interestingly, the indexing overhead runs faster in the hardware-SANE versions, probably because fewer machine-language instructions are necessary to set up the actual floating-point calculations.

How Matrices Work

Before proceeding further, we need to look more closely at how matrices are implemented in high-level languages. When we store data in a matrix, just where does it go, and when we retrieve data from a matrix, where does the program look for it?

Normally we picture a matrix as a table of numbers, laid out in a neat two-dimensional pattern of rows and columns. However, computer memory is not two-dimensional. It is a one-dimensional linear sequence of addresses. A programming language that supports matrices must somehow “map” two dimensions into one. There are two obvious ways to do this.

One way is to take the rows of the matrix one after the other and lay them out “end to end”, forming a single long horizontal line. If we start at one end of the line and examine one element after another in sequence until we come to the other end, we find that the second index (the column number) passes through its complete range of values repeatedly, with the first index (the row number) changing by one after each cycle. This is called row-major ordering. It is the method used by Pascal, C, and most other languages for allocating matrices.

The other way, as you might expect, is to take the columns of the matrix one after the other and lay them out “end to end”, forming a single long vertical line. Now, as we step from one element to the next, the first index varies more rapidly than the second. This is called column-major ordering, and is used by FORTRAN, alone among the major programming languages.

Using column-major ordering, we can easily verify that whenever a FORTRAN program needs to access an element of a matrix, it must compute the address as follows:

addr B(K,J) = addr B(1,1) [ the start of the matrix ]
 + (J - 1) * n[ completely filled columns ]
 + (K - 1) * w[ last, incomplete, column ]

In this formula, n is the number of bytes in one complete column of the matrix, and w is the number of bytes in one matrix element. I’ve kept things simple by assuming that both the row and column indices have 1 as the lower bound, as is the most common practice; if you like, you can figure out the formula for the more general case as an exercise.

The indices J and K usually are variables, so their values have to be fetched from memory as the program runs. If we declare the matrix in the usual way, by specifying the size of the matrix explicitly, then the compiler can calculate the parameters n and w and insert them into the code as constants, possibly as immediate operands in arithmetic operations.

However, our example, MXMPY, uses FORTRAN’s variable dimension feature for array arguments to subroutines. In this case the compiler has no way of knowing what n and w will be for the particular matrices with which MXMPY is invoked. To solve this problem, Language Systems FORTRAN allocates an invisible six-element “bounds array” for each matrix argument to MXMPY, containing the following information:

1: lower bound of first index

2: upper bound of first index

3: number of bytes in one column

4: lower bound of second index

5: upper bound of second index

6: number of bytes in entire matrix

At the beginning of MXMPY, therefore, the compiler must insert a sequence of instructions which uses the arguments L, M and N to initialize the bounds arrays for the matrices A, B and C. This takes about 40 instructions for each matrix, for a total of about 120. Fortunately they’re executed only once, each time we call the subroutine.

Now that we know about the bounds array, we can appreciate the code which Language Systems FORTRAN generates to calculate the address of B(K,J) in MXMPY. The compiler’s -a option produces an output file containing MPW Assembler source code corresponding to the compiled program, from which I obtained the following:

        MOVE.L  -164(A6),D2 ; K
        SUB.L   -136(A6),D2 ; bounds[1] of B
        EXG     D0,D2
        MOVE.L  D1,-(SP)
        MOVEQ   #10,D1
        JSR     F_IMUL  ; FORTRAN library call
        MOVE.L  (SP)+,D1
        EXG     D0,D2
        MOVE.L  D2,D1
        MOVE.L  -168(A6),D2  ; J
        SUB.L   -124(A6),D2  ; bounds[4] of B
        EXG     D0,D2
        MOVE.L  D1,-(SP)
        MOVE.L  -128(A6),D1  ; bounds[3] of B
        JSR     F_IMUL  ; FORTRAN library call
        MOVE.L  (SP)+,D1
        EXG     D0,D2
        ADD.L   D1,D2
        MOVEA.L 24(A6),A1   ; start address of B
        ADDA.L  D2,A1   ; address of B(K,J)

With optimization turned off, the compiler generates all 20 of the instructions above each time an array reference appears in the source code, regardless of whether the indices have changed since the last array reference. With optimization turned on, the compiler moves as many of the instructions as possible outside of loops, to avoid redundant recalculations. This decreases the execution time, as demonstrated above.

Clearly this address calculation could be done more compactly with carefully hand-written assembler code, even without changing the formula. Nevertheless, each calculation would still require two multiplications and four additions or subtractions.

Since we are accessing array elements in a regular sequence, rather than in a “random” fashion, we can do better than this. Using assembly language, we can step through the elements of each array in an orderly fashion which requires only one addition per array access. The key idea is to maintain a pointer to each array. To step to the element in the next row of the same column, we add 10 to the pointer, since the elements in each row are stored contiguously. To step to the element in the next column of the same row, we add the number of bytes in one column.

Before looking at the details of how to do this, let’s look at how to do the floating-point arithmetic, using the Standard Apple Numerics Environment (SANE).

Software SANE in Assembly Language

Each Macintosh has three “packages” containing the SANE operations in its System file. FP68K contains the “fundamental operations” (arithmetic operations, some mathematical functions, and conversion routines). Elems68K contains the “elementary functions” (trig, log, exponential and a few others). DecStr68K contains scanners and formatters which convert floating-point values to character strings for display, and vice versa for input.

Most SANE routines take either one or two operands. Arithmetic operations take two operands, called the source and the destination. The result of the operation appears in the destination operand, wiping out its original value. For example, to perform the addition A + B = C, we might specify A as the source and B as the destination, in which case the value of C would replace the original value of A.

Other operations, such as the various mathematical functions (square root, sine, etc.) take only one operand, the destination operand. The result of the operation replaces the destination operand.

To pass operands to a SANE routine, we push their addresses onto the stack before calling the routine: first the source address (if any), then the destination address.

The SANE routines must be called using the trap dispatcher, rather than with a JSR. Each package uses a single Toolbox trap for access to all of its routines. To tell the package which operation we want, we must push an “opword” onto the stack, after the operands. Normally, we don’t have to worry about looking up the various opwords, because the file SaneMacs.a (which cones with the MPW Assembler) defines macros for each operation, which take care of pushing the opword for you. The entire calling sequence for a single multiplication operation looks like this, using operands named “source” and “destination” which are defined in the current stack frame:

        PEA         source(A6)
        PEA         destination(A6)
        FMULX

SANE removes the operand addresses from the stack, and leaves the result in the destination.

Some of the more commonly used SANE routines are:

(two operands)

FADDX addition
FSUBX subtraction
FMULX multiplication
FDIVX division

(one operand)

FSQRTX  square root
FLNX  natural logarithm
FEXPX exponential (base e)
FSINX sine
FCOSX cosine
FTANX tangent
FATANX  arc tangent

It’s worth noting that although the destination operand must always be an extended-precision number, we can use any numeric data type as a source operand, by using different macros. For example, FMULS assumes that the source is a 32-bit single-precision floating-point number. Nevertheless, calculations are always done using 80-bit extended precision, so we don’t gain any speed by using smaller formats for data. For simplicity, then, data should be stored in extended format.

MXMPY with Software SANE

Listing #2 presents an assembly-language version of MXMPY, using software SANE.

Interfacing assembly-language routines to Language Systems FORTRAN is very straightforward, because LSF uses the standard MPW Pascal conventions for setting up stack frames for the arguments and local variables. The calling routine first pushes the arguments onto the stack, in sequence from left to right as listed in the CALL statement; then it performs a JSR which pushes the return address onto the stack. When the subprogram returns, it must remove the return address and all the arguments from the stack, leaving nothing behind if the subprogram is a SUBROUTINE, or leaving only the function result behind if the subprogram is a FUNCTION.

The only “twist” is that standard FORTRAN always passes subroutine arguments by reference (by pushing the address of the argument onto the stack) rather than by value (pushing the argument itself onto the stack). This actually makes it simpler to write assembly-language subroutines for LSF than for other languages. We simply allocate four bytes in the stack frame for each argument, no matter what type of data it is! (LSF does allow passing arguments by value when necessary, as an extension to standard FORTRAN.)

The overall logic of the assembly-language version of MXMPY mimics the FORTRAN version. It contains three nested loops, of which the outer two scan through the rows and columns of the product array, C, and the innermost loop cycles over the terms of the sum which produces each individual element of C. The main innovation, as mentioned before, is that we step “intelligently” through each array by incrementing a pointer, rather than by recalculating each address from scratch every time.

Exactly how we increment the pointer depends on whether we need to step along a row or a column of the array. If we are stepping across a row of an array, we increment the pointer explicitly by the number of bytes in one column. If we are stepping down a column, we let the autoincrement indirect addressing mode increment the pointer for us.

Inside the innermost loop is the heart of the routine: the code which accumulates an element of C as the sum of products of elements of A and B. Here we use the SANE floating-point operations, as implemented in software via a package in the System file.

Just as with the FORTRAN version, I constructed a “dummy” subroutine by commenting out the SANE calls and operand-pushing instructions. Assembling the two subroutines and linking them with the same main program as before, I obtained the following execution times listed in lines 7 and 8 of the table.

The overhead of the assembly-language version is significantly faster than the overhead of the FORTRAN opt=2 version, on both the SE and the SE/30. For some reason, the speedup factor is smaller on the SE/30 than on the SE. Nevertheless, the total time doesn’t change very much. Overall, the assembly-language version is only 3.5% faster on the SE, and 8.0% faster on the SE/30. Not really worth bragging about, is it?

Hardware SANE in Assembly Language

Simply put, software SANE is such a CPU hog that once we’ve minimized the number of floating-point operations you use, we can’t gain significantly more speed. On the other hand, if we’re using hardware SANE on an FPU-equipped machine, the percentage of overhead is significantly larger. This suggests that we might be able to gain a really significant speed increase by using assembly language to cut down the overhead.

The 68882 coprocessor effectively extends the instruction set and register set of the 68030 CPU. The programmer gains eight new registers, FP0-FP7, for temporary storage of floating-point numbers, and a series of instructions for moving floating-point numbers to and from the coprocessor, and for performing arithmetic on those numbers.

The FPU works with 80-bit extended-precision data, just as the software SANE routines do, but its format in memory is slightly different. Although the FPU registers hold 80 bits each, the “natural” memory format is 96 bits (12 bytes). The extra bits are empty; their only purpose is to allow both the mantissa and exponent to be aligned to longword boundaries, so that the FPU can access them more efficiently. Therefore, for maximum speed, we should store all our floating-point data in 96-bit format. In our FORTRAN main program, EXTENDED data automatically takes up 12 bytes when the -mc68882 switch is used. In our assembly-language MXMPY, we need to change some of our equates and instructions to reflect the change from 10-byte format to 12-byte format.

It’s worth noting that, like most software SANE routines, most FPU instructions can handle data which is stored in other formats besides 96-bit extended. All calculations are performed in the 96-bit format, however, so it is best to keep floating-point data in 96-bit format to begin with, unless memory space is at a premium.

Just as the basic 680x0 integer arithmetic instructions require at least one of the operands to be in a CPU register (A0-A7 and D0-D7), the 68882 floating-point instructions require at least one of the operands to be in a FPU register. Therefore our general sequence of operations must be as follows: first move the destination operand into one of the FPU registers, then do the arithmetic, then move the result out of the FPU register and back into program memory. Our previous multiplication example might look like this (the extension .X indicates that the instruction acts on 12-byte extended-precision data):

        FMOVE.X     destination(A6), FP0
        FMUL.X      source(A6), FP0
        FMOVE.X     FP0, destination(A6)

Some typical 68882 instructions are listed below, corresponding to the ones listed for software SANE. They fall into three groups, depending on what kinds of operands they accept. The second column shows the possible operand combinations for all operations in a group: <ea> indicates any valid memory addressing mode, while FPn and FPm indicate FPU registers. The first operand is the source, while the second is the destination.

Group 1:

FADD.X  <ea>, FPnaddition
FSUB.X  FPn, FPm subtraction
FMUL.X  multiplication
FDIV.X  division

Group 2:

FLOGN.X <ea>, FPnnatural logarithm
FETOX.X FPm, FPn exponential (base e)
FSIN.X  FPn, FPn sine
FCOS.X  cosine
FTAN.X  tangent
FATAN.X arc tangent

Group 3:

FMOVE.X <ea>, FPncopy data
 FPm, FPn
 FPn, <ea>

The Motorola manual for the 68881/68882 contains a complete listing of all the FPU instructions, as does Steve Williams’s book (see the list of references).

MXMPY with hardware SANE

Listing #3 presents an assembly-language version of MXMPY which uses the FPU directly.

For speed, we should keep floating-point data in floating-point registers as much as possible. Therefore, MXMPY uses two FPU registers, one to hold the product which makes up each term, and one to accumulate the sum of all the terms. We don’t actually move any results out of the registers until we finish calculating a matrix element of C.

The sum register has to be initialized to zero. However, there’s no CLR.X instruction which clears a FPU register in the same way a CLR.L clears a normal register. Instead, we use a special instruction, FMOVECR (“floating move constant to register”), which can store one of several different constants into a FPU register. The first operand of FMOVECR is a “code number” which selects the constant. Code $3B gives you pi, code $0C gives you “e”, and code $0F gives you zero, among others. (Weird, isn’t it?)

Once again, I prepared a dummy version, DUMMY, in which the floating-point arithmetic was commented out, and linked the two subroutines to the test program, which had been recompiled with the -mc68030 and -mc68882 options. Running the resulting program on an SE/30 gave the results listed in line 9 of the table.

This version is more than twice as fast as the optimized FORTRAN version. Not only have we cut down the indexing overhead, we have also made the floating-point operations more efficient by keeping intermediate results in floating-point registers.

We also gain a substantial reduction in code size. The assembly-language FPU version of MXMPY occupies only 153 bytes, whereas the FORTRAN version (opt=2) occupies 624 bytes.

This isn’t the last word in speed, though. The innermost loop of MXMPY could be “unrolled” to calculate two (or more) terms on each pass, instead of just one. Combined with careful use of the floating-point registers to take advantage of the 68882’s “pipelining” capability (overlapping of instructions), this could significantly speed up the program. (This won’t do as much good on a 68881, which can’t pipeline.)

This would require some careful work and testing, and would make MXMPY harder to understand and maintain. I would like to emphasize that even the current, “unoptimized” version of MXMPY is still significantly faster than the FORTRAN version, demonstrating that assembly-language code doesn’t have to be “tricky” to be effective.

Executive Summary

If you’re using “software SANE”, don’t bother with assembly language. You wouldn’t gain much speed because the execution time is dominated by the SANE calls. It is worthwhile to examine your compiled code to make sure that you aren’t doing unnecessary SANE calls, and hand-optimize the FORTRAN code if necessary.

If you’re using “hardware SANE”, it may be worthwhile to rewrite key routines in assembler. Even relatively straightforward code can bring significant speed increases.

Table of Results (50 x 50 Matrix Multiplication)

Mach lang opt SANE total calcs ovhd ovhd %

1 SE LSF 1 sw 128.008 108.672 19.337 15.1

2 SE LSF 2 sw 114.628 108.620 6.008 5.2

3 SE/30 LSF 1 sw 26.283 21.773 4.510 17.2

4 SE/30 LSF 2 sw 22.755 21.653 1.102 4.8

5 SE/30 LSF 1 hw 3.998 2.182 1.817 45.4

6 SE/30 LSF 2 hw 2.873 2.158 0.715 24.9

7 SE Asm - sw 110.772 108.735 2.037 1.8

8 SE/30 Asm - sw 21.070 20.575 0.495 2.3

9 SE/30 Asm - hw 1.355 1.192 0.163 12.0

References

Apple Computer, Inc. Apple Numerics Manual, 2nd ed. (Addison-Wesley, 1988). Complete documentation of SANE for both the Apple II and Macintosh families.

Kane, Jeffrey. “Assembly Language for the Rest of Us.” MacTutor, vol. 5, no. 12 (December 1989). A good “user-friendly” introduction to Macintosh assembly-language programming, using an example similar to the one presented here.

Knaster, Scott. How to Write Macintosh Software (2nd ed.) (Hayden Books, 1988). Has good descriptions of how subroutines work in most Macintosh languages, and a good overview of assembly language, written with the goal of helping the reader understand a compiler’s output.

Motorola, Inc. MC68000 8-, 16-, 32-Bit Microprocessors User’s Manual, 6th ed. (Prentice-Hall, 1988). The “bible” for the CPU used in the Mac Plus, SE and Portable.

Motorola, Inc. MC68020 32-Bit Microprocessor User’s Manual, 3rd ed. (Prentice-Hall, 1988). The “bible” for the CPU used in the Mac II.

Motorola, Inc. MC68030 Enhanced 32-Bit Microprocessor User’s Manual (Prentice-Hall, 1988). The “bible” for the CPU used in the Mac SE/30, IIx, IIc, IIci, IIfx.

Motorola, Inc. MC68881/2 Floating-Point Coprocessor User’s Manual, 2nd ed. The “bible” for the FPUs used in the 68020- and 68030-based Macs.

Weston, Dan. The Complete Book of Macintosh Assembly Language Programming, vol. 1 (Scott, Foresman, 1986) and vol. 2 (1987). Still the best introduction to Macintosh assembly language. Its one major drawback is that it uses the CDS (formerly MDS) assembler, rather than the MPW assembler.

Williams, Steve. 68030 Assembly Language Reference (Addison-Wesley, 1989). Includes a complete listing of instructions for all members of the 680x0 family, plus the 6888x FPU’s and the 68851 memory management unit. Also has code examples and discusses Macintosh programming conventions. You could probably get along without the Motorola manuals if you have this.

Listing #1: Timing program

!!M Inlines.f
C  The compiler directive above gives us access
C  to the Macintosh Toolbox.
C------------------------------------------------
C  This program determines the time needed to
C  perform matrix multiplcation, and also finds
C  out how the total time is divided between
C  actual floating-point calculations and index-
C  ing overhead.
C
C  April 1990.
C  Jon Bell, Dept. of Physics & Computer Science
C  Presbyterian College, Clinton SC.
C
C  Written in Language Systems FORTRAN, v2.0.
C------------------------------------------------
      IMPLICIT NONE
      INTEGER I, J
      EXTENDED TOTAL, OVERHEAD, CALCS
      EXTENDED A(3,5), B(5,3), C(3,3)
      DATA ((A(I,J), J=1,5), I=1,3)
     &     /  -2.0, -7.0, -4.0,  8.0,  1.0,
     &        -3.0,  0.0, -1.0,  0.0, -9.0,
     &         0.0, -1.0, -8.0, -9.0, -3.0  /
      DATA ((B(I,J), J=1,3), I=1,5)
     &     /  -1.0,  3.0, -2.0,
     &         4.0, -1.0,  1.0,
     &         9.0, -4.0, -8.0,
     &        -5.0,  0.0, -3.0,
     &         6.0,  3.0, -6.0  /
C
C  First demonstrate that the matrix 
C  multiplication works properly.
C
      CALL MXMPY (A, B, C, 3, 5, 3)
      TYPE *, ‘Matrix A is:’
      TYPE *
      CALL MATPRINT (A, 3, 5)
      TYPE *
      TYPE *, ‘Matrix B is:’
      TYPE *
      CALL MATPRINT (B, 5, 3)
      TYPE *
      TYPE *, ‘Their product is:’
      TYPE *
      CALL MATPRINT (C, 3, 3)
      TYPE *
C
C  Find the time it takes to multiply matrices
C  of various sizes.
C
      TYPE *
      TYPE *, ‘Time to multiply two matrices:’
      TYPE *
      TYPE *, ‘      Size           Total’,
     *        ‘      Overhead          Calcs’
      TYPE *, ‘----------     ----------’,
     *        ‘    ----------     ----------’
      CALL TIMER (5, TOTAL, OVERHEAD, CALCS)
      TYPE 100, ‘5 x 5’, TOTAL, OVERHEAD, CALCS
      CALL TIMER (10, TOTAL, OVERHEAD, CALCS)
      TYPE 100, ’10 x 10', TOTAL, OVERHEAD, CALCS
      CALL TIMER (20, TOTAL, OVERHEAD, CALCS)
      TYPE 100, ’20 x 20', TOTAL, OVERHEAD, CALCS
      CALL TIMER (50, TOTAL, OVERHEAD, CALCS)
      TYPE 100, ’50 x 50', TOTAL, OVERHEAD, CALCS
100   FORMAT (1X, A10, 3(5X, F10.3))
      END

      SUBROUTINE MATINIT (A, M, N)
C------------------------------------------------
C  Initialize the array A as a M x N matrix.
C------------------------------------------------
      IMPLICIT NONE
      INTEGER M, N, I, J
      EXTENDED A(M,N)
      DO I = 1, M
         DO J = 1, N
            A(I,J) = REAL(I+J)
         END DO
      END DO
      END

      SUBROUTINE MATPRINT (A, M, N)
C------------------------------------------------
C  Prints the contents of the M x N matrix A.
C  Both dimensions must not be greater than 10.
C------------------------------------------------
      IMPLICIT NONE
      INTEGER M, N, I, J
      EXTENDED A(M,N)
      DO I = 1, M
          TYPE ‘(10F8.1)’, (A(I,J), J=1,N)
      END DO
      END

      SUBROUTINE TIMER 
     *   (SIZE, TOTAL, OVERHEAD, CALCS)
C------------------------------------------------
C  Determine the time required to multiply two
C  SIZE x SIZE matrices (where SIZE <= 50).
C     TOTAL = total time (seconds). 
C     OVERHEAD = indexing overhead (seconds).
C     CALCS = actual time spent in float-
C  ing-point calculations (seconds).
C------------------------------------------------
      IMPLICIT NONE
      INTEGER SIZE, TICKS, STARTTICKS, 
     *        STOPTICKS, J
      EXTENDED TOTAL, OVERHEAD, CALCS
      EXTENDED SOURCE(50,50), RESULT(50,50)
      CALL MATINIT (SOURCE, SIZE, SIZE)
C
C  Multiply the matrix by itself ten times and 
C  find the average time per multiplication.
C  Notice that we’re using only part of the 
C  matrix.  Although it’s declared here as 50x50,
C  all the other subroutines will think it’s
C  really SIZE x SIZE!
C
      TICKS = 0
      DO J = 1, 10
         STARTTICKS = TICKCOUNT()
         CALL MXMPY (SOURCE, SOURCE, RESULT, SIZE,
     *               SIZE, SIZE)
         STOPTICKS = TICKCOUNT()
         TICKS = TICKS + (STOPTICKS - STARTTICKS)
      END DO
      TOTAL = REAL(TICKS) / 600.0
C
C  Repeat with a “dummy” routine which has all the 
C  indexing overhead of the matrix multiplication
C  but doesn’t actually do any arithmetic.
C
      TICKS = 0
      DO J = 1, 10
         STARTTICKS = TICKCOUNT()
         CALL DUMMY (SOURCE, SOURCE, RESULT,
     *                 SIZE, SIZE, SIZE)
         STOPTICKS = TICKCOUNT()
         TICKS = TICKS + (STOPTICKS - STARTTICKS)
      END DO
      OVERHEAD = REAL(TICKS) / 600.0
C
C  Calculate the average time spent doing actual
C  arithmetic.
C
      CALCS = TOTAL - OVERHEAD
      END
Listing #2: MXMPY, assembly-language version (software SANE)

      PRINT OFF
      INCLUDE ‘Traps.a’
      INCLUDE ‘SANEMacs.a’
      PRINT ON
;------------------------------------------------
Mxmpy       PROC        EXPORT
;- - - - - - - - - - - - - - - - - - - - - - - - -
;  Performs the matrix multiplication C = A * B.
;
;  Calling sequence (FORTRAN):
;      CALL MXMPY (A, B, C, L, M, N)
;
;  where
;      A is an array with L rows and M columns
;      B is an array with M rows and N columns
;      C is an array with L rows and N columns
;      L, M, and N are INTEGERs.
;
;  NOTE:  All arrays must be completely filled,
;  with no gaps.  Do not try to pass part of an
;  array unless it forms a contiguous block of
;  memory locations.
;
;  April 1990
;  Jon Bell, Dept. of Physics & Computer Science
;  Presbyterian College, Clinton SC 29325
;
;  Written for MPW Assembler, v3.0.
;- - - - - - - - - - - - - - - - - - - - - - - - -
;  Locations of arguments to the subroutine,
;  relative to the address stored in register A6.
a       EQU     28          ; addr. of a
b       EQU     24          ; addr. of b
c       EQU     20          ; addr. of c
l       EQU     16          ; addr. of # rows in a
m       EQU     12          ; addr. of # cols in a
n       EQU     8           ; addr. of # cols in b
;  Locations of local variables, relative to the
;  address stored in register A6.
sum         EQU -10 ; accumulates an element of c
term        EQU -20 ; terms for an element of c
termCount   EQU -24 ; initial value of term index
rowCount    EQU -28 ; initial value of col. index
aColSize    EQU -32 ; # of bytes per column of a
bColSize    EQU -36 ; # of bytes per column of b
;  Other constants.
ParamSize   EQU  24 ; # of bytes of parameters
LocalSize   EQU -36 ; # of bytes of local var’s.
;  Register usage.
aPtr        EQU  A2 ; pointer into a
bPtr        EQU  A3 ; pointer into b
cPtr        EQU  A4 ; pointer into c
rowIndex    EQU  D3 ; row-loop index
colIndex    EQU  D4 ; column-loop index
termIndex   EQU  D5 ; term-loop index
aRowBase    EQU  D6 ; start of current row in a
bColBase    EQU  D7 ; start of current col. in b
;- - - - - - - - - - - - - - - - - - - - - - - - -
;  Set up the stack frame, and save registers on
;  the stack.
      LINK        A6, #LocalSize
      MOVEM.L     A2-A4/D3-D7, -(SP)
;  Calculate and save the length of one 
;  column of a.
      MOVE.L      l(A6), A0
      MOVE.L      (A0), D0      ; # of rows
      MULU        #10, D0       ; bytes per column
      MOVE.L      D0, aColSize(A6)
;  Calculate and save the length of one 
;  column of b.
      MOVE.L      m(A6), A0
      MOVE.L      (A0), D0      ; # of rows
      MULU        #10, D0       ; bytes per column
      MOVE.L      D0, bColSize(A6)
;  Save the initial value of the term index.
      MOVE.L      m(A6), A0
      MOVE.L      (A0), termCount(A6)
;  Save the initial value of the row index.
      MOVE.L      l(A6), A0
      MOVE.L      (A0), rowCount(A6)
;  Initialize the column index.
      MOVE.L      n(A6), A0
      MOVE.L      (A0), colIndex
;  Initialize the base address of the current
;  column in b to the start of b.
      MOVE.L      b(A6), bColBase
;  Initialize pointer into c.
      MOVE.L      c(A6), cPtr
BeginColLoop      ;  Cycle over the columns of c.
            SUB.L       #1, colIndex
            BMI.S       EndColLoop
      ;  Initialize the row index.
            MOVE.L      rowCount(A6), rowIndex
      ;  Initialize the base address of the
      ;  current row in a to the start of a.
            MOVE.L      a(A6), aRowBase
BeginRowLoop      ; Cycle over the rows of c.
                  SUB.L       #1, rowIndex
                  BMI.S       EndRowLoop
            ;  Initialize the a and b pointers 
            ;  for the next sum of terms.
                  MOVE.L      aRowBase, aPtr
                  MOVE.L      bColBase, bPtr
            ;  Initialize the sum.
                  LEA         sum(A6), A0
                  CLR.L       (A0)+
                  CLR.L       (A0)+
                  CLR.W       (A0)
            ;  Initialize the term index.
                  MOVE.L  termCount(A6), termIndex
BeginTermLoop     ; Cycle over the terms 
                  ; in the sum.
                        SUB.L       #1, termIndex
                        BMI.S       EndTermLoop
                  ;  Push the source and
                  ;  destination addresses on the
                  ;  stack for the multiplication.
                        MOVE.L      aPtr, -(SP)
                        LEA         term(A6), A0
                        MOVE.L      A0, -(SP)
                  ;  Copy the current element of b
                  ;  to the destination, and
                  ;  advance to the next element
                  ;  in the current column of b.
                        MOVE.L      (bPtr)+, (A0)+
                        MOVE.L      (bPtr)+, (A0)+
                        MOVE.W      (bPtr)+, (A0)
                  ;  Perform the multiplication.
                        FMULX
                  ;  Add the new term to the sum.
                        PEA         term(A6)
                        PEA         sum(A6)
                        FADDX
                  ;  Advance to the next element
                  ;  in the current row of a.
                        ADDA.L  aColSize(A6), aPtr
                        BRA.S       BeginTermLoop
EndTermLoop
            ;  Move the sum into the current
            ;  element of c, and advance to the
            ;  next row in the current column of
            ;  c.  (At the end of the current
            ;  column, this will wrap around to
            ;  the next column.)
                  LEA         sum(A6), A0
                  MOVE.L      (A0)+, (cPtr)+
                  MOVE.L      (A0)+, (cPtr)+
                  MOVE.W      (A0), (cPtr)+
            ;  Advance to the next row of a.
                  ADD.L       #10, aRowBase
                  BRA.S       BeginRowLoop
EndRowLoop
      ;  Advance to the next column of b.
            ADD.L       bColSize(A6), bColBase
            BRA.S       BeginColLoop
EndColLoop
;  All done.  Restore the saved registers, 
;  clean up the stack and return.
      MOVEM.L     (SP)+, A2-A4/D3-D7
      UNLK        A6
      MOVE.L      (SP)+, A0
      ADDA.L      #ParamSize, SP
      JMP         (A0)
      DC.B        ‘MXMPY   ‘  ; label for debugger
      ENDPROC
      END
Listing #3: MXMPY, Assembly-language version (hardware SANE)

      MACHINE MC68020
      MC68881
;------------------------------------------------
Mxmpy       PROC        EXPORT
;- - - - - - - - - - - - - - - - - - - - - - - - -
;  Performs the matrix multiplication C = A * B.
;
;  Calling sequence (FORTRAN):
;      CALL MXMPY (A, B, C, L, M, N)
;
;  where
;      A is an array with L rows and M columns
;      B is an array with M rows and N columns
;      C is an array with L rows and N columns
;      L, M, and N are INTEGERs.
;
;  NOTE:  All arrays must be completely filled,
;  with no gaps. Do not try to pass part of an
;  array unless it forms a contiguous block of
;  memory locations.
;
;  April 1990
;  Jon Bell, Dept. of Physics & Computer Science
;  Presbyterian College, Clinton SC 29325 
;
;  Written for MPW Assembler, v3.0.
;- - - - - - - - - - - - - - - - - - - - - - - - -
;  Locations of arguments to the subroutine,
;  relative to the address stored in register A6.
a           EQU     28      ; addr. of a
b           EQU     24      ; addr. of b
c           EQU     20      ; addr. of c
l           EQU     16      ; addr. of # rows in a
m           EQU     12      ; addr. of # cols in a
n           EQU     8       ; addr. of # cols in b
;  Locations of local variables, relative to the
;  address stored in register A6.
termCount   EQU  -4  ; initial value of term index
rowCount    EQU  -8  ; initial value of col. index
aColSize    EQU -12  ; # of bytes per column of a
bColSize    EQU -16  ; # of bytes per column of b
;  Other constants.
ParamSize   EQU  24  ; # of bytes of parameters
LocalSize   EQU -16  ; # of bytes of local var’s
zero        EQU $0F  ; 68882 code for constant 0
;  Register usage.
aPtr        EQU  A2  ; pointer into a
bPtr        EQU  A3  ; pointer into b
cPtr        EQU  A4  ; pointer into c
rowIndex    EQU  D3  ; row-loop index
colIndex    EQU  D4  ; column-loop index
termIndex   EQU  D5  ; term-loop index
aRowBase    EQU  D6  ; start of current row in a
bColBase    EQU  D7  ; start of current col. in b
term        EQU  FP0 ; one term for an elem. of c
sum         EQU  FP1 ; sum for an element of c
;- - - - - - - - - - - - - - - - - - - - - - - - -
;  Set up the stack frame, and save registers on
;  the stack.
      LINK        A6, #LocalSize
      MOVEM.L     A2-A4/D3-D7, -(SP)
;  Calculate and save the length of 
;  one column of a.
      MOVE.L      l(A6), A0
      MOVE.L      (A0), D0      ; # of rows
      MULU        #12, D0       ; bytes per column
      MOVE.L      D0, aColSize(A6)
;  Calculate and save the length of 
;  one column of b.
      MOVE.L      m(A6), A0
      MOVE.L      (A0), D0      ; # of rows
      MULU        #12, D0       ; bytes per column
      MOVE.L      D0, bColSize(A6)
;  Save the initial value of the term index.
      MOVE.L      m(A6), A0
      MOVE.L      (A0), termCount(A6)
;  Save the initial value of the row index.
      MOVE.L      l(A6), A0
      MOVE.L      (A0), rowCount(A6)
;  Initialize the column index.
      MOVE.L      n(A6), A0
      MOVE.L      (A0), colIndex
;  Initialize the base address of the current
;  column in b to the start of b.
      MOVE.L      b(A6), bColBase
;  Initialize the pointer into c.
      MOVE.L      c(A6), cPtr
BeginColLoop      ;  Cycle over the columns of c.
            SUBQ.L      #1, colIndex
            BMI.S       EndColLoop
      ;  Initialize the row index.
            MOVE.L      rowCount(A6), rowIndex
      ;  Initialize the base address of the
      ;  current row in a to the start of a.
            MOVE.L      a(A6), aRowBase
BeginRowLoop      ; Cycle over the rows of c.
                  SUBQ.L      #1, rowIndex
                  BMI.S       EndRowLoop
            ;  Initialize the a and b pointers 
            ;  for the next sum of terms.
                  MOVE.L      aRowBase, aPtr
                  MOVE.L      bColBase, bPtr
            ;  Initialize the sum.
                  FMOVECR.X   #zero, sum
            ;  Initialize the term index.
                  MOVE.L  termCount(A6), termIndex
BeginTermLoop     ; Cycle over the terms 
                  ; in the sum.
                        SUBQ.L      #1, termIndex
                        BMI.S       EndTermLoop
                  ;  Multiply the current element
                  ;  of b by the current element
                  ;  of a, and advance to the
                  ;  next element in the current
                  ;  column of b.
                        FMOVE.X     (bPtr)+, term
                        FMUL.X      (aPtr), term
                  ;  Add the new term to the sum.
                        FADD.X      term, sum
                  ;  Advance to the next element
                  ;  in the current row of a.
                        ADDA.L  aColSize(A6), aPtr
                        BRA.S       BeginTermLoop
EndTermLoop
            ;  Move the sum into the current
            ;  element of c, and advance to the
            ;  next row in the current 
            ;  column of c.
                  FMOVE.X     sum, (cPtr)+
            ;  Advance to the next row in the
            ;  current column of a.
                  ADD.L       #12, aRowBase
                  BRA.S       BeginRowLoop
EndRowLoop
      ;  Advance to next column in b.
            ADD.L       bColSize(A6), bColBase
            BRA.S       BeginColLoop
EndColLoop
;  All done.  Restore the saved registers, 
;  clean up the stack and return.
      MOVEM.L     (SP)+, A2-A4/D3-D7
      UNLK        A6
      RTD         #ParamSize
      DC.B        ‘MXMPY   ‘ ; label for debugger
      ENDPROC
      END

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Apple Pro Video Formats 2.0.1 - Updates...
Apple Pro Video Formats brings updates to Apple's professional-level codes for Final Cut Pro X, Motion 5, and Compressor 4. Version 2.0.1: Support for the following professional video codecs Apple... Read more
Maya 2015 - Professional 3D modeling and...
Maya is an award-winning software and powerful, integrated 3D modeling, animation, visual effects, and rendering solution. Because Maya is based on an open architecture, all your work can be scripted... Read more
EtreCheck 2.2 - For troubleshooting your...
EtreCheck is a simple little app to display the important details of your system configuration and allow you to copy that information to the Clipboard. It is meant to be used with Apple Support... Read more
OmniOutliner Pro 4.2 - Pro version of th...
OmniOutliner Pro is a flexible program for creating, collecting, and organizing information. Give your creativity a kick start by using an application that's actually designed to help you think. It's... Read more
VLC Media Player 2.2.1 - Popular multime...
VLC Media Player is a highly portable multimedia player for various audio and video formats (MPEG-1, MPEG-2, MPEG-4, DivX, MP3, OGG, ...) as well as DVDs, VCDs, and various streaming protocols. It... Read more
Nisus Writer Pro 2.1.1 - Multilingual wo...
Nisus Writer Pro is a powerful multilingual word processor, similar to its entry level products, but brings new features such as table of contents, indexing, bookmarks, widow and orphan control,... Read more
Tinderbox 6.2.0 - Store and organize you...
Tinderbox is a personal content management assistant. It stores your notes, ideas, and plans. It can help you organize and understand them. And Tinderbox helps you share ideas through Web journals... Read more
OmniOutliner 4.2 - Organize your ideas,...
OmniOutliner is a flexible program for creating, collecting, and organizing information. Give your creativity a kick start by using an application that's actually designed to help you think. It's... Read more
calibre 2.25.0 - Complete e-library mana...
Calibre is a complete e-book library manager. Organize your collection, convert your books to multiple formats, and sync with all of your devices. Let Calibre be your multi-tasking digital librarian... Read more
Things 2.5.4 - Elegant personal task man...
Things is a task management solution that helps to organize your tasks in an elegant and intuitive way. Things combines powerful features with simplicity through the use of tags and its intelligent... Read more

Watch This Homerun is Batting for the Ap...
Eyes Wide Games' Watch This Homerun is purportedly the first sports game coming to the Apple Watch, where you'll be up to bat as the pitcher tries to out-manuever you with fastballs, curveballs, and changeups. Using one-touch controls you can try to... | Read more »
Field Trip Can Take You on a Guided Tour...
Field Trip, by Google’s Niantic Labs, is an exploration app that gives you details about the awesome places you can discover wherever you find yourself. The app can show you local history, delicious restraunts, the best places to shop, and places to... | Read more »
Watch Your Six - SPY_WATCH is Infiltrati...
SPY_WATCH, by Bossa Studios, is a new game designed for the Apple Watch. Runmor has it your spy agency has fallen out of favor. To save it, you'll need to train-up a spy and send them on missions to earn you a stunningly suspicious reputation and... | Read more »
Both Halo: Spartan Assault and Halo: Spa...
Halo: Spartan Assault and Halo: Spartan Strike, by Microsoft, have officially landed on the App Store. Spartan Assault pits you against the Covenant with missions geared to tell the story of the origin of Spartan Ops. In Spartan Strike you'll delve... | Read more »
The Apple Watch Could Revolutionize the...
It’s not here yet but there’s that developing sneaky feeling that the Apple Watch, despite its price tag and low battery life, might yet change quite a lot about how we conduct our lives. While I don’t think it’s going to be an overnight... | Read more »
Mad Skills Motocross 2 Version 2.0 is He...
Mad Skills Motocross 2 fans got some good news this week as Turborilla has given the game its biggest update yet. Now you'll have access to Versus mode where you can compete against your friends in timed challenges. Turborilla has implemented a... | Read more »
Kids Can Practice Healthy Living With Gr...
Bobaka is releasing a new interactive book called Green Riding Hood  in May. The app teaches kids about yoga and organic style of life through mini-games and a fun take on the classic Little Red Riding Hood fairy tale. | Read more »
Chainsaw Warrior: Lords of the Night has...
It's time to put the Darkness back in its place now that Chainsaw Warrior: Lords of the Night has officially made it to iOS. | Read more »
A World of Ice and Fire Lets You Stalk 2...
George R. R. Martin’s A World of Ice and Fire, by Random House, is a mobile guide to the epic series. The new update gives you the Journeys map feture that will let you track the movements of 25 different characters. But don't worry, you can protect... | Read more »
Gameloft Announces Battle Odyssey, a New...
Battle Odyssey, Gameloft's newest puzzle RPG, is coming to the App Store next week. Set in the world of Pondera, you will need to control the power of the elements to defend the world from evil. You'll be able to entlist over 500 allies to aid you... | Read more »

Price Scanner via MacPrices.net

Sale! 15-inch Retina MacBook Pros for up to $...
 MacMall has 15″ Retina MacBook Pros on sale for up to $255 off MSRP. Shipping is free: - 15″ 2.2GHz Retina MacBook Pro: $1794.99 save $205 - 15″ 2.5GHz Retina MacBook Pro: $2244.99 save $255 Adorama... Read more
New 2015 MacBook Airs on sale for up to $75 o...
Save up to $75 on the purchase of a new 2015 13″ or 11″ 1.6GHz MacBook Air at the following resellers. Shipping is free with each model: 11" 128GB MSRP $899 11" 256GB... Read more
Clearance 13-inch Retina MacBook Pros availab...
B&H Photo has leftover 2014 13″ Retina MacBook Pros on sale for up to $250 off original MSRP. Shipping is free, and B&H charges NY sales tax only: - 13″ 2.6GHz/128GB Retina MacBook Pro: $1129... Read more
Clearance 2014 MacBook Airs available startin...
B&H Photo has clearance 2014 MacBook Airs available for up to $200 off original MSRP. Shipping is free, and B&H charges NY sales tax only: - 11″ 128GB MacBook Air: $729 $170 off original MSRP... Read more
Taichi Temple First Tai Chi Motion Sensor App...
Zhen Wu LLC has announced the official launch of Taichi Temple 1.0, the first motion sensor app for Tai Chi, offering a revolutionary new way to de-compress, relax and exercise all at the same time.... Read more
CleanExit – Erase your Hard Drive Quickly, Se...
CleanExit works on both Macs and PCs, securely and permanently deleting all files from any type of hard drive, flash-based drive or camera media card making the files permanently unrecoverable.... Read more
250 iPhone 6 Tips eBook Released for $1.99
Bournemouth, UK based iOS Guides has released 250 iPhone 6 Tips, a new eBook available in the iBookstore that reveals a wealth of tips and tutorials for iPhone 6 and iPhone 6 Plus. Priced at $1.99,... Read more
TigerText Introduces First Secure Enterprise...
TigerText, a provider of secure, real-time messaging for the enterprise, has announced the launch of TigerText for the Apple Watch. TigerText for the Apple Watch enables users to securely send and... Read more
The Conservation Fund Partners with Apple To...
The Conservation Fund has announced that it will partner with Apple to help protect working forests in the United States. The Apple initiative will conserve more than 36,000 acres of working... Read more
Clearance 13-inch 2.6GHz Retina MacBook Pro a...
B&H Photo has clearance 2014 13″ 2.6GHz/128GB Retina MacBook Pros now available for $1099, or $200 off original MSRP. Shipping is free, and B&H charges NY sales tax only. Read more

Jobs Board

*Apple* Solutions Consultant - Retail Sales...
**Job Summary** As an Apple Solutions Consultant (ASC) you are the link between our customers and our products. Your role is to drive the Apple business in a retail Read more
*Apple* Solutions Consultant - Retail Sales...
**Job Summary** As an Apple Solutions Consultant (ASC) you are the link between our customers and our products. Your role is to drive the Apple business in a retail Read more
DevOps Software Engineer - *Apple* Pay, iOS...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
*Apple* Pay - Site Reliability Engineer - Ap...
**Job Summary** Imagine what you could do here. At Apple , great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring Read more
Sr. Technical Services Consultant, *Apple*...
**Job Summary** Apple Professional Services (APS) has an opening for a senior technical position that contributes to Apple 's efforts for strategic and transactional Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.