daxpy performance with veclib

Hi Mac Special Interest Group folks,

We've noticed some curious behavior of the veclib BLAS implementation
in the development of the OpenMx library.  The daxpy implementation
appears to be twice as slow in the veclib implementation as compared
to the reference implementation.  Attached is a test kernel that has
been run under both implementations.  The kernel consists of repeated
calls to daxpy with vectors of varying size.  In the output files, the
first column is the dimension of the vector.  The 2nd-4th column
report the runtime in seconds of the kernel; three identical trials
per vector size.

It may be more appropriate to send this information upstream to the
veclib persons.  However, I thought it would be of interest to Mac R
folks, too.  For our own project, our workaround will be to create our
own basic implementation of daxpy, and continue to link against the
veclib BLAS library so we can get a speedup on dgemm and the other
functions.

The benchmarks were executed on a Mac Pro with 2 Quad-Core Xeons @ 3
GHz (MacPro2,1) running OS X 10.5.8.  It was tested with R 2.12.0 and
the same behavior has been observed with R 2.10.1.

Thanks,
--Michael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: omxTest.c
Type: text/x-csrc
Size: 1022 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mac/attachments/20101020/fdde1502/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: daxpy.veclib.results
Type: application/octet-stream
Size: 677 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mac/attachments/20101020/fdde1502/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: daxpy.refblas.results
Type: application/octet-stream
Size: 659 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mac/attachments/20101020/fdde1502/attachment-0001.obj>