NaN and linear algebra - R-devel

Bill Northcott · 2005-03-24T12:59:36Z

On 24/03/2005, at 10:04 PM, Andy wrote: > I just tested this problem on a Windows 2000 Pro P4-Xeon system. I > tried > this in patched 2.0.1 (2005-2-28) and 2.1.0-alpha (2005-3-23) with both > generic BLAS and precompiled ATLAS BLAS dll (downloaded from CRAN > windows > binaries). In all 4 combinations I get > >> d > f > f > [,1] [,2] [,3] > [1,] NaN NaN NaN > [2,] NaN NaN NaN > [3,] NaN NaN NaN >> det(d) > [1] NaN >> >> I just want to clarify some

Bill Northcott

Thu, Mar 24, 2005 4:59 AM #

On 24/03/2005, at 10:04 PM, Andy wrote:

I just want to clarify some of the discussion here for my own benefit.

As I understand it, there are two software libraries involved (three if 
you count R).  BLAS is the lowest layer providing basic matrix 
operations.  It comes in vanilla and platform optimised (ATLAS or Goto) 
flavours.  Presumably R's built in version is vanilla.

Relying on BLAS is LAPACK which provides the functions which R actually 
calls like dgesv.  LAPACK relies on BLAS to provide the platform 
optimisations.  As far as I can see the issue(s) here is(are) with 
LAPACK not BLAS.  It is LAPACK that issues the error and halts the 
program.  (The MacOS X Accelerate Framework contains BLAS and LAPACK as 
distinct libraries libBlas.dylib and libLapack.dylib)

If, as reported, the behaviour is fundamentally similar on MacOS X, 
Linux and IRIX then it would appear that this is a feature(bug?) of the 
LAPACK reference implementation and that whoever wrote the Windows 
version saw fit to improve on it.  I don't think LAPACK ever pretended 
to properly handle NaNs.

It seems to me that the Windows version is an improvement, because it 
complies with the robust arithmetic principles that programs should not 
halt but propagate NaN or other indicators that can tested as 
appropriate in the application.  So it might be good idea to press 
other implementers of LAPACK to do the same.

My view for what it is worth is that R should also be robust in this 
sense and not halt on NaNs.  For preference this should be handled in 
the LAPACK implementations, but if not R should should trap them.  As I 
see it, it is incorrect to say a matrix containing NaNs is singular, 
not positive definite or whatever, because since it contains NaNs the 
computations to make those determinations cannot be performed.

Is this the wrong way to look at it?

Bill Northcott

PS  This discussion has given me another thought.  On MacOS X, it would 
be much faster to use floats instead of doubles because they can be 
directly processed by the Altivec section of the CPU.  I can see no way 
to have R use floats.  Would that be possible as a future enhancement 
or is it just too hard?

Simon Urbanek

Thu, Mar 24, 2005 9:54 AM #

On Mar 24, 2005, at 6:59 AM, Bill Northcott wrote:

To have just a rough idea about how "much faster" this would be, here  
are some timings for 2000 runs of 1000x1000 matrix-vector  
multiplications using vecLib's BLAS:
G5 1.8GHz
sgemv: 2.9s
dgemv: 6.7s

G4 1.5GHz
sgemv: 10.1s
dgemv: 17.3s

The performance difference between double and float is actually less  
astonishing (although is proves that altiVec is used - still) than  
the G4 vs G5 comparison. vecLib for G5 seems to be fairly well  
optimized and the G5 architecture seems to have better throughput.

Anyway, I'm not quite sure switching from doubles to floats yields  
big enough speed up to justify the effort. Maybe some Lapack test  
would helpful, too, to have a bigger picture of this.

Cheers,
Simon

PS: I'd rather move this part to R-SIG-Mac as this is actually a Mac- 
only question.

Kjell Konis

Thu, Mar 24, 2005 1:03 PM #

Although both Linux and OSX produce an error the behavior of LAPACK is 
not similar.

(It would be nice if someone using Windows could try the following test 
too)

(1) Compile the following C code

void testdgesv(long* N, long* NRHS, double* A, long* LDA, long* IPIVOT, 
double* B, long* LDB, long* INFO)
{
   DGESV(N, NRHS, A, LDA, IPIVOT, B, LDB, INFO);
}

(you will have to use the right combination of case and _s)

(2) Run these commands in R:

dyn.load("test.so")
N <- as.integer(3)
NRHS <- as.integer(3)
A <- matrix(as.double(NaN), 3, 3)
LDA <- as.integer(3)
IPIVOT <- integer(3)
B <- diag(3)
storage.mode(B) <- "double"
LDB <- as.integer(3)
INFO <- as.integer(0)

.C("testdgesv", N = N, NRHS = NRHS, A = A, LDA = LDA, IPIVOT = IPIVOT, 
B = B, LDB = LDB, INFO = INFO, NAOK = TRUE)

$N
[1] 3

$NRHS
[1] 3

$A
      [,1] [,2] [,3]
[1,]  NaN  NaN  NaN
[2,]  NaN  NaN  NaN
[3,]  NaN  NaN  NaN

$LDA
[1] 3

$IPIVOT
[1] 1 2 3

$B
      [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1

$LDB
[1] 3

$INFO
[1] 3

//////////////////

On my Mac mini this exits with no errors or warnings (other than INFO = 
3 anyway).  However,

solve(A)

generates the error:

Error in solve.default(A) : Lapack routine dgesv: system is exactly 
singular

This behavior seems to suggest that error is generated somewhere in R, 
not in LAPACK.

Okay, now for the weird bit.

When I run this in Linux I get:

 > .C("testdgesv", N = N, NRHS = NRHS, A = A, LDA = LDA, IPIVOT = 
IPIVOT, B = B, LDB = LDB, INFO = INFO, NAOK = TRUE)
$N
[1] 3

$NRHS
[1] 3

$A
      [,1] [,2] [,3]
[1,]  NaN  NaN  NaN
[2,]  NaN  NaN  NaN
[3,]  NaN  NaN  NaN

$LDA
[1] 3

$IPIVOT
[1] 3 2 3

$B
      [,1] [,2] [,3]
[1,]  NaN  NaN  NaN
[2,]  NaN  NaN  NaN
[3,]  NaN  NaN  NaN

$LDB
[1] 3

$INFO
[1] 0

where INFO == 0 indicates "successful exit" but

 > solve(A)
Error in solve.default(A) : system is computationally singular: 
reciprocal condition number = 0



In case it is important:

platform powerpc-apple-darwin7.8.0
arch     powerpc
os       darwin7.8.0
system   powerpc, darwin7.8.0
status
major    2
minor    0.1
year     2004
month    11
day      15
language R

and

platform i686-pc-linux-gnu
arch     i686
os       linux-gnu
system   i686, linux-gnu
status
major    2
minor    0.1
year     2004
month    11
day      15
language R

prompt> more Makevars
PKG_LIBS=${LAPACK_LIBS} ${BLAS_LIBS} ${FLIBS}