Skip to content

Lapack, determinant, multivariate normal density, solution to linear system, C language

3 messages · Matt Shotwell, Douglas Bates

#
r-devel list,

I have recently written an R package that solves a linear least squares
problem, and computes the multivariate normal density function. The bulk
of the code is written in C, with interfacing code to the BLAS and
Lapack libraries. The motivation here is speed. I ran into a problem
computing the determinant of a symmetric matrix in packed storage.
Apparently, there are no explicit routines for this as part of Lapack.
While there IS an explicit routine for this in Linpack, I did not want
to use the older library. Also, right before I needed the determinant of
the matrix A, I had used the Lapack routine dspsv to solve the linear
system Ax=b, which does much of the work of computing a determinant
also. In fact, the solution I came up with involves the output of this
routine (which might be obvious to Lapack designers, but not me)

My modest Googleing turned up very little unique material (as is typical
with BLAS/Lapack/Linpack queries). Hence, I am writing the r-devel list
partly to document the solution I've come up with, but mainly to elicit
additional wisdom from seasoned R programmers.

My solution to the problem is illustrated in the appended discussion and
C code. Thanks for your input.

-Matt Shotwell

--------------

The Lapack routine dspsv solves the linear system of equations Ax=b,
where A is a symmetric matrix in packed storage format. The dspsv
function performs the factorization A=UDU', where U is a unitriangular
matrix and D is a block diagonal matrix where the blocks are of
dimension 1x1 or 2x2. In addition to the solution for x, the dspsv
function also returns the matrices U and D. The matrix D may then be
used to compute the determinant of A. Recall from linear algebra that
det(A) = det(UDU') = det(U)det(D)det(U'). Since U is unitriangular,
det(U) = 1. The determinant of D is the product of the determinants of
the diagonal blocks. If a diagonal block is of dimension 1x1, then the
determinant of the block is simply the value of the single element in
the block. If the diagonal block is of dimension 2x2 then the
determinant of the block may be computed according to the well-known
formula b11*b22-b12*b21, where bij is the value in the i'th row and j'th
column of the block.

  int i, q, info, *ipiv, one = 1;
  double *b, *A, *D, det;

  /* 
  ** A and D are upper triangular matrices in packed storage
  ** A[] = a00, a01, a11, a02, a12, a22, a03, a13, a23, a33, ...
  ** use the following macro to address the element in the 
  ** i'th row and j'th column for i <= j
  */
  #define UMAT(i, j) (i + j * ( j + 1 ) / 2)

  /* 
  ** additional code should be here
  ** - set q
  ** - allocate ipiv...
  ** - allocate and fill A and b... 
  */

  /* 
  ** solve Ax=b using A=UDU' factorization where 
  ** A represents a qxq matrix, b a 1xq vector.
  ** dspsv outputs the elements of the matrix D
  ** is upper triangular packed storage
  ** in the memory addressed by A. That is, A is
  ** replaced by D when dspsv returns.
  */
  F77_CALL(dspsv)("U", &q, &one, A, ipiv, b, &q, &info); 
  if( info > 0 ) { /*issue warning, system is singular*/ }
  if( info < 0 ) { /*issue error, invalid argument*/ }

  /* 
  ** compute the determinant det = det(A)
  ** if ipiv[i] > 0, then D(i,i) is a 1x1 block diagonal
  ** if ipiv[i] = ipiv[i-1] < 0, then D(i-1,i-1), 
  ** D(i-1,i), and D(i,i) form the upper triangle 
  ** of a 2x2 block diagonal
  */
  D = A;
  det = 1.0;
  for( i = 0; i < q; i++ ) { 
    if( ipiv[ i ] > 0 ) {
      det *= D[ UMAT(i,i) ];
    } else if( i > 0 && ipiv[ i ] < 0 && ipiv[ i-1 ] == ipiv[ i ] ) { 
      det *= D[ UMAT(i,i) ] * D[ UMAT(i-1,i-1) ] -\
             D[ UMAT(i-1,i) ] * D[ UMAT(i-1,i) ];
    }
  }
6 days later
#
On Mon, Apr 12, 2010 at 10:27 PM, shotwelm <shotwelm at musc.edu> wrote:
For both of those applications you can use a Cholesky decomposition of
the symmetric matrix.  If the Cholesky decomposition fails then you
have a singular least squares problem or a singular
variance-covariance matrix for your multivariate normal density
function.

Have you tried comparing the speed of your code to

prod(diag(chol(mm))^2

or, probably better, is to use the logarithm of the determinant

2 * sum(log(diag(chol(mm)))

If you use the Matrix package class dpoMatrix to solve the linear
system it will cache the results of the Cholesky decomposition when
solving the system so later evaluation of the determinant will be very
fast - although I suspect you would need to be working with matrices
of sizes in the hundreds or doing the same operation thousands of
times before you would notice a difference.

If you really insist on doing this in compiled code you just need to
call F77_CALL(dpotrf) then accumulate the product of the diagonal
elements of the resulting factor.

You could use packed storage but the slight advantage in memory usage
(at best, 1/2 of the full storage usage) is not worth the pain of
writing code to navigate the packed storage locations.
4 days later
#
Douglas, 

Thanks for your reply. I took your suggestion and tried the Cholesky
factorization with dppsv, the positive definite version of dspsv. I
found dppsv to be a great deal faster than dspsv, as might be expected
since dspsv uses a more complicated factorization. However, I ran into
trouble with computational singularities with dppsv. In this particular
case, I don't mind having least squares solutions that aren't unique.
The dspsv routine appears to be more robust in this regard. The speed is
sufficiently improved with dppsv that I opted to use both in the code,
trying dppsv first, then dspsv if dppsv fails.

I agree, packed storage was more trouble than it was worth in this case,
but yielded mild to moderate improvements in speed and memory usage.
Addressing the elements of a packed matrix can be complicated, but it
makes a difference whether it's upper packed, or lower packed. To
address each type of matrix, I use the following macros

//address upper triangular packed storage matrix
//a00, a01, a11, a02, a12, a22, a03, a13, a23, a33, ...
//0 <= i <= j < n
#define UMAT(i, j) (i + j * ( j + 1 ) / 2)

//address symmetric full storage (by column) matrix
//a00, a10, ..., an0, a01, a11, ..., an1, ...
//0 <= i, j < n
#define FMAT(i, j, n) (i + j * n)

//address lower triangular packed storage matrix
//a00, a10, ... ,an0, a11, a21, ..., an1, a22, a32, ...
//0 <= j <= i < n
#define LMAT(i, j, n) (i + j * ( 2 * n - j + 1 ) / 2)

Notice that UMAT doesn't require the matrix dimension n! Hence, when n
is retrieved from memory, it may be more efficient to address an upper
packed storage matrix than a full storage matrix. I'll have to test this
theory, though it's probably compiler dependent.

Thanks again,

-Matt
On Mon, 2010-04-19 at 12:44 -0400, Douglas Bates wrote: