Skip to content

[Rcpp-devel] Detecting Singular Matrices in RcppArmadillo

6 messages · Alex Ilich, Neal Fultz, Dirk Eddelbuettel

#
Hi, I have some code I've written to calculate regression parameters using
ordinarily least squares using RcppArmadillo. This is meant to be performed
iteratively over many small subsets (sliding window over spatial raster
data) so I'd like to automatically detect when the regression can't be
performed and just return a vector of NA's. Currently I check to see if the
determinant is 0 in order to see if the XtX matrix can be inverted and it
seems to work in the vast majority of cases but I did run into a case where
a small amount of precision error was introduced and for some reason this
causes the determinant to be non-zero in C++ however, the matrix still is
considered singular (uninvertible) leading to an error. I was wondering if
anyone had any suggestions? Maybe there's a minimum value non-zero value
where it's considered essentially singular that I could use instead of 0,
or maybe there's a way to return a vector of NAs if an error is detected
rather than stopping execution and printing an error message. I've also
seen that pinv can be used, but from some tests I ran it's substantially
slower, so I'd like to avoid using that. I've included some example code
below. Any suggestions would be greatly appreciated.

Thanks,
Alex

library(Rcpp)
cppFunction(depends="RcppArmadillo", "NumericVector C_OLS(arma::mat X,
arma::mat Y){
  arma::mat Xt = trans(X);
  arma::mat XtX = Xt * X;
  double d = det(XtX);
  Rcout << d; //print determinant
  if(d==0){
    NumericVector B2(X.n_cols, NA_REAL);
    return B2;
  } else{
    arma::mat XtX_inv= inv(XtX);
    NumericVector B = Rcpp::as<Rcpp::NumericVector>(wrap(XtX_inv * (Xt *
Y)));
    return B;
  }}")

X<- matrix(data=c(-250, -250, -250, -250, -250, -250, 250, 200, 150,
                  100, 50, 0, 1, 1, 1, 1, 1, 1), ncol=3, nrow=6)


X2<- X

X2[1,2]<- X[1,2] + 5.684342e-14 #Add a slight numeric precision issue

Y<- matrix(1:6, ncol=1)

#Determinant of XtX in R is 0 for both
det(t(X) %*% X) ==0 #TRUE
det(t(X2) %*% X2) ==0 #TRUE

C_OLS(X,Y) #Determinant is 0 in C++ and correctly returns a vector of NA's
#0[1] NA NA NA

C_OLS(X2,Y) #Determinant is not 0 in C++ so the if statement is not
triggered but the matrix is still uninvertible
# 4.57764e-05
# error: inv(): matrix is singular
# Error in C_OLS(X2, Y) : inv(): matrix is singular
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20220128/05ebe0b5/attachment.html>
#
There's an R object that has the machine precision, which could be a
reasonable threshold.

.Machine$double.eps

I believe there is a similar constant in the C++ standard library.

You might also try checking the condition of the matrix instead of the
determinant, but you might take a performance hit. EG:
https://www.quora.com/Why-is-a-condition-number-more-useful-than-a-determinant-for-spotting-problems-with-the-solution-to-ax-b

- Neal
On Fri, Jan 28, 2022 at 8:54 AM Alex Ilich <alexilich93 at gmail.com> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20220128/6a25fefe/attachment.html>
#
On 28 January 2022 at 09:16, Neal Fultz wrote:
| There's an R object that has the machine precision, which could be a
| reasonable threshold.
| 
| .Machine$double.eps
| 
| I believe there is a similar constant in the C++ standard library.
| 
| You might also try checking the condition of the matrix instead of the
| determinant, but you might take a performance hit. EG:
| https://www.quora.com/Why-is-a-condition-number-more-useful-than-a-determinant-for-spotting-problems-with-the-solution-to-ax-b

Exactly -- here is an old SO answer of mine featuring the condition
number and kappa (its inverse):

https://stackoverflow.com/questions/3042117/screening-multicollinearity-in-a-regression-model/3042262#3042262

I don't have an implementation of either 'from C++ / Armadillo basics' handy
but it is likely not hard, and possibly a nice exercise.

Dirk
#
Thank you Neal and Dirk. Based on these comments I think using arma::rcond
to get the reciprocal of the condition factor, and checking if that is zero
rather than the determinant may be the route to go. At least for this
example that works.

library(Rcpp)

cppFunction(depends="RcppArmadillo", "NumericVector C_OLS(arma::mat X,
arma::mat Y){
  arma::mat Xt = trans(X);
  arma::mat XtX = Xt * X;
  double rcf = rcond(XtX);
  Rcout << rcf; //print reciprocal of condition factor
  if(rcf==0){
    NumericVector B2(X.n_cols, NA_REAL);
    return B2;
  } else{
    arma::mat XtX_inv= inv(XtX);
    NumericVector B = Rcpp::as<Rcpp::NumericVector>(wrap(XtX_inv * (Xt *
Y)));
    return B;
  }}")

X<- matrix(data=c(-250, -250, -250, -250, -250, -250, 250, 200, 150,
                  100, 50, 0, 1, 1, 1, 1, 1, 1), ncol=3, nrow=6)


X2<- X

X2[1,2]<- X[1,2] + 5.684342e-14 #Add a slight numeric precision issue

Y<- matrix(1:6, ncol=1)

C_OLS(X,Y)
# 0[1] NA NA NA

C_OLS(X2,Y)
# 0[1] NA NA NA
On Fri, Jan 28, 2022 at 11:53 AM Alex Ilich <alexilich93 at gmail.com> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20220128/73e370be/attachment-0001.html>
#
On 28 January 2022 at 13:25, Alex Ilich wrote:
| Thank you Neal and Dirk. Based on these comments I think using arma::rcond

Ah, yes, of course. Should have remembered arma::rcond.

| to get the reciprocal of the condition factor, and checking if that is zero
| rather than the determinant may be the route to go. At least for this
| example that works.
| 
| library(Rcpp)
| 
| cppFunction(depends="RcppArmadillo", "NumericVector C_OLS(arma::mat X,
| arma::mat Y){
|   arma::mat Xt = trans(X);
|   arma::mat XtX = Xt * X;
|   double rcf = rcond(XtX);
|   Rcout << rcf; //print reciprocal of condition factor
|   if(rcf==0){

Re-read Neal's first email.  You probably do not want zero (apart from all
the R FAQ 7.31 reasons to not compare a double with equality :wink: )

Dirk

|     NumericVector B2(X.n_cols, NA_REAL);
|     return B2;
|   } else{
|     arma::mat XtX_inv= inv(XtX);
|     NumericVector B = Rcpp::as<Rcpp::NumericVector>(wrap(XtX_inv * (Xt *
| Y)));
|     return B;
|   }}")
| 
| X<- matrix(data=c(-250, -250, -250, -250, -250, -250, 250, 200, 150,
|                   100, 50, 0, 1, 1, 1, 1, 1, 1), ncol=3, nrow=6)
| 
| 
| X2<- X
| 
| X2[1,2]<- X[1,2] + 5.684342e-14 #Add a slight numeric precision issue
| 
| Y<- matrix(1:6, ncol=1)
| 
| C_OLS(X,Y)
| # 0[1] NA NA NA
| 
| C_OLS(X2,Y)
| # 0[1] NA NA NA
| 
| 
|
| On Fri, Jan 28, 2022 at 11:53 AM Alex Ilich <alexilich93 at gmail.com> wrote:
| 
| > Hi, I have some code I've written to calculate regression parameters using
| > ordinarily least squares using RcppArmadillo. This is meant to be performed
| > iteratively over many small subsets (sliding window over spatial raster
| > data) so I'd like to automatically detect when the regression can't be
| > performed and just return a vector of NA's. Currently I check to see if the
| > determinant is 0 in order to see if the XtX matrix can be inverted and it
| > seems to work in the vast majority of cases but I did run into a case where
| > a small amount of precision error was introduced and for some reason this
| > causes the determinant to be non-zero in C++ however, the matrix still is
| > considered singular (uninvertible) leading to an error. I was wondering if
| > anyone had any suggestions? Maybe there's a minimum value non-zero value
| > where it's considered essentially singular that I could use instead of 0,
| > or maybe there's a way to return a vector of NAs if an error is detected
| > rather than stopping execution and printing an error message. I've also
| > seen that pinv can be used, but from some tests I ran it's substantially
| > slower, so I'd like to avoid using that. I've included some example code
| > below. Any suggestions would be greatly appreciated.
| >
| > Thanks,
| > Alex
| >
| > library(Rcpp)
| > cppFunction(depends="RcppArmadillo", "NumericVector C_OLS(arma::mat X,
| > arma::mat Y){
| >   arma::mat Xt = trans(X);
| >   arma::mat XtX = Xt * X;
| >   double d = det(XtX);
| >   Rcout << d; //print determinant
| >   if(d==0){
| >     NumericVector B2(X.n_cols, NA_REAL);
| >     return B2;
| >   } else{
| >     arma::mat XtX_inv= inv(XtX);
| >     NumericVector B = Rcpp::as<Rcpp::NumericVector>(wrap(XtX_inv * (Xt *
| > Y)));
| >     return B;
| >   }}")
| >
| > X<- matrix(data=c(-250, -250, -250, -250, -250, -250, 250, 200, 150,
| >                   100, 50, 0, 1, 1, 1, 1, 1, 1), ncol=3, nrow=6)
| >
| >
| > X2<- X
| >
| > X2[1,2]<- X[1,2] + 5.684342e-14 #Add a slight numeric precision issue
| >
| > Y<- matrix(1:6, ncol=1)
| >
| > #Determinant of XtX in R is 0 for both
| > det(t(X) %*% X) ==0 #TRUE
| > det(t(X2) %*% X2) ==0 #TRUE
| >
| > C_OLS(X,Y) #Determinant is 0 in C++ and correctly returns a vector of NA's
| > #0[1] NA NA NA
| >
| > C_OLS(X2,Y) #Determinant is not 0 in C++ so the if statement is not
| > triggered but the matrix is still uninvertible
| > # 4.57764e-05
| > # error: inv(): matrix is singular
| > # Error in C_OLS(X2, Y) : inv(): matrix is singular
| >
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
#
Thanks! I'll use if(rcf <= std::numeric_limits<double>::epsilon()) instead
of rcf==0.
On Fri, Jan 28, 2022 at 7:32 PM Dirk Eddelbuettel <edd at debian.org> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20220128/db411ea1/attachment.html>