I've been toying with both RcppArmadillo and RcppEigen the past few days and don't know which library to continue using. RcppEigen seems really slick, but appears to be lacking some of the decompositions I want and isn't nearly as fast to code. RcppArmadillo seems about as fast, easier to code up etc. What are some of the advantages/disadvantages of both? Can you call LAPACK or BLAS from either? Is there a wrapper in RcppEigen to call LAPACK functions? Want some other decomposition methods, dont like the JacobiSVD method in Eigen. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20120613/20dab2b9/attachment.html>
[Rcpp-devel] Differences between RcppEigen and RcppArmadillo
19 messages · Dirk Eddelbuettel, Julian Smith, c s +3 more
On 13 June 2012 at 10:57, Julian Smith wrote:
| I've been toying with both RcppArmadillo and RcppEigen the past few days and | don't know which library to continue using. RcppEigen seems really slick, but | appears to be lacking some of the decompositions I want and isn't nearly as | fast to code. RcppArmadillo seems about as fast, easier to code up etc. What | are some of the advantages/disadvantages of both? That's pretty close. I have been a fan of [Rcpp]Armadillo which I find easier to get my head around. Doug, however, moved from [Rcpp]Armadillo to [Rcpp]Eigen as it has some things he needs. Eigen should have a "larger" API than Armadillo, but I find the code and docs harder to navigate. And you should find Eigen to be a little faster. Andreas Alfons went as far as building 'robustHD' using RcppArmadillo with a drop-in for RcppEigen (in package 'sparseLTSEigen'; both package names from memmory and I may have mistyped). He reported a performance gain of around 25% for his problem sets. On the 'fastLm' benchmark, we find the fast Eigen-based decompositions to be much faster than Armadillo. | Can you call LAPACK or BLAS from either? Is there a wrapper in RcppEigen to | call LAPACK functions? Want some other decomposition methods, dont like the | JacobiSVD method in Eigen. You need to differentiate between the Eigen and Armadillo docs _for their libraries_ and what happens when you access the Rcpp* variant from R. RcppArmadillo will use the very same LAPACK and BLAS libs your R session uses. So MKL, OpenBlas, ... are all options. Eigen actually has its own code outperforming LAPACK, so it doesn't as much there. Hope this helps, Dirk (at useR!) | | ---------------------------------------------------------------------- | _______________________________________________ | Rcpp-devel mailing list | Rcpp-devel at lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
I agree that RcppEigen is a little bit faster, but ease of use is important
to me, so I feel like RcppArmadillo might win out in my application.
| RcppArmadillo will use the very same LAPACK and BLAS libs your R session
| uses. So MKL, OpenBlas, ... are all options. Eigen actually has its own
code
| outperforming LAPACK, so it doesn't as much there.
Why do you think R outperforms RcppArmadillo in this example below? Anyway
to speed this up?
require(RcppArmadillo)
require(inline)
arma.code <- '
using namespace arma;
NumericMatrix Xr(Xs);
int n = Xr.nrow(), k = Xr.ncol();
mat X(Xr.begin(), n, k, false);
mat U;
vec s;
mat V;
svd(U, s, V, X);
return wrap(s);
'
rcppsvd <- cxxfunction(signature(Xs="numeric"),
arma.code,
plugin="RcppArmadillo")
A<-matrix(rnorm(5000^2), 5000)
system.time(rcppsvd(A))
user system elapsed 1992.406 4.862 1988.737
system.time(svd(A))
user system elapsed 652.496 2.641 652.614
On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 13 June 2012 at 10:57, Julian Smith wrote: | I've been toying with both RcppArmadillo and RcppEigen the past few days and | don't know which library to continue using. RcppEigen seems really slick, but | appears to be lacking some of the decompositions I want and isn't nearly as | fast to code. RcppArmadillo seems about as fast, easier to code up etc. What | are some of the advantages/disadvantages of both? That's pretty close. I have been a fan of [Rcpp]Armadillo which I find easier to get my head around. Doug, however, moved from [Rcpp]Armadillo to [Rcpp]Eigen as it has some things he needs. Eigen should have a "larger" API than Armadillo, but I find the code and docs harder to navigate. And you should find Eigen to be a little faster. Andreas Alfons went as far as building 'robustHD' using RcppArmadillo with a drop-in for RcppEigen (in package 'sparseLTSEigen'; both package names from memmory and I may have mistyped). He reported a performance gain of around 25% for his problem sets. On the 'fastLm' benchmark, we find the fast Eigen-based decompositions to be much faster than Armadillo. | Can you call LAPACK or BLAS from either? Is there a wrapper in RcppEigen to | call LAPACK functions? Want some other decomposition methods, dont like the | JacobiSVD method in Eigen. You need to differentiate between the Eigen and Armadillo docs _for their libraries_ and what happens when you access the Rcpp* variant from R. RcppArmadillo will use the very same LAPACK and BLAS libs your R session uses. So MKL, OpenBlas, ... are all options. Eigen actually has its own code outperforming LAPACK, so it doesn't as much there. Hope this helps, Dirk (at useR!) | | ---------------------------------------------------------------------- | _______________________________________________ | Rcpp-devel mailing list | Rcpp-devel at lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20120613/f2957506/attachment-0001.html>
On 13 June 2012 at 15:05, Julian Smith wrote:
| I agree that RcppEigen is a little bit faster, but ease of use is important to | me, so I feel like RcppArmadillo might win out in my application. Yup, that my personal view too. | | RcppArmadillo will use the very same LAPACK and BLAS libs your R session | | uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own | code | | outperforming LAPACK, so it doesn't ?as much there. | | Why do you think R outperforms RcppArmadillo in this example below? Anyway to | speed this up? That is odd. "I guess it shouldn't." I shall take another look -- as I understand it both should go to the same underlying Lapack routine. I may have to consult with Conrad on this. Thanks for posting a full and reproducible example! Dirk | require(RcppArmadillo) | require(inline) | | arma.code <- ' | ? using namespace arma; | ? NumericMatrix Xr(Xs); | ? int n = Xr.nrow(), k = Xr.ncol(); | ? mat X(Xr.begin(), n, k, false); | ? mat U; | ? vec s; | ? mat V; | ? svd(U, s, V, X); | ? return wrap(s); | ' | rcppsvd <- cxxfunction(signature(Xs="numeric"), | ??????????????????????? arma.code, | ??????????????????????? plugin="RcppArmadillo") | | A<-matrix(rnorm(5000^2), 5000) | | > system.time(rcppsvd(A)) | ??? user?? system? elapsed | 1992.406??? 4.862 1988.737 | | > system.time(svd(A)) | ?? user? system elapsed | 652.496?? 2.641 652.614 |
| On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| |
| On 13 June 2012 at 10:57, Julian Smith wrote:
| | I've been toying with both RcppArmadillo and RcppEigen the past few days | and | | don't know which library to continue using. RcppEigen seems really slick, | but | | appears to be lacking some of the decompositions I want and isn't nearly | as | | fast to code. RcppArmadillo seems about as fast, easier to code up etc. | What | | are some of the advantages/disadvantages of both? | | That's pretty close. ?I have been a fan of [Rcpp]Armadillo which I find | easier to get my head around. ?Doug, however, moved from [Rcpp]Armadillo | to | [Rcpp]Eigen as it has some things he needs. ?Eigen should have a "larger" | API | than Armadillo, but I find the code and docs harder to navigate. | | And you should find Eigen to be a little faster. Andreas Alfons went as far | as building 'robustHD' using RcppArmadillo with a drop-in for RcppEigen (in | package 'sparseLTSEigen'; both package names from memmory and I may have | mistyped). ?He reported a performance gain of around 25% for his problem | sets. ?On the 'fastLm' benchmark, we find the fast Eigen-based | decompositions | to be much faster than Armadillo. | | | Can you call LAPACK or BLAS from either? Is there a wrapper in RcppEigen | to | | call LAPACK functions? Want some other decomposition methods, dont like | the | | JacobiSVD method in Eigen. | | You need to differentiate between the Eigen and Armadillo docs _for their | libraries_ and what happens when you access the Rcpp* variant from R. | | RcppArmadillo will use the very same LAPACK and BLAS libs your R session | uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own | code | outperforming LAPACK, so it doesn't ?as much there. | | Hope this helps, ? Dirk (at useR!) | | | | | ---------------------------------------------------------------------- | | _______________________________________________ | | Rcpp-devel mailing list | | Rcpp-devel at lists.r-forge.r-project.org | | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | -- | Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com | |
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On 13 June 2012 at 17:16, Dirk Eddelbuettel wrote:
|
| On 13 June 2012 at 15:05, Julian Smith wrote:
| | I agree that RcppEigen is a little bit faster, but ease of use is important to
| | me, so I feel like RcppArmadillo might win out in my application.
|
| Yup, that my personal view too.
|
| | | RcppArmadillo will use the very same LAPACK and BLAS libs your R session
| | | uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own
| | code
| | | outperforming LAPACK, so it doesn't ?as much there.
| |
| | Why do you think R outperforms RcppArmadillo in this example below? Anyway to
| | speed this up?
|
| That is odd. "I guess it shouldn't." I shall take another look -- as I
| understand it both should go to the same underlying Lapack routine. I may
| have to consult with Conrad on this.
Does your machine by chance have four cores? It looks like I get a scaling
in the number of (hyperthreaded) cores on my box:
R> require(RcppArmadillo)
R> require(inline)
R> require(rbenchmark)
R>
R> arma.code <- 'using namespace arma; NumericMatrix Xr(Xs); int n = Xr.nrow(), k = Xr.ncol(); mat X(Xr.begin(), n, k, false); mat U; vec s; mat V; svd(U, s, V, X); return wrap(s);'
R> rcppsvd <- cxxfunction(signature(Xs="numeric"), body=arma.code,plugin="RcppArmadillo")
R>
R> N <- 1000
R> A <- matrix(rnorm(N^2), N)
R>
R> res <- benchmark(rcppsvd(A), svd(A), replications=10)
R> print(res)
test replications elapsed relative user.self sys.self user.child sys.child
1 rcppsvd(A) 10 121.389 7.77089 127.448 1.748 0 0
2 svd(A) 10 15.621 1.00000 26.290 4.224 0 0
R>
Why it does this I am not clear about -- may well be a bug.
Dirk
| Thanks for posting a full and reproducible example!
|
| Dirk
|
| | require(RcppArmadillo)
| | require(inline)
| |
| | arma.code <- '
| | ? using namespace arma;
| | ? NumericMatrix Xr(Xs);
| | ? int n = Xr.nrow(), k = Xr.ncol();
| | ? mat X(Xr.begin(), n, k, false);
| | ? mat U;
| | ? vec s;
| | ? mat V;
| | ? svd(U, s, V, X);
| | ? return wrap(s);
| | '
| | rcppsvd <- cxxfunction(signature(Xs="numeric"),
| | ??????????????????????? arma.code,
| | ??????????????????????? plugin="RcppArmadillo")
| |
| | A<-matrix(rnorm(5000^2), 5000)
| |
| | > system.time(rcppsvd(A))
| | ??? user?? system? elapsed
| | 1992.406??? 4.862 1988.737
| |
| | > system.time(svd(A))
| | ?? user? system elapsed
| | 652.496?? 2.641 652.614
| |
| | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| | | |
| | On 13 June 2012 at 10:57, Julian Smith wrote:
| | | I've been toying with both RcppArmadillo and RcppEigen the past few days | | and | | | don't know which library to continue using. RcppEigen seems really slick, | | but | | | appears to be lacking some of the decompositions I want and isn't nearly | | as | | | fast to code. RcppArmadillo seems about as fast, easier to code up etc. | | What | | | are some of the advantages/disadvantages of both? | | | | That's pretty close. ?I have been a fan of [Rcpp]Armadillo which I find | | easier to get my head around. ?Doug, however, moved from [Rcpp]Armadillo | | to | | [Rcpp]Eigen as it has some things he needs. ?Eigen should have a "larger" | | API | | than Armadillo, but I find the code and docs harder to navigate. | | | | And you should find Eigen to be a little faster. Andreas Alfons went as far | | as building 'robustHD' using RcppArmadillo with a drop-in for RcppEigen (in | | package 'sparseLTSEigen'; both package names from memmory and I may have | | mistyped). ?He reported a performance gain of around 25% for his problem | | sets. ?On the 'fastLm' benchmark, we find the fast Eigen-based | | decompositions | | to be much faster than Armadillo. | | | | | Can you call LAPACK or BLAS from either? Is there a wrapper in RcppEigen | | to | | | call LAPACK functions? Want some other decomposition methods, dont like | | the | | | JacobiSVD method in Eigen. | | | | You need to differentiate between the Eigen and Armadillo docs _for their | | libraries_ and what happens when you access the Rcpp* variant from R. | | | | RcppArmadillo will use the very same LAPACK and BLAS libs your R session | | uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own | | code | | outperforming LAPACK, so it doesn't ?as much there. | | | | Hope this helps, ? Dirk (at useR!) | | | | | | | | ---------------------------------------------------------------------- | | | _______________________________________________ | | | Rcpp-devel mailing list | | | Rcpp-devel at lists.r-forge.r-project.org | | | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | | -- | | Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com | | | | | | -- | Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 13 June 2012 at 15:05, Julian Smith wrote: | I agree that RcppEigen is a little bit faster, but ease of use is important to | me, so I feel like RcppArmadillo might win out in my application. Yup, that my personal view too. | | RcppArmadillo will use the very same LAPACK and BLAS libs your R session | | uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own | code | | outperforming LAPACK, so it doesn't ?as much there. | | Why do you think R outperforms RcppArmadillo in this example below? Anyway to | speed this up? That is odd. "I guess it shouldn't." I shall take another look -- as I understand it both should go to the same underlying Lapack routine. ?I may have to consult with Conrad on this. Thanks for posting a full and reproducible example! Dirk | require(RcppArmadillo) | require(inline) | | arma.code <- ' | ? using namespace arma; | ? NumericMatrix Xr(Xs); | ? int n = Xr.nrow(), k = Xr.ncol(); | ? mat X(Xr.begin(), n, k, false); | ? mat U; | ? vec s; | ? mat V; | ? svd(U, s, V, X); | ? return wrap(s); | '
Because the arma code is evaluating the singular vectors (U and V) as well as the singular values (S) whereas the R code is only evaluating the singular values. There is considerably more effort required to evaluate the singular vectors in addition to the singular values.
| rcppsvd <- cxxfunction(signature(Xs="numeric"), | ??????????????????????? arma.code, | ??????????????????????? plugin="RcppArmadillo") | | A<-matrix(rnorm(5000^2), 5000) | | > system.time(rcppsvd(A)) | ??? user?? system? elapsed | 1992.406??? 4.862 1988.737 | | > system.time(svd(A)) | ?? user? system elapsed | 652.496?? 2.641 652.614 | | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org> wrote: | | | ? ? On 13 June 2012 at 10:57, Julian Smith wrote: | ? ? | I've been toying with both RcppArmadillo and RcppEigen the past few days | ? ? and | ? ? | don't know which library to continue using. RcppEigen seems really slick, | ? ? but | ? ? | appears to be lacking some of the decompositions I want and isn't nearly | ? ? as | ? ? | fast to code. RcppArmadillo seems about as fast, easier to code up etc. | ? ? What | ? ? | are some of the advantages/disadvantages of both? | | ? ? That's pretty close. ?I have been a fan of [Rcpp]Armadillo which I find | ? ? easier to get my head around. ?Doug, however, moved from [Rcpp]Armadillo | ? ? to | ? ? [Rcpp]Eigen as it has some things he needs. ?Eigen should have a "larger" | ? ? API | ? ? than Armadillo, but I find the code and docs harder to navigate. | | ? ? And you should find Eigen to be a little faster. Andreas Alfons went as far | ? ? as building 'robustHD' using RcppArmadillo with a drop-in for RcppEigen (in | ? ? package 'sparseLTSEigen'; both package names from memmory and I may have | ? ? mistyped). ?He reported a performance gain of around 25% for his problem | ? ? sets. ?On the 'fastLm' benchmark, we find the fast Eigen-based | ? ? decompositions | ? ? to be much faster than Armadillo. | | ? ? | Can you call LAPACK or BLAS from either? Is there a wrapper in RcppEigen | ? ? to | ? ? | call LAPACK functions? Want some other decomposition methods, dont like | ? ? the | ? ? | JacobiSVD method in Eigen. | | ? ? You need to differentiate between the Eigen and Armadillo docs _for their | ? ? libraries_ and what happens when you access the Rcpp* variant from R. | | ? ? RcppArmadillo will use the very same LAPACK and BLAS libs your R session | ? ? uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own | ? ? code | ? ? outperforming LAPACK, so it doesn't ?as much there. | | ? ? Hope this helps, ? Dirk (at useR!) | | ? ? | | ? ? | ---------------------------------------------------------------------- | ? ? | _______________________________________________ | ? ? | Rcpp-devel mailing list | ? ? | Rcpp-devel at lists.r-forge.r-project.org | ? ? | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | ? ? -- | ? ? Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com | | -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Doesn't svd in R by default compute D, U and V? http://stat.ethz.ch/R-manual/R-patched/library/base/html/svd.html
On Wed, Jun 13, 2012 at 4:07 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 13 June 2012 at 15:05, Julian Smith wrote: | I agree that RcppEigen is a little bit faster, but ease of use is
important to
| me, so I feel like RcppArmadillo might win out in my application. Yup, that my personal view too. | | RcppArmadillo will use the very same LAPACK and BLAS libs your R
session
| | uses. So MKL, OpenBlas, ... are all options. Eigen actually has its
own
| code | | outperforming LAPACK, so it doesn't as much there. | | Why do you think R outperforms RcppArmadillo in this example below?
Anyway to
| speed this up? That is odd. "I guess it shouldn't." I shall take another look -- as I understand it both should go to the same underlying Lapack routine. I
may
have to consult with Conrad on this. Thanks for posting a full and reproducible example! Dirk | require(RcppArmadillo) | require(inline) | | arma.code <- ' | using namespace arma; | NumericMatrix Xr(Xs); | int n = Xr.nrow(), k = Xr.ncol(); | mat X(Xr.begin(), n, k, false); | mat U; | vec s; | mat V; | svd(U, s, V, X); | return wrap(s); | '
Because the arma code is evaluating the singular vectors (U and V) as well as the singular values (S) whereas the R code is only evaluating the singular values. There is considerably more effort required to evaluate the singular vectors in addition to the singular values.
| rcppsvd <- cxxfunction(signature(Xs="numeric"), | arma.code, | plugin="RcppArmadillo") | | A<-matrix(rnorm(5000^2), 5000) | | > system.time(rcppsvd(A)) | user system elapsed | 1992.406 4.862 1988.737 | | > system.time(svd(A)) | user system elapsed | 652.496 2.641 652.614 | | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org>
wrote:
| | | On 13 June 2012 at 10:57, Julian Smith wrote: | | I've been toying with both RcppArmadillo and RcppEigen the past
few days
| and | | don't know which library to continue using. RcppEigen seems
really slick,
| but | | appears to be lacking some of the decompositions I want and
isn't nearly
| as | | fast to code. RcppArmadillo seems about as fast, easier to code
up etc.
| What | | are some of the advantages/disadvantages of both? | | That's pretty close. I have been a fan of [Rcpp]Armadillo which I
find
| easier to get my head around. Doug, however, moved from
[Rcpp]Armadillo
| to | [Rcpp]Eigen as it has some things he needs. Eigen should have a
"larger"
| API | than Armadillo, but I find the code and docs harder to navigate. | | And you should find Eigen to be a little faster. Andreas Alfons
went as far
| as building 'robustHD' using RcppArmadillo with a drop-in for
RcppEigen (in
| package 'sparseLTSEigen'; both package names from memmory and I
may have
| mistyped). He reported a performance gain of around 25% for his
problem
| sets. On the 'fastLm' benchmark, we find the fast Eigen-based | decompositions | to be much faster than Armadillo. | | | Can you call LAPACK or BLAS from either? Is there a wrapper in
RcppEigen
| to | | call LAPACK functions? Want some other decomposition methods,
dont like
| the | | JacobiSVD method in Eigen. | | You need to differentiate between the Eigen and Armadillo docs
_for their
| libraries_ and what happens when you access the Rcpp* variant from
R.
| | RcppArmadillo will use the very same LAPACK and BLAS libs your R
session
| uses. So MKL, OpenBlas, ... are all options. Eigen actually has
its own
| code | outperforming LAPACK, so it doesn't as much there. | | Hope this helps, Dirk (at useR!) | | | | |
----------------------------------------------------------------------
| | _______________________________________________ | | Rcpp-devel mailing list | | Rcpp-devel at lists.r-forge.r-project.org | |
| -- | Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com | | -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20120613/9801bf54/attachment-0001.html>
On Wed, Jun 13, 2012 at 6:53 PM, Julian Smith <hilbertspaces at gmail.com> wrote:
Doesn't svd in R by default compute D, U and V?
You're right but the default is the 'thin' U when X is n by p and n >= p. Does the svd in Armadillo return the full n by n matrix U? Another thing to check is what underlying Lapack routine is called. There are two: dgesvd, which is the older algorithm and the one with the expected name if you know the Lapack conventions, and dgesdd which is a newer and faster algorithm. R uses dgesdd.
On Wed, Jun 13, 2012 at 4:07 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 13 June 2012 at 15:05, Julian Smith wrote: | I agree that RcppEigen is a little bit faster, but ease of use is important to | me, so I feel like RcppArmadillo might win out in my application. Yup, that my personal view too. | | RcppArmadillo will use the very same LAPACK and BLAS libs your R session | | uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own | code | | outperforming LAPACK, so it doesn't ?as much there. | | Why do you think R outperforms RcppArmadillo in this example below? Anyway to | speed this up? That is odd. "I guess it shouldn't." I shall take another look -- as I understand it both should go to the same underlying Lapack routine. ?I may have to consult with Conrad on this. Thanks for posting a full and reproducible example! Dirk | require(RcppArmadillo) | require(inline) | | arma.code <- ' | ? using namespace arma; | ? NumericMatrix Xr(Xs); | ? int n = Xr.nrow(), k = Xr.ncol(); | ? mat X(Xr.begin(), n, k, false); | ? mat U; | ? vec s; | ? mat V; | ? svd(U, s, V, X); | ? return wrap(s); | '
Because the arma code is evaluating the singular vectors (U and V) as well as the singular values (S) whereas the R code is only evaluating the singular values. ?There is considerably more effort required to evaluate the singular vectors in addition to the singular values.
| rcppsvd <- cxxfunction(signature(Xs="numeric"), | ??????????????????????? arma.code, | ??????????????????????? plugin="RcppArmadillo") | | A<-matrix(rnorm(5000^2), 5000) | | > system.time(rcppsvd(A)) | ??? user?? system? elapsed | 1992.406??? 4.862 1988.737 | | > system.time(svd(A)) | ?? user? system elapsed | 652.496?? 2.641 652.614 | | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org> wrote: | | | ? ? On 13 June 2012 at 10:57, Julian Smith wrote: | ? ? | I've been toying with both RcppArmadillo and RcppEigen the past few days | ? ? and | ? ? | don't know which library to continue using. RcppEigen seems really slick, | ? ? but | ? ? | appears to be lacking some of the decompositions I want and isn't nearly | ? ? as | ? ? | fast to code. RcppArmadillo seems about as fast, easier to code up etc. | ? ? What | ? ? | are some of the advantages/disadvantages of both? | | ? ? That's pretty close. ?I have been a fan of [Rcpp]Armadillo which I find | ? ? easier to get my head around. ?Doug, however, moved from [Rcpp]Armadillo | ? ? to | ? ? [Rcpp]Eigen as it has some things he needs. ?Eigen should have a "larger" | ? ? API | ? ? than Armadillo, but I find the code and docs harder to navigate. | | ? ? And you should find Eigen to be a little faster. Andreas Alfons went as far | ? ? as building 'robustHD' using RcppArmadillo with a drop-in for RcppEigen (in | ? ? package 'sparseLTSEigen'; both package names from memmory and I may have | ? ? mistyped). ?He reported a performance gain of around 25% for his problem | ? ? sets. ?On the 'fastLm' benchmark, we find the fast Eigen-based | ? ? decompositions | ? ? to be much faster than Armadillo. | | ? ? | Can you call LAPACK or BLAS from either? Is there a wrapper in RcppEigen | ? ? to | ? ? | call LAPACK functions? Want some other decomposition methods, dont like | ? ? the | ? ? | JacobiSVD method in Eigen. | | ? ? You need to differentiate between the Eigen and Armadillo docs _for their | ? ? libraries_ and what happens when you access the Rcpp* variant from R. | | ? ? RcppArmadillo will use the very same LAPACK and BLAS libs your R session | ? ? uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its own | ? ? code | ? ? outperforming LAPACK, so it doesn't ?as much there. | | ? ? Hope this helps, ? Dirk (at useR!) | | ? ? | | ? ? | ---------------------------------------------------------------------- | ? ? | _______________________________________________ | ? ? | Rcpp-devel mailing list | ? ? | Rcpp-devel at lists.r-forge.r-project.org | ? ? | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | ? ? -- | ? ? Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com | | -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
On 14 June 2012 at 08:01, Douglas Bates wrote:
| On Wed, Jun 13, 2012 at 6:53 PM, Julian Smith <hilbertspaces at gmail.com> wrote:
| > Doesn't svd in R by default compute D, U and V? | | > http://stat.ethz.ch/R-manual/R-patched/library/base/html/svd.html | | You're right but the default is the 'thin' U when X is n by p and n >= | p. Does the svd in Armadillo return the full n by n matrix U? Thanks for that earlier hint re 'thin' and 'full' SVDs. | Another thing to check is what underlying Lapack routine is called. | There are two: dgesvd, which is the older algorithm and the one with | the expected name if you know the Lapack conventions, and dgesdd which | is a newer and faster algorithm. R uses dgesdd. Good call. It looks like Armadillo uses dgesv(d), grep did not find dgesdd it seems: : edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ grep dges * atlas_bones.hpp: int wrapper_clapack_dgesv(const enum CBLAS_ORDER Order, const int N, const int NRHS, double *A, const int lda, int *ipiv, double *B, const int ldb); atlas_wrapper.hpp: return arma_atlas(clapack_dgesv)(Order, N, NRHS, (T*)A, lda, ipiv, (T*)B, ldb); glue_histc_meat.hpp: "histc(): parameter 'edges' must be a vector" lapack_bones.hpp: #define arma_dgesvd dgesvd lapack_bones.hpp: #define arma_dgesv dgesv lapack_bones.hpp: #define arma_dgesvd DGESVD lapack_bones.hpp: #define arma_dgesv DGESV lapack_bones.hpp: void arma_fortran(arma_dgesvd)(char* jobu, char* jobvt, blas_int* m, blas_int* n, double* a, blas_int* lda, double* s, double* u, blas_int* ldu, double* vt, blas_int* ldvt, double* work, blas_int* lwork, blas_int* info); lapack_bones.hpp: void arma_fortran(arma_dgesv)(blas_int* n, blas_int* nrhs, double* a, blas_int* lda, blas_int* ipiv, double* b, blas_int* ldb, blas_int* info); lapack_wrapper.hpp: arma_fortran(arma_dgesvd)(jobu, jobvt, m, n, (T*)a, lda, (T*)s, (T*)u, ldu, (T*)vt, ldvt, (T*)work, lwork, info); lapack_wrapper.hpp: arma_fortran(arma_dgesv)(n, nrhs, (T*)a, lda, ipiv, (T*)b, ldb, info); edd at max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ Conrad, any interest in switching to dgesdd? Dirk, at useR and across the room from Doug
| > On Wed, Jun 13, 2012 at 4:07 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
| >>
| >> On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >> >
| >> > On 13 June 2012 at 15:05, Julian Smith wrote:
| >> > | I agree that RcppEigen is a little bit faster, but ease of use is | >> > important to | >> > | me, so I feel like RcppArmadillo might win out in my application. | >> > | >> > Yup, that my personal view too. | >> > | >> > | | RcppArmadillo will use the very same LAPACK and BLAS libs your R | >> > session | >> > | | uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has its | >> > own | >> > | code | >> > | | outperforming LAPACK, so it doesn't ?as much there. | >> > | | >> > | Why do you think R outperforms RcppArmadillo in this example below? | >> > Anyway to | >> > | speed this up? | >> > | >> > That is odd. "I guess it shouldn't." I shall take another look -- as I | >> > understand it both should go to the same underlying Lapack routine. ?I | >> > may | >> > have to consult with Conrad on this. | >> > | >> > Thanks for posting a full and reproducible example! | >> > | >> > Dirk | >> > | >> > | require(RcppArmadillo) | >> > | require(inline) | >> > | | >> > | arma.code <- ' | >> > | ? using namespace arma; | >> > | ? NumericMatrix Xr(Xs); | >> > | ? int n = Xr.nrow(), k = Xr.ncol(); | >> > | ? mat X(Xr.begin(), n, k, false); | >> > | ? mat U; | >> > | ? vec s; | >> > | ? mat V; | >> > | ? svd(U, s, V, X); | >> > | ? return wrap(s); | >> > | ' | >> | >> Because the arma code is evaluating the singular vectors (U and V) as | >> well as the singular values (S) whereas the R code is only evaluating | >> the singular values. ?There is considerably more effort required to | >> evaluate the singular vectors in addition to the singular values. | >> | >> > | rcppsvd <- cxxfunction(signature(Xs="numeric"), | >> > | ??????????????????????? arma.code, | >> > | ??????????????????????? plugin="RcppArmadillo") | >> > | | >> > | A<-matrix(rnorm(5000^2), 5000) | >> > | | >> > | > system.time(rcppsvd(A)) | >> > | ??? user?? system? elapsed | >> > | 1992.406??? 4.862 1988.737 | >> > | | >> > | > system.time(svd(A)) | >> > | ?? user? system elapsed | >> > | 652.496?? 2.641 652.614 | >> > | | >> > | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <edd at debian.org>
| >> > wrote:
| >> > | | >> > |
| >> > | ? ? On 13 June 2012 at 10:57, Julian Smith wrote:
| >> > | ? ? | I've been toying with both RcppArmadillo and RcppEigen the past | >> > few days | >> > | ? ? and | >> > | ? ? | don't know which library to continue using. RcppEigen seems | >> > really slick, | >> > | ? ? but | >> > | ? ? | appears to be lacking some of the decompositions I want and | >> > isn't nearly | >> > | ? ? as | >> > | ? ? | fast to code. RcppArmadillo seems about as fast, easier to code | >> > up etc. | >> > | ? ? What | >> > | ? ? | are some of the advantages/disadvantages of both? | >> > | | >> > | ? ? That's pretty close. ?I have been a fan of [Rcpp]Armadillo which I | >> > find | >> > | ? ? easier to get my head around. ?Doug, however, moved from | >> > [Rcpp]Armadillo | >> > | ? ? to | >> > | ? ? [Rcpp]Eigen as it has some things he needs. ?Eigen should have a | >> > "larger" | >> > | ? ? API | >> > | ? ? than Armadillo, but I find the code and docs harder to navigate. | >> > | | >> > | ? ? And you should find Eigen to be a little faster. Andreas Alfons | >> > went as far | >> > | ? ? as building 'robustHD' using RcppArmadillo with a drop-in for | >> > RcppEigen (in | >> > | ? ? package 'sparseLTSEigen'; both package names from memmory and I | >> > may have | >> > | ? ? mistyped). ?He reported a performance gain of around 25% for his | >> > problem | >> > | ? ? sets. ?On the 'fastLm' benchmark, we find the fast Eigen-based | >> > | ? ? decompositions | >> > | ? ? to be much faster than Armadillo. | >> > | | >> > | ? ? | Can you call LAPACK or BLAS from either? Is there a wrapper in | >> > RcppEigen | >> > | ? ? to | >> > | ? ? | call LAPACK functions? Want some other decomposition methods, | >> > dont like | >> > | ? ? the | >> > | ? ? | JacobiSVD method in Eigen. | >> > | | >> > | ? ? You need to differentiate between the Eigen and Armadillo docs | >> > _for their | >> > | ? ? libraries_ and what happens when you access the Rcpp* variant from | >> > R. | >> > | | >> > | ? ? RcppArmadillo will use the very same LAPACK and BLAS libs your R | >> > session | >> > | ? ? uses. So MKL, OpenBlas, ... are all options. ?Eigen actually has | >> > its own | >> > | ? ? code | >> > | ? ? outperforming LAPACK, so it doesn't ?as much there. | >> > | | >> > | ? ? Hope this helps, ? Dirk (at useR!) | >> > | | >> > | ? ? | | >> > | ? ? | | >> > ---------------------------------------------------------------------- | >> > | ? ? | _______________________________________________ | >> > | ? ? | Rcpp-devel mailing list | >> > | ? ? | Rcpp-devel at lists.r-forge.r-project.org | >> > | ? ? | | >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | >> > | ? ? -- | >> > | ? ? Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com | >> > | | >> > | | >> > | >> > -- | >> > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com | >> > _______________________________________________ | >> > Rcpp-devel mailing list | >> > Rcpp-devel at lists.r-forge.r-project.org | >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | > | >
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On Thu, Jun 14, 2012 at 4:43 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
And you should find Eigen to be a little faster. Andreas Alfons went as far as building 'robustHD' using RcppArmadillo with a drop-in for RcppEigen (in package 'sparseLTSEigen'; both package names from memmory and I may have mistyped). ?He reported a performance gain of around 25% for his problem sets. ?On the 'fastLm' benchmark, we find the fast Eigen-based decompositions to be much faster than Armadillo.
This is a mis-conception that needs to be addressed. For equivalent functionality, Armadillo is not necessarily any slower than Eigen, given suitable Lapack and/or Blas libraries are used (such as Intel's MKL or AMD's ACML, or even the open-source Atlas or OpenBlas in many cases). Standard Lapack and Blas are just that: a "better than nothing" baseline implementation in terms of performance. Armadillo doesn't reimplement Lapack and it doesn't reimplement any decompositions -- it uses Lapack. (**This is a very important point, which I elaborate on below**). As such, the speed of Armadillo for matrix decompositions is directly dependant on the particular implementation of Lapack that's installed on the user's machine. I've seen some ridiculous speed differences between standard Lapack and MKL. The latter not only has CPU-specific optimisations (eg. using the latest AVX extensions), but can also do multi-threading. Simply installing ATLAS (which provides speed-ups for several Lapack functions) on Debian/Ubuntu systems can already make a big difference. (Debian & Ubuntu use a trick to redirect Lapack and Blas calls to ATLAS). Under Mac OS X, the Accelerate framework provides fast implementations of Lapack and Blas functions (eg. using multi-threading). I've taken the modular approach to Armadillo (ie. using Lapack rather than reimplementing decompositions), as it specifically allows other specialist parties (such as Intel) to provide Lapack that is highly optimised for particular architectures. I myself would not be able to keep up with the specific optimisations required for each CPU. This also "future-proofs" Armadillo for each new CPU generation. More importantly, numerically stable implementation of computational decompositions/factorisations is notoriously difficult to get right. The core algorithms in Lapack have been evolving for the past 20+ years, being exposed to a bazillion corner-cases. Lapack itself is related to Linpack and Eispack, which are even older. I've been exposed to software development long enough to know that in the end only time can shake out all the bugs. As such, using Lapack is far less risky than reimplementing decompositions from scratch. A "home-made" matrix decomposition might be a bit faster on a particular CPU, but you have far less knowledge as to when it's going to blow up in your face. High-performance variants of Lapack, such as MKL, take an existing proven implementation of a decomposition algorithm and recode parts of it in assembly, and/or parallelise other parts.
On Jun 15, 2012 12:11 AM, "Dirk Eddelbuettel" wrote:
Thanks for that earlier hint re 'thin' and 'full' SVDs.
Armadillo has the standard svd() and the thin version too: svd_econ(). See http://arma.sourceforge.net/docs.html#svd_econ
Conrad, any interest in switching to dgesdd?
Yes, but this is not simply a matter of making svd() use dgesdd() instead of dgesvd(). Lapack authors added the dgsedd() function instead of simply replacing the underlying algorithm used by dgesvd(). This was done for a reason: the results produced by dgesvd() and dgesdd() can be "slightly" different (where "slightly" is dependant on matrix size and matrix content). If Armadillo starts giving different results all of a sudden, there would be a lot of displeased people. I'll probably add an option to Armadillo's svd() to optionally use dgesdd(). I've done something very similar for eig_sym() in Armadillo 3.2, where the faster "divide and conquer" algorithm for eigen decomposition can be optionally used. See http://arma.sourceforge.net/docs.html#eig_sym
On 15 June 2012 at 02:56, c s wrote:
| Simply installing ATLAS (which provides speed-ups for several Lapack | functions) on Debian/Ubuntu systems can already make a big difference. | (Debian & Ubuntu use a trick to redirect Lapack and Blas calls to | ATLAS). Under Mac OS X, the Accelerate framework provides fast | implementations of Lapack and Blas functions (eg. using | multi-threading). I found OpenBLAS to be faster than Atlas (with both coming from the distribution), and tend to install that now. It uses a multithreaded approach by default, but I have to double check one thing about cpu affinity which, when used from R, has been seen to interfere. That is possible reason for the single-threaded performance here. I'll report back. | I've taken the modular approach to Armadillo (ie. using Lapack rather | than reimplementing decompositions), as it specifically allows other | specialist parties (such as Intel) to provide Lapack that is highly | optimised for particular architectures. I myself would not be able to | keep up with the specific optimisations required for each CPU. This | also "future-proofs" Armadillo for each new CPU generation. | | More importantly, numerically stable implementation of computational | decompositions/factorisations is notoriously difficult to get right. | The core algorithms in Lapack have been evolving for the past 20+ | years, being exposed to a bazillion corner-cases. Lapack itself is | related to Linpack and Eispack, which are even older. I've been | exposed to software development long enough to know that in the end | only time can shake out all the bugs. As such, using Lapack is far | less risky than reimplementing decompositions from scratch. A | "home-made" matrix decomposition might be a bit faster on a particular | CPU, but you have far less knowledge as to when it's going to blow up | in your face. | | High-performance variants of Lapack, such as MKL, take an existing | proven implementation of a decomposition algorithm and recode parts of | it in assembly, and/or parallelise other parts. All excellent points which nobody disputes. It just so happens that Eigen still looks better in some benchmarks but as you say we have to ensure we really compare apples to apples. Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
1 day later
On Thu, Jun 14, 2012 at 2:36 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 15 June 2012 at 02:56, c s wrote: | Simply installing ATLAS (which provides speed-ups for several Lapack | functions) on Debian/Ubuntu systems can already make a big difference. | ?(Debian & Ubuntu use a trick to redirect Lapack and Blas calls to | ATLAS). ?Under Mac OS X, the Accelerate framework provides fast | implementations of Lapack and Blas functions (eg. using | multi-threading). I found OpenBLAS to be faster than Atlas (with both coming from the distribution), and tend to install that now. It uses a multithreaded approach by default, but I have to double check one thing about cpu affinity which, when used from R, has been seen to interfere. ?That is possible reason for the single-threaded performance here. I'll report back. | I've taken the modular approach to Armadillo (ie. using Lapack rather | than reimplementing decompositions), as it specifically allows other | specialist parties (such as Intel) to provide Lapack that is highly | optimised for particular architectures. ?I myself would not be able to | keep up with the specific optimisations required for each CPU. ?This | also "future-proofs" Armadillo for each new CPU generation. | | More importantly, numerically stable implementation of computational | decompositions/factorisations is notoriously difficult to get right. | The core algorithms in Lapack have been evolving for the past 20+ | years, being exposed to a bazillion corner-cases. ?Lapack itself is | related to Linpack and Eispack, which are even older. ?I've been | exposed to software development long enough to know that in the end | only time can shake out all the bugs. ?As such, using Lapack is far | less risky than reimplementing decompositions from scratch. ?A | "home-made" matrix decomposition might be a bit faster on a particular | CPU, but you have far less knowledge as to when it's going to blow up | in your face. | | High-performance variants of Lapack, such as MKL, take an existing | proven implementation of a decomposition algorithm and recode parts of | it in assembly, and/or parallelise other parts. All excellent points which nobody disputes. ?It just so happens that Eigen still looks better in some benchmarks but as you say we have to ensure we really compare apples to apples.
While I realize that Conrad has strong opinions on these issues and is unlikely to be moved from those opinions, I would make a few points in favor of Eigen. - using BLAS/Lapack is effective because that is mature, well-tested code. It is ineffective because the reference implementations of BLAS/Lapack are in Fortran 77 and suffer from all the idiosyncrasies of that ancient language. These include call-by-reference semantics, even for scalars, lack of structures beyond pointers to arrays and the inability to allocate storage. Hence, the caller frequently must pass an array 'work' and it length 'lwork' and, occasionally, 'rwork' and 'iwork'. There is a helpful convention for getting the length of the work array correct which involves a preliminary call with lwork = -1 that does nothing but write the desired work array length into the first element of the work array. The caller must then allocate the work array, change the value of lwork and call the routine a second time. Because the only structure available is a pointer to a one-dimensional array, you must pass a general matrix as four arguments M, N, A, LDA. There are no machine-readable header files. Jeff Bezanson recently joked about writing a parser for the upper-case Fortran comments to create headers. Does any of this sound like 21st century software design? - MKL blas is very good on Intel hardware. Naturally Intel doesn't go to great lengths to support AMD processors to the same extent. MKL is proprietary, closed-source software and plugging it into an open-source software systems like R and Aramadillo feels like a compromise. There are open alternatives like Atlas and OpenBlas. However, these all need to conform to the Fortran calling sequence specifications (Lapacke and CBLAS do provide C calling sequences but it is not a good idea to count on their availability at present). And notice that this approach contradicts the "Lapack code is complex but widely tested so it should be left untouched" argument. The way to get good performance in OpenBlas and MKL is to rewrite the BLAS and heavily-used parts of Lapack, like dgetrf, in a combination of C and Assembler. I would also point out that one of the reasons that Lapack code is complex and difficult to understand is because it is written in the archaic language Fortran 77. - the approach in Eigen was to use templated C++ classes for dense and sparse matrices and for the decompositions and solvers. I don't think readers of this group need to be reminded of the advantages of templated object-oriented C++ code. Eigen has a bit of a learning curve but it also has a great range of capabilities. These comments may provoke a heated response from Conrad but, if so, I don't plan to respond further. Eigen and Armadillo are different approaches, each with their own advantages and disadvantages.
1 day later
On Sun, Jun 17, 2012 at 4:21 AM, Douglas Bates wrote:
These comments may provoke a heated response from Conrad but, if so, I don't plan to respond further. ?Eigen and Armadillo are different approaches, each with their own advantages and disadvantages.
I'm actually in favour of heterogeneous solutions and support the use of C++ for performance critical implementations. Relying on only one solution (ie. homogeneity) is bad from a "survival" point of view, due to very similar reasons as in the biological world: a fault can affect the entire population, or a virus can propagate itself quite easily and do lots of nasty things. For example, see the recent snafu with state-sponsored viruses attacking Windoze installations, ie, Stuxnet and Flame: http://www.h-online.com/security/features/FAQ-Flame-the-super-spy-1587063.html In my view, the C++ brethren Armadillo and Eigen are not really up against each other, but against Matlab. Matlab is like crack or crystal meth. Highly addictive, but in the end bad for you. Mathworks, the company that produces Matlab, hooks people in while they are university students. It provides low prices to educational institutions, and charges through the roof for everyone else. Students become graduates, which are then employed by the companies that are in effect stuck with using Matlab. This is a strategy known as vendor lock-in, exploited by monopolists. Other descriptions consistent with the business model of Mathworks are "parasite" and "leech". One way of keeping Mathworks in check is to provide alternative solutions that have similar functionality (eg. easy visualisation of data and rapid prototyping). One of them in GNU Octave, which really needs a JIT compiler to effectively compete with Matlab.. Another is R. I believe R currently doesn't have a JIT compiler (I haven't checked lately), and hence the very useful Rcpp fills in the performance gap. One also has to address the current reality that people still overly rely on Matlab, and have the problem of converting their Matlab code into production environments / products. This is where C++ comes in, and libraries such as Armadillo (with its Matlab-like syntax) can be quite handy.
On Sun, Jun 17, 2012 at 11:09 PM, c s <conradsand.arma at gmail.com> wrote:
On Sun, Jun 17, 2012 at 4:21 AM, Douglas Bates wrote:
These comments may provoke a heated response from Conrad but, if so, I don't plan to respond further. ?Eigen and Armadillo are different approaches, each with their own advantages and disadvantages.
I'm actually in favour of heterogeneous solutions and support the use of C++ for performance critical implementations. Relying on only one solution (ie. homogeneity) is bad from a "survival" point of view, due to very similar reasons as in the biological world: a fault can affect the entire population, or a virus can propagate itself quite easily and do lots of nasty things. ?For example, see the recent snafu with state-sponsored viruses attacking Windoze installations, ie, Stuxnet and Flame: http://www.h-online.com/security/features/FAQ-Flame-the-super-spy-1587063.html In my view, the C++ brethren Armadillo and Eigen are not really up against each other, but against Matlab. ?Matlab is like crack or crystal meth. ?Highly addictive, but in the end bad for you. Mathworks, the company that produces Matlab, hooks people in while they are university students. ?It provides low prices to educational institutions, and charges through the roof for everyone else. Students become graduates, which are then employed by the companies that are in effect stuck with using Matlab. ?This is a strategy known as vendor lock-in, exploited by monopolists. ?Other descriptions consistent with the business model of Mathworks are "parasite" and "leech".
I completely and happily agree.
One way of keeping Mathworks in check is to provide alternative solutions that have similar functionality (eg. easy visualisation of data and rapid prototyping). ?One of them in GNU Octave, which really needs a JIT compiler to effectively compete with Matlab.. ?Another is R. ?I believe R currently doesn't have a JIT compiler (I haven't checked lately), and hence the very useful Rcpp fills in the performance gap.
The recently-developed Julia language (JuliaLang.org) has a JIT and a syntax that is familiar to Matlab and R users. It is still very much a work in progress but shows great promise.
One also has to address the current reality that people still overly rely on Matlab, and have the problem of converting their Matlab code into production environments / products. ?This is where C++ comes in, and libraries such as Armadillo (with its Matlab-like syntax) can be quite handy.
On Sun, Jun 17, 2012 at 9:09 PM, c s <conradsand.arma at gmail.com> wrote:
Another is R. I believe R currently doesn't have a JIT compiler (I haven't checked lately), and hence the very useful Rcpp fills in the performance gap.
Still work in progress but see http://www.milbo.users.sonic.net/ra/jit.html. Doesn't replace Rcpp by any stretch of the imagination, but may close the gap a little bit. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20120618/7e2ca815/attachment.html>
On 18 June 2012 at 11:55, Antonio Piccolboni wrote:
| |
| On Sun, Jun 17, 2012 at 9:09 PM, c s <conradsand.arma at gmail.com> wrote:
| | ?Another is | R. ?I believe R currently doesn't have a JIT compiler (I haven't | checked lately), and hence the very useful Rcpp fills in the | performance gap. | | | | Still work in progress but ?see?http://www.milbo.users.sonic.net/ra/jit.html. | Doesn't replace Rcpp by any stretch of the imagination, but may close the gap a | little bit. Sadly, Ra and jit are no longer maintained by Stephen Milborrow. I actually used to provided a patched binary for this within Debian. This was ahead of its time (ginve the renewed interest in JITs from LLVM and other places) and quite useful, but Ra/jit is at this point a dead project. Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
Ooops, wrong JIT! Dirk is absolutely right. I meant to say see enableJIT is the compiler package, see http://www.r-statistics.com/2012/04/speed-up-your-r-code-using-a-just-in-time-jit-compiler/ I second Dirk's comments on Ra. Antonio
On Mon, Jun 18, 2012 at 12:53 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 18 June 2012 at 11:55, Antonio Piccolboni wrote: | | | On Sun, Jun 17, 2012 at 9:09 PM, c s <conradsand.arma at gmail.com> wrote: | | Another is | R. I believe R currently doesn't have a JIT compiler (I haven't | checked lately), and hence the very useful Rcpp fills in the | performance gap. | | | | Still work in progress but see http://www.milbo.users.sonic.net/ra/jit.html. | Doesn't replace Rcpp by any stretch of the imagination, but may close the gap a | little bit. Sadly, Ra and jit are no longer maintained by Stephen Milborrow. I actually used to provided a patched binary for this within Debian. This was ahead of its time (ginve the renewed interest in JITs from LLVM and other places) and quite useful, but Ra/jit is at this point a dead project. Dirk -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20120618/78b96b23/attachment.html>
As I understand it, both Ra and the R 'compiler' package are bytecode JITs, so roughly like Java '.class' and Emacs '.elc' files. There is no current method that compiles R into machine code, which is what JIT in the JVM does, and which is why Java does so well in the language shootout [1]. Davor [1] http://shootout.alioth.debian.org/
On 12-06-18 01:15 PM, Antonio Piccolboni wrote:
Ooops, wrong JIT! Dirk is absolutely right. I meant to say see enableJIT is the compiler package, see http://www.r-statistics.com/2012/04/speed-up-your-r-code-using-a-just-in-time-jit-compiler/ I second Dirk's comments on Ra. Antonio On Mon, Jun 18, 2012 at 12:53 PM, Dirk Eddelbuettel <edd at debian.org <mailto:edd at debian.org>> wrote: On 18 June 2012 at 11:55, Antonio Piccolboni wrote: | | | On Sun, Jun 17, 2012 at 9:09 PM, c s <conradsand.arma at gmail.com <mailto:conradsand.arma at gmail.com>> wrote: | | Another is | R. I believe R currently doesn't have a JIT compiler (I haven't | checked lately), and hence the very useful Rcpp fills in the | performance gap. | | | | Still work in progress but see http://www.milbo.users.sonic.net/ra/jit.html. | Doesn't replace Rcpp by any stretch of the imagination, but may close the gap a | little bit. Sadly, Ra and jit are no longer maintained by Stephen Milborrow. I actually used to provided a patched binary for this within Debian. This was ahead of its time (ginve the renewed interest in JITs from LLVM and other places) and quite useful, but Ra/jit is at this point a dead project. Dirk -- Dirk Eddelbuettel | edd at debian.org <mailto:edd at debian.org> | http://dirk.eddelbuettel.com
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel