[R-pkg-devel] Cannot create C code with acceptable performance with respect to internal R command.

Dear package developers,

in creating a package lavaanC for use in lavaan, I need to perform some matrix computations involving matrix products and crossproducts. As far as I see I cannot directly call the C code in the R core. So I copied the code in the R core, but the same C/C++ code in a package is 2.5 ? 3 times slower than executed directly in R :?

C code in package :?
? SEXP prod0(SEXP mat1, SEXP mat2) {
? ? SEXP u1 = Rf_getAttrib(mat1, R_DimSymbol);
? ? int m1 = INTEGER(u1)[0];
? ? int n1 = INTEGER(u1)[1];
? ? SEXP u2 = Rf_getAttrib(mat2, R_DimSymbol);
? ? int m2 = INTEGER(u2)[0];
? ? int n2 = INTEGER(u2)[1];
? ? if (n1 != m2) Rf_error("matrices not conforming");
? ? SEXP retval = PROTECT(Rf_allocMatrix(REALSXP, m1, n2));
? ? double* left = REAL(mat1);
? ? double* right = REAL(mat2);
? ? double* ret = REAL(retval);
? ? double werk = 0.0;
? ? for (int j = 0; j < n2; j++) {
? ? ? for (int i = 0; i < m1; i++) {
? ? ? ? ? werk = 0.0;
? ? ? ? for (int k = 0; k < n1; k++) werk += (left[i + m1 * k] * right[k + m2 * j]);
? ? ? ? ret[j * m1 + i] = ?werk;
? ? ? }
? ? }
? ? UNPROTECT(1);
? ? return retval;
? }

Test script :
m1 <- matrix(rnorm(300000), nrow = 60)
m2 <- matrix(rnorm(300000), ncol = 60)
print(microbenchmark::microbenchmark(
  m1 %*% m2, .Call("prod0", m1, m2), times = 100
))

Result on my pc:
Unit: milliseconds
                   expr     min      lq     mean  median       uq     max neval
              m1 %*% m2 10.5650 10.8967 11.13434 10.9449 11.02965 15.8397   100
 .Call("prod0", m1, m2) 29.3336 30.7868 32.05114 31.0408 33.85935 45.5321   100

Can anyone explain why the compiled code in the package is so much slower than in R core?

and

Is there a way to improve the performance in R package?

Best regards,

Luc De Wilde