Skip to content

[Rcpp-devel] Filling NumericMatrix with NumericVectors with apply by row/column?

12 messages · Christian Gunning, Dirk Eddelbuettel, Romain Francois

#
I was thinking about this today, and I wondered if getter/setter
functions for NumericMatrix, along with MatrixIndex classes might make
any sense, as opposed to sugar? Something like this:

SEXP foo1( int n ){
  NumericVector x(n);
  MatrixIndex i(1);  // row index
  NumericMatrix xx(n,n) ;
  // possible to assign by row or column?
  for (i.index =0; i.index < n; i.index++) {
    xx.Setter(i) = zeros(n) ;
  }
  return(xx)
}

class MatrixIndex:
{
  // use apply()'s convention of 1=row, 2=col
  // row/col specification is set on creation
  private:
    int m_rowcol;
  public:
    int index;
    MatrixIndex( int rowcol, int i=0) {
      m_rowcol = rowcol;
      index = i;
    }
};

best,
Christian

On Sun, Aug 22, 2010 at 5:23 AM, Romain Francois
<romain at r-enthusiasts.com> wrote:

  
    
#
On 7 September 2010 at 17:26, Christian Gunning wrote:
| I was thinking about this today, and I wondered if getter/setter
| functions for NumericMatrix, along with MatrixIndex classes might make
| any sense, as opposed to sugar? Something like this:
| 
| SEXP foo1( int n ){
|   NumericVector x(n);
|   MatrixIndex i(1);  // row index
|   NumericMatrix xx(n,n) ;
|   // possible to assign by row or column?
|   for (i.index =0; i.index < n; i.index++) {
|     xx.Setter(i) = zeros(n) ;
|   }
|   return(xx)
| }
| 
| class MatrixIndex:
| {
|   // use apply()'s convention of 1=row, 2=col
|   // row/col specification is set on creation
|   private:
|     int m_rowcol;
|   public:
|     int index;
|     MatrixIndex( int rowcol, int i=0) {
|       m_rowcol = rowcol;
|       index = i;
|     }
| };

I have to admit that questions like this always leads to _numeric classes_
such as Armadillo or NewMat rather than our R proxy classes.  Did you see
what Armadillo:  http://arma.sourceforge.net/docs.html#insert_rows

Would that help?  Passing matrices from R via Rcpp to Armadillo (and back) is
pretty easy.

Cheers, Dirk

PS We are a little behind wrapping Armadillo 0.9.70 in RcppArmadillo but we
will get there "soon".

 
| best,
| Christian
| 
| On Sun, Aug 22, 2010 at 5:23 AM, Romain Francois
| <romain at r-enthusiasts.com> wrote:
| > Hello,
| >
| > There currently is no sugar facility to generate a matrix the way you want.
| > The last option is probably the best thing to do for now.
| >
| > Perhaps outer can help you :
| >
| > NumericVector xx(x) ;
| > NumericVector yy(y);
| > NumericMatrix m = outer( xx, yy, std::plus<double>() ) ;
| > return m ;
| >
| > Romain
| >
| > Le 21/08/10 23:13, Christian Gunning a ?crit :
| >>
| >> Dear list,
| >>
| >> I'm amazed at the ability to use the apply family in Rcpp. ?Yet I'm still
| >> unsure of the best way to assign NumericVector objects into
| >> NumericMatrix objects. ?Must this be done element-by-element, or is
| >> there something equivalent to R's MyMatrix[,1] = MyColVector?
| >> (As an aside, are both mymatrix[i,j] and mymatrix(i,j) equivalent? ?It
| >> seems that I've seen them used interchangably on the list.)
| >>
| >> A simple example of what I'm aiming for:
| >> Make an n*n matrix, and use sapply to perform a vector operation by
| >> row, here constructing a vector of length n full of zeros.
| >>
| >> // a simple vector-returning function
| >> NumericVector zeros( int n){
| >> ?NumericVector ret(n);
| >> ?ret = 0;
| >> ?return ret;
| >> }
| >>
| >> // sapply version, doesn't work but is easy to read
| >> SEXP foo( int n ){
| >> ?NumericVector x(n);
| >> ?x = n;
| >> ?NumericMatrix xx(n,n) ;
| >> ?// possible to assign by row or column?
| >> ?xx = sapply( x, zeros ) ;
| >> ?return(xx);
| >> }
| >>
| >> // the looped version, where xx[,i] is not syntactically valid
| >> SEXP foo1( int n ){
| >> ?NumericVector x(n);
| >> ?int i;
| >> ?NumericMatrix xx(n,n) ;
| >> ?// possible to assign by row or column?
| >> ?for (i =0; i<n; i++) {
| >> ? ?xx[,i] = zeros(n) ;
| >> ?}
| >> ?return(xx)
| >> }
| >>
| >>
| >> // syntactically valid, element-wise assignment
| >> SEXP foo2( int n ){
| >> ?NumericVector x(n);
| >> ?int i, j;
| >> ?NumericMatrix xx(n,n) ;
| >> ?// possible to assign by row or column?
| >> ?for (i=0; i<n; i++) {
| >> ? ?x = zeros(n) ;
| >> ? ?for (j=0; ?j<n; j++) {
| >> ? ? ?xx(i,j) = x[j]
| >> ? ?}
| >> ?}
| >> ?return(xx)
| >> }
| >>
| >> thanks so much,
| >> Christian Gunning
| >> --
| >> A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal ? Panama!
| >
| > --
| > Romain Francois
| > Professional R Enthusiast
| > +33(0) 6 28 91 30 30
| > http://romainfrancois.blog.free.fr
| > |- http://bit.ly/bzoWrs : Rcpp svn revision 2000
| > |- http://bit.ly/b8VNE2 : Rcpp at LondonR, oct 5th
| > `- http://bit.ly/aAyra4 : highlight 0.2-2
| >
| >
| 
| 
| 
| -- 
| A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal ? Panama!
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
#
Yes, I see the reinvent-the-wheel problem, especially after combing
through rcpp-devel archives some more.  I admit that I don't well
understand the relative costs/benefits of extending the R proxy class
versus outsourcing to Armadillo.  Two examples that come to mind are
Rcpp::Array and Rcpp::DataFrame.

In the first case, could I conceivably turn an input 3D SEXP array
into, e.g., NumericVector( Dimension(2,3,4) ), create an arma::cube
out of it, process it and return?

In the second case, there's a column level but no row-level accessor
for the DataFrame (that I could find), nor is there a mapping into
Armadillo. I get the impression that DataFrame is more of a
convenience class for *returning* data rather than processing it?

best,
Christian
On Tue, Sep 7, 2010 at 5:48 PM, Dirk Eddelbuettel <edd at debian.org> wrote:

  
    
#
Le 08/09/10 06:07, Christian Gunning a ?crit :
I think so. I have not tried though ... have you ?
Yes. It makes it easy to create a data frame and access it by column.

We'd be open to extend the interface if you provide some sort of design 
of the sort of code you'd want to write.

Romain

  
    
#
On Tue, Sep 7, 2010 at 11:40 PM, Romain Francois
<romain at r-enthusiasts.com> wrote:
No, not yet.  I just now went mucking about through the armadillo
classes (and, by extension, the rcpp-devel archives) at Dirk's
suggestion.
Yeah, I apologize - I'm just at the edge of my experience here.  I
spent the last few days learning proper C++ OOP writing (as opposed to
reading) in response to your earlier comment about references.  I
admit, templates are still magic to me.

I guess I'm still curious about the fundamental design goals.
RcppArmadillo allows us to easily use armadillo's rich object model,
with a wealth of object-specific methods, e.g.:

mat A = randu<mat>(5,10);
mat B = ones<mat>(5,2);
// at column 2, insert a copy of B;
A.insert_cols(2, B);

Personally, R's indexing is one of my favorite parts of the language
(and C++ indexing my least). With this in mind, I've wondered about
the possibility of rolling some sort of generic Index class that
contains enough information to do (as above):

A(aColIndex) = B;

So, if the overall goal is Simple/Reads-more-like-C++, then it's a
moot point.  If something like this is worth doing, is it worth doing
for a variety of Rcpp classes?  I admit that i'm not entirely clear on
the cost side of the "worth" equation here :)

Christian

  
    
#
Le 08/09/10 11:24, Christian Gunning a ?crit :
I'm sure people will be interested about the results of this experiment :-)
that's alright. templates can do amazing things, but can also give 
persistant headaches.
Me too
We have started very shy steps towards this. See for example the 
convolve5_cpp.cpp file :

RcppExport SEXP convolve5cpp(SEXP a, SEXP b) {
     NumericVector xa(a); int n_xa = xa.size() ;
     NumericVector xb(b); int n_xb = xb.size() ;
     NumericVector xab(n_xa + n_xb - 1,0.0);

     for(int i=0; i<n_xa; i++){
     	xab[ Range(i, i+n_xb-1) ] += xa[i] * xb ;
     }
     return xab ;
}

as opposed to the usual implementation:

RcppExport SEXP convolve3cpp(SEXP a, SEXP b){
     Rcpp::NumericVector xa(a);
     Rcpp::NumericVector xb(b);
     int n_xa = xa.size() ;
     int n_xb = xb.size() ;
     int nab = n_xa + n_xb - 1;
     Rcpp::NumericVector xab(nab);

     for (int i = 0; i < n_xa; i++)
         for (int j = 0; j < n_xb; j++)
             xab[i + j] += xa[i] * xb[j];

     return xab ;
}

would be interesting to go beyond Range and have a more flexible system.
Me neither (although I might have a better idea).

We want C++ code to look more like R code, yet be more efficient, so 
there is scope for such things. It will be easier to come up with an 
estimate when you hand the design document. Then we can decide if this 
is something we want to do upfront, something for which we will need 
extra help (patches, someone's time, funding to get some extra time from 
us, etc ...), something we might do later ...

Romain

  
    
12 days later
#
On Tue, Sep 7, 2010 at 11:40 PM, Romain Francois
<romain at r-enthusiasts.com> wrote:
Here we go - it works under certain circumstances:

fx_flat <- cxxfunction( signature(x = "integer", y = "array" ) ,
  '
    NumericVector ret(Dimension(2,2,2));
    ret = NumericVector(y);
    ret = ret * as<int>(x);  // this "squishes" ret back to 1D
    return ret;
  ', plugin = "RcppArmadillo" )
[1]  2  4  6  8 10 12 14 16

fx_cube <- cxxfunction( signature(x = "integer", y = "array" ) ,
  '
      // note - NumericVector ret(y) is flat
    NumericVector ret(Dimension(2,2,2));
      // use copy constructor to keep dimension
    ret = NumericVector(y);
      // copy_aux_mem = false is the only way to return retcube?
    arma::cube retcube(ret.begin(), 2, 2, 2, false);
    retcube = retcube * as<int>(x);
    return ret;
  ', plugin = "RcppArmadillo" )
, , 1
     [,1] [,2]
[1,]    2    6
[2,]    4    8
, , 2
     [,1] [,2]
[1,]   10   14
[2,]   12   16

A few points stand out to me -

* If i'm not doing pointer-magic and/or only using Rcpp/Armadillo
operators, is copy_aux_mem = false a reasonable way to return retcube
after processing?  I don't see any other way to get an SEXP out of an
arma::cube, but i'm skeptical of "can be dangerous unless you know",
since i don't.

* It seems to me that row/column/slice indexer semantics, rather than
linearArray[ thisrow + thiscol*(ncol-1) ] semantics (which i always
seem to mess up) is one reason these mailing list questions about
arrays keep popping up.

* What about "ret = ret * as<int>(x);" causes ret to lose it's dimension?

best,
Christian
#
Le 21/09/10 10:57, Christian Gunning a ?crit :
Ah. We should implement NumericVector::operator*= etc ...

You can always attach dimensions later:

ret.attr( "dim" ) = Dimension( 2, 2, 2) ;

I'll try to think of some ways to keep the dimensions. Perhaps we do 
need some sort of Rcpp::Array class
Yes. Armadillo has a separate class for cubes, which makes all this 
work. That's why I think we need Rcpp::Array too.
I think this is fine. You essentially tell armadillo that the memory is 
yours and no to mess with it, it won't.

You can however Rcpp::wrap a cube. It is safer because the memory gets 
copied, but less efficient, because the memory is copied.
Sure. I need to think a little bit more about how to do Rcpp::Array, etc 
... because an Array could have an arbitrary number of dimensions.
Perhaps embedding information into the class would work.

// make an array of 3 dimensions, etc ...
Rcpp::Array<3> ret( y ) ;

I have very little time to devote to this right now.
Omission, lack of time, I'm not using arrays all that much personally, 
etc ...

  
    
#
Ah, yes.  Thanks.
Got it.  I erroneously assumed an implicit wrap somewhere.
Yes, I've been thinking similarly about an Indexer class.  Accounting
for a variable number of dimensions looks unpleasantly messy.
Oh, I was thinking more along the lines of "by what mechanism is
dimension lost" than "why is grass green" :)

-Christian
#
Le 21/09/10 11:46, Christian Gunning a ?crit :
I'd rather encapsulate this a bit rather than let people touch the dim 
attribute, or perhaps mimic what R is doing :

 > x <- 1:24
 > attr( x, "dim" ) <- c(2,12)
 > x
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]    1    3    5    7    9   11   13   15   17    19    21    23
[2,]    2    4    6    8   10   12   14   16   18    20    22    24
 > attr( x, "dim" ) <- c(2,17)
Erreur dans attr(x, "dim") <- c(2, 17) :
   dims [produit 34] ne correspond pas ? la longueur de l'objet [24]

and gatekeep inside the attr() function
There was no implicit wrap in the code you shown.
Yes. Feel free to share thoughts on that.
Because we make a new vector and don't keep the attributes of the old 
one, including the "dim" attribute.
#
On Tue, Sep 21, 2010 at 2:56 AM, Romain Francois
<romain at r-enthusiasts.com> wrote:
Maybe solving the just the 3D case instead of n-D solve the majority
of use cases?
Since an R-style mat(2,) is illegal, perhaps a simple solution is
using booleans in the index position to indicate "all",   This solves
a decent range of use cases, and should be accessible to the "native
R" crowd. Here's a simple one:

NumericMatrix ret(inmat);
double lambda = as<double>(inlambda);
int nr = ret.nrows();
int nc = ret.ncols();
int i;
for (i=0; i<nr; i++) {
  ret(i,true) = rpois(nc, lambda);
};

The next logical extension of matrix/array indexing beyond boolean
would be to allow NumericVectors in each of the index positions.  At
this point, though, there's an explosion of over-loaded functions - a
3D Num myArray(x, y, z) gives 27 separate functions for x,y,z chosen
from {bool, int, NumericVector}, plus NumericMatrix and NumericVector.
 Here's where an indexer class starts to make sense - unified
constructors, along with clear dimensionality of the indexer and
simple checks of dimensional conformance between indexer and indexed.

As a side-note, I just spent some time with arma, and was sad to find
that arma_mat.insert_rows(atrow, rowvec) extends arma_mat, with no
apparent way to do row/col-level in-place replacement. So, at least
we're not whipping a dead horse.

Can you point me to the relevant NumericMatrix(i,j) indexer code?

thanks much,
Christian
#
Le 21/09/10 13:20, Christian Gunning a ?crit :
I don't like that. I'd rather go with a dummy class.

class Empty{} ;

and have something like this:

ret( i, Empty() ) = ...


Or perhaps use the "_" thing :

ret( i, _ ) = ...


Or perhaps have this syntax :

ret.row(i) = ...

We aleardy have row member function that returns a Row object, but it 
currently does not have a operator=, that could be taken care of.
Sure.
Look for the "offset" member functions in
https://r-forge.r-project.org/scm/viewvc.php/pkg/Rcpp/inst/include/Rcpp/vector/Vector.h?view=markup&revision=2031&root=rcpp

This is what is used by operator()(int,int) in Matrix.h:
https://r-forge.r-project.org/scm/viewvc.php/pkg/Rcpp/inst/include/Rcpp/vector/Matrix.h?view=markup&revision=1907&root=rcpp