Skip to content

[Rcpp-devel] matrix indexing

11 messages · Dirk Eddelbuettel, Whit Armstrong, Romain Francois

#
Hello,

Motivated by Christophe's question on R-help (cc'ed), I'd like to add 
facilities for matrix (and maybe arrays) indexing into the new Rcpp api.

The classic api already ships matrix indexing through RcppMatrix and 
RcppMatrixView classes. I'm not a big fan of using round brackets for 
indexing, but apparently one cannot pass more than one parameter to [] 
so ... I guess I'll have to sacrifice look and feel for a minute.


Anyway I was thinking we can maybe add NumericVector::operator()( int, 
int) (other vectors as well), so that we can do :

NumericVector m(x) ;
double x11 = m(0,0) ;


I was also thinking maybe making some sort of indexer class to take care 
of conversion of 2 indices into 1d index; something like this pseudo code :

NumericVector m(x) ;
NumericVector::MatrixIndexer indexer(m) ;
double x11 = m[ indexer(0,0) ] ;

But this might be even more ugly than operator().


Given than a REALSXP can change its mind about being a matrix, I'm not 
sure I want to subclass NumericVector into NumericMatrix, etc. ...

Ideas ?

Romain
#
On 23 January 2010 at 21:05, Romain Francois wrote:
| Hello,
| 
| Motivated by Christophe's question on R-help (cc'ed), I'd like to add 
| facilities for matrix (and maybe arrays) indexing into the new Rcpp api.
| 
| The classic api already ships matrix indexing through RcppMatrix and 
| RcppMatrixView classes. I'm not a big fan of using round brackets for 
| indexing, but apparently one cannot pass more than one parameter to [] 
| so ... I guess I'll have to sacrifice look and feel for a minute.
| 
| 
| Anyway I was thinking we can maybe add NumericVector::operator()( int, 
| int) (other vectors as well), so that we can do :
| 
| NumericVector m(x) ;
| double x11 = m(0,0) ;
| 
| 
| I was also thinking maybe making some sort of indexer class to take care 
| of conversion of 2 indices into 1d index; something like this pseudo code :
| 
| NumericVector m(x) ;
| NumericVector::MatrixIndexer indexer(m) ;
| double x11 = m[ indexer(0,0) ] ;
| 
| But this might be even more ugly than operator().
| 
| 
| Given than a REALSXP can change its mind about being a matrix, I'm not 
| sure I want to subclass NumericVector into NumericMatrix, etc. ...
| 
| Ideas ?

It's a really important topic, and I really don't know yet what I want. 

Recall another r-help questions of not too long ago when the almost-always-
complaining-yet-never-contributing Ivo Welch went on about needing a
faster lm().

I once wrote one using GNU gsl for the ordinary least squares part, but one
can do much_ better with proper C++ classes. As a grad student, I published a
review paper in the J of Applied Econometrics about Newmat (a C++ library I
used a lot at the time), and things have come a long way since.

Whit (also CC'ed) and I independently landed at using the clever armadillo
(templated C++) project at (IIRC) http://arma.sf.net, and there is also the
Eigen2 project (also templated C++, initially used only inside KDE).  

Whit essentially answered Ivo with a fast.lm project on github [ the boy is
lost in the wilderness and has yet not found the goodness that R-Forge is ;-)
-- see http://github.com/armstrtw/fast.lm ] 

Long story short: given that we pass these objects down to C++, we should
figure out a fast / light / convenient / elegant / [your term here] / ... way
to make use of them with other libraries.

Dirk
#
On 01/23/2010 09:23 PM, Dirk Eddelbuettel wrote:
As I see it, there are two distinct issues:

- convenience api to extract data from a matrix, what should it look 
like: m(i,j) or m[ something(i,j) ] or m[i][j] or whatever else ....

- how to pass data back and forth between Rcpp classes and the rest of 
the world. for this we have as and wrap so I guess the only things 
needed here are :

* converting a SEXP to some c++ class

as< armadillo::Matrix > (SEXP)
as< Newmat::Matrix >(SEXP)

(implicit conversion from RObject to SEXP takes care of the rest), so 
that we can write :

IntegerVector dims = { 4, 4 } ;
NumericVector x(16) ;
x.attr( "dim" ) = dims ;

Newmat::Matrix foo = as< Newmat::Matrix >(x) ;

* The other way is wrap's business, and so would come by overloading 
wrap, as in:

RObject wrap( Newmat::Matrix );
RObject wrap( armadillo::Matrix ) ;

(note that I have never used armadilla, and I only used newmat briefly a 
looooong time ago, so these class names are just made up for the purpose 
of the email)...

Anyway, just to say I don't think Rcpp has to ship these converters, but 
it would maybe fit nicely in another package depending on Rcpp... dunno.

Romain
#
yes, same issue for me in tslib.  but (i,j) is not so bad esp for
matlab users.  tslib users have it even harder since x() is used as
lag/lead. x[] is used as 1d accessor/setter, and x(i,j) is used as 2d
accessor/setter.

I think this is part of the reason people don't like c++. Overloading
can confuse the hell out of people, but good documentation should take
care of the negative sentiment.
I think this is useful.  I called mine "offset(i,j)" in tslib.
yes, but if you follow R's implicit class system, then you have to go
this way.  One possibility is to thow if you don't find a dims
attribute when calling a constructor on an SEXP, and what to do when
length dims == 3?
#
I think the Armadillo stuff is great.  MTL4 is also great, but lacking
some of the features of Armadillo.

http://www.osl.iu.edu/research/mtl/mtl4/

All the good libraries that I've seen allow you to construct objects
from a pointer to raw memory and dimensions which should be very easy
to write for Rcpp objects.  Armadillo also allows you to tell it not
to copy (which is essentially what tslib does when it wraps SEXP's).

You all are also touching indirectly on another topic.  We are all
doing interesting things in with R/c++ wrappers to make our lives
easier and make our code faster. What we really want is R++?  Should
we all be working on it as a separate project?  Or join the existing
one?

-Whit
On Sat, Jan 23, 2010 at 3:23 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
#
On 23 January 2010 at 18:42, Whit Armstrong wrote:
| You all are also touching indirectly on another topic.  We are all
| doing interesting things in with R/c++ wrappers to make our lives
| easier and make our code faster. What we really want is R++?  

Don;t forget we all want a pony too.  Now, seriously, what is R++?

| Should we all be working on it as a separate project?  Or join the existing
| one?

If I knew what you were talking about I might have a chance of answering this :)

Dirk
#
On 01/24/2010 12:42 AM, Whit Armstrong wrote:
I'll have to do some reading/googling before I can make sense of the above.
As Dirk, I am not sure to understand what you mean, so I'll second guess.

Do you mean :
- R/C++ wrappers ? In that case, I will be proprietary and pretend that 
Rcpp is the best available. I'd be happy for you to participate. I 
however will not start a new one. up to you.
- R written in C++ ? This is CXXR. The project is awesome and basically 
reimplement all of R on top of c++ rather than C. I still prefer Rcpp's 
approach because it can be used with regular R, which is what people 
use. Although if R core wants to follow Andrew's lead and rebase on C++, 
then I'll volunteer to help.

Romain

  
    
#
On 01/24/2010 12:33 AM, Whit Armstrong wrote:
Makes sense.
yes. but you cannot lock your object so that you control how attributes 
(e.g. dim) are set, so you have to check for bounds each time, etc ... 
we'll have to document that this is convenience operators and that there 
is a performance penalty, etc ...

  
    
#
Oh, boy. I _do_ want a pony!
I was talking about this one:
http://www.cs.kent.ac.uk/projects/cxxr

which Romain has pointed out in the next post.  For some reason, it is
a devil to find it on the web.

Anyhow, I know no one what's to give up the good thing we have with R,
but eventually I think this has to happen.  In some ways it's even
supportive of further language development. Just look at ruby and
python, reimplemented as jruby, jython, iron ruby, and iron python.

I'm not saying I'm the one who's going to kickstart this, but I'm
definitely supportive if anyone can mount a serious effort to do this.
 Which leads to the question about cxxr.  Is it any good?  Dirk, I
think you've mentioned that performance seems to be awful, but that
was some time ago.

Anyway, I took this thread seriously off topic w/ that reply, so feel
free to just ping me back off list if you all want to keep going on
this.

-Whit
#
On 24 January 2010 at 13:37, Whit Armstrong wrote:
| Oh, boy. I _do_ want a pony!
| 
| > Don;t forget we all want a pony too. ?Now, seriously, what is R++?
| > If I knew what you were talking about I might have a chance of answering this :)
| 
| I was talking about this one:
| http://www.cs.kent.ac.uk/projects/cxxr

Oh, we are aware of Andrew's efforts. We have talked to him too. It is just
that he has completely different goals ("redo R in C++") than we do.  While there
is some overlap, it is not clear it is all that useful.

Dirk
1 day later
#
Hi,

We now (in svn) have matrix-like indexing through operator() for all 
vectors : (character, integer, numeric, raw, logical, generic and 
expression).

Here is an example from the unit tests of NumericVector :

 > funx <- cfunction(signature(x = "numeric" ), '
+ NumericVector m(x) ;
+ double trace = 0.0 ;
+ for( size_t i=0 ; i<4; i++){
+ trace += m(i,i) ;
+ }
+ return wrap( trace ) ;
+ ', Rcpp = TRUE, includes = "using namespace Rcpp;"  )
 > x <- matrix( 1:16 + .5, ncol = 4 )
 > diag( x )
[1]  1.5  6.5 11.5 16.5
 > sum( diag( x ) )
[1] 36
 > funx(x)
[1] 36

Arbitrary dimension array indexing is planned, but not yet implemented. 
I'll probably hard-code dimension 3 and let variadic templates take care 
of generalization to 4+ dimensions since it is not used that much.

Or maybe we can do it through classic C variable argument lists so that 
it does not depend on C++0x ? anyone familiar with this ?

Romain
On 01/23/2010 09:05 PM, Romain Francois wrote: