Skip to content

[Rcpp-devel] Matrix columns

5 messages · Nick Sabbe, Dirk Eddelbuettel

#
Hi all,

 

I've been trying the following, with the main part ripped from
rcpp-quickref:

    SEXP dfr2Mat(SEXP dfr)

    {

        DataFrame df = dfr;

        int* dm = INTEGER( ::Rf_getAttrib( df, R_DimSymbol ) ) ;

        int rows = dm[0];

        int cols = dm[1];

        NumericMatrix retMat (rows, cols);

        for(int i = 0; i < cols; i++)

        {

            NumericVector curcol = df(i);

            //NumericMatrix::Column zzcol = xx( _, 1);

            NumericMatrix::Column cl = retMat.( _, i);

            cl = curcol;

        }

        return retMat;

    }

When compiling this, I get some errors starting from the line after the
comment (i.e. line 64):

 

main.cpp: In function 'SEXPREC* dfr2Mat(SEXPREC*)':

main.cpp:64: error: invalid conversion from 'SEXPREC*' to 'int'

main.cpp:64: error:   initializing argument 1 of 'typename
Rcpp::Vector<RTYPE>::Proxy Rcpp::Matrix<RTYPE>::operator()(int, int) [with
int RTYPE = 14]'

main.cpp:64: error: conversion from 'double' to non-scalar type
'Rcpp::MatrixColumn<14>' requested

main.cpp:65: error: no match for 'operator=' in 'cl = curcol'

C:/Users/nisabbe/Documents/R/win-library/2.11/Rcpp/include/Rcpp/vector/Matri
xColumn.h:40: note: candidates are: Rcpp::MatrixColumn<RTYPE>&
Rcpp::MatrixColumn<RTYPE>::operator=(Rcpp::MatrixColumn<RTYPE>&) [with int
RTYPE = 14]

 

I suspect it has something to do with this underscore syntax.

I compiled with the following statement:

g++ -I"C:/PROGRA~1/R/R-211~1.1/include"
-I"C:/Users/nisabbe/Documents/R/win-library/2.11/Rcpp/include" -O2 -Wall -c
main.cpp -o main.o

And the version of rcpp I installed is 0.9.4 on the most recent version of
R.

 

Can anyone show me what is wrong here?

Besides the syntax problems, what I am doing here should work, right?

I know there are non-Rcpp ways to get this done, but this is just something
to get me started.

 

And as an addendum: where can I find out how (not) to use this underscore?

 

Nick Sabbe

--

ping: nick.sabbe at ugent.be

link:  <http://biomath.ugent.be/> http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110514/2be893ce/attachment.htm>
#
On 14 May 2011 at 23:12, Nick Sabbe wrote:
| I?ve been trying the following, with the main part ripped from rcpp-quickref:
| 
|     SEXP dfr2Mat(SEXP dfr)

A much easier way is to, say, call as.matrix() in R and then assign to an
NumericMatrix object in C++.  

Another way is turn the Data.Frame into a list via as.list() as just pick off
column after column.

|     {
|         DataFrame df = dfr;
|         int* dm = INTEGER( ::Rf_getAttrib( df, R_DimSymbol ) ) ;
|         int rows = dm[0];
|         int cols = dm[1];
|         NumericMatrix retMat (rows, cols);
|         for(int i = 0; i < cols; i++)
|         {
|             NumericVector curcol = df(i);
|             NumericMatrix::Column cl = retMat.( _, i);

Why is retMat on the right-hand side if you try to fill it?  Makes no real sense.

|             cl = curcol;
|         }
|         return retMat;
|     }

I can see no straightforward way to write given how objects are organized
internally.  Once a Rcpp::DataFrame object is instantiated, you can call
size() which gives you the number of columns.  So here is a really
complicated way to do this, but let me reiterate that I think you are
starting from the wrong starting point: if your data.frame really is a
matrix, just use a matrix:

R> require(inline)
Loading required package: inline
R> 
R> df2mat <- cxxfunction(signature(Dsexp="ANY"), plugin="Rcpp", body='
+    // construct the data.frame object
+    Rcpp::DataFrame DF = Rcpp::DataFrame(Dsexp);
+ 
+    // we get ncol() from DF
+    int k = DF.size();
+ 
+    // but for nrow() we need to assign first; one way is:
+    Rcpp::NumericVector V = DF[0];
+    int n = V.size();
+ 
+    Rcpp::NumericMatrix M(n,k);
+ 
+    for (int i=0; i<k; i++) {
+        V = DF[i];
+        M(_,i) = V;  // one way to assign using sugar operator _
+    }
+ 
+    return M;
+ ')
R> M <- df2mat(data.frame(a=1:5, b=seq(1.1,5.5,by=1.1)))
R> M
     [,1] [,2]
[1,]    1  1.1
[2,]    2  2.2
[3,]    3  3.3
[4,]    4  4.4
[5,]    5  5.5
R> class(M)
[1] "matrix"
R> 


Hope this helps,  Dirk
#
Hi Dirk. Thanks for your time and trouble.
I agree there are simpler ways to achieve this functionality, but if I don't
get around this, I won't be able to produce the thing I'm really after.
sense.
#
Nick,
On 15 May 2011 at 20:51, Nick Sabbe wrote:
| Hi Dirk. Thanks for your time and trouble.
| I agree there are simpler ways to achieve this functionality, but if I don't
| get around this, I won't be able to produce the thing I'm really after.

I fear that you think that because data.frame is a preferred structure in R,
it also must be in C++. That is not the case, and the reason is the internal
representation of a data.frame at the C (and hence C++) level.  You may be
better off decomposing your data.frame into columns or different types (in R
or C++) or just use a matrix if there is a common type.
 
| > Why is retMat on the right-hand side if you try to fill it?  Makes no real
| sense.
| 
| >From rcpp-quickref:
| // Reference the second column
| // Changes propagate to xx (same applies for Row)
| NumericMatrix::Column zzcol = xx( _, 1);
| zzcol = zzcol * 2;
| 
| Only difference in my code was that I was assigning from a NumericVector
| instead of from an altered version of the original NumericMatrix::Column. Or
| am I mistaken here? I've put it here again:
| NumericVector curcol = df(i);
| NumericMatrix::Column cl = retMat( _, i);
| cl = curcol;//from rcpp-quickref, I understand this alters the matrix?

I believe you to be mistaken. Twice assigning overwrites. Feel free to debug
and test to convince yourself.
 
| Then you appear to suggest that my approach using the "dim" attribute is no
| good, or at least you get the dimensions in a very different way (by using
| the data.frame as a list). I agree your way works, but why shouldn't mine?
| In any case, this sort of thing would be a valuable addition to Rcpp in my
| opinion.
| 
| Finally, you use:
| >+        V = DF[i];
| >+        M(_,i) = V;  // one way to assign using sugar operator _
| Which makes me wonder:
| * why DF[i] and not DF(i) (what's the difference - I still have not found a
| clear explanation)

It's only stylistic. You can use either one.

| * is M(_,i) = V; any different from NumericVector cl = M(_,i);cl=V; ? If so,
| how so?

As I recall, the former ("one-step") created a compiler error as the compiler
could not disambiguate some intermediate type. The second ("two-step")
worked, so I used that,  So the difference is that one works :)   [ In
theory, both should. Patches to make the former work are welcome. ]

| * thanks for letting me know _ is part of sugar. I may be able to figure out
| its use now, but isn't this worthy of a mention in rcpp-sugar.pdf?

As I said a few days ago, patches which enhance documentation or code are
always welcome and will always be properly credited (that is, if we agree
with them and incorporate them).

Dirk
#
Dirk,

Thanks again!
I thought I ought to let you know: the main problem was that my compilation
still pointed to older versions of R and Rcpp that did not contain some of
the additions in case. Goes to show!

But still some comments/questions below...
What I'm actually hoping to achieve is an improvement to the answer to my
question in
http://stackoverflow.com/questions/5980240/performance-of-rbind-data-frame
(though in doubt that I will improve on it). Apart from that, I'm just
trying to learn Rcpp.
Then what's with the comment in rcpp-quickref (// Changes propagate to xx
(same applies for Row))?
As far as I recall (that old effective C++ has gathered quite a bit of dust
by now), the first = is not an assignment but initialization. So I only
assign once (in the last lign). But I'll debug and test to convince myself.
Truly lol.
Intriguingly, I could not find any information on how to get my suggested
"patches" to you, besides a note on your site to "post them on the mailing
list" (admittedly I could have looked better). Do I create a LaTeX file with
additions to the helpfiles and send it here? Or is there a more formal
(read: less annoying for you guys who would then still have to edit it until
it fits the rest of the docs) way? I _do_ intend to try and improve the
documentation, though I don't feel worthy to improve upon the
implementation.

Thanks again!


Nick Sabbe
--
ping: nick.sabbe at ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove