Hi all,
I've been trying the following, with the main part ripped from
rcpp-quickref:
SEXP dfr2Mat(SEXP dfr)
{
DataFrame df = dfr;
int* dm = INTEGER( ::Rf_getAttrib( df, R_DimSymbol ) ) ;
int rows = dm[0];
int cols = dm[1];
NumericMatrix retMat (rows, cols);
for(int i = 0; i < cols; i++)
{
NumericVector curcol = df(i);
//NumericMatrix::Column zzcol = xx( _, 1);
NumericMatrix::Column cl = retMat.( _, i);
cl = curcol;
}
return retMat;
}
When compiling this, I get some errors starting from the line after the
comment (i.e. line 64):
main.cpp: In function 'SEXPREC* dfr2Mat(SEXPREC*)':
main.cpp:64: error: invalid conversion from 'SEXPREC*' to 'int'
main.cpp:64: error: initializing argument 1 of 'typename
Rcpp::Vector<RTYPE>::Proxy Rcpp::Matrix<RTYPE>::operator()(int, int) [with
int RTYPE = 14]'
main.cpp:64: error: conversion from 'double' to non-scalar type
'Rcpp::MatrixColumn<14>' requested
main.cpp:65: error: no match for 'operator=' in 'cl = curcol'
C:/Users/nisabbe/Documents/R/win-library/2.11/Rcpp/include/Rcpp/vector/Matri
xColumn.h:40: note: candidates are: Rcpp::MatrixColumn<RTYPE>&
Rcpp::MatrixColumn<RTYPE>::operator=(Rcpp::MatrixColumn<RTYPE>&) [with int
RTYPE = 14]
I suspect it has something to do with this underscore syntax.
I compiled with the following statement:
g++ -I"C:/PROGRA~1/R/R-211~1.1/include"
-I"C:/Users/nisabbe/Documents/R/win-library/2.11/Rcpp/include" -O2 -Wall -c
main.cpp -o main.o
And the version of rcpp I installed is 0.9.4 on the most recent version of
R.
Can anyone show me what is wrong here?
Besides the syntax problems, what I am doing here should work, right?
I know there are non-Rcpp ways to get this done, but this is just something
to get me started.
And as an addendum: where can I find out how (not) to use this underscore?
Nick Sabbe
--
ping: nick.sabbe at ugent.be
link: <http://biomath.ugent.be/> http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36
-- Do Not Disapprove
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110514/2be893ce/attachment.htm>
[Rcpp-devel] Matrix columns
5 messages · Nick Sabbe, Dirk Eddelbuettel
On 14 May 2011 at 23:12, Nick Sabbe wrote:
| I?ve been trying the following, with the main part ripped from rcpp-quickref:
|
| SEXP dfr2Mat(SEXP dfr)
A much easier way is to, say, call as.matrix() in R and then assign to an
NumericMatrix object in C++.
Another way is turn the Data.Frame into a list via as.list() as just pick off
column after column.
| {
| DataFrame df = dfr;
| int* dm = INTEGER( ::Rf_getAttrib( df, R_DimSymbol ) ) ;
| int rows = dm[0];
| int cols = dm[1];
| NumericMatrix retMat (rows, cols);
| for(int i = 0; i < cols; i++)
| {
| NumericVector curcol = df(i);
| NumericMatrix::Column cl = retMat.( _, i);
Why is retMat on the right-hand side if you try to fill it? Makes no real sense.
| cl = curcol;
| }
| return retMat;
| }
I can see no straightforward way to write given how objects are organized
internally. Once a Rcpp::DataFrame object is instantiated, you can call
size() which gives you the number of columns. So here is a really
complicated way to do this, but let me reiterate that I think you are
starting from the wrong starting point: if your data.frame really is a
matrix, just use a matrix:
R> require(inline)
Loading required package: inline
R>
R> df2mat <- cxxfunction(signature(Dsexp="ANY"), plugin="Rcpp", body='
+ // construct the data.frame object
+ Rcpp::DataFrame DF = Rcpp::DataFrame(Dsexp);
+
+ // we get ncol() from DF
+ int k = DF.size();
+
+ // but for nrow() we need to assign first; one way is:
+ Rcpp::NumericVector V = DF[0];
+ int n = V.size();
+
+ Rcpp::NumericMatrix M(n,k);
+
+ for (int i=0; i<k; i++) {
+ V = DF[i];
+ M(_,i) = V; // one way to assign using sugar operator _
+ }
+
+ return M;
+ ')
R> M <- df2mat(data.frame(a=1:5, b=seq(1.1,5.5,by=1.1)))
R> M
[,1] [,2]
[1,] 1 1.1
[2,] 2 2.2
[3,] 3 3.3
[4,] 4 4.4
[5,] 5 5.5
R> class(M)
[1] "matrix"
R>
Hope this helps, Dirk
Gauss once played himself in a zero-sum game and won $50.
-- #11 at http://www.gaussfacts.com
Hi Dirk. Thanks for your time and trouble. I agree there are simpler ways to achieve this functionality, but if I don't get around this, I won't be able to produce the thing I'm really after.
Why is retMat on the right-hand side if you try to fill it? Makes no real
sense.
Nick,
On 15 May 2011 at 20:51, Nick Sabbe wrote:
| Hi Dirk. Thanks for your time and trouble.
| I agree there are simpler ways to achieve this functionality, but if I don't
| get around this, I won't be able to produce the thing I'm really after.
I fear that you think that because data.frame is a preferred structure in R,
it also must be in C++. That is not the case, and the reason is the internal
representation of a data.frame at the C (and hence C++) level. You may be
better off decomposing your data.frame into columns or different types (in R
or C++) or just use a matrix if there is a common type.
| > Why is retMat on the right-hand side if you try to fill it? Makes no real
| sense.
|
| >From rcpp-quickref:
| // Reference the second column
| // Changes propagate to xx (same applies for Row)
| NumericMatrix::Column zzcol = xx( _, 1);
| zzcol = zzcol * 2;
|
| Only difference in my code was that I was assigning from a NumericVector
| instead of from an altered version of the original NumericMatrix::Column. Or
| am I mistaken here? I've put it here again:
| NumericVector curcol = df(i);
| NumericMatrix::Column cl = retMat( _, i);
| cl = curcol;//from rcpp-quickref, I understand this alters the matrix?
I believe you to be mistaken. Twice assigning overwrites. Feel free to debug
and test to convince yourself.
| Then you appear to suggest that my approach using the "dim" attribute is no
| good, or at least you get the dimensions in a very different way (by using
| the data.frame as a list). I agree your way works, but why shouldn't mine?
| In any case, this sort of thing would be a valuable addition to Rcpp in my
| opinion.
|
| Finally, you use:
| >+ V = DF[i];
| >+ M(_,i) = V; // one way to assign using sugar operator _
| Which makes me wonder:
| * why DF[i] and not DF(i) (what's the difference - I still have not found a
| clear explanation)
It's only stylistic. You can use either one.
| * is M(_,i) = V; any different from NumericVector cl = M(_,i);cl=V; ? If so,
| how so?
As I recall, the former ("one-step") created a compiler error as the compiler
could not disambiguate some intermediate type. The second ("two-step")
worked, so I used that, So the difference is that one works :) [ In
theory, both should. Patches to make the former work are welcome. ]
| * thanks for letting me know _ is part of sugar. I may be able to figure out
| its use now, but isn't this worthy of a mention in rcpp-sugar.pdf?
As I said a few days ago, patches which enhance documentation or code are
always welcome and will always be properly credited (that is, if we agree
with them and incorporate them).
Dirk
Gauss once played himself in a zero-sum game and won $50.
-- #11 at http://www.gaussfacts.com
Dirk, Thanks again! I thought I ought to let you know: the main problem was that my compilation still pointed to older versions of R and Rcpp that did not contain some of the additions in case. Goes to show! But still some comments/questions below...
-----Original Message----- From: Dirk Eddelbuettel [mailto:edd at debian.org] Sent: zondag 15 mei 2011 21:22 To: Nick Sabbe Cc: 'Dirk Eddelbuettel'; rcpp-devel at r-forge.wu-wien.ac.at Subject: RE: [Rcpp-devel] Matrix columns Nick, I fear that you think that because data.frame is a preferred structure in R, it also must be in C++. That is not the case, and the reason is the internal representation of a data.frame at the C (and hence C++) level. You may be better off decomposing your data.frame into columns or different types (in R or C++) or just use a matrix if there is a common type.
What I'm actually hoping to achieve is an improvement to the answer to my question in http://stackoverflow.com/questions/5980240/performance-of-rbind-data-frame (though in doubt that I will improve on it). Apart from that, I'm just trying to learn Rcpp.
| > Why is retMat on the right-hand side if you try to fill it? Makes no real | sense. | | >From rcpp-quickref: | // Reference the second column | // Changes propagate to xx (same applies for Row) | NumericMatrix::Column zzcol = xx( _, 1); | zzcol = zzcol * 2; | | Only difference in my code was that I was assigning from a NumericVector | instead of from an altered version of the original NumericMatrix::Column. Or | am I mistaken here? I've put it here again: | NumericVector curcol = df(i); | NumericMatrix::Column cl = retMat( _, i); | cl = curcol;//from rcpp-quickref, I understand this alters the matrix? I believe you to be mistaken. Twice assigning overwrites. Feel free to debug and test to convince yourself.
Then what's with the comment in rcpp-quickref (// Changes propagate to xx (same applies for Row))? As far as I recall (that old effective C++ has gathered quite a bit of dust by now), the first = is not an assignment but initialization. So I only assign once (in the last lign). But I'll debug and test to convince myself.
| * is M(_,i) = V; any different from NumericVector cl = M(_,i);cl=V; ?
If so,
| how so?
As I recall, the former ("one-step") created a compiler error as the
compiler
could not disambiguate some intermediate type. The second ("two-step")
worked, so I used that, So the difference is that one works :) [ In
theory, both should. Patches to make the former work are welcome. ]
Truly lol.
| * thanks for letting me know _ is part of sugar. I may be able to figure out | its use now, but isn't this worthy of a mention in rcpp-sugar.pdf? As I said a few days ago, patches which enhance documentation or code are always welcome and will always be properly credited (that is, if we agree with them and incorporate them).
Intriguingly, I could not find any information on how to get my suggested "patches" to you, besides a note on your site to "post them on the mailing list" (admittedly I could have looked better). Do I create a LaTeX file with additions to the helpfiles and send it here? Or is there a more formal (read: less annoying for you guys who would then still have to edit it until it fits the rest of the docs) way? I _do_ intend to try and improve the documentation, though I don't feel worthy to improve upon the implementation. Thanks again! Nick Sabbe -- ping: nick.sabbe at ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove