Hi,
I have a doubt regarding passing large data frame into Rcpp. If we
consider the following function
foo(SEXP myframe) {
RcppFrame &fr_ref = (RcppFrame &) myframe;
}
Somehow seems to work without a need to call a constructor and thus
causes copy of large data frame to RcppFrame object. However, you can
see that the code is not safe. there's no guarantee that myframe is a
data frame. This is my first question, is there any way to check type
of the input SEXP? Or is there any better way to do this?
Secondly, I'm wondering why the POSIXct column in my data frame
appears as double when I pass a data frame as an argument into a
function or when I read it out from global environment map? Is there
anyway to ensure it appears as RcppDatetime? Thank you.
Robert
[Rcpp-devel] Passing large data frame
3 messages · R_help Help, Romain Francois, Dirk Eddelbuettel
Hi, Le 14/06/10 05:38, R_help Help a ?crit :
Hi,
I have a doubt regarding passing large data frame into Rcpp. If we
consider the following function
foo(SEXP myframe) {
RcppFrame&fr_ref = (RcppFrame&) myframe;
}
Somehow seems to work without a need to call a constructor and thus
causes copy of large data frame to RcppFrame object.
This is very wrong code, you are just getting lucky about the internal
representation of RcppFrame.
Consider:
require( Rcpp )
require( inline )
inc <- '
class Foo{
public:
Foo( SEXP x) : y(5), xx(x) {
Rprintf( "hello" ) ;
}
Foo( ) : y(6), xx(R_NilValue) {
Rprintf( "hello from default" );
}
inline SEXP gety(){
return IntegerVector::create( y ) ;
}
private:
int y ;
SEXP xx ;
} ;
'
code <- '
Foo& foo = (Foo&) x ;
return foo.gety() ;
'
df <- data.frame( x = 1:5, y = 1:5 )
fx <- cxxfunction( signature( x = "data.frame" ), code, include = inc,
plugin = "Rcpp" )
I get :
> fx( df )
[1] 35966160
> fx( df )
[1] 35966160
Using C++ cast "static_cast", the compiler would tell you the error.
file10d63af1.cpp: In function ?SEXPREC* file10d63af1(SEXPREC*)?:
file10d63af1.cpp:49: error: invalid static_cast from type ?SEXPREC*? to
type ?Foo&?
make: *** [file10d63af1.o] Error 1
ERROR(s) during compilation: source code errors or compiler
configuration errors!
However, you can see that the code is not safe.
It is more than "not safe", it is just plain wrong.
there's no guarantee that myframe is a data frame. This is my first question, is there any way to check type of the input SEXP? Or is there any better way to do this?
RcppFrame is a class of what we call the "classic" api, which indeed is largely inefficient because it copies data all the time. The new api, and in particular the class Rcpp::DataFrame is much more efficient. For example the constructor Rcpp::DataFrame( SEXP ) will not make a copy of the SEXP you pass in. You can find example code of Rcpp::DataFrame in the unit test: > system.file( "unitTests", "runit.DataFrame.R", package = "Rcpp" )
Secondly, I'm wondering why the POSIXct column in my data frame appears as double when I pass a data frame as an argument into a function or when I read it out from global environment map? Is there anyway to ensure it appears as RcppDatetime? Thank you. Robert
Someone else will pick this up.
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5
Robert,
On 14 June 2010 at 12:36, Romain Francois wrote:
| Le 14/06/10 05:38, R_help Help a ?crit : | [...] | > Secondly, I'm wondering why the POSIXct column in my data frame | > appears as double when I pass a data frame as an argument into a | > function or when I read it out from global environment map? Is there | > anyway to ensure it appears as RcppDatetime? Thank you. | > | > Robert | | Someone else will pick this up. a) POSIXct really is a double and nothing more, so you could re-create a RcppDatetimeVector from the double vector -- no information lossage b) RcppFrame is a data structure for _creating data frame in C++ for return_ rather than for retrieving a data frame from R c) As Romain said, you are better off with Rcpp::DataFrame anyway d) But that class (and the new API in general) do not have a datetime class yet so see point a) I have been mulling over what to do about a simple datetime time class. So far, I haven't needed one (comparison between doubles work fine) so I had no real motivation. Eventually we should have one. For now you can just use doubles and/or the old class.
Regards, Dirk