Dear Rcpp developer,
I am tried return a big DataFrame from Rcpp to R, but met some problem!
### begin dataframetest.cpp
#include <Rcpp.h>
using namespace Rcpp;
using namespace std;
// [[Rcpp::export]]
DataFrame dataframetest(NumericVector close){
int nrow = close.size();
vector<double> txn_qty = vector<double>(nrow);
vector<double> txn_prc = vector<double>(nrow);
vector<double> txn_fee = vector<double>(nrow);
vector<double> pos_qty = vector<double>(nrow);
vector<double> close_prc = as<vector<double> >(close);
vector<double> PL = vector<double>(nrow);
DataFrame PLrecord = DataFrame::create(Named("txn.qty", txn_qty),
Named("txn.prc", txn_prc),
Named("txn.fee", txn_fee),
Named("pos.qty", pos_qty),
Named("close.prc", close_prc),
Named("PL", PL));
return PLrecord;
}
#### end dataframetest.cpp
### R code
n <- 4e5
x.prc <- 1:n
library(Rcpp)
sourceCpp("./dataframetest.cpp")
aa <- dataframetest(x.prc)
##### end R code
When n is big, like 4e5, then it will exhaust the memory or crash; when n
is small, like 4e3, it can return the correct DataFrame. I was wondering
if Rcpp::DataFrame can handle so big DataFrame. In my opinion, n = 4e5 is
not big, I can create such a long data.frame from R code easily, without
any problem. Why Rcpp can not? Or I miss something?
### R code
n <- 4e5
x.prc <- rnorm(n)
a <- data.frame(x = x.prc,
y = x.prc,
d = x.prc,
e = x.prc,
f = x.prc,
k = x.prc)
head(a)
x y d e f k
1 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433
2 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370
3 0.18209145 0.18209145 0.18209145 0.18209145 0.18209145 0.18209145
4 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768
5 0.25689622 0.25689622 0.25689622 0.25689622 0.25689622 0.25689622
6 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792
#### sessionInfo
sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: x86_64-suse-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rcpp_0.10.3 data.table_1.8.8
loaded via a namespace (and not attached):
[1] compiler_2.15.3 tools_2.15.3
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130327/e3587f04/attachment.html>
[Rcpp-devel] Rcpp can not return big DataFrame
4 messages · Dirk Eddelbuettel, 该走了, Romain Francois
Hi,
On 27 March 2013 at 21:19, ??? wrote:
| Dear Rcpp developer,
| ? I am tried return a big DataFrame from Rcpp to R, but met some problem!
If you check the list archives you will see that has been discussed before.
| ### begin dataframetest.cpp
|
| #include <Rcpp.h>
| using namespace Rcpp;
| using namespace std;
|
| // [[Rcpp::export]]
| DataFrame dataframetest(NumericVector close){
| ? int nrow = close.size();
| ? vector<double> ?txn_qty = vector<double>(nrow);
| ? vector<double> txn_prc = vector<double>(nrow);
| ? vector<double> ?txn_fee = vector<double>(nrow);
| ? vector<double> ?pos_qty = vector<double>(nrow);
| ? vector<double> ?close_prc = as<vector<double> >(close);
| ? vector<double> ?PL = vector<double>(nrow);
| ? DataFrame PLrecord = DataFrame::create(Named("txn.qty", txn_qty),
| Named("txn.prc", txn_prc),
| Named("txn.fee", txn_fee),
| Named("pos.qty", pos_qty),
| Named("close.prc", close_prc),
| Named("PL", PL));
| ? return PLrecord;
| }
| #### end ?dataframetest.cpp
|
| ### R code?
| n <- 4e5
| x.prc <- 1:n
| library(Rcpp)
| sourceCpp("./dataframetest.cpp")
| aa <- dataframetest(x.prc)
|
| ##### end R code?
|
| ?When n is big, like 4e5, then it will exhaust the memory or crash; when n is
| small, like ?4e3, it can return the correct DataFrame. I was wondering if
I agree.
But it probably "just" has to do with temp objects, which are co-managed by
R, so this is hard to sort out.
| Rcpp::DataFrame can handle so big DataFrame. In my opinion, n = 4e5 is not big,
| I can create such a long data.frame from R code easily, without any problem.
| Why Rcpp can not? Or I miss something??
You are welcome to debug it. Maybe valgrind will help.
Or if you don't want to or can't, just return a list of vectors and call
as.data.frame() on it when you back in R.
That's what we used to do anyway before we added the convenience wrapping.
Dirk
|
| ### R code
| n <- 4e5
| x.prc <- rnorm(n)
| a <- data.frame(x = x.prc,?
| ? ? ? ?y = x.prc,?
| ? ? ? ? ? ? ? ? d = x.prc,
| ? ? ? ? ? ? ? ? e = x.prc,?
| ? ? ? ? ? ? ? ? f = x.prc,?
| ? ? ? ? ? ? ? ? k = x.prc)
| head(a)
| ? ? ? ? ? ? x ? ? ? ? ? y ? ? ? ? ? d ? ? ? ? ? e ? ? ? ? ? f ? ? ? ? ? k
| 1 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433
| 2 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370
| 3 ?0.18209145 ?0.18209145 ?0.18209145 ?0.18209145 ?0.18209145 ?0.18209145
| 4 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768
| 5 ?0.25689622 ?0.25689622 ?0.25689622 ?0.25689622 ?0.25689622 ?0.25689622
| 6 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792
|
| #### sessionInfo
| sessionInfo()
| R version 2.15.3 (2013-03-01)
| Platform: x86_64-suse-linux-gnu (64-bit)
|
| locale:
| ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C ? ? ? ? ? ? ?
| ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 ? ?
| ?[5] LC_MONETARY=en_US.UTF-8 ? ?LC_MESSAGES=en_US.UTF-8 ??
| ?[7] LC_PAPER=C ? ? ? ? ? ? ? ? LC_NAME=C ? ? ? ? ? ? ? ??
| ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C ? ? ? ? ? ?
| [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ? ? ??
|
| attached base packages:
| [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base ? ??
|
| other attached packages:
| [1] Rcpp_0.10.3 ? ? ?data.table_1.8.8
|
| loaded via a namespace (and not attached):
| [1] compiler_2.15.3 tools_2.15.3 ??
|
|
| ----------------------------------------------------------------------
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
Hi Dirk, Thank you for your prompt reply and suggestion. I tried a lot of times, sometimes I got segfaults and sometimes I got an error messeger "Error: error calling the data.frame function", sometimes I got the DataFrame returned, but the elements is not correct. 2013/3/27 Dirk Eddelbuettel <edd at debian.org>
Hi, On 27 March 2013 at 21:19, ??? wrote: | Dear Rcpp developer, | I am tried return a big DataFrame from Rcpp to R, but met some problem! If you check the list archives you will see that has been discussed before. | When n is big, like 4e5, then it will exhaust the memory or crash; when n is | small, like 4e3, it can return the correct DataFrame. I was wondering if I agree. But it probably "just" has to do with temp objects, which are co-managed by R, so this is hard to sort out. | Rcpp::DataFrame can handle so big DataFrame. In my opinion, n = 4e5 is not big, | I can create such a long data.frame from R code easily, without any problem. | Why Rcpp can not? Or I miss something? You are welcome to debug it. Maybe valgrind will help. Or if you don't want to or can't, just return a list of vectors and call as.data.frame() on it when you back in R. That's what we used to do anyway before we added the convenience wrapping. Dirk | ---------------------------------------------------------------------- | _______________________________________________ | Rcpp-devel mailing list | Rcpp-devel at lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130327/933ba812/attachment.html>
Hmm. This does fix the problem:
DataFrame PLrecord = DataFrame::create(
Named("txn.qty" , wrap( txn_qty ) ),
Named("txn.prc" , wrap( txn_prc ) ),
Named("txn.fee" , wrap( txn_fee ) ),
Named("pos.qty" , wrap( pos_qty ) ),
Named("close.prc", wrap( close_prc) ),
Named("PL" , wrap( PL ) )
);
So we might do something wrong with copying objects.
Le 27/03/13 14:19, ??? a ?crit :
Dear Rcpp developer,
I am tried return a big DataFrame from Rcpp to R, but met some problem!
### begin dataframetest.cpp
#include <Rcpp.h>
using namespace Rcpp;
using namespace std;
// [[Rcpp::export]]
DataFrame dataframetest(NumericVector close){
int nrow = close.size();
vector<double> txn_qty = vector<double>(nrow);
vector<double> txn_prc = vector<double>(nrow);
vector<double> txn_fee = vector<double>(nrow);
vector<double> pos_qty = vector<double>(nrow);
vector<double> close_prc = as<vector<double> >(close);
vector<double> PL = vector<double>(nrow);
DataFrame PLrecord = DataFrame::create(Named("txn.qty", txn_qty),
Named("txn.prc", txn_prc),
Named("txn.fee", txn_fee),
Named("pos.qty", pos_qty),
Named("close.prc", close_prc),
Named("PL", PL));
return PLrecord;
}
#### end dataframetest.cpp
### R code
n <- 4e5
x.prc <- 1:n
library(Rcpp)
sourceCpp("./dataframetest.cpp")
aa <- dataframetest(x.prc)
##### end R code
When n is big, like 4e5, then it will exhaust the memory or crash;
when n is small, like 4e3, it can return the correct DataFrame. I was
wondering if Rcpp::DataFrame can handle so big DataFrame. In my opinion,
n = 4e5 is not big, I can create such a long data.frame from R code
easily, without any problem. Why Rcpp can not? Or I miss something?
### R code
n <- 4e5
x.prc <- rnorm(n)
a <- data.frame(x = x.prc,
y = x.prc,
d = x.prc,
e = x.prc,
f = x.prc,
k = x.prc)
head(a)
x y d e f k
1 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433
2 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370
3 0.18209145 0.18209145 0.18209145 0.18209145 0.18209145 0.18209145
4 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768
5 0.25689622 0.25689622 0.25689622 0.25689622 0.25689622 0.25689622
6 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792
#### sessionInfo
sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: x86_64-suse-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rcpp_0.10.3 data.table_1.8.8
loaded via a namespace (and not attached):
[1] compiler_2.15.3 tools_2.15.3
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 R Graph Gallery: http://gallery.r-enthusiasts.com blog: http://blog.r-enthusiasts.com |- http://bit.ly/ZTFLDo : Simpler R help tooltips `- http://bit.ly/YFsziW : R Help tooltips