-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi
I am new to C, and my question might be basic, but I am struggling with it:
I have about 10 lines of code in a simulation, which take up about 50%
of the whole simulation, As the simulation, takes several hours to run,
I want to make the code faster. After trying different approaches in R,
with different degrees of success, I decided to try out inline and Rcpp
for this.
I did some progress (using sugar), and the net-result is so far, that it
is as fast / slow as R; but as I said, this is a first step and I am
happy so far. My problem now is, in the R code, I have the following
(the complete code is below):
s <- seeds[x:(x+dx2), y:(y+dy2)]
dispSeeds[x,y] <- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
As the rbinom part is only using the vector representation of s, I have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in inline
using sugar?
As the code above sits in two large loops, I would like to translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
dispRparFor <- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
dx2 <- 2*dx
dy2 <- 2*dy
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s <- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y] <- NA
} else {
dispSeeds[x,y] <- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
}
}
}
return(dispSeeds)
}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
dispRparForSugar <- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
library(inline)
library(Rcpp)
dx2 <- 2*dx
dy2 <- 2*dy
fx2 <- cxxfunction(
sig = signature(N = "integer", SIZE = "integer",
PROB = "double"),
body = '
Rcpp::IntegerVector n (N);
Rcpp::IntegerVector size (SIZE);
Rcpp::NumericVector prob (PROB);
int res;
res = 0;
for( int i=0; i<n.size(); i++){
if (size[i]>0) {
res += rbinom( 1, size[i],
prob[i] )[0];
}
}
return wrap( res );
',
plugin = "Rcpp",
verbose = TRUE
)
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s <- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y] <- NA
} else {
dispSeeds[x,y] <- fx2(s, s, sd2D)
}
}
}
return(dispSeeds)
}
#+END_SRC
Ultimately, I would like to have the complete dispRparFor function in C.
For the ones interested, I could send a working example with data.
Cheers,
Rainer
- --
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)
Centre of Excellence for Invasion Biology
Natural Sciences Building
Office Suite 2039
Stellenbosch University
Main Campus, Merriman Avenue
Stellenbosch
South Africa
Tel: +33 - (0)9 53 10 27 44
Cell: +27 - (0)8 39 47 90 42
Fax (SA): +27 - (0)8 65 16 27 82
Fax (D) : +49 - (0)3 21 21 25 22 44
Fax (FR): +33 - (0)9 58 10 27 44
email: Rainer at krugs.de
Skype: RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAkziXZ0ACgkQoYgNqgF2egoL7wCfR5RapMcddSmM2f8ACBzjVG6p
GaQAni+XQJs2wzcJQNXpMxwFY/FqRa/W
=PmVZ
-----END PGP SIGNATURE-----
[Rcpp-devel] Question concerning use of Raster in C vi Rcpp plugin and inline
12 messages · Rainer M Krug, Douglas Bates, Romain Francois +1 more
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M Krug <r.m.krug at gmail.com> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi I am new to C, and my question might be basic, but I am struggling with it: I have about 10 lines of code in a simulation, which take up about 50% of the whole simulation, As the simulation, takes several hours to run, I want to make the code faster. After trying different approaches in R, with different degrees of success, I decided to try out inline and Rcpp for this. I did some progress (using sugar), and the net-result is so far, that it is as fast / slow as R; but as I said, this is a first step and I am happy so far. My problem now is, in the R code, I have the following (the complete code is below): s <- seeds[x:(x+dx2), y:(y+dy2)] dispSeeds[x,y] <- sum( ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rbinom( ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ), ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na.rm = TRUE ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
As the rbinom part is only using the vector representation of s, I have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in inline
using sugar?
As the code above sits in two large loops, I would like to translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
?dispRparFor <- function(
? ? ? ? ? ? ? ? ? ? ? ? ?maxx ?= maxx,
? ? ? ? ? ? ? ? ? ? ? ? ?maxy ?= maxy,
? ? ? ? ? ? ? ? ? ? ? ? ?seeds = seeds,
? ? ? ? ? ? ? ? ? ? ? ? ?sd2D ?= sd2D,
? ? ? ? ? ? ? ? ? ? ? ? ?dx ? ?= dx,
? ? ? ? ? ? ? ? ? ? ? ? ?dy ? ?= dy,
? ? ? ? ? ? ? ? ? ? ? ? ?dispSeeds = dispSeeds # the return value
? ? ? ? ? ? ? ? ? ? ? ? ?) {
? ?dx2 <- 2*dx
? ?dy2 <- 2*dy
? ?for ( y in 1:maxy ) {
? ? ?cat(y, " ")
? ? ?for (x in 1:maxx) {
? ? ? ?s <- seeds[x:(x+dx2), y:(y+dy2)]
? ? ? ?if (all(is.na(s))) {
? ? ? ? ?dispSeeds[x,y] <- NA
? ? ? ?} else {
? ? ? ? ?dispSeeds[x,y] <- sum(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rbinom(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na.rm = TRUE
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?)
? ? ? ?}
? ? ?}
? ?}
? ?return(dispSeeds)
?}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
?dispRparForSugar <- function(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? maxx ?= maxx,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? maxy ?= maxy,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? seeds = seeds,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D ?= sd2D,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dx ? ?= dx,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dy ? ?= dy,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dispSeeds = dispSeeds # the return value
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ) {
? ?library(inline)
? ?library(Rcpp)
? ?dx2 <- 2*dx
? ?dy2 <- 2*dy
? ?fx2 <- cxxfunction(
? ? ? ? ? ? ? ? ? ? ? sig = signature(N = "integer", SIZE = "integer",
PROB = "double"),
? ? ? ? ? ? ? ? ? ? ? body = '
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::IntegerVector n (N);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::IntegerVector size (SIZE);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::NumericVector prob (PROB);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int res;
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? res = 0;
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? for( int i=0; i<n.size(); i++){
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?if (size[i]>0) {
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?res += rbinom( 1, size[i],
prob[i] )[0];
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?}
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return wrap( res );
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ',
? ? ? ? ? ? ? ? ? ? ? plugin = "Rcpp",
? ? ? ? ? ? ? ? ? ? ? verbose = TRUE
? ? ? ? ? ? ? ? ? ? ? )
? ?for ( y in 1:maxy ) {
? ? ?cat(y, " ")
? ? ?for (x in 1:maxx) {
? ? ? ?s <- seeds[x:(x+dx2), y:(y+dy2)]
? ? ? ?if (all(is.na(s))) {
? ? ? ? ?dispSeeds[x,y] <- NA
? ? ? ?} else {
? ? ? ? ?dispSeeds[x,y] <- fx2(s, s, sd2D)
? ? ? ?}
? ? ?}
? ?}
? ?return(dispSeeds)
?}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition. It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). Is that intentional? And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: ? ? ? ?+33 - (0)9 53 10 27 44 Cell: ? ? ? +27 - (0)8 39 47 90 42 Fax (SA): ? +27 - (0)8 65 16 27 82 Fax (D) : ? +49 - (0)3 21 21 25 22 44 Fax (FR): ? +33 - (0)9 58 10 27 44 email: ? ? ?Rainer at krugs.de Skype: ? ? ?RMkrug -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkziXZ0ACgkQoYgNqgF2egoL7wCfR5RapMcddSmM2f8ACBzjVG6p GaQAni+XQJs2wzcJQNXpMxwFY/FqRa/W =PmVZ -----END PGP SIGNATURE-----
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M Krug <r.m.krug-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote:
Hi
I am new to C, and my question might be basic, but I am struggling with it:
I have about 10 lines of code in a simulation, which take up about 50%
of the whole simulation, As the simulation, takes several hours to run,
I want to make the code faster. After trying different approaches in R,
with different degrees of success, I decided to try out inline and Rcpp
for this.
I did some progress (using sugar), and the net-result is so far, that it
is as fast / slow as R; but as I said, this is a first step and I am
happy so far. My problem now is, in the R code, I have the following
(the complete code is below):
s <- seeds[x:(x+dx2), y:(y+dy2)]
dispSeeds[x,y] <- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in inline
using sugar?
As the code above sits in two large loops, I would like to translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
dispRparFor <- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
dx2 <- 2*dx
dy2 <- 2*dy
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s <- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y] <- NA
} else {
dispSeeds[x,y] <- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
}
}
}
return(dispSeeds)
}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
dispRparForSugar <- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
library(inline)
library(Rcpp)
dx2 <- 2*dx
dy2 <- 2*dy
fx2 <- cxxfunction(
sig = signature(N = "integer", SIZE = "integer",
PROB = "double"),
body = '
Rcpp::IntegerVector n (N);
Rcpp::IntegerVector size (SIZE);
Rcpp::NumericVector prob (PROB);
int res;
res = 0;
for( int i=0; i<n.size(); i++){
if (size[i]>0) {
res += rbinom( 1, size[i],
prob[i] )[0];
}
}
return wrap( res );
',
plugin = "Rcpp",
verbose = TRUE
)
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s <- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y] <- NA
} else {
dispSeeds[x,y] <- fx2(s, s, sd2D)
}
}
}
return(dispSeeds)
}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general? In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer at krugs.de Skype: RMkrug -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: dispersal.org URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20101116/4b8c5707/attachment-0001.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: dispersal.R URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20101116/4b8c5707/attachment-0001.asc>
On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug <Rainer at krugs.de> wrote:
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M Krug <r.m.krug-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote: Hi I am new to C, and my question might be basic, but I am struggling with it: I have about 10 lines of code in a simulation, which take up about 50% of the whole simulation, As the simulation, takes several hours to run, I want to make the code faster. After trying different approaches in R, with different degrees of success, I decided to try out inline and Rcpp for this. I did some progress (using sugar), and the net-result is so far, that it is as fast / slow as R; but as I said, this is a first step and I am happy so far. My problem now is, in the R code, I have the following (the complete code is below): s <- seeds[x:(x+dx2), y:(y+dy2)] dispSeeds[x,y] <- sum( ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rbinom( ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ), ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na.rm = TRUE ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. ?The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in inline
using sugar?
As the code above sits in two large loops, I would like to translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
?dispRparFor <- function(
? ? ? ? ? ? ? ? ? ? ? ? ?maxx ?= maxx,
? ? ? ? ? ? ? ? ? ? ? ? ?maxy ?= maxy,
? ? ? ? ? ? ? ? ? ? ? ? ?seeds = seeds,
? ? ? ? ? ? ? ? ? ? ? ? ?sd2D ?= sd2D,
? ? ? ? ? ? ? ? ? ? ? ? ?dx ? ?= dx,
? ? ? ? ? ? ? ? ? ? ? ? ?dy ? ?= dy,
? ? ? ? ? ? ? ? ? ? ? ? ?dispSeeds = dispSeeds # the return value
? ? ? ? ? ? ? ? ? ? ? ? ?) {
? ?dx2 <- 2*dx
? ?dy2 <- 2*dy
? ?for ( y in 1:maxy ) {
? ? ?cat(y, " ")
? ? ?for (x in 1:maxx) {
? ? ? ?s <- seeds[x:(x+dx2), y:(y+dy2)]
? ? ? ?if (all(is.na(s))) {
? ? ? ? ?dispSeeds[x,y] <- NA
? ? ? ?} else {
? ? ? ? ?dispSeeds[x,y] <- sum(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rbinom(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na.rm = TRUE
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?)
? ? ? ?}
? ? ?}
? ?}
? ?return(dispSeeds)
?}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
?dispRparForSugar <- function(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? maxx ?= maxx,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? maxy ?= maxy,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? seeds = seeds,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D ?= sd2D,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dx ? ?= dx,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dy ? ?= dy,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dispSeeds = dispSeeds # the return value
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ) {
? ?library(inline)
? ?library(Rcpp)
? ?dx2 <- 2*dx
? ?dy2 <- 2*dy
? ?fx2 <- cxxfunction(
? ? ? ? ? ? ? ? ? ? ? sig = signature(N = "integer", SIZE = "integer",
PROB = "double"),
? ? ? ? ? ? ? ? ? ? ? body = '
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::IntegerVector n (N);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::IntegerVector size (SIZE);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::NumericVector prob (PROB);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int res;
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? res = 0;
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? for( int i=0; i<n.size(); i++){
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?if (size[i]>0) {
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?res += rbinom( 1, size[i],
prob[i] )[0];
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?}
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return wrap( res );
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ',
? ? ? ? ? ? ? ? ? ? ? plugin = "Rcpp",
? ? ? ? ? ? ? ? ? ? ? verbose = TRUE
? ? ? ? ? ? ? ? ? ? ? )
? ?for ( y in 1:maxy ) {
? ? ?cat(y, " ")
? ? ?for (x in 1:maxx) {
? ? ? ?s <- seeds[x:(x+dx2), y:(y+dy2)]
? ? ? ?if (all(is.na(s))) {
? ? ? ? ?dispSeeds[x,y] <- NA
? ? ? ?} else {
? ? ? ? ?dispSeeds[x,y] <- fx2(s, s, sd2D)
? ? ? ?}
? ? ?}
? ?}
? ?return(dispSeeds)
?}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. ?cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). ?Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general?
Yes, it will. You are assigning the NumericVector's s and sd2D to use the storage of the SD2D argument so when you assign s[indS] in the first loop you are overwriting the contents of SD2D. I am testing a rewrite now.
In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: ? ? ? ?+33 - (0)9 53 10 27 44 Cell: ? ? ? +27 - (0)8 39 47 90 42 Fax (SA): ? +27 - (0)8 65 16 27 82 Fax (D) : ? +49 - (0)3 21 21 25 22 44 Fax (FR): ? +33 - (0)9 58 10 27 44 email: ? ? ?Rainer at krugs.de Skype: ? ? ?RMkrug _______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
I enclose a rewrite which compiles and executes without changing the arguments. I am not entirely sure that it is doing the correct calculation. I tried to compare the results from calls like print(range(sd2D)) set.seed(1234) print(system.time(dispSeedsC <- disperse(seeds.m, sd2D, dispRparForSugarOrg))) set.seed(1234) print(system.time(dispSeedsR <- disperse(seeds.m, sd2D, dispRparFor))) print(range(sd2D)) and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced errors. I am still somewhat confused by the logic.
On Tue, Nov 16, 2010 at 10:15 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug <Rainer at krugs.de> wrote:
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M Krug <r.m.krug-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> wrote: Hi I am new to C, and my question might be basic, but I am struggling with it: I have about 10 lines of code in a simulation, which take up about 50% of the whole simulation, As the simulation, takes several hours to run, I want to make the code faster. After trying different approaches in R, with different degrees of success, I decided to try out inline and Rcpp for this. I did some progress (using sugar), and the net-result is so far, that it is as fast / slow as R; but as I said, this is a first step and I am happy so far. My problem now is, in the R code, I have the following (the complete code is below): s <- seeds[x:(x+dx2), y:(y+dy2)] dispSeeds[x,y] <- sum( ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rbinom( ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ), ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na.rm = TRUE ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. ?The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in inline
using sugar?
As the code above sits in two large loops, I would like to translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
?dispRparFor <- function(
? ? ? ? ? ? ? ? ? ? ? ? ?maxx ?= maxx,
? ? ? ? ? ? ? ? ? ? ? ? ?maxy ?= maxy,
? ? ? ? ? ? ? ? ? ? ? ? ?seeds = seeds,
? ? ? ? ? ? ? ? ? ? ? ? ?sd2D ?= sd2D,
? ? ? ? ? ? ? ? ? ? ? ? ?dx ? ?= dx,
? ? ? ? ? ? ? ? ? ? ? ? ?dy ? ?= dy,
? ? ? ? ? ? ? ? ? ? ? ? ?dispSeeds = dispSeeds # the return value
? ? ? ? ? ? ? ? ? ? ? ? ?) {
? ?dx2 <- 2*dx
? ?dy2 <- 2*dy
? ?for ( y in 1:maxy ) {
? ? ?cat(y, " ")
? ? ?for (x in 1:maxx) {
? ? ? ?s <- seeds[x:(x+dx2), y:(y+dy2)]
? ? ? ?if (all(is.na(s))) {
? ? ? ? ?dispSeeds[x,y] <- NA
? ? ? ?} else {
? ? ? ? ?dispSeeds[x,y] <- sum(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rbinom(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na.rm = TRUE
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?)
? ? ? ?}
? ? ?}
? ?}
? ?return(dispSeeds)
?}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
?dispRparForSugar <- function(
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? maxx ?= maxx,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? maxy ?= maxy,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? seeds = seeds,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sd2D ?= sd2D,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dx ? ?= dx,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dy ? ?= dy,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? dispSeeds = dispSeeds # the return value
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ) {
? ?library(inline)
? ?library(Rcpp)
? ?dx2 <- 2*dx
? ?dy2 <- 2*dy
? ?fx2 <- cxxfunction(
? ? ? ? ? ? ? ? ? ? ? sig = signature(N = "integer", SIZE = "integer",
PROB = "double"),
? ? ? ? ? ? ? ? ? ? ? body = '
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::IntegerVector n (N);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::IntegerVector size (SIZE);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Rcpp::NumericVector prob (PROB);
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int res;
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? res = 0;
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? for( int i=0; i<n.size(); i++){
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?if (size[i]>0) {
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?res += rbinom( 1, size[i],
prob[i] )[0];
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?}
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? return wrap( res );
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ',
? ? ? ? ? ? ? ? ? ? ? plugin = "Rcpp",
? ? ? ? ? ? ? ? ? ? ? verbose = TRUE
? ? ? ? ? ? ? ? ? ? ? )
? ?for ( y in 1:maxy ) {
? ? ?cat(y, " ")
? ? ?for (x in 1:maxx) {
? ? ? ?s <- seeds[x:(x+dx2), y:(y+dy2)]
? ? ? ?if (all(is.na(s))) {
? ? ? ? ?dispSeeds[x,y] <- NA
? ? ? ?} else {
? ? ? ? ?dispSeeds[x,y] <- fx2(s, s, sd2D)
? ? ? ?}
? ? ?}
? ?}
? ?return(dispSeeds)
?}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. ?cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). ?Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general?
Yes, it will. ?You are assigning the NumericVector's s and sd2D to use the storage of the SD2D argument so when you assign s[indS] in the first loop you are overwriting the contents of SD2D. I am testing a rewrite now.
In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: ? ? ? ?+33 - (0)9 53 10 27 44 Cell: ? ? ? +27 - (0)8 39 47 90 42 Fax (SA): ? +27 - (0)8 65 16 27 82 Fax (D) : ? +49 - (0)3 21 21 25 22 44 Fax (FR): ? +33 - (0)9 58 10 27 44 email: ? ? ?Rainer at krugs.de Skype: ? ? ?RMkrug _______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-------------- next part -------------- A non-text attachment was scrubbed... Name: disp.cpp Type: text/x-c++src Size: 734 bytes Desc: not available URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20101116/cbe2df0c/attachment.cpp>
On 11/16/2010 05:49 PM, Douglas Bates wrote:
Hi Douglas,
I enclose a rewrite which compiles and executes without changing the arguments.
That looks much better and "C like" - thanks a lot.
I am not entirely sure that it is doing the correct calculation. I tried to compare the results from calls like print(range(sd2D)) set.seed(1234) print(system.time(dispSeedsC <- disperse(seeds.m, sd2D, dispRparForSugarOrg))) set.seed(1234) print(system.time(dispSeedsR <- disperse(seeds.m, sd2D, dispRparFor))) print(range(sd2D)) and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced errors.
I will look at that - and I am optimistic that I will sort that out now.
I am still somewhat confused by the logic.
What I want to calculate is the dispersal of seeds into each cell from the neighboring cells, defined in sd2D. sd2D is a probability density function, and seeds.m is the initial seed distribution of to be dispersed seeds. Well - the basic idea is of a mowing-window calculation over seeds.m, where the window is defined by sd2D, and the calculation is based on the rbinom numbers drawn from all cells around a particular cell (x,y), covered y sd2D; the prob of rbinom is the value of sd2D, the size is the value of the underlying cell in seeds.m, n is 1. All numbers drawn are added up and stored in the cell (x,y). And this is repeated for all cells. Now, I need the values for all cells, even those where sd2D does not completely overlap with seeds.m, and this is why I buffer seeds.m with NAs to get seeds, on which the calculations are now performed. Hope this clarifies my problem, and my somehow strange approach (I am thinking of skipping the buffering and to check for overlap in the loop itself). A procedural question: You attached below a file named disp.cpp, which indicates to me that your c code is in a separate file - is this the case? And if yes, how can I do that with the inline command? Or do I have to compile externally and use Rcpp directly? Cheers and thanks a lot for your help, Rainer
On Tue, Nov 16, 2010 at 10:15 AM, Douglas Bates <bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org> wrote:
On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug <Rainer-vfylz/Ys1k4 at public.gmane.org> wrote:
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M Krug <r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org> wrote:
Hi
I am new to C, and my question might be basic, but I am struggling with it:
I have about 10 lines of code in a simulation, which take up about 50%
of the whole simulation, As the simulation, takes several hours to run,
I want to make the code faster. After trying different approaches in R,
with different degrees of success, I decided to try out inline and Rcpp
for this.
I did some progress (using sugar), and the net-result is so far, that it
is as fast / slow as R; but as I said, this is a first step and I am
happy so far. My problem now is, in the R code, I have the following
(the complete code is below):
s <- seeds[x:(x+dx2), y:(y+dy2)]
dispSeeds[x,y] <- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in inline
using sugar?
As the code above sits in two large loops, I would like to translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
dispRparFor <- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
dx2 <- 2*dx
dy2 <- 2*dy
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s <- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y] <- NA
} else {
dispSeeds[x,y] <- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
}
}
}
return(dispSeeds)
}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
dispRparForSugar <- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
library(inline)
library(Rcpp)
dx2 <- 2*dx
dy2 <- 2*dy
fx2 <- cxxfunction(
sig = signature(N = "integer", SIZE = "integer",
PROB = "double"),
body = '
Rcpp::IntegerVector n (N);
Rcpp::IntegerVector size (SIZE);
Rcpp::NumericVector prob (PROB);
int res;
res = 0;
for( int i=0; i<n.size(); i++){
if (size[i]>0) {
res += rbinom( 1, size[i],
prob[i] )[0];
}
}
return wrap( res );
',
plugin = "Rcpp",
verbose = TRUE
)
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s <- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y] <- NA
} else {
dispSeeds[x,y] <- fx2(s, s, sd2D)
}
}
}
return(dispSeeds)
}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general?
Yes, it will. You are assigning the NumericVector's s and sd2D to use the storage of the SD2D argument so when you assign s[indS] in the first loop you are overwriting the contents of SD2D. I am testing a rewrite now.
In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer-vfylz/Ys1k4 at public.gmane.org Skype: RMkrug _______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Le 17/11/10 10:17, Rainer M Krug a ?crit :
On 11/16/2010 05:49 PM, Douglas Bates wrote: Hi Douglas,
I enclose a rewrite which compiles and executes without changing the arguments.
That looks much better and "C like" - thanks a lot.
I am not entirely sure that it is doing the correct calculation. I tried to compare the results from calls like print(range(sd2D)) set.seed(1234) print(system.time(dispSeedsC<- disperse(seeds.m, sd2D, dispRparForSugarOrg))) set.seed(1234) print(system.time(dispSeedsR<- disperse(seeds.m, sd2D, dispRparFor))) print(range(sd2D)) and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced errors.
I will look at that - and I am optimistic that I will sort that out now.
I am still somewhat confused by the logic.
What I want to calculate is the dispersal of seeds into each cell from the neighboring cells, defined in sd2D. sd2D is a probability density function, and seeds.m is the initial seed distribution of to be dispersed seeds. Well - the basic idea is of a mowing-window calculation over seeds.m, where the window is defined by sd2D, and the calculation is based on the rbinom numbers drawn from all cells around a particular cell (x,y), covered y sd2D; the prob of rbinom is the value of sd2D, the size is the value of the underlying cell in seeds.m, n is 1. All numbers drawn are added up and stored in the cell (x,y). And this is repeated for all cells. Now, I need the values for all cells, even those where sd2D does not completely overlap with seeds.m, and this is why I buffer seeds.m with NAs to get seeds, on which the calculations are now performed. Hope this clarifies my problem, and my somehow strange approach (I am thinking of skipping the buffering and to check for overlap in the loop itself). A procedural question: You attached below a file named disp.cpp, which indicates to me that your c code is in a separate file - is this the case? And if yes, how can I do that with the inline command? Or do I have to compile externally and use Rcpp directly?
Ultimately, making a package is the best action.
Otherwise, I guess you could do something like this:
f <- cxxfunction( signature(....),
paste( readLines( "disp.cpp" ), collapse = "\n" ),
plugin = "Rcpp" )
I'll look at the rest of the thread and try to provide some help later.
Romain
Cheers and thanks a lot for your help, Rainer
On Tue, Nov 16, 2010 at 10:15 AM, Douglas Bates<bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org> wrote:
On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug<Rainer-vfylz/Ys1k4 at public.gmane.org> wrote:
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M Krug<r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org> wrote:
Hi
I am new to C, and my question might be basic, but I am struggling with it:
I have about 10 lines of code in a simulation, which take up about 50%
of the whole simulation, As the simulation, takes several hours to run,
I want to make the code faster. After trying different approaches in R,
with different degrees of success, I decided to try out inline and Rcpp
for this.
I did some progress (using sugar), and the net-result is so far, that it
is as fast / slow as R; but as I said, this is a first step and I am
happy so far. My problem now is, in the R code, I have the following
(the complete code is below):
s<- seeds[x:(x+dx2), y:(y+dy2)]
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in inline
using sugar?
As the code above sits in two large loops, I would like to translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
dispRparFor<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
dx2<- 2*dx
dy2<- 2*dy
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
}
}
}
return(dispSeeds)
}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
dispRparForSugar<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
library(inline)
library(Rcpp)
dx2<- 2*dx
dy2<- 2*dy
fx2<- cxxfunction(
sig = signature(N = "integer", SIZE = "integer",
PROB = "double"),
body = '
Rcpp::IntegerVector n (N);
Rcpp::IntegerVector size (SIZE);
Rcpp::NumericVector prob (PROB);
int res;
res = 0;
for( int i=0; i<n.size(); i++){
if (size[i]>0) {
res += rbinom( 1, size[i],
prob[i] )[0];
}
}
return wrap( res );
',
plugin = "Rcpp",
verbose = TRUE
)
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- fx2(s, s, sd2D)
}
}
}
return(dispSeeds)
}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general?
Yes, it will. You are assigning the NumericVector's s and sd2D to use the storage of the SD2D argument so when you assign s[indS] in the first loop you are overwriting the contents of SD2D. I am testing a rewrite now.
In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer-vfylz/Ys1k4 at public.gmane.org Skype: RMkrug _______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/17/2010 10:26 AM, Romain Francois wrote:
Le 17/11/10 10:17, Rainer M Krug a ?crit :
On 11/16/2010 05:49 PM, Douglas Bates wrote: Hi Douglas,
I enclose a rewrite which compiles and executes without changing the arguments.
That looks much better and "C like" - thanks a lot.
I am not entirely sure that it is doing the correct calculation. I tried to compare the results from calls like print(range(sd2D)) set.seed(1234) print(system.time(dispSeedsC<- disperse(seeds.m, sd2D, dispRparForSugarOrg))) set.seed(1234) print(system.time(dispSeedsR<- disperse(seeds.m, sd2D, dispRparFor))) print(range(sd2D)) and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced errors.
I will look at that - and I am optimistic that I will sort that out now.
I am still somewhat confused by the logic.
What I want to calculate is the dispersal of seeds into each cell from the neighboring cells, defined in sd2D. sd2D is a probability density function, and seeds.m is the initial seed distribution of to be dispersed seeds. Well - the basic idea is of a mowing-window calculation over seeds.m, where the window is defined by sd2D, and the calculation is based on the rbinom numbers drawn from all cells around a particular cell (x,y), covered y sd2D; the prob of rbinom is the value of sd2D, the size is the value of the underlying cell in seeds.m, n is 1. All numbers drawn are added up and stored in the cell (x,y). And this is repeated for all cells. Now, I need the values for all cells, even those where sd2D does not completely overlap with seeds.m, and this is why I buffer seeds.m with NAs to get seeds, on which the calculations are now performed. Hope this clarifies my problem, and my somehow strange approach (I am thinking of skipping the buffering and to check for overlap in the loop itself). A procedural question: You attached below a file named disp.cpp, which indicates to me that your c code is in a separate file - is this the case? And if yes, how can I do that with the inline command? Or do I have to compile externally and use Rcpp directly?
Ultimately, making a package is the best action.
True - But I haven't looked at that option yet, as I was happy with sourcing the code, but I will for my next project.
Otherwise, I guess you could do something like this:
f <- cxxfunction( signature(....),
paste( readLines( "disp.cpp" ), collapse = "\n" ),
plugin = "Rcpp" )
Sounds good - would there be an option of including an alternative argument to "body", i.e. "file", of which exactly one has to be supplied?
I'll look at the rest of the thread and try to provide some help later.
Thanks - I'll work on it today again, and post (hopefully) some progress later today. Rainer
Romain
Cheers and thanks a lot for your help, Rainer
On Tue, Nov 16, 2010 at 10:15 AM, Douglas Bates<bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org> wrote:
On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug<Rainer-vfylz/Ys1k4 at public.gmane.org> wrote:
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M
Krug<r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org>
wrote:
Hi
I am new to C, and my question might be basic, but I am struggling
with it:
I have about 10 lines of code in a simulation, which take up about
50%
of the whole simulation, As the simulation, takes several hours to
run,
I want to make the code faster. After trying different approaches
in R,
with different degrees of success, I decided to try out inline and
Rcpp
for this.
I did some progress (using sugar), and the net-result is so far,
that it
is as fast / slow as R; but as I said, this is a first step and I am
happy so far. My problem now is, in the R code, I have the following
(the complete code is below):
s<- seeds[x:(x+dx2), y:(y+dy2)]
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I
have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in
inline
using sugar?
As the code above sits in two large loops, I would like to
translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
dispRparFor<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
dx2<- 2*dx
dy2<- 2*dy
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
}
}
}
return(dispSeeds)
}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
dispRparForSugar<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return
value
) {
library(inline)
library(Rcpp)
dx2<- 2*dx
dy2<- 2*dy
fx2<- cxxfunction(
sig = signature(N = "integer", SIZE =
"integer",
PROB = "double"),
body = '
Rcpp::IntegerVector n (N);
Rcpp::IntegerVector size (SIZE);
Rcpp::NumericVector prob (PROB);
int res;
res = 0;
for( int i=0; i<n.size(); i++){
if (size[i]>0) {
res += rbinom( 1, size[i],
prob[i] )[0];
}
}
return wrap( res );
',
plugin = "Rcpp",
verbose = TRUE
)
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- fx2(s, s, sd2D)
}
}
}
return(dispSeeds)
}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general?
Yes, it will. You are assigning the NumericVector's s and sd2D to use the storage of the SD2D argument so when you assign s[indS] in the first loop you are overwriting the contents of SD2D. I am testing a rewrite now.
In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer-vfylz/Ys1k4 at public.gmane.org Skype: RMkrug _______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
- -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer at krugs.de Skype: RMkrug -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzjpBYACgkQoYgNqgF2egrPFQCcDhoEaLbiV87gQLV+i21BSsbO cSoAoIsUsUx6CCLZBIYVZYnzY2ry9TxC =mnzJ -----END PGP SIGNATURE-----
Le 17/11/10 10:44, Rainer M Krug a ?crit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/17/2010 10:26 AM, Romain Francois wrote:
Le 17/11/10 10:17, Rainer M Krug a ?crit :
On 11/16/2010 05:49 PM, Douglas Bates wrote: Hi Douglas,
I enclose a rewrite which compiles and executes without changing the arguments.
That looks much better and "C like" - thanks a lot.
I am not entirely sure that it is doing the correct calculation. I tried to compare the results from calls like print(range(sd2D)) set.seed(1234) print(system.time(dispSeedsC<- disperse(seeds.m, sd2D, dispRparForSugarOrg))) set.seed(1234) print(system.time(dispSeedsR<- disperse(seeds.m, sd2D, dispRparFor))) print(range(sd2D)) and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced errors.
I will look at that - and I am optimistic that I will sort that out now.
I am still somewhat confused by the logic.
What I want to calculate is the dispersal of seeds into each cell from the neighboring cells, defined in sd2D. sd2D is a probability density function, and seeds.m is the initial seed distribution of to be dispersed seeds. Well - the basic idea is of a mowing-window calculation over seeds.m, where the window is defined by sd2D, and the calculation is based on the rbinom numbers drawn from all cells around a particular cell (x,y), covered y sd2D; the prob of rbinom is the value of sd2D, the size is the value of the underlying cell in seeds.m, n is 1. All numbers drawn are added up and stored in the cell (x,y). And this is repeated for all cells. Now, I need the values for all cells, even those where sd2D does not completely overlap with seeds.m, and this is why I buffer seeds.m with NAs to get seeds, on which the calculations are now performed. Hope this clarifies my problem, and my somehow strange approach (I am thinking of skipping the buffering and to check for overlap in the loop itself). A procedural question: You attached below a file named disp.cpp, which indicates to me that your c code is in a separate file - is this the case? And if yes, how can I do that with the inline command? Or do I have to compile externally and use Rcpp directly?
Ultimately, making a package is the best action.
True - But I haven't looked at that option yet, as I was happy with sourcing the code, but I will for my next project.
Otherwise, I guess you could do something like this:
f<- cxxfunction( signature(....),
paste( readLines( "disp.cpp" ), collapse = "\n" ),
plugin = "Rcpp" )
Sounds good - would there be an option of including an alternative argument to "body", i.e. "file", of which exactly one has to be supplied?
Perhaps. Feel free to submit a patch to the maintainer of inline.
I'll look at the rest of the thread and try to provide some help later.
Thanks - I'll work on it today again, and post (hopefully) some progress later today. Rainer
Romain
Cheers and thanks a lot for your help, Rainer
On Tue, Nov 16, 2010 at 10:15 AM, Douglas Bates<bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org> wrote:
On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug<Rainer-vfylz/Ys1k4 at public.gmane.org> wrote:
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M
Krug<r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org>
wrote:
Hi
I am new to C, and my question might be basic, but I am struggling
with it:
I have about 10 lines of code in a simulation, which take up about
50%
of the whole simulation, As the simulation, takes several hours to
run,
I want to make the code faster. After trying different approaches
in R,
with different degrees of success, I decided to try out inline and
Rcpp
for this.
I did some progress (using sugar), and the net-result is so far,
that it
is as fast / slow as R; but as I said, this is a first step and I am
happy so far. My problem now is, in the R code, I have the following
(the complete code is below):
s<- seeds[x:(x+dx2), y:(y+dy2)]
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I
have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in
inline
using sugar?
As the code above sits in two large loops, I would like to
translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
dispRparFor<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
dx2<- 2*dx
dy2<- 2*dy
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
}
}
}
return(dispSeeds)
}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
dispRparForSugar<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return
value
) {
library(inline)
library(Rcpp)
dx2<- 2*dx
dy2<- 2*dy
fx2<- cxxfunction(
sig = signature(N = "integer", SIZE =
"integer",
PROB = "double"),
body = '
Rcpp::IntegerVector n (N);
Rcpp::IntegerVector size (SIZE);
Rcpp::NumericVector prob (PROB);
int res;
res = 0;
for( int i=0; i<n.size(); i++){
if (size[i]>0) {
res += rbinom( 1, size[i],
prob[i] )[0];
}
}
return wrap( res );
',
plugin = "Rcpp",
verbose = TRUE
)
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- fx2(s, s, sd2D)
}
}
}
return(dispSeeds)
}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general?
Yes, it will. You are assigning the NumericVector's s and sd2D to use the storage of the SD2D argument so when you assign s[indS] in the first loop you are overwriting the contents of SD2D. I am testing a rewrite now.
In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer-vfylz/Ys1k4 at public.gmane.org Skype: RMkrug _______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
- -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer at krugs.de Skype: RMkrug -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzjpBYACgkQoYgNqgF2egrPFQCcDhoEaLbiV87gQLV+i21BSsbO cSoAoIsUsUx6CCLZBIYVZYnzY2ry9TxC =mnzJ -----END PGP SIGNATURE-----
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube
On 17 November 2010 at 10:44, Rainer M Krug wrote:
| On 11/17/2010 10:26 AM, Romain Francois wrote:
| > Ultimately, making a package is the best action. | | True - But I haven't looked at that option yet, as I was happy with | sourcing the code, but I will for my next project. Are you sure you haven't? ;-) You and I are co-authors of the earthmovdist package on R-Forge which is just that: a small package with Rcpp around a single file you once for help with in wrapping to C++. Feel free to take it as a stanza. https://r-forge.r-project.org/projects/earthmovdist/ Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On 11/17/2010 01:57 PM, Dirk Eddelbuettel wrote:
On 17 November 2010 at 10:44, Rainer M Krug wrote: | On 11/17/2010 10:26 AM, Romain Francois wrote: | > Ultimately, making a package is the best action. | | True - But I haven't looked at that option yet, as I was happy with | sourcing the code, but I will for my next project. Are you sure you haven't? ;-)
Not for this project - but that package in which we are co-authors definitely is doing that. To clarify: Up to now, the whole simulation (of which the the code discussed here is s small part) was purely R, as it worked fine and was relatively easy to maintain. Now the simulation is growing, and the need for speed becomes more evident. So I am looking into the coding sections in C++ and to use inline or Rcpp. When the C++ code is working, and before the simulations can run on a hpc cluster (I don't suppose gcc is installed on the nodes), I have to compile it and the idea of a package, also in relation to version control, is really appealing. But that is the next step. Don't worry - I haven't forgotten earthmovdist and I will use the code there as a blueprint for wrapping the C++ code.
You and I are co-authors of the earthmovdist package on R-Forge which is just that: a small package with Rcpp around a single file you once for help with in wrapping to C++. Feel free to take it as a stanza. https://r-forge.r-project.org/projects/earthmovdist/
Cheers, Rainer
Dirk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/17/2010 10:44 AM, Rainer M Krug wrote:
On 11/17/2010 10:26 AM, Romain Francois wrote:
Le 17/11/10 10:17, Rainer M Krug a ?crit :
On 11/16/2010 05:49 PM, Douglas Bates wrote: Hi Douglas,
I enclose a rewrite which compiles and executes without changing the arguments.
That looks much better and "C like" - thanks a lot.
I am not entirely sure that it is doing the correct calculation. I tried to compare the results from calls like print(range(sd2D)) set.seed(1234) print(system.time(dispSeedsC<- disperse(seeds.m, sd2D, dispRparForSugarOrg))) set.seed(1234) print(system.time(dispSeedsR<- disperse(seeds.m, sd2D, dispRparFor))) print(range(sd2D)) and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced errors.
I will look at that - and I am optimistic that I will sort that out now.
I am still somewhat confused by the logic.
What I want to calculate is the dispersal of seeds into each cell from the neighboring cells, defined in sd2D. sd2D is a probability density function, and seeds.m is the initial seed distribution of to be dispersed seeds. Well - the basic idea is of a mowing-window calculation over seeds.m, where the window is defined by sd2D, and the calculation is based on the rbinom numbers drawn from all cells around a particular cell (x,y), covered y sd2D; the prob of rbinom is the value of sd2D, the size is the value of the underlying cell in seeds.m, n is 1. All numbers drawn are added up and stored in the cell (x,y). And this is repeated for all cells. Now, I need the values for all cells, even those where sd2D does not completely overlap with seeds.m, and this is why I buffer seeds.m with NAs to get seeds, on which the calculations are now performed. Hope this clarifies my problem, and my somehow strange approach (I am thinking of skipping the buffering and to check for overlap in the loop itself). A procedural question: You attached below a file named disp.cpp, which indicates to me that your c code is in a separate file - is this the case? And if yes, how can I do that with the inline command? Or do I have to compile externally and use Rcpp directly?
Ultimately, making a package is the best action.
True - But I haven't looked at that option yet, as I was happy with sourcing the code, but I will for my next project.
Otherwise, I guess you could do something like this:
f <- cxxfunction( signature(....),
paste( readLines( "disp.cpp" ), collapse = "\n" ),
plugin = "Rcpp" )
Sounds good - would there be an option of including an alternative argument to "body", i.e. "file", of which exactly one has to be supplied?
I'll look at the rest of the thread and try to provide some help later.
Thanks - I'll work on it today again, and post (hopefully) some progress later today.
As promised, here is the reworked code below. The code is working now, and about four times as fast as the fastest R version - brilliant. Nevertheless, I assume that there is still some scope for improvement - although I have no idea where. Any ideas on how to streamline the code are welcome. To run the code, please download the surrounding R code and required data files from: http://thekrugs.free.fr/rainer/examples/ Cheers and thanks, Rainer Here is the working code: #+BEGIN_SRC c++ :tangle spreadSim.windDispersal.cpp // The input parameter int dx2 = as<int>(DX2); // by reference or value? int dy2 = as<int>(DY2); NumericVector sd2D (SD2D); // by reference! IntegerMatrix seeds (SEEDS); IntegerMatrix mask (MASK); // result vector IntegerMatrix dispSeeds = clone<IntegerMatrix>(mask); // internal variables IntegerVector s (sd2D.size()); RNGScope scope; // N.B. Needed when calling random number generators int res; int nc = dispSeeds.ncol(); int nr = dispSeeds.nrow(); // BEGIN loop over seeds grid ("moving") for( int y=0; y < nc; y++ ){ for( int x=0; x < nr; x++ ){ // BEGIN loop over sd2D ("window") if ( dispSeeds(x, y) >= 0 ) { int indS = 0; for( int xS=x; xS <= x + dx2; xS++ ) for( int yS=y; yS <= y + dy2; yS++, indS++ ) s[indS]=seeds(xS, yS); res = 0; for( int i=0; i<s.size(); i++ ){ if (s[i]>0 && sd2D[i]>0) { res += (int) ::Rf_rbinom((double)(s[i]), sd2D[i]); } } dispSeeds(x, y) = res; } } } // END loop over seeds return wrap( dispSeeds ); #+END_SRC
Rainer
Romain
Cheers and thanks a lot for your help, Rainer
On Tue, Nov 16, 2010 at 10:15 AM, Douglas Bates<bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org> wrote:
On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug<Rainer-vfylz/Ys1k4 at public.gmane.org> wrote:
On 11/16/2010 03:16 PM, Douglas Bates wrote:
On Tue, Nov 16, 2010 at 4:31 AM, Rainer M
Krug<r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org>
wrote:
Hi
I am new to C, and my question might be basic, but I am struggling
with it:
I have about 10 lines of code in a simulation, which take up about
50%
of the whole simulation, As the simulation, takes several hours to
run,
I want to make the code faster. After trying different approaches
in R,
with different degrees of success, I decided to try out inline and
Rcpp
for this.
I did some progress (using sugar), and the net-result is so far,
that it
is as fast / slow as R; but as I said, this is a first step and I am
happy so far. My problem now is, in the R code, I have the following
(the complete code is below):
s<- seeds[x:(x+dx2), y:(y+dy2)]
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
I think it unlikely that you will be able to write C/C++ code that runs much faster than that expression in R. The sum and rbinom functions are vectorized and other than a small interpretation overhead not much can be saved by going to compiled code.
The main reason why I am trying to use C are the loops around the code above. See attached code example.
As the rbinom part is only using the vector representation of s, I
have
already translated it into C.
So my problem / question is: can I use a similar syntax for s in
inline
using sugar?
As the code above sits in two large loops, I would like to
translate the
whole loops into C.
Below the code (R and R / inline):
maxx, maxy, dx, dy: scalar integers
seeds: integer matrix
dispSeeds: integer matrix
The pure R code:
#+BEGIN_SRC R
dispRparFor<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return value
) {
dx2<- 2*dx
dy2<- 2*dy
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- sum(
rbinom(
s,
s,
sd2D
),
na.rm = TRUE
)
}
}
}
return(dispSeeds)
}
#+END_SRC
How far I got with using inline and Rcpp / sugar:
#+BEGIN_SRC R
dispRparForSugar<- function(
maxx = maxx,
maxy = maxy,
seeds = seeds,
sd2D = sd2D,
dx = dx,
dy = dy,
dispSeeds = dispSeeds # the return
value
) {
library(inline)
library(Rcpp)
dx2<- 2*dx
dy2<- 2*dy
fx2<- cxxfunction(
sig = signature(N = "integer", SIZE =
"integer",
PROB = "double"),
body = '
Rcpp::IntegerVector n (N);
Rcpp::IntegerVector size (SIZE);
Rcpp::NumericVector prob (PROB);
int res;
res = 0;
for( int i=0; i<n.size(); i++){
if (size[i]>0) {
res += rbinom( 1, size[i],
prob[i] )[0];
}
}
return wrap( res );
',
plugin = "Rcpp",
verbose = TRUE
)
for ( y in 1:maxy ) {
cat(y, " ")
for (x in 1:maxx) {
s<- seeds[x:(x+dx2), y:(y+dy2)]
if (all(is.na(s))) {
dispSeeds[x,y]<- NA
} else {
dispSeeds[x,y]<- fx2(s, s, sd2D)
}
}
}
return(dispSeeds)
}
#+END_SRC
It is peculiar to call cxxfunction within another function if the code being compiled is fixed. cxxfunction is a function generator and one usually regards a call to cxxfunction as the equivalent of entering a function definition.
I changed that - yes, it was peculiar but based on my experimentation.
It doesn't appear that you are using n in your C++ code (other than taking its length) and your C++ code seems to be evaluating sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)). Is that intentional?
Left over - changed as well. Thanks.
And if your fx2 is always to be called as fx2(s, s, sd2D) then why not just pass s and sd2D?
Done (partly). To add some more strange things: I attach the code in a working example as an R script file which will show something strange (for me): the function overwrites a variable, namely sd2D. Am I doing something wrong or do I underestimate the dangers of using Rcpp and inline and C in general?
Yes, it will. You are assigning the NumericVector's s and sd2D to use the storage of the SD2D argument so when you assign s[indS] in the first loop you are overwriting the contents of SD2D. I am testing a rewrite now.
In addition, the code is not doing what it is supposed to be doing, but that is a different question at the moment. Cheers, Rainer
Ultimately, I would like to have the complete dispRparFor function in C. For the ones interested, I could send a working example with data. Cheers, Rainer
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer-vfylz/Ys1k4 at public.gmane.org Skype: RMkrug _______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
- -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer at krugs.de Skype: RMkrug -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzlArkACgkQoYgNqgF2egq7jACfR3jzZeKkYifHBKeGyiTkEddU 4MMAnAvWSpB7panUtpt0fh6TEHcldZRx =eY1p -----END PGP SIGNATURE-----