I am a new Rcpp user, and I have been trying to do a simple exercise of generating random numbers to put into a vector. To start off, the following code works:
cygwin warning:
MS-DOS style path detected: C:/R/R-215~1.1/etc/i386/Makeconf
Preferred POSIX equivalent is: /cygdrive/c/R/R-215~1.1/etc/i386/Makeconf
CYGWIN environment variable option "nodosfilewarning" turns off this warning.
Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
(res <- fun())
[1] 1.7517515 0.3178733 -0.6957581 1.7030360 -1.6546936
When I try to do the same thing, but in a loop as shown here:
the result is an error message indicating that it cannot convert a NumericVector to double. Since in the first example I did not index, I am assuming that I have indexed incorrectly, but I can't see how. Below is the detailed error message - any idea what I am doing wrong?
cygwin warning:
MS-DOS style path detected: C:/R/R-215~1.1/etc/i386/Makeconf
Preferred POSIX equivalent is: /cygdrive/c/R/R-215~1.1/etc/i386/Makeconf
CYGWIN environment variable option "nodosfilewarning" turns off this warning.
Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
file15182ecd2ee.cpp: In function 'SEXPREC* file15182ecd2ee()':
file15182ecd2ee.cpp:33:31: error: cannot convert 'Rcpp::NumericVector {aka Rcpp::Vector<14>}' to 'Rcpp::traits::storage_type<14>::type {aka double}' in assignment
make: *** [file15182ecd2ee.o] Error 1
ERROR(s) during compilation: source code errors or compiler configuration errors!
Program source:
1:
2: // includes from the plugin
3:
4: #include <Rcpp.h>
5:
6:
7: #ifndef BEGIN_RCPP
8: #define BEGIN_RCPP
9: #endif
10:
11: #ifndef END_RCPP
12: #define END_RCPP
13: #endif
14:
15: using namespace Rcpp;
16:
17:
18: // user includes
19:
20:
21: // declarations
22: extern "C" {
23: SEXP file15182ecd2ee( ) ;
24: }
25:
26: // definition
27:
28: SEXP file15182ecd2ee( ){
29: BEGIN_RCPP
30: Rcpp::RNGScope scope;
31: Rcpp::NumericVector rn(5);
32: for (int i=0; i < 5; i++) {
33: rn(i) = rnorm(1,0,1);
34: }
35: return Rcpp::wrap(rn);
36:
37: END_RCPP
38: }
39:
40:
Error in compileCode(f, code, language = language, verbose = verbose) :
Compilation ERROR, function(s)/method(s) not created! cygwin warning:
MS-DOS style path detected: C:/R/R-215~1.1/etc/i386/Makeconf
Preferred POSIX equivalent is: /cygdrive/c/R/R-215~1.1/etc/i386/Makeconf
CYGWIN environment variable option "nodosfilewarning" turns off this warning.
Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
file15182ecd2ee.cpp: In function 'SEXPREC* file15182ecd2ee()':
file15182ecd2ee.cpp:33:31: error: cannot convert 'Rcpp::NumericVector {aka Rcpp::Vector<14>}' to 'Rcpp::traits::storage_type<14>::type {aka double}' in assignment
make: *** [file15182ecd2ee.o] Error 1
In addition: Warning message:
running command 'C:/R/R-215~1.1/bin/i386/R CMD SHLIB file15182ecd2ee.cpp 2> file15182ecd2ee.cpp.err.txt' had status 1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20120925/e263abed/attachment.html>
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
Rodney Sparapani, PhD Center for Patient Care and Outcomes Research
Sr. Biostatistician http://www.mcw.edu/pcor
4 wheels good, 2 wheels better! Medical College of Wisconsin (MCW)
WWLD?: What Would Lombardi Do? Milwaukee, WI, USA
Of course - that did the trick. Thanks so much.
-----Original Message-----
From: Rodney Sparapani [mailto:rsparapa at mcw.edu]
Sent: Tuesday, September 25, 2012 2:53 PM
To: Goldfeld, Keith
Cc: 'rcpp-devel at lists.r-forge.r-project.org'
Subject: Re: [Rcpp-devel] NumericVector Double mismatch when indexing an array
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
Rodney Sparapani, PhD Center for Patient Care and Outcomes Research
Sr. Biostatistician http://www.mcw.edu/pcor
4 wheels good, 2 wheels better! Medical College of Wisconsin (MCW)
WWLD?: What Would Lombardi Do? Milwaukee, WI, USA
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
That works but you wouldn't want to use it as a general model because
you are creating vectors of length 1 then dereferencing the first
element in each of these vectors and copying it to another vector.
Obviously with n=5 the difference in execution time will be minimal
but with n=5 million it won't be. Using the Rcpp sugar function rnorm
will be the easiest approach unless, like me, you get a little queasy
when using Rcpp sugar functions just because it is so hard to pick
them apart and see what is actually happening.
I would probably end up going back to the R API and calling norm_rand
or Rf_rnorm. That latter returns a double from two double arguments,
mu and sigma.
Yes - I agree that iterating is not the best way to go, but I am using this for an agent based simulation where iteration is really my only option. I won't be generating more than a few thousand records, so the time shouldn't be much of a factor. Of course, in R it is painfully slow - so that is why I am going the Rcpp route.
-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
Sent: Tuesday, September 25, 2012 3:13 PM
To: Rodney Sparapani
Cc: Goldfeld, Keith; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] NumericVector Double mismatch when indexing an array
On Tue, Sep 25, 2012 at 1:53 PM, Rodney Sparapani <rsparapa at mcw.edu> wrote:
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
That works but you wouldn't want to use it as a general model because you are creating vectors of length 1 then dereferencing the first element in each of these vectors and copying it to another vector.
Obviously with n=5 the difference in execution time will be minimal but with n=5 million it won't be. Using the Rcpp sugar function rnorm will be the easiest approach unless, like me, you get a little queasy when using Rcpp sugar functions just because it is so hard to pick them apart and see what is actually happening.
I would probably end up going back to the R API and calling norm_rand or Rf_rnorm. That latter returns a double from two double arguments, mu and sigma.
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative",
replications = 1e6L)
test replications elapsed relative user.self sys.self user.child
sys.child
3 fun3() 1000000 7.15 1.000 7.01 0.00 NA
NA
4 fun4() 1000000 7.33 1.025 7.27 0.01 NA
NA
2 fun2() 1000000 7.53 1.053 7.41 0.00 NA
NA
1 fun1() 1000000 7.99 1.117 7.91 0.00 NA
NA
On Tue, Sep 25, 2012 at 8:19 PM, Goldfeld, Keith
<Keith.Goldfeld at nyumc.org>wrote:
Yes - I agree that iterating is not the best way to go, but I am using
this for an agent based simulation where iteration is really my only
option. I won't be generating more than a few thousand records, so the
time shouldn't be much of a factor. Of course, in R it is painfully slow -
so that is why I am going the Rcpp route.
-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
Bates
Sent: Tuesday, September 25, 2012 3:13 PM
To: Rodney Sparapani
Cc: Goldfeld, Keith; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] NumericVector Double mismatch when indexing an
array
On Tue, Sep 25, 2012 at 1:53 PM, Rodney Sparapani <rsparapa at mcw.edu>
wrote:
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
That works but you wouldn't want to use it as a general model because you
are creating vectors of length 1 then dereferencing the first element in
each of these vectors and copying it to another vector.
Obviously with n=5 the difference in execution time will be minimal but
with n=5 million it won't be. Using the Rcpp sugar function rnorm will be
the easiest approach unless, like me, you get a little queasy when using
Rcpp sugar functions just because it is so hard to pick them apart and see
what is actually happening.
I would probably end up going back to the R API and calling norm_rand or
Rf_rnorm. That latter returns a double from two double arguments, mu and
sigma.
Jeff and List,
I was interested in the same fairly recently. The full shebang is here <
http://www.rochester.edu/college/gradstudents/jolmsted/blog/2012/08/22/rcpp-rng-performance/>.
I, unlike you, found some difference between the various approaches.
If I had to guess why we come up with different results, I'd say that the
time for any one call for any of your inline functions is 90% overhead and
the differences between the various approaches, while present, is on a
vastly different (and smaller) scale than the overhead itself due to only 5
draws being taken. Of course, how any of these timings inform one's
approach depends on the specifics of how they'll set up their analysis.
For what it is worth, instead of an explicit "as<>()" conversion, I usually
pull out the first element of the returned NumericVector. That is,
rn[i] = as<double>(rnorm(1, 0.0, 1.0));
becomes
rn[i] = rnorm(1, 0.0, 1.0)[0];
I haven't ever timed this, though.
JPO
-------------------------------------------------------------------------
J. P. Olmsted
j.p.olmsted at gmail.com
http://about.me/olmjo
-------------------------------------------------------------------------
On Tue, Sep 25, 2012 at 4:08 PM, Jeffrey Pollock <jeffpollock9 at gmail.com>wrote:
I was interested to see if there was any real speed difference between the
different methods suggested, and it looks like... there isn't...
library(inline)
library(Rcpp)
library(rbenchmark)
fun1 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = as<double>(rnorm(1, 0.0, 1.0));
}
return rn;
', plugin = "Rcpp")
fun2 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = Rf_rnorm(0.0, 1.0);
}
return rn;
', plugin = "Rcpp")
fun3 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = norm_rand();
}
return rn;
', plugin = "Rcpp")
fun4 <- cxxfunction(body = '
RNGScope scope;
return rnorm(5, 0.0, 1.0);
', plugin = "Rcpp")
set.seed(1)
print(fun1())
set.seed(1)
print(fun2())
set.seed(1)
print(fun3())
set.seed(1)
print(fun4())
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative", replications
= 1e6L)
output;
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative",
replications = 1e6L)
test replications elapsed relative user.self sys.self user.child
sys.child
3 fun3() 1000000 7.15 1.000 7.01 0.00
NA NA
4 fun4() 1000000 7.33 1.025 7.27 0.01
NA NA
2 fun2() 1000000 7.53 1.053 7.41 0.00
NA NA
1 fun1() 1000000 7.99 1.117 7.91 0.00
NA NA
On Tue, Sep 25, 2012 at 8:19 PM, Goldfeld, Keith <Keith.Goldfeld at nyumc.org
wrote:
Yes - I agree that iterating is not the best way to go, but I am using
this for an agent based simulation where iteration is really my only
option. I won't be generating more than a few thousand records, so the
time shouldn't be much of a factor. Of course, in R it is painfully slow -
so that is why I am going the Rcpp route.
-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
Bates
Sent: Tuesday, September 25, 2012 3:13 PM
To: Rodney Sparapani
Cc: Goldfeld, Keith; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] NumericVector Double mismatch when indexing an
array
On Tue, Sep 25, 2012 at 1:53 PM, Rodney Sparapani <rsparapa at mcw.edu>
wrote:
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
That works but you wouldn't want to use it as a general model because you
are creating vectors of length 1 then dereferencing the first element in
each of these vectors and copying it to another vector.
Obviously with n=5 the difference in execution time will be minimal but
with n=5 million it won't be. Using the Rcpp sugar function rnorm will be
the easiest approach unless, like me, you get a little queasy when using
Rcpp sugar functions just because it is so hard to pick them apart and see
what is actually happening.
I would probably end up going back to the R API and calling norm_rand or
Rf_rnorm. That latter returns a double from two double arguments, mu and
sigma.
There does appear to be a difference between the native Rf_rnorm and the R rnorm function when the sample sizes get large. In fact, the Rf_rnorm appears to be about twice as fast (using benchmark). Not surprisingly, doing a loop in R is about 50 times slower.
From: Jonathan Olmsted [mailto:jpolmsted at gmail.com]
Sent: Tuesday, September 25, 2012 4:29 PM
To: Jeffrey Pollock
Cc: Goldfeld, Keith; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] NumericVector Double mismatch when indexing an array
Jeff and List,
I was interested in the same fairly recently. The full shebang is here <http://www.rochester.edu/college/gradstudents/jolmsted/blog/2012/08/22/rcpp-rng-performance/>. I, unlike you, found some difference between the various approaches.
If I had to guess why we come up with different results, I'd say that the time for any one call for any of your inline functions is 90% overhead and the differences between the various approaches, while present, is on a vastly different (and smaller) scale than the overhead itself due to only 5 draws being taken. Of course, how any of these timings inform one's approach depends on the specifics of how they'll set up their analysis.
For what it is worth, instead of an explicit "as<>()" conversion, I usually pull out the first element of the returned NumericVector. That is,
rn[i] = as<double>(rnorm(1, 0.0, 1.0));
becomes
rn[i] = rnorm(1, 0.0, 1.0)[0];
I haven't ever timed this, though.
JPO
-------------------------------------------------------------------------
J. P. Olmsted
j.p.olmsted at gmail.com<mailto:j.p.olmsted at gmail.com>
http://about.me/olmjo
-------------------------------------------------------------------------
On Tue, Sep 25, 2012 at 4:08 PM, Jeffrey Pollock <jeffpollock9 at gmail.com<mailto:jeffpollock9 at gmail.com>> wrote:
I was interested to see if there was any real speed difference between the different methods suggested, and it looks like... there isn't...
library(inline)
library(Rcpp)
library(rbenchmark)
fun1 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = as<double>(rnorm(1, 0.0, 1.0));
}
return rn;
', plugin = "Rcpp")
fun2 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = Rf_rnorm(0.0, 1.0);
}
return rn;
', plugin = "Rcpp")
fun3 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = norm_rand();
}
return rn;
', plugin = "Rcpp")
fun4 <- cxxfunction(body = '
RNGScope scope;
return rnorm(5, 0.0, 1.0);
', plugin = "Rcpp")
set.seed(1)
print(fun1())
set.seed(1)
print(fun2())
set.seed(1)
print(fun3())
set.seed(1)
print(fun4())
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative", replications = 1e6L)
output;
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative", replications = 1e6L)
test replications elapsed relative user.self sys.self user.child sys.child
3 fun3() 1000000 7.15 1.000 7.01 0.00 NA NA
4 fun4() 1000000 7.33 1.025 7.27 0.01 NA NA
2 fun2() 1000000 7.53 1.053 7.41 0.00 NA NA
1 fun1() 1000000 7.99 1.117 7.91 0.00 NA NA
On Tue, Sep 25, 2012 at 8:19 PM, Goldfeld, Keith <Keith.Goldfeld at nyumc.org<mailto:Keith.Goldfeld at nyumc.org>> wrote:
Yes - I agree that iterating is not the best way to go, but I am using this for an agent based simulation where iteration is really my only option. I won't be generating more than a few thousand records, so the time shouldn't be much of a factor. Of course, in R it is painfully slow - so that is why I am going the Rcpp route.
-----Original Message-----
From: dmbates at gmail.com<mailto:dmbates at gmail.com> [mailto:dmbates at gmail.com<mailto:dmbates at gmail.com>] On Behalf Of Douglas Bates
Sent: Tuesday, September 25, 2012 3:13 PM
To: Rodney Sparapani
Cc: Goldfeld, Keith; rcpp-devel at lists.r-forge.r-project.org<mailto:rcpp-devel at lists.r-forge.r-project.org>
Subject: Re: [Rcpp-devel] NumericVector Double mismatch when indexing an array
On Tue, Sep 25, 2012 at 1:53 PM, Rodney Sparapani <rsparapa at mcw.edu<mailto:rsparapa at mcw.edu>> wrote:
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
That works but you wouldn't want to use it as a general model because you are creating vectors of length 1 then dereferencing the first element in each of these vectors and copying it to another vector.
Obviously with n=5 the difference in execution time will be minimal but with n=5 million it won't be. Using the Rcpp sugar function rnorm will be the easiest approach unless, like me, you get a little queasy when using Rcpp sugar functions just because it is so hard to pick them apart and see what is actually happening.
I would probably end up going back to the R API and calling norm_rand or Rf_rnorm. That latter returns a double from two double arguments, mu and sigma.
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel at lists.r-forge.r-project.org<mailto:Rcpp-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel at lists.r-forge.r-project.org<mailto:Rcpp-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20120925/e87dc5b4/attachment.html>
Indeed, changing the sample size from 5 to 500 results in;
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative",
replications = 1e5L)
test replications elapsed relative user.self sys.self user.child
sys.child
4 fun4() 100000 6.16 1.000 6.06 0 NA
NA
3 fun3() 100000 6.19 1.005 5.91 0 NA
NA
2 fun2() 100000 6.92 1.123 6.75 0 NA
NA
1 fun1() 100000 13.01 2.112 12.73 0 NA
NA
On Tue, Sep 25, 2012 at 9:33 PM, Goldfeld, Keith
<Keith.Goldfeld at nyumc.org>wrote:
There does appear to be a difference between the native Rf_rnorm and the
R rnorm function when the sample sizes get large. In fact, the Rf_rnorm
appears to be about twice as fast (using benchmark). Not surprisingly,
doing a loop in R is about 50 times slower.****
** **
*From:* Jonathan Olmsted [mailto:jpolmsted at gmail.com]
*Sent:* Tuesday, September 25, 2012 4:29 PM
*To:* Jeffrey Pollock
*Cc:* Goldfeld, Keith; rcpp-devel at lists.r-forge.r-project.org
*Subject:* Re: [Rcpp-devel] NumericVector Double mismatch when indexing
an array****
** **
Jeff and List,****
** **
I was interested in the same fairly recently. The full shebang is here <
http://www.rochester.edu/college/gradstudents/jolmsted/blog/2012/08/22/rcpp-rng-performance/>.
I, unlike you, found some difference between the various approaches.****
** **
If I had to guess why we come up with different results, I'd say that the
time for any one call for any of your inline functions is 90% overhead and
the differences between the various approaches, while present, is on a
vastly different (and smaller) scale than the overhead itself due to only 5
draws being taken. Of course, how any of these timings inform one's
approach depends on the specifics of how they'll set up their analysis.***
*
** **
For what it is worth, instead of an explicit "as<>()" conversion, I
usually pull out the first element of the returned NumericVector. That is,
****
rn[i] = as<double>(rnorm(1, 0.0, 1.0));****
becomes****
rn[i] = rnorm(1, 0.0, 1.0)[0];****
I haven't ever timed this, though.****
** **
JPO****
** **
** **
** **
-------------------------------------------------------------------------
J. P. Olmsted
j.p.olmsted at gmail.com
http://about.me/olmjo
-------------------------------------------------------------------------
****
On Tue, Sep 25, 2012 at 4:08 PM, Jeffrey Pollock <jeffpollock9 at gmail.com>
wrote:****
I was interested to see if there was any real speed difference between the
different methods suggested, and it looks like... there isn't...
library(inline)
library(Rcpp)
library(rbenchmark)
fun1 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = as<double>(rnorm(1, 0.0, 1.0));
}
return rn;
', plugin = "Rcpp")
fun2 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = Rf_rnorm(0.0, 1.0);
}
return rn;
', plugin = "Rcpp")
fun3 <- cxxfunction(body = '
RNGScope scope;
NumericVector rn(5);
for (int i = 0; i < 5; i++) {
rn[i] = norm_rand();
}
return rn;
', plugin = "Rcpp")
fun4 <- cxxfunction(body = '
RNGScope scope;
return rnorm(5, 0.0, 1.0);
', plugin = "Rcpp")
set.seed(1)
print(fun1())
set.seed(1)
print(fun2())
set.seed(1)
print(fun3())
set.seed(1)
print(fun4())
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative", replications
= 1e6L)
output;
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative",
replications = 1e6L)
test replications elapsed relative user.self sys.self user.child
sys.child
3 fun3() 1000000 7.15 1.000 7.01 0.00
NA NA
4 fun4() 1000000 7.33 1.025 7.27 0.01
NA NA
2 fun2() 1000000 7.53 1.053 7.41 0.00
NA NA
1 fun1() 1000000 7.99 1.117 7.91 0.00
NA NA
****
On Tue, Sep 25, 2012 at 8:19 PM, Goldfeld, Keith <Keith.Goldfeld at nyumc.org>
wrote:****
Yes - I agree that iterating is not the best way to go, but I am using
this for an agent based simulation where iteration is really my only
option. I won't be generating more than a few thousand records, so the
time shouldn't be much of a factor. Of course, in R it is painfully slow -
so that is why I am going the Rcpp route.****
-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
Bates
Sent: Tuesday, September 25, 2012 3:13 PM
To: Rodney Sparapani
Cc: Goldfeld, Keith; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] NumericVector Double mismatch when indexing an
array****
On Tue, Sep 25, 2012 at 1:53 PM, Rodney Sparapani <rsparapa at mcw.edu>
wrote:
You just need an explicit type conversion like so:
funk2 <- cxxfunction(body='RNGScope scope;
NumericVector rn(5);
for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
return wrap(rn);',
includes="using namespace Rcpp;\n", plugin="Rcpp")
funk2()
That works but you wouldn't want to use it as a general model because you
are creating vectors of length 1 then dereferencing the first element in
each of these vectors and copying it to another vector.
Obviously with n=5 the difference in execution time will be minimal but
with n=5 million it won't be. Using the Rcpp sugar function rnorm will be
the easiest approach unless, like me, you get a little queasy when using
Rcpp sugar functions just because it is so hard to pick them apart and see
what is actually happening.
I would probably end up going back to the R API and calling norm_rand or
Rf_rnorm. That latter returns a double from two double arguments, mu and
sigma.
On 25 September 2012 at 13:53, Rodney Sparapani wrote:
| You just need an explicit type conversion like so:
|
| funk2 <- cxxfunction(body='RNGScope scope;
| NumericVector rn(5);
| for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
| return wrap(rn);',
| includes="using namespace Rcpp;\n", plugin="Rcpp")
|
| funk2()
In the interest of constructive criticism, my version (and I must posted
essentially the same maybe ten times already on this very list):
R> f3 <- cxxfunction(signature(), plugin="Rcpp", body='
+ RNGScope scope;
+ NumericVector rn = Rcpp::rnorm(5,0,1);
+ return wrap(rn);
+ ')
R> f3()
[1] -0.837983 -0.706765 0.551781 0.662024 0.495562
R>
Changes:
-- no 'using namespace Rcpp;' as inline already does it; I still don't use
it as I prefer to be explicit about provenance of functions and objects
-- no loop (as Doug said) --- rnorm() is a sugar functions and vectorised
-- no wrap, we already have an Rcpp vector for which wrap is implicit
Dirk
On 25 September 2012 at 21:08, Jeffrey Pollock wrote:
| I was interested to see if there was any real speed difference between the
| different methods suggested, and it looks like... there isn't...
Good move -- measuring is a good idea.
And yes: at the end of the days everything is just a call to the C-level R
API which supplies the values one at a time. "Sugar" just hides some loops,
and unrolls (some) longer loops for speed. It's easier to read and maintain
(we think) but not automagically faster.
Dirk
On Tue, Sep 25, 2012 at 4:41 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 25 September 2012 at 13:53, Rodney Sparapani wrote:
| You just need an explicit type conversion like so:
|
| funk2 <- cxxfunction(body='RNGScope scope;
| NumericVector rn(5);
| for (int i=0; i < 5; i++) rn[i] = as<double>(rnorm(1,0,1));
| return wrap(rn);',
| includes="using namespace Rcpp;\n", plugin="Rcpp")
|
| funk2()
In the interest of constructive criticism, my version (and I must posted
essentially the same maybe ten times already on this very list):
R> f3 <- cxxfunction(signature(), plugin="Rcpp", body='
+ RNGScope scope;
+ NumericVector rn = Rcpp::rnorm(5,0,1);
+ return wrap(rn);
+ ')
R> f3()
[1] -0.837983 -0.706765 0.551781 0.662024 0.495562
R>
Changes:
-- no 'using namespace Rcpp;' as inline already does it; I still don't use
it as I prefer to be explicit about provenance of functions and objects
Check
-- no loop (as Doug said) --- rnorm() is a sugar functions and vectorised
Roger that
-- no wrap, we already have an Rcpp vector for which wrap is implicit
Woops!... we know, though ... "do as I say not as I do" and all that
jazz ... ;-)
-steve
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
In the interest of constructive criticism, my version (and I must posted
essentially the same maybe ten times already on this very list):
R> f3<- cxxfunction(signature(), plugin="Rcpp", body='
+ RNGScope scope;
+ NumericVector rn = Rcpp::rnorm(5,0,1);
+ return wrap(rn);
+ ')
R> f3()
[1] -0.837983 -0.706765 0.551781 0.662024 0.495562
R>
Very good. Just one quick question since I have come across
this in the documentation somewhere, but I can't seem to find
it anymore. Can you pass optimization flags like so?
f3<- cxxfunction(signature(), plugin="Rcpp", body='
RNGScope scope;
NumericVector rn = Rcpp::rnorm(5,0,1);
return wrap(rn);',
cppflags="-O2")
^^^^^^^^^^^^^^
I tried it, but it doesn't seem to make a noticeable impact on
these benchmarks. Is that because it is ignoring them or it
just has no benefit?
Rodney Sparapani, PhD Center for Patient Care and Outcomes Research
Sr. Biostatistician http://www.mcw.edu/pcor
4 wheels good, 2 wheels better! Medical College of Wisconsin (MCW)
WWLD?: What Would Lombardi Do? Milwaukee, WI, USA
On 25 September 2012 at 16:19, Rodney Sparapani wrote:
| Very good. Just one quick question since I have come across
| this in the documentation somewhere, but I can't seem to find
| it anymore. Can you pass optimization flags like so?
|
| f3<- cxxfunction(signature(), plugin="Rcpp", body='
| RNGScope scope;
| NumericVector rn = Rcpp::rnorm(5,0,1);
| return wrap(rn);',
| cppflags="-O2")
| ^^^^^^^^^^^^^^
| I tried it, but it doesn't seem to make a noticeable impact on
| these benchmarks. Is that because it is ignoring them or it
| just has no benefit?
I use and recommend ~/.R/Makevars for that --- as setting certain values at
the package level is very verboten ("non portable code" and all that) as far
as CRAN is concerned.
Now, that's the "how do you" part. As for "does it matter" that is a
different beast as -O2 (or better) is already the default for R on my systems
(running the vanilla Ubuntu packages based on my Debian packaging).
Dirk
benchmark(fun1(), fun2(), fun3(), fun4(), order = "relative",
replications = 1e5L)
test replications elapsed relative user.self sys.self user.child
sys.child
4 fun4() 100000 6.16 1.000 6.06 0 NA
NA
3 fun3() 100000 6.19 1.005 5.91 0 NA
NA
2 fun2() 100000 6.92 1.123 6.75 0 NA
NA
1 fun1() 100000 13.01 2.112 12.73 0 NA
NA
An interesting thread, thanks for the benchmarks. I just wanted to point
out that even with 100,000 replications, noise is altering the ordering:
fun3() is quicker when considering user.self instead of elapsed.
(Are there any ways, built-in to benchmark, to avoid this? E.g. do
multiple runs, and average, or just use quickest run, etc.?)
Darren