I have looked at your (i.e. Dirk and Roman's) submitted Rjournal article on Rcpp and decided that I should take the plunge. To get my feet wet I am starting with a relatively simple example interfacing to (God help us) 1960's style Fortran written by Mike Powell and incorporated in the minqa package. The value to be returned is a named list of heterogeneous values. I can see how to construct such a list in the classic Rcpp API using a RcppResultSet object but I don't know the best way to accomplish this in the more modern Rcpp namespace. Can someone point me in the right direction? Also, is there a convenient idiom for attaching an S3 class name to a list to be returned? An unrelated question -- has anyone looked at accessing S4 classed objects in Rcpp? I have had considerable experience with S4 classes and objects and could keep such a project in mind as I begin to explore Rcpp. Yet another question, if i may. Does coercion take place in constructing an object like Rcpp::NumericVector aa(a); where a is an SEXP, say an SEXP argument to a C++ function, where I accidentally passed an integer vector instead of a double vector? If so, what happens if there is no coercion? Are the contents of the SEXP copied to newly allocated storage or not? The reason that I ask is because there are some places in the lme4 code where I want to make sure that I have a pointer to the contents of the original vector because I am want to change the contents without copying.
[Rcpp-devel] Returning a named (and classed?) list from an RcppExport SEXP function
8 messages · Douglas Bates, Dirk Eddelbuettel, Romain Francois
Hi Doug, Welcome! We're honoured to have you here. And glad you picked the list.
On 12 March 2010 at 10:14, Douglas Bates wrote:
| I have looked at your (i.e. Dirk and Roman's) submitted Rjournal
| article on Rcpp and decided that I should take the plunge.
|
| To get my feet wet I am starting with a relatively simple example
| interfacing to (God help us) 1960's style Fortran written by Mike
| Powell and incorporated in the minqa package.
|
| The value to be returned is a named list of heterogeneous values. I
| can see how to construct such a list in the classic Rcpp API using a
| RcppResultSet object but I don't know the best way to accomplish this
| in the more modern Rcpp namespace. Can someone point me in the right
| direction? Also, is there a convenient idiom for attaching an S3
| class name to a list to be returned?
Yes. See eg the examples I put on my blog, eg inside of RcppArmadillo we
simply do (including actual fastLm source here modulu two ifdefs dealing with
older versions -- which means you need Armadillo 0.9.0 or later)
#include <RcppArmadillo.h>
extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {
Rcpp::NumericVector yr(ys); // creates Rcpp vector from SEXP
Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
arma::colvec y(yr.begin(), yr.size(), false);
arma::colvec coef = solve(X, y); // fit model y ~ X
arma::colvec resid = y - X*coef; // residuals
double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) );
// std.error of estimate
arma::colvec stderrest = sqrt( sig2 * diagvec( arma::inv(arma::trans(X)*X)) );
Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef),
Rcpp::Named( "stderr", stderrest));
return res;
}
[ By the way, I still think I should try to implement all four of your "how
to do lm" solutions from the older R News / R Journal piece in Armadillo one
day :-) And see what the timing impacts would be. ]
That works "on the fly" with Pairlist and named. We also have Rcpp::List
examples, but AFAICR you need to 'predeclare' how long your list is and the
set each element. The 200+ unit tests will have examples. Better docs will
appear one day...
| An unrelated question -- has anyone looked at accessing S4 classed
| objects in Rcpp? I have had considerable experience with S4 classes
| and objects and could keep such a project in mind as I begin to
| explore Rcpp.
I'd be happy to go there (time permitting) but very gently as I am still much
more firmly in S3 land. Romain may be more eager :)
| Yet another question, if i may. Does coercion take place in
| constructing an object like
|
| Rcpp::NumericVector aa(a);
|
| where a is an SEXP, say an SEXP argument to a C++ function, where I
| accidentally passed an integer vector instead of a double vector? If
Very much so. Again, see my blog and the RcppVectorExample code. Int in,
Double out. All magic hidden.
And of course you can just try it:
R> library(RcppExamples)
Loading required package: Rcpp
R> RcppVectorExample(c(1,4,9), api="new")
$result
[1] 1 2 3
$original
[1] 1 4 9
R> RcppVectorExample(c(2,5,10), api="new")
$result
[1] 1.414 2.236 3.162
$original
[1] 2 5 10
R>
At this level, behaviour is the same with 'classic' API but we know of at
least one case where 'classic' doesn't cast a Matrix right and 'new' does.
| so, what happens if there is no coercion? Are the contents of the
| SEXP copied to newly allocated storage or not?
It depends. In the new API we do fewer copies.
| The reason that I ask is because there are some places in the lme4
| code where I want to make sure that I have a pointer to the contents
| of the original vector because I am want to change the contents
| without copying.
I think we do that. If you look at the vignette and the timing example -- but
using pointer accessors (eg double *x = foo.begin()) and pointer ops (eg x++)
we get essentially the same speed as pure and ugly C code. Because we do
away with the copy, and the operator[], for purest speed.
Hope that answer the first batch of questions. Keep'em coming!
Dirk
Registration is open for the 2nd International conference R / Finance 2010 See http://www.RinFinance.com for details, and see you in Chicago in April!
On Fri, Mar 12, 2010 at 10:47 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
Hi Doug,
Welcome! We're honoured to have you here. And glad you picked the list.
On 12 March 2010 at 10:14, Douglas Bates wrote:
| I have looked at your (i.e. Dirk and Roman's) submitted Rjournal
| article on Rcpp and decided that I should take the plunge.
|
| To get my feet wet I am starting with a relatively simple example
| interfacing to (God help us) 1960's style Fortran written by Mike
| Powell and incorporated in the minqa package.
|
| The value to be returned is a named list of heterogeneous values. ?I
| can see how to construct such a list in the classic Rcpp API using a
| RcppResultSet object but I don't know the best way to accomplish this
| in the more modern Rcpp namespace. ?Can someone point me in the right
| direction? ?Also, is there a convenient idiom for attaching an S3
| class name to a list to be returned?
Yes. See eg the examples I put on my blog, eg inside of RcppArmadillo we
simply do (including actual fastLm source here modulu two ifdefs dealing with
older versions -- which means you need Armadillo 0.9.0 or later)
#include <RcppArmadillo.h>
extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {
? ?Rcpp::NumericVector yr(ys); ? ? ? ? ? ? ? ? // creates Rcpp vector from SEXP
? ?Rcpp::NumericMatrix Xr(Xs); ? ? ? ? ? ? ? ? // creates Rcpp matrix from SEXP
? ?int n = Xr.nrow(), k = Xr.ncol();
? ?arma::mat X(Xr.begin(), n, k, false); ? ? ? // reuses memory and avoids extra copy
? ?arma::colvec y(yr.begin(), yr.size(), false);
? ?arma::colvec coef = solve(X, y); ? ? ? ? ? ?// fit model y ~ X
? ?arma::colvec resid = y - X*coef; ? ? ? ? ? ?// residuals
? ?double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) );
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?// std.error of estimate
? ?arma::colvec stderrest = sqrt( sig2 * diagvec( arma::inv(arma::trans(X)*X)) );
? ?Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef),
? ? ? ? ? ? ? ? ? ? ? Rcpp::Named( "stderr", stderrest));
? ?return res;
}
[ By the way, I still think I should try to implement all four of your "how
to do lm" solutions from the older R News / R Journal piece in Armadillo one
day :-) ? And see what the timing impacts would be. ]
That works "on the fly" with Pairlist and named. ?We also have Rcpp::List
examples, but AFAICR you need to 'predeclare' how long your list is and the
set each element. ?The 200+ unit tests will have examples. Better docs will
appear one day...
Nice approach. I presume that the Pairlist object is converted to an VECSXP in the process of being packaged for return. I will play around a bit with expressions like Rcpp::List res(Rcpp::PairList(Rcpp::Named(...), ...)) I would like to have the capability of attaching an attribute named "class" to the VECSXP before returning it.
| An unrelated question -- has anyone looked at accessing S4 classed | objects in Rcpp? ?I have had considerable experience with S4 classes | and objects and could keep such a project in mind as I begin to | explore Rcpp. I'd be happy to go there (time permitting) but very gently as I am still much more firmly in S3 land. ?Romain may be more eager :)
One of the nice properties of S4 classed objects is that you can define validity checks for classes and bypass some of the code that does validity checking of arguments.
| Yet another question, if i may. ? Does coercion take place in | constructing an object like | | Rcpp::NumericVector aa(a); | | where a is an SEXP, say an SEXP argument to a C++ function, where I | accidentally passed an integer vector instead of a double vector? ?If Very much so. ?Again, see my blog and the RcppVectorExample code. ?Int in, Double out. All magic hidden. And of course you can just try it: R> library(RcppExamples) Loading required package: Rcpp R> RcppVectorExample(c(1,4,9), api="new") $result [1] 1 2 3 $original [1] 1 4 9 R> RcppVectorExample(c(2,5,10), api="new") $result [1] 1.414 2.236 3.162 $original [1] ?2 ?5 10 R>
Perhaps you are assuming that c(1,4,9) generates an integer vector in this example. In fact, c(1,4,9) is a numeric (in the sense of double precision) vector. You need to write c(1L, 4L, 9L) to ensure you have an integer vector or coerce it with as.integer().
At this level, behaviour is the same with 'classic' API but we know of at least one case where 'classic' doesn't cast a Matrix right and 'new' does. | so, what happens if there is no coercion? ?Are the contents of the | SEXP copied to newly allocated storage or not? It depends. In the new API we do fewer copies. | The reason that I ask is because there are some places in the lme4 | code where I want to make sure that I have a pointer to the contents | of the original vector because I am want to change the contents | without copying. I think we do that. If you look at the vignette and the timing example -- but using pointer accessors (eg double *x = foo.begin()) and pointer ops (eg x++) we get essentially the same speed as pure and ugly C code. ?Because we do away with the copy, and the operator[], for purest speed.
Yes, I like that idea of being able to use foo.begin() to access the pointer. Part of my question related to whether I could count on getting the pointer to the original contents from R. In some way I would need to check whether the R object had been coerced, and hence allocated new storage for its contents, or not. That's not an immediate problem, though. Thanks for the response.
Hope that answer the first batch of questions. ?Keep'em coming! Dirk -- ?Registration is open for the 2nd International conference R / Finance 2010 ?See http://www.RinFinance.com for details, and see you in Chicago in April!
On 12 March 2010 at 11:23, Douglas Bates wrote:
| Nice approach. I presume that the Pairlist object is converted to an | VECSXP in the process of being packaged for return. I will play | around a bit with expressions like | | Rcpp::List res(Rcpp::PairList(Rcpp::Named(...), ...)) Not sure if that works with Rcpp::List as a returned object. | I would like to have the capability of attaching an attribute named | "class" to the VECSXP before returning it. We do see attr() on RObject types as a quick grep on unitTests/* revelas | > | An unrelated question -- has anyone looked at accessing S4 classed | > | objects in Rcpp? ?I have had considerable experience with S4 classes | > | and objects and could keep such a project in mind as I begin to | > | explore Rcpp. | > | > I'd be happy to go there (time permitting) but very gently as I am still much | > more firmly in S3 land. ?Romain may be more eager :) | | One of the nice properties of S4 classed objects is that you can | define validity checks for classes and bypass some of the code that | does validity checking of arguments. I include the S4 unit test file below. That's the closest we've come to docs :) | Perhaps you are assuming that c(1,4,9) generates an integer vector in | this example. In fact, c(1,4,9) is a numeric (in the sense of double | precision) vector. You need to write c(1L, 4L, 9L) to ensure you have | an integer vector or coerce it with as.integer(). Yes, I am still trying to get used to that. So in that sense my test will have been less rigid. In this particular example it does work for both class and new without fail when I submit c(1L, 4L, 9L) | Yes, I like that idea of being able to use foo.begin() to access the | pointer. Part of my question related to whether I could count on | getting the pointer to the original contents from R. In some way I | would need to check whether the R object had been coerced, and hence | allocated new storage for its contents, or not. That's not an | immediate problem, though. A fairly firm 'yes'. The classic API did more copying, the new API does a lot more SEXP preservation. You should be good. Dirk inst/unitTests/runit.S4.R below: #!/usr/bin/r -t # # Copyright (C) 2010 Dirk Eddelbuettel and Romain Francois # # This file is part of Rcpp. # # Rcpp is free software: you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # Rcpp is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with Rcpp. If not, see <http://www.gnu.org/licenses/>. .setUp <- function(){ suppressMessages( require( inline ) ) } test.S4 <- function(){ funx <- cfunction(signature(x = "ANY" ), ' RObject y(x) ; List res(5) ; res[0] = y.isS4() ; res[1] = y.hasSlot("x") ; res[2] = y.hasSlot("z") ; res[3] = y.slot("x") ; res[4] = y.slot("y") ; return res ; ', Rcpp=TRUE, verbose=FALSE, includes = "using namespace Rcpp;" ) setClass("track", representation(x="numeric", y="numeric")) tr <- new( "track", x = 2, y = 2 ) checkEquals( funx(tr), list( TRUE, TRUE, FALSE, 2.0, 2.0 ) , msg = "slot management" ) funx <- cfunction(signature(x = "ANY" ), ' RObject y(x) ; y.slot( "x" ) = 10.0 ; y.slot( "y" ) = 20.0 ; return R_NilValue ; ', Rcpp=TRUE, verbose=FALSE, includes = "using namespace Rcpp;" ) funx( tr ) checkEquals( tr at x, 10.0 , msg = "slot('x') = 10" ) checkEquals( tr at y, 20.0 , msg = "slot('y') = 20" ) funx <- cfunction(signature(x = "ANY" ), ' RObject y(x) ; y.slot( "foo" ) = 10.0 ; return R_NilValue ; ', Rcpp=TRUE, verbose=FALSE, includes = "using namespace Rcpp;" ) checkException( funx( tr ), msg = "slot does not exist" ) funx <- cfunction(signature(x = "ANY" ), ' RObject y(x) ; y.slot( "foo" ) ; return R_NilValue ; ', Rcpp=TRUE, verbose=FALSE, includes = "using namespace Rcpp;" ) checkException( funx( tr ), msg = "slot does not exist" ) }
Registration is open for the 2nd International conference R / Finance 2010 See http://www.RinFinance.com for details, and see you in Chicago in April!
Hello Doug, Le 12/03/10 17:14, Douglas Bates a ?crit :
I have looked at your (i.e. Dirk and Roman's) submitted Rjournal article on Rcpp and decided that I should take the plunge. To get my feet wet I am starting with a relatively simple example interfacing to (God help us) 1960's style Fortran written by Mike Powell and incorporated in the minqa package. The value to be returned is a named list of heterogeneous values. I can see how to construct such a list in the classic Rcpp API using a RcppResultSet object but I don't know the best way to accomplish this in the more modern Rcpp namespace.
That is tricky. We have Pairlist has Dirk suggested, but as the name implies they manage LISTSXP objects and not VECSXP. Rcpp::List (which manages VECSXP) has less convenient semantics (but this is on my todo list), you can do : List x ; x["foo"] = 1 ; x["bar"] = "something" ; which is kind of convenient, but that is not cheap as the VECSXP is reallocated each time. Otherwise, it is possible to allocate it first and manage the names separately: List x(2); x[0] = 1 ; x[1] = "something" ; CharacterVector y(2) ; y = "foo", "bar" ; x.names() = y ; but this is too verbose. what we need is something like this: List x( Named( "foo" ) = 1, Named( "bar" ) = "something" ) ;
Can someone point me in the right direction?
We need to get better at documenting Rcpp, now that the code is getting stable. For now I think the best place to look for examples is the unit tests. > system.file( "unitTests", package = "Rcpp" ) They are vaguely organized in terms of classes.
Also, is there a convenient idiom for attaching an S3 class name to a list to be returned?
yes. x.attr( "class" ) = "whatever" ;
An unrelated question -- has anyone looked at accessing S4 classed objects in Rcpp? I have had considerable experience with S4 classes and objects and could keep such a project in mind as I begin to explore Rcpp.
The slot method of the RObject class (which is the parent class of all our classes) allows read/write of slots. x.slot( "foo" ) = 3 ; this throws exception if x is not an S4 object or if its class does not have a "foo" attribute. this then uses the R function "slot<-" (there might be a better way with the R api, but I did not find it at the time)
Yet another question, if i may. Does coercion take place in constructing an object like Rcpp::NumericVector aa(a); where a is an SEXP, say an SEXP argument to a C++ function, where I accidentally passed an integer vector instead of a double vector? If so, what happens if there is no coercion? Are the contents of the SEXP copied to newly allocated storage or not?
There is coercion. for example a Rcpp::NumericVector is only allowed to store a REALSXP, if you give it a INTSXP, coercion happens. There is no copy. The objects are essentially proxy objects to the SEXP they encapsulate. We have the clone template function to explicitely copy when this is needed, for example : Rcpp::NumericVector aa(a) ; Rcpp::NumericVector b = Rcpp::clone(aa) ;
The reason that I ask is because there are some places in the lme4 code where I want to make sure that I have a pointer to the contents of the original vector because I am want to change the contents without copying.
yes. we want that too. but this is somewhat orthogonal to the "copy on change" semantics of R ...
Le 12/03/10 18:23, Douglas Bates a ?crit :
On Fri, Mar 12, 2010 at 10:47 AM, Dirk Eddelbuettel<edd at debian.org> wrote:
Hi Doug,
Welcome! We're honoured to have you here. And glad you picked the list.
On 12 March 2010 at 10:14, Douglas Bates wrote:
| I have looked at your (i.e. Dirk and Roman's) submitted Rjournal
| article on Rcpp and decided that I should take the plunge.
|
| To get my feet wet I am starting with a relatively simple example
| interfacing to (God help us) 1960's style Fortran written by Mike
| Powell and incorporated in the minqa package.
|
| The value to be returned is a named list of heterogeneous values. I
| can see how to construct such a list in the classic Rcpp API using a
| RcppResultSet object but I don't know the best way to accomplish this
| in the more modern Rcpp namespace. Can someone point me in the right
| direction? Also, is there a convenient idiom for attaching an S3
| class name to a list to be returned?
Yes. See eg the examples I put on my blog, eg inside of RcppArmadillo we
simply do (including actual fastLm source here modulu two ifdefs dealing with
older versions -- which means you need Armadillo 0.9.0 or later)
#include<RcppArmadillo.h>
extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {
Rcpp::NumericVector yr(ys); // creates Rcpp vector from SEXP
Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
arma::colvec y(yr.begin(), yr.size(), false);
arma::colvec coef = solve(X, y); // fit model y ~ X
arma::colvec resid = y - X*coef; // residuals
double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) );
// std.error of estimate
arma::colvec stderrest = sqrt( sig2 * diagvec( arma::inv(arma::trans(X)*X)) );
Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef),
Rcpp::Named( "stderr", stderrest));
return res;
}
[ By the way, I still think I should try to implement all four of your "how
to do lm" solutions from the older R News / R Journal piece in Armadillo one
day :-) And see what the timing impacts would be. ]
That works "on the fly" with Pairlist and named. We also have Rcpp::List
examples, but AFAICR you need to 'predeclare' how long your list is and the
set each element. The 200+ unit tests will have examples. Better docs will
appear one day...
Nice approach. I presume that the Pairlist object is converted to an VECSXP in the process of being packaged for return.
No, they are shipped to R as a LISTSXP.
I will play around a bit with expressions like Rcpp::List res(Rcpp::PairList(Rcpp::Named(...), ...))
This should work since the List( SEXP ) constructor uses coercion. Let us know otherwise.
I would like to have the capability of attaching an attribute named "class" to the VECSXP before returning it.
res.attr( "class" ) = "foo" ;
| An unrelated question -- has anyone looked at accessing S4 classed | objects in Rcpp? I have had considerable experience with S4 classes | and objects and could keep such a project in mind as I begin to | explore Rcpp. I'd be happy to go there (time permitting) but very gently as I am still much more firmly in S3 land. Romain may be more eager :)
One of the nice properties of S4 classed objects is that you can define validity checks for classes and bypass some of the code that does validity checking of arguments.
For now the only things we have in the api to deal with S4'ness is these methods : "isS4", "slot", "hasSlot". Not sure right now how we could formalize S4 objects in the api, but this would definitely be a fine addition.
| Yet another question, if i may. Does coercion take place in | constructing an object like | | Rcpp::NumericVector aa(a); | | where a is an SEXP, say an SEXP argument to a C++ function, where I | accidentally passed an integer vector instead of a double vector? If Very much so. Again, see my blog and the RcppVectorExample code. Int in, Double out. All magic hidden. And of course you can just try it: R> library(RcppExamples) Loading required package: Rcpp R> RcppVectorExample(c(1,4,9), api="new") $result [1] 1 2 3 $original [1] 1 4 9 R> RcppVectorExample(c(2,5,10), api="new") $result [1] 1.414 2.236 3.162 $original [1] 2 5 10 R>
Perhaps you are assuming that c(1,4,9) generates an integer vector in this example. In fact, c(1,4,9) is a numeric (in the sense of double precision) vector. You need to write c(1L, 4L, 9L) to ensure you have an integer vector or coerce it with as.integer().
At this level, behaviour is the same with 'classic' API but we know of at least one case where 'classic' doesn't cast a Matrix right and 'new' does. | so, what happens if there is no coercion? Are the contents of the | SEXP copied to newly allocated storage or not? It depends. In the new API we do fewer copies. | The reason that I ask is because there are some places in the lme4 | code where I want to make sure that I have a pointer to the contents | of the original vector because I am want to change the contents | without copying. I think we do that. If you look at the vignette and the timing example -- but using pointer accessors (eg double *x = foo.begin()) and pointer ops (eg x++) we get essentially the same speed as pure and ugly C code. Because we do away with the copy, and the operator[], for purest speed.
Yes, I like that idea of being able to use foo.begin() to access the pointer. Part of my question related to whether I could count on getting the pointer to the original contents from R.
Yes. The objects of the new api have the SEXP as their data member and methods applied on objects are relayed back to the SEXP in terms of calls t the R API. We use R memory management all the way. There are some times though where memory is reallocated, because there is no other way. for example if you add values to a vector (push_back, push_front, etc ...) In some way I
would need to check whether the R object had been coerced, and hence allocated new storage for its contents, or not. That's not an immediate problem, though.
There is something that might be helpful (or not). In the file RcppCommon.h, we have the RCPP_RETURN_VECTOR macro, this is an initial attempt to avoid writing the same switch( TYPEOF(x), case REALSXP: ... code over and over again. There is an example use in RcppCommon.cpp
Thanks for the response.
Hope that answer the first batch of questions. Keep'em coming! Dirk
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 `- http://tr.im/O1wO : highlight 0.1-5
On Fri, Mar 12, 2010 at 2:23 PM, Romain Francois
<romain at r-enthusiasts.com> wrote:
Le 12/03/10 18:23, Douglas Bates a ?crit :
On Fri, Mar 12, 2010 at 10:47 AM, Dirk Eddelbuettel<edd at debian.org> ?wrote:
Hi Doug,
Welcome! We're honoured to have you here. And glad you picked the list.
On 12 March 2010 at 10:14, Douglas Bates wrote:
| I have looked at your (i.e. Dirk and Roman's) submitted Rjournal
| article on Rcpp and decided that I should take the plunge.
|
| To get my feet wet I am starting with a relatively simple example
| interfacing to (God help us) 1960's style Fortran written by Mike
| Powell and incorporated in the minqa package.
|
| The value to be returned is a named list of heterogeneous values. ?I
| can see how to construct such a list in the classic Rcpp API using a
| RcppResultSet object but I don't know the best way to accomplish this
| in the more modern Rcpp namespace. ?Can someone point me in the right
| direction? ?Also, is there a convenient idiom for attaching an S3
| class name to a list to be returned?
Yes. See eg the examples I put on my blog, eg inside of RcppArmadillo we
simply do (including actual fastLm source here modulu two ifdefs dealing
with
older versions -- which means you need Armadillo 0.9.0 or later)
#include<RcppArmadillo.h>
extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {
? ?Rcpp::NumericVector yr(ys); ? ? ? ? ? ? ? ? // creates Rcpp vector
from SEXP
? ?Rcpp::NumericMatrix Xr(Xs); ? ? ? ? ? ? ? ? // creates Rcpp matrix
from SEXP
? ?int n = Xr.nrow(), k = Xr.ncol();
? ?arma::mat X(Xr.begin(), n, k, false); ? ? ? // reuses memory and
avoids extra copy
? ?arma::colvec y(yr.begin(), yr.size(), false);
? ?arma::colvec coef = solve(X, y); ? ? ? ? ? ?// fit model y ~ X
? ?arma::colvec resid = y - X*coef; ? ? ? ? ? ?// residuals
? ?double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) );
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?// std.error of estimate
? ?arma::colvec stderrest = sqrt( sig2 * diagvec(
arma::inv(arma::trans(X)*X)) );
? ?Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef),
? ? ? ? ? ? ? ? ? ? ? Rcpp::Named( "stderr", stderrest));
? ?return res;
}
[ By the way, I still think I should try to implement all four of your
"how
to do lm" solutions from the older R News / R Journal piece in Armadillo
one
day :-) ? And see what the timing impacts would be. ]
That works "on the fly" with Pairlist and named. ?We also have Rcpp::List
examples, but AFAICR you need to 'predeclare' how long your list is and
the
set each element. ?The 200+ unit tests will have examples. Better docs
will
appear one day...
Nice approach. ?I presume that the Pairlist object is converted to an VECSXP in the process of being packaged for return.
No, they are shipped to R as a LISTSXP.
Hmm. That may cause problems in the long run. IIRC, having LISTSXPs, other than those related to language objects, loose in R is frowned upon. I'll check before saying anything definite.
I will play around a bit with expressions like Rcpp::List res(Rcpp::PairList(Rcpp::Named(...), ...))
This should work since the List( SEXP ) constructor uses coercion. Let us know otherwise.
I tried it and was terribly frustrated that my example didn't work and I couldn't decide why. After much exploration I found that it had nothing to do with that construction and was in fact due to my using 6 SEXPs in a Pairlist constructor. Apparently my version is not compiled with HAS_VARIADIC_TEMPLATES set and that means the constructors only go up to 5. I see now that because I am using g++ 4.4.1 I should probably set the compiler flag -std=c++0x I have tried it again initializing a list from the Pairlist object and that works. Do you happen to know if that conversion determines the length of the result before allocating it? I tried to read the code for internal::ListInitialization but I am not sufficiently fluent in C++ to decide what is going on.
I would like to have the capability of attaching an attribute named "class" to the VECSXP before returning it.
res.attr( "class" ) = "foo" ;
| An unrelated question -- has anyone looked at accessing S4 classed | objects in Rcpp? ?I have had considerable experience with S4 classes | and objects and could keep such a project in mind as I begin to | explore Rcpp. I'd be happy to go there (time permitting) but very gently as I am still much more firmly in S3 land. ?Romain may be more eager :)
One of the nice properties of S4 classed objects is that you can define validity checks for classes and bypass some of the code that does validity checking of arguments.
For now the only things we have in the api to deal with S4'ness is these methods : "isS4", "slot", "hasSlot". Not sure right now how we could formalize S4 objects in the api, but this would definitely be a fine addition.
| Yet another question, if i may. ? Does coercion take place in | constructing an object like | | Rcpp::NumericVector aa(a); | | where a is an SEXP, say an SEXP argument to a C++ function, where I | accidentally passed an integer vector instead of a double vector? ?If Very much so. ?Again, see my blog and the RcppVectorExample code. ?Int in, Double out. All magic hidden. And of course you can just try it: R> ?library(RcppExamples) Loading required package: Rcpp R> ?RcppVectorExample(c(1,4,9), api="new") $result [1] 1 2 3 $original [1] 1 4 9 R> ?RcppVectorExample(c(2,5,10), api="new") $result [1] 1.414 2.236 3.162 $original [1] ?2 ?5 10 R>
Perhaps you are assuming that c(1,4,9) generates an integer vector in this example. ?In fact, c(1,4,9) is a numeric (in the sense of double precision) vector. ?You need to write c(1L, 4L, 9L) to ensure you have an integer vector or coerce it with as.integer().
At this level, behaviour is the same with 'classic' API but we know of at least one case where 'classic' doesn't cast a Matrix right and 'new' does. | so, what happens if there is no coercion? ?Are the contents of the | SEXP copied to newly allocated storage or not? It depends. In the new API we do fewer copies. | The reason that I ask is because there are some places in the lme4 | code where I want to make sure that I have a pointer to the contents | of the original vector because I am want to change the contents | without copying. I think we do that. If you look at the vignette and the timing example -- but using pointer accessors (eg double *x = foo.begin()) and pointer ops (eg x++) we get essentially the same speed as pure and ugly C code. ?Because we do away with the copy, and the operator[], for purest speed.
Yes, I like that idea of being able to use foo.begin() to access the pointer. ?Part of my question related to whether I could count on getting the pointer to the original contents from R.
Yes. The objects of the new api have the SEXP as their data member and methods applied on objects are relayed back to the SEXP in terms of calls t the R API. We use R memory management all the way. There are some times though where memory is reallocated, because there is no other way. for example if you add values to a vector (push_back, push_front, etc ...) ?In some way I
would need to check whether the R object had been coerced, and hence allocated new storage for its contents, or not. ?That's not an immediate problem, though.
There is something that might be helpful (or not). In the file RcppCommon.h, we have the RCPP_RETURN_VECTOR macro, this is an initial attempt to avoid writing the same switch( TYPEOF(x), case REALSXP: ... code over and over again. There is an example use in RcppCommon.cpp
Thanks for the response.
Hope that answer the first batch of questions. ?Keep'em coming! Dirk
-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 `- http://tr.im/O1wO : highlight 0.1-5
Le 12/03/10 22:00, Douglas Bates a ?crit :
On Fri, Mar 12, 2010 at 2:23 PM, Romain Francois <romain at r-enthusiasts.com> wrote:
Le 12/03/10 18:23, Douglas Bates a ?crit :
On Fri, Mar 12, 2010 at 10:47 AM, Dirk Eddelbuettel<edd at debian.org> wrote:
Hi Doug,
Welcome! We're honoured to have you here. And glad you picked the list.
On 12 March 2010 at 10:14, Douglas Bates wrote:
| I have looked at your (i.e. Dirk and Roman's) submitted Rjournal
| article on Rcpp and decided that I should take the plunge.
|
| To get my feet wet I am starting with a relatively simple example
| interfacing to (God help us) 1960's style Fortran written by Mike
| Powell and incorporated in the minqa package.
|
| The value to be returned is a named list of heterogeneous values. I
| can see how to construct such a list in the classic Rcpp API using a
| RcppResultSet object but I don't know the best way to accomplish this
| in the more modern Rcpp namespace. Can someone point me in the right
| direction? Also, is there a convenient idiom for attaching an S3
| class name to a list to be returned?
Yes. See eg the examples I put on my blog, eg inside of RcppArmadillo we
simply do (including actual fastLm source here modulu two ifdefs dealing
with
older versions -- which means you need Armadillo 0.9.0 or later)
#include<RcppArmadillo.h>
extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {
Rcpp::NumericVector yr(ys); // creates Rcpp vector
from SEXP
Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix
from SEXP
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and
avoids extra copy
arma::colvec y(yr.begin(), yr.size(), false);
arma::colvec coef = solve(X, y); // fit model y ~ X
arma::colvec resid = y - X*coef; // residuals
double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) );
// std.error of estimate
arma::colvec stderrest = sqrt( sig2 * diagvec(
arma::inv(arma::trans(X)*X)) );
Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef),
Rcpp::Named( "stderr", stderrest));
return res;
}
[ By the way, I still think I should try to implement all four of your
"how
to do lm" solutions from the older R News / R Journal piece in Armadillo
one
day :-) And see what the timing impacts would be. ]
That works "on the fly" with Pairlist and named. We also have Rcpp::List
examples, but AFAICR you need to 'predeclare' how long your list is and
the
set each element. The 200+ unit tests will have examples. Better docs
will
appear one day...
Nice approach. I presume that the Pairlist object is converted to an VECSXP in the process of being packaged for return.
No, they are shipped to R as a LISTSXP.
Hmm. That may cause problems in the long run. IIRC, having LISTSXPs, other than those related to language objects, loose in R is frowned upon. I'll check before saying anything definite.
I agree. Nobody would do something like this in R: > pairlist( foo = 1, bar = 2 ) so I would be in favour of giving less exposure to the Pairlist class. IIRC the only reason why Dirk uses them is because of the convenience of creating the object in one call... so we just need to have similar semantics with the List class.
I will play around a bit with expressions like Rcpp::List res(Rcpp::PairList(Rcpp::Named(...), ...))
This should work since the List( SEXP ) constructor uses coercion. Let us know otherwise.
I tried it and was terribly frustrated that my example didn't work and I couldn't decide why. After much exploration I found that it had nothing to do with that construction and was in fact due to my using 6 SEXPs in a Pairlist constructor. Apparently my version is not compiled with HAS_VARIADIC_TEMPLATES set and that means the constructors only go up to 5. I see now that because I am using g++ 4.4.1 I should probably set the compiler flag -std=c++0x
This should enable c++0x : RCPP_CXX0X="yes" R CMD INSTALL Rcpp (we don't use the PKG_CXXFLAGS for this because otherwise R CMD check is unhappy about non portable flags) Ah yes. 5 here is arbitrary, and it would not be too hard to make it 10, 20 or something. Now that I am on OSX, I am stuck with gcc 4.2 :-( so no c++0x for me anymore (well only when I use my fedora machine). I took the precaution of wrapping the code bloat code into pseudo xml : /* <code-bloat> */ /* </code-bloat> */ but wrote that manually. Should not be hard to come up with some R code to generate the code bloat... maybe tomorrow.
I have tried it again initializing a list from the Pairlist object and that works. Do you happen to know if that conversion determines the length of the result before allocating it?
We outsource this to the as.list function of R. This is probably not the
easiest piece of code, and it surely deserve some more documentation. In
a nutshell :
The constructor :
List(SEXP) aka Vector<VECSXP>( SEXP )
calls r_cast<VECSXP> which calls r_true_cast if the object is not a
VECSXP, which looks like this:
template<> SEXP r_true_cast<VECSXP>(SEXP x){
return convert_using_rfunction(x, "as.list" ) ;
}
This is somewhat lazy and we could to some switch( TYPEOF( )) to keep
some things internal for the easy cases (LISTSXP, EXPRSXP, LANGSXP) and
use as.list for the other cases to avoid the price of the round trip to
the R side.
I tried to read the code for internal::ListInitialization but I am not sufficiently fluent in C++ to decide what is going on.
ListInitialization might not be what you think. It is only there to
support this notation.
NumericVector x(4) ;
x = 1.0, 2.0, 3.0, 4.0 ;
This is a trick we borrow from blitz++
but C++0x gives this notation that I find much nicer:
NumericVector x = {1.0, 2.0, 3.0; 4.0} ;
I would like to have the capability of attaching an attribute named "class" to the VECSXP before returning it.
res.attr( "class" ) = "foo" ;
| An unrelated question -- has anyone looked at accessing S4 classed | objects in Rcpp? I have had considerable experience with S4 classes | and objects and could keep such a project in mind as I begin to | explore Rcpp. I'd be happy to go there (time permitting) but very gently as I am still much more firmly in S3 land. Romain may be more eager :)
One of the nice properties of S4 classed objects is that you can define validity checks for classes and bypass some of the code that does validity checking of arguments.
For now the only things we have in the api to deal with S4'ness is these methods : "isS4", "slot", "hasSlot". Not sure right now how we could formalize S4 objects in the api, but this would definitely be a fine addition.
| Yet another question, if i may. Does coercion take place in | constructing an object like | | Rcpp::NumericVector aa(a); | | where a is an SEXP, say an SEXP argument to a C++ function, where I | accidentally passed an integer vector instead of a double vector? If Very much so. Again, see my blog and the RcppVectorExample code. Int in, Double out. All magic hidden. And of course you can just try it: R> library(RcppExamples) Loading required package: Rcpp R> RcppVectorExample(c(1,4,9), api="new") $result [1] 1 2 3 $original [1] 1 4 9 R> RcppVectorExample(c(2,5,10), api="new") $result [1] 1.414 2.236 3.162 $original [1] 2 5 10 R>
Perhaps you are assuming that c(1,4,9) generates an integer vector in this example. In fact, c(1,4,9) is a numeric (in the sense of double precision) vector. You need to write c(1L, 4L, 9L) to ensure you have an integer vector or coerce it with as.integer().
At this level, behaviour is the same with 'classic' API but we know of at least one case where 'classic' doesn't cast a Matrix right and 'new' does. | so, what happens if there is no coercion? Are the contents of the | SEXP copied to newly allocated storage or not? It depends. In the new API we do fewer copies. | The reason that I ask is because there are some places in the lme4 | code where I want to make sure that I have a pointer to the contents | of the original vector because I am want to change the contents | without copying. I think we do that. If you look at the vignette and the timing example -- but using pointer accessors (eg double *x = foo.begin()) and pointer ops (eg x++) we get essentially the same speed as pure and ugly C code. Because we do away with the copy, and the operator[], for purest speed.
Yes, I like that idea of being able to use foo.begin() to access the pointer. Part of my question related to whether I could count on getting the pointer to the original contents from R.
Yes. The objects of the new api have the SEXP as their data member and methods applied on objects are relayed back to the SEXP in terms of calls t the R API. We use R memory management all the way. There are some times though where memory is reallocated, because there is no other way. for example if you add values to a vector (push_back, push_front, etc ...) In some way I
would need to check whether the R object had been coerced, and hence allocated new storage for its contents, or not. That's not an immediate problem, though.
There is something that might be helpful (or not). In the file RcppCommon.h, we have the RCPP_RETURN_VECTOR macro, this is an initial attempt to avoid writing the same switch( TYPEOF(x), case REALSXP: ... code over and over again. There is an example use in RcppCommon.cpp
Thanks for the response.
Hope that answer the first batch of questions. Keep'em coming! Dirk
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/OIXN : raster images and RImageJ |- http://tr.im/OcQe : Rcpp 0.7.7 `- http://tr.im/O1wO : highlight 0.1-5