Ever since I read Phil Spector's book on S I have been a fan of
functional programming in S and R. When Jose Pinheiro and I were
working on the nlme package there was a joke between us that you could
tell which of us wrote which parts of the code because his parts
always had an object named "aux" and my parts always had
"unlist(lapply(list(...)))".
Even within C++ I like to use the std:: algorithms like
std::transform, but, of course, there are differences between a
strongly typed language like C++ and a dynamically typed language like
S. Templates can get around these differences to some extent but I am
still a bit of a novice regarding templates.
Currently I want to apply some functions to lists but entirely within
the C++ code (i.e. I don't want to create Rcpp Function objects and
call back to R). For the sake of argument, consider a function that
extracts the lengths of the components of the lists.
library(Rcpp)
library(inline)
inc <- '
class length {
public:
R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));}
};
'
code <- '
List lst(ll);
IntegerVector ans(lst.size());
std::transform(lst.begin(), lst.end(), ans.begin(), length());
return ans;
'
ltst <- cxxfunction(signature(ll = "list"), code, "Rcpp", inc)
ll <- list(a = numeric(0), b = LETTERS[6:9], c = c)
ltst(ll)
sapply(ll, length)
I would like to create a templated sapply function like
template <int RTYPE>
Vector<RTYPE> sapply(List ll, ??) {
Vector<RTYPE> ans(ll.size());
CharacterVector nms = ll.names();
if (nms.size() == ll.size()) ans.names() = nms;
std::transform(ll.begin(), ll.end(), ans.begin(), ??);
return ans;
}
but I don't know how to specify the second argument that is a function
that returns the atomic element type of a Vector<RTYPE> (is this as
simple as Vector<RTYPE>::value_type?) and has a single argument which
probably should be an RObject (although might be an SEXP, if that was
more convenient). Can someone (probably Romain) provide some
guidance?
[Rcpp-devel] Rcpp internal versions of lapply and sapply
7 messages · Douglas Bates, Romain Francois
Le 16/06/10 14:58, Douglas Bates a ?crit :
Ever since I read Phil Spector's book on S I have been a fan of
functional programming in S and R. When Jose Pinheiro and I were
working on the nlme package there was a joke between us that you could
tell which of us wrote which parts of the code because his parts
always had an object named "aux" and my parts always had
"unlist(lapply(list(...)))".
Even within C++ I like to use the std:: algorithms like
std::transform, but, of course, there are differences between a
strongly typed language like C++ and a dynamically typed language like
S. Templates can get around these differences to some extent but I am
still a bit of a novice regarding templates.
Currently I want to apply some functions to lists but entirely within
the C++ code (i.e. I don't want to create Rcpp Function objects and
call back to R). For the sake of argument, consider a function that
extracts the lengths of the components of the lists.
library(Rcpp)
library(inline)
inc<- '
class length {
public:
R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));}
};
'
code<- '
List lst(ll);
IntegerVector ans(lst.size());
std::transform(lst.begin(), lst.end(), ans.begin(), length());
return ans;
'
ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc)
ll<- list(a = numeric(0), b = LETTERS[6:9], c = c)
ltst(ll)
sapply(ll, length)
I would like to create a templated sapply function like
template<int RTYPE>
Vector<RTYPE> sapply(List ll, ??) {
Vector<RTYPE> ans(ll.size());
CharacterVector nms = ll.names();
if (nms.size() == ll.size()) ans.names() = nms;
std::transform(ll.begin(), ll.end(), ans.begin(), ??);
return ans;
}
but I don't know how to specify the second argument that is a function
that returns the atomic element type of a Vector<RTYPE> (is this as
simple as Vector<RTYPE>::value_type?) and has a single argument which
probably should be an RObject (although might be an SEXP, if that was
more convenient). Can someone (probably Romain) provide some
guidance?
I recently added Vector::import_transform which might be all you need.
NumericVector x = NumericVector::import_transform(
y.begin(), y.end(), f) ;
This will work as long as f is of some type acceptable by std::transform.
But let's keep going anyway. One way (I'll write later why this is not
fully satisfactory) is here:
require( Rcpp )
require( inline )
inc <- '
template <typename OBJECT, typename FUN>
SEXP sapply(const OBJECT& object, FUN fun) {
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
return ans ;
}
template <int RT, typename FUN>
SEXP sapply( const Rcpp::Vector<RT>& object, FUN fun){
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
ans.names() = object.names() ;
return ans ;
}
template <typename T>
struct square : std::unary_function<T,T> {
inline T operator()( T& t ){ return t*t ; }
} ;
'
code <- '
NumericVector xx( x );
return sapply( xx, square<double>() ) ;
'
fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
> fx( seq(1,10, length = 5 ) )
[1] 1.0000 10.5625 30.2500 60.0625 100.0000
I'll try to explain bit by bit. Please let me know if something is too
cryptic. The good thing about templates is that as long as they work
they don't complain. The bad thing is that when they start complaining,
you need large screens to accomodate all the warnings/erros you get...
Starting from the last thing, the square struct.
template <typename T>
struct square : std::unary_function<T,T> {
inline T operator()( const T& t ){ return t*t ; }
} ;
we need the function that goes into sapply to help, i.e we need it to be
aware of the result type. inheriting from std::unary_function makes this
easy. It essentially just adds some typedef. See
http://www.cplusplus.com/reference/std/functional/unary_function/
So when we do square<double>(), the function is self aware that it
returns a double. We need that to decide which kind of R vectors we want
to create. Which is the job of the Rcpp::traits::r_sexptype_traits
trait. So:
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
it is very important that this is const, because we are using this as a
template argument later and this needs to be compile time constant.
Then, we just delegate to std::transform through Vector::import_transform:
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
For now the sapply just return a SEXP because that is easy, but we could
have the result type deduced as well (fasten your seatbelt):
template <typename OBJECT, typename FUN>
Rcpp::Vector< Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype >
sapply(const OBJECT& object, FUN fun) {
Anyway, why is there two sapply ? because the first one is more generic,
you could sapply on a std::vector<double> for example, the only thing we
do with the object is to call begin() and end(), so as long as they
exist and they produce something that makes sense for std::transform, we
are fine (this is why templates are just so great).
The second version is specific to Rcpp vectors, on which we can call
names().
Now why is not satisfactory ? This is not lazy enough for me. With a
little bit more work, instead of returning a Rcpp::Vector which
allocates all of its memory right now (not lazy), we could have sapply
returning some sort of proxy object. The proxy object would expose a
similar interface : operator[] and iterator, but the result will only be
calculated when truly necessary.
Why should we care about that ? Imagine, instead of just calculating the
square, we want to find out if all the squared values are below 15, ie.
the equivalent of the R code "all( x^2 < 15 )" ? We only need to
calculate three values to make our decision, so why bother calculating
the 2 last ones. (lazy)
This is the kind of thing I want to bring to Rcpp with the "sugar" piece
of the puzzle.
We would then have this expression:
code <- '
Rcpp::NumericVector xx( x) ;
return Rcpp::all( sapply( xx, square<double>() ) < 15 ) ;
'
It almost work right now (I need to add some code so that <15 makes
sense, I only added Vector < Vector so far), but currently requires to
compute all the 5 values before handling the data to all.
Please come back with questions if some gaps need to be filled.
Romain
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5
Small update incorporating the use of all (and using namespace Rcpp to
make things slightly less cluterred).
require( Rcpp )
require( inline )
inc <- '
using namespace Rcpp ;
template <typename OBJECT, typename FUN>
Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype >
sapply(const OBJECT& object, FUN fun) {
const int RTYPE = traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Vector<RTYPE> ans = Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
return ans ;
}
template <int RT, typename FUN>
Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype >
sapply( const Vector<RT>& object, FUN fun){
const int RTYPE = traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Vector<RTYPE> ans = Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
ans.names() = object.names() ;
return ans ;
}
template <typename T>
struct square : std::unary_function<T,T> {
inline T operator()( const T& t ){ return t*t ; }
} ;
'
code <- '
NumericVector xx( x) ;
return all( sapply( xx, square<double>() ) < 15 ) ;
'
fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
This needs rev 1555 of Rcpp for the "< 15" part. I'll try to come up
with a lazy version of this for the Rcpp sugar.
I'll probably recycle some of this thread into a blog post (with nicer
formatting).
Also worth noting the existence of Rcpp::unary_call, Rcpp::fixed_call
and Rcpp::binary_call that wrap up an R function in a strong type
illusion (powered by the twins Rcpp::as and Rcpp::wrap).
So for example (this will be less efficient for obvious reasons):
code <- '
NumericVector xx( x) ;
return all( sapply( xx, unary_call<double,double>( Function("square") )
) < 15 ) ;
'
square <- function(x) x^2
fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
Romain
PS : More examples of these in the runit.Language.R file.
Le 16/06/10 16:18, Romain Francois a ?crit :
Le 16/06/10 14:58, Douglas Bates a ?crit :
Ever since I read Phil Spector's book on S I have been a fan of
functional programming in S and R. When Jose Pinheiro and I were
working on the nlme package there was a joke between us that you could
tell which of us wrote which parts of the code because his parts
always had an object named "aux" and my parts always had
"unlist(lapply(list(...)))".
Even within C++ I like to use the std:: algorithms like
std::transform, but, of course, there are differences between a
strongly typed language like C++ and a dynamically typed language like
S. Templates can get around these differences to some extent but I am
still a bit of a novice regarding templates.
Currently I want to apply some functions to lists but entirely within
the C++ code (i.e. I don't want to create Rcpp Function objects and
call back to R). For the sake of argument, consider a function that
extracts the lengths of the components of the lists.
library(Rcpp)
library(inline)
inc<- '
class length {
public:
R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));}
};
'
code<- '
List lst(ll);
IntegerVector ans(lst.size());
std::transform(lst.begin(), lst.end(), ans.begin(), length());
return ans;
'
ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc)
ll<- list(a = numeric(0), b = LETTERS[6:9], c = c)
ltst(ll)
sapply(ll, length)
I would like to create a templated sapply function like
template<int RTYPE>
Vector<RTYPE> sapply(List ll, ??) {
Vector<RTYPE> ans(ll.size());
CharacterVector nms = ll.names();
if (nms.size() == ll.size()) ans.names() = nms;
std::transform(ll.begin(), ll.end(), ans.begin(), ??);
return ans;
}
but I don't know how to specify the second argument that is a function
that returns the atomic element type of a Vector<RTYPE> (is this as
simple as Vector<RTYPE>::value_type?) and has a single argument which
probably should be an RObject (although might be an SEXP, if that was
more convenient). Can someone (probably Romain) provide some
guidance?
I recently added Vector::import_transform which might be all you need.
NumericVector x = NumericVector::import_transform(
y.begin(), y.end(), f) ;
This will work as long as f is of some type acceptable by std::transform.
But let's keep going anyway. One way (I'll write later why this is not
fully satisfactory) is here:
require( Rcpp )
require( inline )
inc <- '
template <typename OBJECT, typename FUN>
SEXP sapply(const OBJECT& object, FUN fun) {
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
return ans ;
}
template <int RT, typename FUN>
SEXP sapply( const Rcpp::Vector<RT>& object, FUN fun){
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
ans.names() = object.names() ;
return ans ;
}
template <typename T>
struct square : std::unary_function<T,T> {
inline T operator()( T& t ){ return t*t ; }
} ;
'
code <- '
NumericVector xx( x );
return sapply( xx, square<double>() ) ;
'
fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
> fx( seq(1,10, length = 5 ) )
[1] 1.0000 10.5625 30.2500 60.0625 100.0000
I'll try to explain bit by bit. Please let me know if something is too
cryptic. The good thing about templates is that as long as they work
they don't complain. The bad thing is that when they start complaining,
you need large screens to accomodate all the warnings/erros you get...
Starting from the last thing, the square struct.
template <typename T>
struct square : std::unary_function<T,T> {
inline T operator()( const T& t ){ return t*t ; }
} ;
we need the function that goes into sapply to help, i.e we need it to be
aware of the result type. inheriting from std::unary_function makes this
easy. It essentially just adds some typedef. See
http://www.cplusplus.com/reference/std/functional/unary_function/
So when we do square<double>(), the function is self aware that it
returns a double. We need that to decide which kind of R vectors we want
to create. Which is the job of the Rcpp::traits::r_sexptype_traits
trait. So:
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
it is very important that this is const, because we are using this as a
template argument later and this needs to be compile time constant.
Then, we just delegate to std::transform through Vector::import_transform:
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
For now the sapply just return a SEXP because that is easy, but we could
have the result type deduced as well (fasten your seatbelt):
template <typename OBJECT, typename FUN>
Rcpp::Vector< Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype >
sapply(const OBJECT& object, FUN fun) {
Anyway, why is there two sapply ? because the first one is more generic,
you could sapply on a std::vector<double> for example, the only thing we
do with the object is to call begin() and end(), so as long as they
exist and they produce something that makes sense for std::transform, we
are fine (this is why templates are just so great).
The second version is specific to Rcpp vectors, on which we can call
names().
Now why is not satisfactory ? This is not lazy enough for me. With a
little bit more work, instead of returning a Rcpp::Vector which
allocates all of its memory right now (not lazy), we could have sapply
returning some sort of proxy object. The proxy object would expose a
similar interface : operator[] and iterator, but the result will only be
calculated when truly necessary.
Why should we care about that ? Imagine, instead of just calculating the
square, we want to find out if all the squared values are below 15, ie.
the equivalent of the R code "all( x^2 < 15 )" ? We only need to
calculate three values to make our decision, so why bother calculating
the 2 last ones. (lazy)
This is the kind of thing I want to bring to Rcpp with the "sugar" piece
of the puzzle.
We would then have this expression:
code <- '
Rcpp::NumericVector xx( x) ;
return Rcpp::all( sapply( xx, square<double>() ) < 15 ) ;
'
It almost work right now (I need to add some code so that <15 makes
sense, I only added Vector < Vector so far), but currently requires to
compute all the 5 values before handling the data to all.
Please come back with questions if some gaps need to be filled.
Romain
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5
Thank you very much, Romain. This certainly helps my understanding of unary_function, traits, etc. These are the sorts of things I can read about but until I see a non-trivial usage it is hard to keep these ideas straight. Two minor questions occur to me at this point. In my code I assigned the names to be the names of the input list but I think now that I should have assigned a clone of the object names instead. To be safe I should have cloned before assigning, right? I also checked the length before assigning but you didn't. Am I reflecting too much of my C programming background here? In other words, is there some magic in the Rcpp names assignment that does the checking? Thanks again. On Wed, Jun 16, 2010 at 9:55 AM, Romain Francois
<romain.francois at dbmail.com> wrote:
Small update incorporating the use of all (and using namespace Rcpp to make
things slightly less cluterred).
require( Rcpp )
require( inline )
inc <- '
using namespace Rcpp ;
template <typename OBJECT, typename FUN>
Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype >
sapply(const OBJECT& object, FUN fun) {
? ? ? ?const int RTYPE = traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
? ? ? ?Vector<RTYPE> ans = Vector<RTYPE>::import_transform(
? ? ? ? ? ? ? ?object.begin(), object.end(), fun
? ? ? ? ? ? ? ?) ;
? ? ? ?return ans ;
}
template <int RT, typename FUN>
Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype >
sapply( const Vector<RT>& object, FUN fun){
? ? ? ?const int RTYPE = traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
? ? ? ?Vector<RTYPE> ans = Vector<RTYPE>::import_transform(
? ? ? ? ? ? ? ?object.begin(), object.end(), fun
? ? ? ? ? ? ? ?) ;
? ? ? ?ans.names() = object.names() ;
? ? ? ?return ans ;
}
template <typename T>
struct square : std::unary_function<T,T> {
? ? ? ?inline T operator()( const T& t ){ return t*t ; }
} ;
'
code <- '
? ? ? ?NumericVector xx( x) ;
? ? ? ?return all( sapply( xx, square<double>() ) < 15 ) ;
'
fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
This needs rev 1555 of Rcpp for the "< 15" part. I'll try to come up with a
lazy version of this for the Rcpp sugar.
I'll probably recycle some of this thread into a blog post (with nicer
formatting).
Also worth noting the existence of Rcpp::unary_call, Rcpp::fixed_call and
Rcpp::binary_call that wrap up an R function in a strong type illusion
(powered by the twins Rcpp::as and Rcpp::wrap).
So for example (this will be less efficient for obvious reasons):
code <- '
? ? ? ?NumericVector xx( x) ;
? ? ? ?return all( sapply( xx, unary_call<double,double>( Function("square")
) ) < 15 ) ;
'
square <- function(x) x^2
fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
Romain
PS : More examples of these in the runit.Language.R file.
Le 16/06/10 16:18, Romain Francois a ?crit :
Le 16/06/10 14:58, Douglas Bates a ?crit :
Ever since I read Phil Spector's book on S I have been a fan of
functional programming in S and R. When Jose Pinheiro and I were
working on the nlme package there was a joke between us that you could
tell which of us wrote which parts of the code because his parts
always had an object named "aux" and my parts always had
"unlist(lapply(list(...)))".
Even within C++ I like to use the std:: algorithms like
std::transform, but, of course, there are differences between a
strongly typed language like C++ and a dynamically typed language like
S. Templates can get around these differences to some extent but I am
still a bit of a novice regarding templates.
Currently I want to apply some functions to lists but entirely within
the C++ code (i.e. I don't want to create Rcpp Function objects and
call back to R). For the sake of argument, consider a function that
extracts the lengths of the components of the lists.
library(Rcpp)
library(inline)
inc<- '
class length {
public:
R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));}
};
'
code<- '
List lst(ll);
IntegerVector ans(lst.size());
std::transform(lst.begin(), lst.end(), ans.begin(), length());
return ans;
'
ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc)
ll<- list(a = numeric(0), b = LETTERS[6:9], c = c)
ltst(ll)
sapply(ll, length)
I would like to create a templated sapply function like
template<int RTYPE>
Vector<RTYPE> sapply(List ll, ??) {
Vector<RTYPE> ans(ll.size());
CharacterVector nms = ll.names();
if (nms.size() == ll.size()) ans.names() = nms;
std::transform(ll.begin(), ll.end(), ans.begin(), ??);
return ans;
}
but I don't know how to specify the second argument that is a function
that returns the atomic element type of a Vector<RTYPE> (is this as
simple as Vector<RTYPE>::value_type?) and has a single argument which
probably should be an RObject (although might be an SEXP, if that was
more convenient). Can someone (probably Romain) provide some
guidance?
I recently added Vector::import_transform which might be all you need.
NumericVector x = NumericVector::import_transform(
y.begin(), y.end(), f) ;
This will work as long as f is of some type acceptable by std::transform.
But let's keep going anyway. One way (I'll write later why this is not
fully satisfactory) is here:
require( Rcpp )
require( inline )
inc <- '
template <typename OBJECT, typename FUN>
SEXP sapply(const OBJECT& object, FUN fun) {
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
return ans ;
}
template <int RT, typename FUN>
SEXP sapply( const Rcpp::Vector<RT>& object, FUN fun){
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
ans.names() = object.names() ;
return ans ;
}
template <typename T>
struct square : std::unary_function<T,T> {
inline T operator()( T& t ){ return t*t ; }
} ;
'
code <- '
NumericVector xx( x );
return sapply( xx, square<double>() ) ;
'
fx <- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
?> fx( seq(1,10, length = 5 ) )
[1] 1.0000 10.5625 30.2500 60.0625 100.0000
I'll try to explain bit by bit. Please let me know if something is too
cryptic. The good thing about templates is that as long as they work
they don't complain. The bad thing is that when they start complaining,
you need large screens to accomodate all the warnings/erros you get...
Starting from the last thing, the square struct.
template <typename T>
struct square : std::unary_function<T,T> {
inline T operator()( const T& t ){ return t*t ; }
} ;
we need the function that goes into sapply to help, i.e we need it to be
aware of the result type. inheriting from std::unary_function makes this
easy. It essentially just adds some typedef. See
http://www.cplusplus.com/reference/std/functional/unary_function/
So when we do square<double>(), the function is self aware that it
returns a double. We need that to decide which kind of R vectors we want
to create. Which is the job of the Rcpp::traits::r_sexptype_traits
trait. So:
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
it is very important that this is const, because we are using this as a
template argument later and this needs to be compile time constant.
Then, we just delegate to std::transform through Vector::import_transform:
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
For now the sapply just return a SEXP because that is easy, but we could
have the result type deduced as well (fasten your seatbelt):
template <typename OBJECT, typename FUN>
Rcpp::Vector< Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype >
sapply(const OBJECT& object, FUN fun) {
Anyway, why is there two sapply ? because the first one is more generic,
you could sapply on a std::vector<double> for example, the only thing we
do with the object is to call begin() and end(), so as long as they
exist and they produce something that makes sense for std::transform, we
are fine (this is why templates are just so great).
The second version is specific to Rcpp vectors, on which we can call
names().
Now why is not satisfactory ? This is not lazy enough for me. With a
little bit more work, instead of returning a Rcpp::Vector which
allocates all of its memory right now (not lazy), we could have sapply
returning some sort of proxy object. The proxy object would expose a
similar interface : operator[] and iterator, but the result will only be
calculated when truly necessary.
Why should we care about that ? Imagine, instead of just calculating the
square, we want to find out if all the squared values are below 15, ie.
the equivalent of the R code "all( x^2 < 15 )" ? We only need to
calculate three values to make our decision, so why bother calculating
the 2 last ones. (lazy)
This is the kind of thing I want to bring to Rcpp with the "sugar" piece
of the puzzle.
We would then have this expression:
code <- '
Rcpp::NumericVector xx( x) ;
return Rcpp::all( sapply( xx, square<double>() ) < 15 ) ;
'
It almost work right now (I need to add some code so that <15 makes
sense, I only added Vector < Vector so far), but currently requires to
compute all the 5 values before handling the data to all.
Please come back with questions if some gaps need to be filled.
Romain
-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5
Le 16/06/10 17:44, Douglas Bates a ?crit :
Thank you very much, Romain. This certainly helps my understanding of unary_function, traits, etc. These are the sorts of things I can read about but until I see a non-trivial usage it is hard to keep these ideas straight. Two minor questions occur to me at this point. In my code I assigned the names to be the names of the input list but I think now that I should have assigned a clone of the object names instead.
I have been lazy (the wrong kind of lazyness) when implementing names<-,
and just used a callback to the R function names<- . The relevant chunk
is in Vector.h in the NamesProxy class:
void set(SEXP x) const {
SEXP new_vec = PROTECT( internal::try_catch(
Rf_lcons( Rf_install("names<-"),
Rf_cons( parent, Rf_cons( x , R_NilValue) )))) ;
/* names<- makes a new vector, so we have to change
the SEXP of the parent of this proxy, it might be
worth to work directly with the names attribute instead
of using the names<- R function, but then we need to
take care of coercion, recycling, etc ... we cannot just
brutally assign the names attribute */
const_cast<Vector&>(parent).setSEXP( new_vec ) ;
UNPROTECT(1) ; /* new_vec */
}
So with this implementation, you don't need to worry about copying
names. But it might not be the best implementation.
To be safe I should have cloned before assigning, right? I also checked the length before assigning but you didn't. Am I reflecting too much of my C programming background here? In other words, is there some magic in the Rcpp names assignment that does the checking?
The magic comes from R.
> x <- 1:10
> names(x) <- c("a", "b")
> x
a b <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
1 2 3 4 5 6 7 8 9 10
In your example ll.names() is either NULL or a vector of the appropriate
size. I think (not sure though) that names<- is restrictive about the rhs:
> names(x) <- rnorm
Erreur dans as.character(function (n, mean = 0, sd = 1) :
cannot coerce type 'closure' to vector of type 'character'
Romain
Thanks again. On Wed, Jun 16, 2010 at 9:55 AM, Romain Francois <romain.francois at dbmail.com> wrote:
Small update incorporating the use of all (and using namespace Rcpp to make
things slightly less cluterred).
require( Rcpp )
require( inline )
inc<- '
using namespace Rcpp ;
template<typename OBJECT, typename FUN>
Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype>
sapply(const OBJECT& object, FUN fun) {
const int RTYPE = traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Vector<RTYPE> ans = Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
return ans ;
}
template<int RT, typename FUN>
Vector< traits::r_sexptype_traits< typename FUN::result_type>::rtype>
sapply( const Vector<RT>& object, FUN fun){
const int RTYPE = traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Vector<RTYPE> ans = Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
ans.names() = object.names() ;
return ans ;
}
template<typename T>
struct square : std::unary_function<T,T> {
inline T operator()( const T& t ){ return t*t ; }
} ;
'
code<- '
NumericVector xx( x) ;
return all( sapply( xx, square<double>() )< 15 ) ;
'
fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
This needs rev 1555 of Rcpp for the "< 15" part. I'll try to come up with a
lazy version of this for the Rcpp sugar.
I'll probably recycle some of this thread into a blog post (with nicer
formatting).
Also worth noting the existence of Rcpp::unary_call, Rcpp::fixed_call and
Rcpp::binary_call that wrap up an R function in a strong type illusion
(powered by the twins Rcpp::as and Rcpp::wrap).
So for example (this will be less efficient for obvious reasons):
code<- '
NumericVector xx( x) ;
return all( sapply( xx, unary_call<double,double>( Function("square")
) )< 15 ) ;
'
square<- function(x) x^2
fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
Romain
PS : More examples of these in the runit.Language.R file.
Le 16/06/10 16:18, Romain Francois a ?crit :
Le 16/06/10 14:58, Douglas Bates a ?crit :
Ever since I read Phil Spector's book on S I have been a fan of
functional programming in S and R. When Jose Pinheiro and I were
working on the nlme package there was a joke between us that you could
tell which of us wrote which parts of the code because his parts
always had an object named "aux" and my parts always had
"unlist(lapply(list(...)))".
Even within C++ I like to use the std:: algorithms like
std::transform, but, of course, there are differences between a
strongly typed language like C++ and a dynamically typed language like
S. Templates can get around these differences to some extent but I am
still a bit of a novice regarding templates.
Currently I want to apply some functions to lists but entirely within
the C++ code (i.e. I don't want to create Rcpp Function objects and
call back to R). For the sake of argument, consider a function that
extracts the lengths of the components of the lists.
library(Rcpp)
library(inline)
inc<- '
class length {
public:
R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));}
};
'
code<- '
List lst(ll);
IntegerVector ans(lst.size());
std::transform(lst.begin(), lst.end(), ans.begin(), length());
return ans;
'
ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc)
ll<- list(a = numeric(0), b = LETTERS[6:9], c = c)
ltst(ll)
sapply(ll, length)
I would like to create a templated sapply function like
template<int RTYPE>
Vector<RTYPE> sapply(List ll, ??) {
Vector<RTYPE> ans(ll.size());
CharacterVector nms = ll.names();
if (nms.size() == ll.size()) ans.names() = nms;
std::transform(ll.begin(), ll.end(), ans.begin(), ??);
return ans;
}
but I don't know how to specify the second argument that is a function
that returns the atomic element type of a Vector<RTYPE> (is this as
simple as Vector<RTYPE>::value_type?) and has a single argument which
probably should be an RObject (although might be an SEXP, if that was
more convenient). Can someone (probably Romain) provide some
guidance?
I recently added Vector::import_transform which might be all you need.
NumericVector x = NumericVector::import_transform(
y.begin(), y.end(), f) ;
This will work as long as f is of some type acceptable by std::transform.
But let's keep going anyway. One way (I'll write later why this is not
fully satisfactory) is here:
require( Rcpp )
require( inline )
inc<- '
template<typename OBJECT, typename FUN>
SEXP sapply(const OBJECT& object, FUN fun) {
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
return ans ;
}
template<int RT, typename FUN>
SEXP sapply( const Rcpp::Vector<RT>& object, FUN fun){
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
ans.names() = object.names() ;
return ans ;
}
template<typename T>
struct square : std::unary_function<T,T> {
inline T operator()( T& t ){ return t*t ; }
} ;
'
code<- '
NumericVector xx( x );
return sapply( xx, square<double>() ) ;
'
fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
> fx( seq(1,10, length = 5 ) )
[1] 1.0000 10.5625 30.2500 60.0625 100.0000
I'll try to explain bit by bit. Please let me know if something is too
cryptic. The good thing about templates is that as long as they work
they don't complain. The bad thing is that when they start complaining,
you need large screens to accomodate all the warnings/erros you get...
Starting from the last thing, the square struct.
template<typename T>
struct square : std::unary_function<T,T> {
inline T operator()( const T& t ){ return t*t ; }
} ;
we need the function that goes into sapply to help, i.e we need it to be
aware of the result type. inheriting from std::unary_function makes this
easy. It essentially just adds some typedef. See
http://www.cplusplus.com/reference/std/functional/unary_function/
So when we do square<double>(), the function is self aware that it
returns a double. We need that to decide which kind of R vectors we want
to create. Which is the job of the Rcpp::traits::r_sexptype_traits
trait. So:
const int RTYPE = Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype ;
it is very important that this is const, because we are using this as a
template argument later and this needs to be compile time constant.
Then, we just delegate to std::transform through Vector::import_transform:
Rcpp::Vector<RTYPE> ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
For now the sapply just return a SEXP because that is easy, but we could
have the result type deduced as well (fasten your seatbelt):
template<typename OBJECT, typename FUN>
Rcpp::Vector< Rcpp::traits::r_sexptype_traits< typename
FUN::result_type>::rtype>
sapply(const OBJECT& object, FUN fun) {
Anyway, why is there two sapply ? because the first one is more generic,
you could sapply on a std::vector<double> for example, the only thing we
do with the object is to call begin() and end(), so as long as they
exist and they produce something that makes sense for std::transform, we
are fine (this is why templates are just so great).
The second version is specific to Rcpp vectors, on which we can call
names().
Now why is not satisfactory ? This is not lazy enough for me. With a
little bit more work, instead of returning a Rcpp::Vector which
allocates all of its memory right now (not lazy), we could have sapply
returning some sort of proxy object. The proxy object would expose a
similar interface : operator[] and iterator, but the result will only be
calculated when truly necessary.
Why should we care about that ? Imagine, instead of just calculating the
square, we want to find out if all the squared values are below 15, ie.
the equivalent of the R code "all( x^2< 15 )" ? We only need to
calculate three values to make our decision, so why bother calculating
the 2 last ones. (lazy)
This is the kind of thing I want to bring to Rcpp with the "sugar" piece
of the puzzle.
We would then have this expression:
code<- '
Rcpp::NumericVector xx( x) ;
return Rcpp::all( sapply( xx, square<double>() )< 15 ) ;
'
It almost work right now (I need to add some code so that<15 makes
sense, I only added Vector< Vector so far), but currently requires to
compute all the 5 values before handling the data to all.
Please come back with questions if some gaps need to be filled.
Romain
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5
By the way, you were correct in your original response that all I really need is input_transform but it is certainly a worthwhile exercise to see how traits, etc. are used and I appreciate your kind explanations. On Wed, Jun 16, 2010 at 10:58 AM, Romain Francois
<romain at r-enthusiasts.com> wrote:
Le 16/06/10 17:44, Douglas Bates a ?crit :
Thank you very much, Romain. ?This certainly helps my understanding of unary_function, traits, etc. ?These are the sorts of things I can read about but until I see a non-trivial usage it is hard to keep these ideas straight. Two minor questions occur to me at this point. ?In my code I assigned the names to be the names of the input list but I think now that I should have assigned a clone of the object names instead.
I have been lazy (the wrong kind of lazyness) when implementing names<-, and
just used a callback to the R function names<- . The relevant chunk is in
Vector.h in the NamesProxy class:
? ? ? ? ? ? ? ?void set(SEXP x) const {
? ? ? ? ? ? ? ? ? ? ? ?SEXP new_vec = PROTECT( internal::try_catch(
? ? ? ? ? ? ? ? ? ? ? ?Rf_lcons( Rf_install("names<-"),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Rf_cons( parent, Rf_cons( x ,
R_NilValue) )))) ;
? ? ? ? ? ? ? ? ? ? ? ?/* names<- makes a new vector, so we have to change
? ? ? ? ? ? ? ? ? ? ? ? ? the SEXP of the parent of this proxy, it might be
? ? ? ? ? ? ? ? ? ? ? ? ? worth to work directly with the names attribute
instead
? ? ? ? ? ? ? ? ? ? ? ? ? of using the names<- R function, but then we need
to
? ? ? ? ? ? ? ? ? ? ? ? ? take care of coercion, recycling, etc ... we
cannot just
? ? ? ? ? ? ? ? ? ? ? ? ? brutally assign the names attribute */
? ? ? ? ? ? ? ? ? ? ? ?const_cast<Vector&>(parent).setSEXP( new_vec ) ;
? ? ? ? ? ? ? ? ? ? ? ?UNPROTECT(1) ; /* new_vec */
? ? ? ? ? ? ? ?}
So with this implementation, you don't need to worry about copying names.
But it might not be the best implementation.
To be safe I should have cloned before assigning, right? ?I also checked the length before assigning but you didn't. ?Am I reflecting too much of my C programming background here? ?In other words, is there some magic in the Rcpp names assignment that does the checking?
The magic comes from R.
x <- 1:10
names(x) <- c("a", "b")
x
? a ? ?b <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> ? 1 ? ?2 ? ?3 ? ?4 ? ?5 ? ?6 ? ?7 ? ?8 ? ?9 ? 10 In your example ll.names() is either NULL or a vector of the appropriate size. I think (not sure though) that names<- is restrictive about the rhs:
names(x) <- rnorm
Erreur dans as.character(function (n, mean = 0, sd = 1) ?: ?cannot coerce type 'closure' to vector of type 'character' Romain
Thanks again. On Wed, Jun 16, 2010 at 9:55 AM, Romain Francois <romain.francois at dbmail.com> ?wrote:
Small update incorporating the use of all (and using namespace Rcpp to
make
things slightly less cluterred).
require( Rcpp )
require( inline )
inc<- '
using namespace Rcpp ;
template<typename OBJECT, typename FUN>
Vector< ?traits::r_sexptype_traits< ?typename FUN::result_type>::rtype>
sapply(const OBJECT& ?object, FUN fun) {
? ? ? ?const int RTYPE = traits::r_sexptype_traits< ?typename
FUN::result_type>::rtype ;
? ? ? ?Vector<RTYPE> ?ans = Vector<RTYPE>::import_transform(
? ? ? ? ? ? ? ?object.begin(), object.end(), fun
? ? ? ? ? ? ? ?) ;
? ? ? ?return ans ;
}
template<int RT, typename FUN>
Vector< ?traits::r_sexptype_traits< ?typename FUN::result_type>::rtype>
sapply( const Vector<RT>& ?object, FUN fun){
? ? ? ?const int RTYPE = traits::r_sexptype_traits< ?typename
FUN::result_type>::rtype ;
? ? ? ?Vector<RTYPE> ?ans = Vector<RTYPE>::import_transform(
? ? ? ? ? ? ? ?object.begin(), object.end(), fun
? ? ? ? ? ? ? ?) ;
? ? ? ?ans.names() = object.names() ;
? ? ? ?return ans ;
}
template<typename T>
struct square : std::unary_function<T,T> ?{
? ? ? ?inline T operator()( const T& ?t ){ return t*t ; }
} ;
'
code<- '
? ? ? ?NumericVector xx( x) ;
? ? ? ?return all( sapply( xx, square<double>() )< ?15 ) ;
'
fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
This needs rev 1555 of Rcpp for the "< ?15" part. I'll try to come up
with a
lazy version of this for the Rcpp sugar.
I'll probably recycle some of this thread into a blog post (with nicer
formatting).
Also worth noting the existence of Rcpp::unary_call, Rcpp::fixed_call and
Rcpp::binary_call that wrap up an R function in a strong type illusion
(powered by the twins Rcpp::as and Rcpp::wrap).
So for example (this will be less efficient for obvious reasons):
code<- '
? ? ? ?NumericVector xx( x) ;
? ? ? ?return all( sapply( xx, unary_call<double,double>(
Function("square")
) )< ?15 ) ;
'
square<- function(x) x^2
fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
fx( seq(1,10, length = 5 ) )
Romain
PS : More examples of these in the runit.Language.R file.
Le 16/06/10 16:18, Romain Francois a ?crit :
Le 16/06/10 14:58, Douglas Bates a ?crit :
Ever since I read Phil Spector's book on S I have been a fan of
functional programming in S and R. When Jose Pinheiro and I were
working on the nlme package there was a joke between us that you could
tell which of us wrote which parts of the code because his parts
always had an object named "aux" and my parts always had
"unlist(lapply(list(...)))".
Even within C++ I like to use the std:: algorithms like
std::transform, but, of course, there are differences between a
strongly typed language like C++ and a dynamically typed language like
S. Templates can get around these differences to some extent but I am
still a bit of a novice regarding templates.
Currently I want to apply some functions to lists but entirely within
the C++ code (i.e. I don't want to create Rcpp Function objects and
call back to R). For the sake of argument, consider a function that
extracts the lengths of the components of the lists.
library(Rcpp)
library(inline)
inc<- '
class length {
public:
R_len_t operator() (RObject const& ?x) {return Rf_length(SEXP(x));}
};
'
code<- '
List lst(ll);
IntegerVector ans(lst.size());
std::transform(lst.begin(), lst.end(), ans.begin(), length());
return ans;
'
ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc)
ll<- list(a = numeric(0), b = LETTERS[6:9], c = c)
ltst(ll)
sapply(ll, length)
I would like to create a templated sapply function like
template<int RTYPE>
Vector<RTYPE> ?sapply(List ll, ??) {
Vector<RTYPE> ?ans(ll.size());
CharacterVector nms = ll.names();
if (nms.size() == ll.size()) ans.names() = nms;
std::transform(ll.begin(), ll.end(), ans.begin(), ??);
return ans;
}
but I don't know how to specify the second argument that is a function
that returns the atomic element type of a Vector<RTYPE> ?(is this as
simple as Vector<RTYPE>::value_type?) and has a single argument which
probably should be an RObject (although might be an SEXP, if that was
more convenient). Can someone (probably Romain) provide some
guidance?
I recently added Vector::import_transform which might be all you need.
NumericVector x = NumericVector::import_transform(
y.begin(), y.end(), f) ;
This will work as long as f is of some type acceptable by
std::transform.
But let's keep going anyway. One way (I'll write later why this is not
fully satisfactory) is here:
require( Rcpp )
require( inline )
inc<- '
template<typename OBJECT, typename FUN>
SEXP sapply(const OBJECT& ?object, FUN fun) {
const int RTYPE = Rcpp::traits::r_sexptype_traits< ?typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ?ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
return ans ;
}
template<int RT, typename FUN>
SEXP sapply( const Rcpp::Vector<RT>& ?object, FUN fun){
const int RTYPE = Rcpp::traits::r_sexptype_traits< ?typename
FUN::result_type>::rtype ;
Rcpp::Vector<RTYPE> ?ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
ans.names() = object.names() ;
return ans ;
}
template<typename T>
struct square : std::unary_function<T,T> ?{
inline T operator()( T& ?t ){ return t*t ; }
} ;
'
code<- '
NumericVector xx( x );
return sapply( xx, square<double>() ) ;
'
fx<- cxxfunction( signature( x = "numeric" ), code, "Rcpp", inc )
?> ?fx( seq(1,10, length = 5 ) )
[1] 1.0000 10.5625 30.2500 60.0625 100.0000
I'll try to explain bit by bit. Please let me know if something is too
cryptic. The good thing about templates is that as long as they work
they don't complain. The bad thing is that when they start complaining,
you need large screens to accomodate all the warnings/erros you get...
Starting from the last thing, the square struct.
template<typename T>
struct square : std::unary_function<T,T> ?{
inline T operator()( const T& ?t ){ return t*t ; }
} ;
we need the function that goes into sapply to help, i.e we need it to be
aware of the result type. inheriting from std::unary_function makes this
easy. It essentially just adds some typedef. See
http://www.cplusplus.com/reference/std/functional/unary_function/
So when we do square<double>(), the function is self aware that it
returns a double. We need that to decide which kind of R vectors we want
to create. Which is the job of the Rcpp::traits::r_sexptype_traits
trait. So:
const int RTYPE = Rcpp::traits::r_sexptype_traits< ?typename
FUN::result_type>::rtype ;
it is very important that this is const, because we are using this as a
template argument later and this needs to be compile time constant.
Then, we just delegate to std::transform through
Vector::import_transform:
Rcpp::Vector<RTYPE> ?ans = Rcpp::Vector<RTYPE>::import_transform(
object.begin(), object.end(), fun
) ;
For now the sapply just return a SEXP because that is easy, but we could
have the result type deduced as well (fasten your seatbelt):
template<typename OBJECT, typename FUN>
Rcpp::Vector< ?Rcpp::traits::r_sexptype_traits< ?typename
FUN::result_type>::rtype>
sapply(const OBJECT& ?object, FUN fun) {
Anyway, why is there two sapply ? because the first one is more generic,
you could sapply on a std::vector<double> ?for example, the only thing
we
do with the object is to call begin() and end(), so as long as they
exist and they produce something that makes sense for std::transform, we
are fine (this is why templates are just so great).
The second version is specific to Rcpp vectors, on which we can call
names().
Now why is not satisfactory ? This is not lazy enough for me. With a
little bit more work, instead of returning a Rcpp::Vector which
allocates all of its memory right now (not lazy), we could have sapply
returning some sort of proxy object. The proxy object would expose a
similar interface : operator[] and iterator, but the result will only be
calculated when truly necessary.
Why should we care about that ? Imagine, instead of just calculating the
square, we want to find out if all the squared values are below 15, ie.
the equivalent of the R code "all( x^2< ?15 )" ? We only need to
calculate three values to make our decision, so why bother calculating
the 2 last ones. (lazy)
This is the kind of thing I want to bring to Rcpp with the "sugar" piece
of the puzzle.
We would then have this expression:
code<- '
Rcpp::NumericVector xx( x) ;
return Rcpp::all( sapply( xx, square<double>() )< ?15 ) ;
'
It almost work right now (I need to add some code so that<15 makes
sense, I only added Vector< ?Vector so far), but currently requires to
compute all the 5 values before handling the data to all.
Please come back with questions if some gaps need to be filled.
Romain
-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5
2 days later
Hello,
As a follow up on this thread, I've now commited a lazy version of
sapply. It can be used with a functor that has a nested result_type
type, or raw functions as illustrated by these examples from the unit
tests:
test.sugar.sapply <- function( ){
inc <- '
template <typename T>
class square : public std::unary_function<T,T> {
public:
T operator()( T t) const { return t*t ; }
} ;
'
fx <- cxxfunction( signature( x = "numeric" ), '
NumericVector xx(x) ;
NumericVector res = sapply( xx, square<double>() );
return res ;
', include = inc, plugin = "Rcpp" )
checkEquals( fx(1:10) , (1:10)^2 )
}
test.sugar.sapply.rawfun <- function( ){
inc <- '
double square( double x){ return x*x; }
'
fx <- cxxfunction( signature( x = "numeric" ), '
NumericVector xx(x) ;
NumericVector res = sapply( xx, square );
return res ;
', include = inc, plugin = "Rcpp" )
checkEquals( fx(1:10) , (1:10)^2 )
}
There is one more thing since this thread, sapply uses the new
Rcpp::traits::result_of trait to find out the result type of the function.
namespace Rcpp{
namespace traits{
template <typename T>
struct result_of{
typedef typename T::result_type type ;
} ;
template <typename RESULT_TYPE, typename INPUT_TYPE>
struct result_of< RESULT_TYPE (*)(INPUT_TYPE) >{
typedef RESULT_TYPE type ;
} ;
}
}
This version of sapply is lazy, so for example this works :
test.sugar.sapply.square <- function( ){
inc <- '
template <typename T>
class square : public std::unary_function<T,T> {
public:
T operator()( T t) const { return t*t ; }
} ;
'
fx <- cxxfunction( signature( x = "numeric" ), '
NumericVector xx(x) ;
return all( sapply( xx * xx , square<double>() ) < 10 );
', include = inc, plugin = "Rcpp" )
checkTrue( fx(1:10) )
}
and sapply only needs to calculate square( xx[0] * xx[0] ) and square(
xx[1] * xx[1] ) to make its final decision.
Romain
Le 16/06/10 14:58, Douglas Bates a ?crit :
Ever since I read Phil Spector's book on S I have been a fan of
functional programming in S and R. When Jose Pinheiro and I were
working on the nlme package there was a joke between us that you could
tell which of us wrote which parts of the code because his parts
always had an object named "aux" and my parts always had
"unlist(lapply(list(...)))".
Even within C++ I like to use the std:: algorithms like
std::transform, but, of course, there are differences between a
strongly typed language like C++ and a dynamically typed language like
S. Templates can get around these differences to some extent but I am
still a bit of a novice regarding templates.
Currently I want to apply some functions to lists but entirely within
the C++ code (i.e. I don't want to create Rcpp Function objects and
call back to R). For the sake of argument, consider a function that
extracts the lengths of the components of the lists.
library(Rcpp)
library(inline)
inc<- '
class length {
public:
R_len_t operator() (RObject const& x) {return Rf_length(SEXP(x));}
};
'
code<- '
List lst(ll);
IntegerVector ans(lst.size());
std::transform(lst.begin(), lst.end(), ans.begin(), length());
return ans;
'
ltst<- cxxfunction(signature(ll = "list"), code, "Rcpp", inc)
ll<- list(a = numeric(0), b = LETTERS[6:9], c = c)
ltst(ll)
sapply(ll, length)
I would like to create a templated sapply function like
template<int RTYPE>
Vector<RTYPE> sapply(List ll, ??) {
Vector<RTYPE> ans(ll.size());
CharacterVector nms = ll.names();
if (nms.size() == ll.size()) ans.names() = nms;
std::transform(ll.begin(), ll.end(), ans.begin(), ??);
return ans;
}
but I don't know how to specify the second argument that is a function
that returns the atomic element type of a Vector<RTYPE> (is this as
simple as Vector<RTYPE>::value_type?) and has a single argument which
probably should be an RObject (although might be an SEXP, if that was
more convenient). Can someone (probably Romain) provide some
guidance?
_______________________________________________ Rcpp-devel mailing list Rcpp-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/98Uf7u : Rcpp 0.8.1 |- http://bit.ly/c6YnCi : graph gallery collage `- http://bit.ly/bZ7ltC : inline 0.3.5