Skip to content

[Rcpp-devel] Understanding the behaviour of const CharacterVector as a function parameter

13 messages · Simon Zehnder, Dirk Eddelbuettel, Romain Francois

#
Dear Rcpp::Users and Rcpp::Devels,

I would like to understand a certain behaviour of my code I encountered lately. 

I am working with CharacterVector and the following behaviour occurred:

void test1 (Rcpp::CharacterVector &charv)
{
	Rprintf("test1: %s\n", (char*) charv(0));
}

void test2 (const Rcpp::CharacterVector &str)
{
	Rprintf("test2: %s\n", (char*) charv(0));
}

Using a string like "2013-05-04 20:23:21" for the Rcpp::CharacterVector gives the following outputs:

test1: 2013-05-04 20:23:21

test2:  `

This does also not change if I use a cast to const char* in test2. I tried something similar with strings and printing the c_str() of them, there the 'const' keyword does not make a difference - it always prints the correct string.

Is this something specific to the Rcpp::CharacterVector, that uses a string_proxy for its elements returned by the operator ()? Is there a way to use const Rcpp::CharacterVector and get the behaviour of test1? 


Best

Simon
#
On 29 September 2013 at 14:06, Simon Zehnder wrote:
| Dear Rcpp::Users and Rcpp::Devels,
| 
| I would like to understand a certain behaviour of my code I encountered lately. 
| 
| I am working with CharacterVector and the following behaviour occurred:
| 
| void test1 (Rcpp::CharacterVector &charv)
| {
| 	Rprintf("test1: %s\n", (char*) charv(0));
| }
| 
| void test2 (const Rcpp::CharacterVector &str)
| {
| 	Rprintf("test2: %s\n", (char*) charv(0));
| }
| 
| Using a string like "2013-05-04 20:23:21" for the Rcpp::CharacterVector gives the following outputs:
| 
| test1: 2013-05-04 20:23:21
| 
| test2:  `
| 
| This does also not change if I use a cast to const char* in test2. I tried something similar with strings and printing the c_str() of them, there the 'const' keyword does not make a difference - it always prints the correct string.
| 
| Is this something specific to the Rcpp::CharacterVector, that uses a string_proxy for its elements returned by the operator ()? Is there a way to use const Rcpp::CharacterVector and get the behaviour of test1? 

Looks like a bug. (But note that const correctness of types build around SEXP
is at best a promise -- we cannot undo the fact the the SEXP _is_ a pointer.)

But hold on for a day til Rcpp 0.10.5 reaches your mirror, or grab it from
CRAN in Vienna. It brings a lot of excellent changes, among them some fine
work by Romain dealing with exactly that. Full announcement coming later once
I am back from running and kid's soccer game.

Dirk
#
Hi Dirk,

thanks for the quick response! 

I do not yet understand your comment towards the const correctness in regard to SEXP, but I will take a closer look at the doxygen of Rcpp. 

New Version of Rcpp: Cool! Thanks for the tip, as I am curious now, I take the Vienna CRAN. The changes from Romain are the ones discussed lately on the rcpp-devel list regarding the shallow copy I assume. 

Enjoy your day with your family!


Best

Simon
On Sep 29, 2013, at 2:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:

            
#
Le 29/09/13 14:06, Simon Zehnder a ?crit :
Try actually using the variable you pass in, as in:

void test2 (const Rcpp::CharacterVector &str)
{
	Rprintf("test2: %s\n", (char*) str(0));
}

Although it still exposes the bug.

You can use something like this in the meantime:

void test2 (const Rcpp::CharacterVector& charv)
{
     String x = charv[0] ;
     Rprintf("test2: %s\n", x.get_cstring());
}

It looks like the bug is about converting the result of charv(0) to a 
char*. Probably worth looking at the string_proxy class.

Romain

  
    
#
Hello,

What acts as a proxy for const CharacterVector& does not do its proxy 
job. Instead it gives direct access to the underlying array of SEXP that 
character vector use.

This has been fixed in Rcpp11. The relevant addition is the 
const_string_proxy class. See this commit which can easily be applied to 
Rcpp. 
https://github.com/romainfrancois/Rcpp11/commit/f5e1600f7acbf3bef39325c06ef3ac5ddf8dc66a

The commit in Rcpp11 also has removed a few things from the proxy class 
that I don't judge needed anymore because I'm cleaning things. This 
might not apply to Rcpp with its more strict compatibility requirements.

Romain

Le 29/09/13 15:24, Romain Francois a ?crit :

  
    
#
On Sep 29, 2013, at 3:24 PM, Romain Francois <romain at r-enthusiasts.com> wrote:

            
My fault! :)
I tried to post a simpler example than my code and made this silly error.
This was my first approach. I write a date parser that converts strings to seconds since 1970 for Rcpp. Then I decided to convert my code to use char pointers, which I assumed to be much faster. It works with nonconst parameters and is indeed on 10e+6 values 1 second faster (takes 0.836 secs in total). Regarding the better performance of my code in case of using char I want to stick to it and accept here using nonconst parameters.
I took a look at the string_proxy class and other related classes. It takes some more time for me to understand what is going on, but step by step I get a better insight, what is going on under the hood. 

Thank you for the quick response. 

Best

Simon
#
Hi Romain,

thanks for this fix!
On Sep 29, 2013, at 5:26 PM, Romain Francois <romain at r-enthusiasts.com> wrote:

            
I need some time to get this from the Rcpp source code to see what exactly is going on there - I don't have this deep understanding of the class structure, yet.
Your new rep Rcpp11 looks very interesting! I have to distribute my package not on CRAN, but at least to colleagues working on Windows machines using the Rtools package. Rtools relies on gcc 4.6 and from http://gcc.gnu.org/projects/cxx0x.html I conclude, that it does not support all features - I guess for Rcpp11 it needs at least gcc 4.7? 
On my own machine I can use it without a problem, but for the distribution among the Windows machines I fear I have to rely on the Rcpp CRAN version and use the nonconst reference as parameter. 

I am curious, what feature of C++11 does enable the const_string_proxy?
I see, that the class 'generic_proxy' has gone. What was its intention in Rcpp?


Best

Simon
#
Le 29/09/13 20:36, Simon Zehnder a ?crit :
You don't really need to understand how it is implemented.
I don't think so. These classes implement a proxy pattern. That is 
classic c++ pattern.

In essence, when you have a List and you call its operator[], what you 
get is a generic_proxy. This class's job is to define getters and 
setters in terms of operator= and implicit conversion operators so that 
you can do things like this:

List z ;
RObject x = z[0] ;
z[0] = 2 ;

The proxy classes take care of all the plumbing here.

But again, you should not need to know about this.

  
    
#
Le 29/09/13 20:36, Simon Zehnder a ?crit :
This is not yet my concern. For now I'm focused on developping.
Releasing and supporting windows will come later.

Right now, I'm using clang on OSX to develop.
This has nothing to do with C++11. It is just that I fixed this issue in 
Rcpp11 and while I was there I let people know how to fix it in Rcpp. I 
might do it myself, but I might also forget to do it.

  
    
#
Hi Romain,

thanks for the quick explanation of the idea behind the generic_proxy! - In addition, I am very glad, I do not have to understand everything what is going on under the hood. So I can concentrate on what I can do with it! 

Best

Simon
On Sep 29, 2013, at 8:46 PM, Romain Francois <romain at r-enthusiasts.com> wrote:

            
#
Hi Dirk,

this is where I took the idea from. The fasttime package works with delimited date patterns, which we often do not have. We have rather something like: 20130405 instead of 2013-04-05 (and I don't want to start with manipulating data in perl or python before reading it into R; this gets usually messy). WIthout a delimiter the code in fasttime does not work. I use the basic principle of Simon's code and extend it to undelimited date patterns with four year digits and two year digits; but using Rcpp instead of the C and the SEXPs. I see though, that the Date::mktime00 function does something very similar ? 

Does it even make sense to think about working with POSIXct in Rcpp or is this task already completed/unnecessary?


Best

Simon
On Sep 29, 2013, at 8:49 PM, Dirk Eddelbuettel <edd at debian.org> wrote:

            
#
Hi Romain,
On Sep 29, 2013, at 8:51 PM, Romain Francois <romain at r-enthusiasts.com> wrote:

            
Me too, and I can use it :)
Nice. This then should work on the Windows machines, too. Cool!