Skip to content
Prev 987 / 10988 Next

[Rcpp-devel] Using sugar expressions for evaluating deviance residuals

Hi,

Could you provide some minimal example using inline, etc ...

Here are a few hints in the meantime. In the next version of Rcpp, I've 
added a few things that help generating sugar functions. This is how for 
example the sugar version of choose is currently implemented :

SUGAR_BLOCK_2(choose      , ::Rf_choose     )

The SUGAR_BLOCK_2 macro (perhaps I should prefix it with RCPP_) lives in 
Rcpp/sugar/SugarBlock.h

You can promote y_log_y to a sugar function like this :

inc <- '
double y_log_y(double y, double mu){
     return (y) ? (y * log(y/mu)) : 0;
}
SUGAR_BLOCK_2( y_log_y , ::y_log_y )
'

fx <- cxxfunction( signature( y = "numeric", mu = "numeric" ), '
	NumericVector res = y_log_y(
		NumericVector(y),
		NumericVector(mu)
	) ;
	return res ;
', plugin = "Rcpp", includes = inc )
fx( runif(11) , seq(0, 1, .1 ) )


This gives you 3 Rcpp::y_log_y functions :
- one for when y is a double and mu is a NumericVector (or any numeric 
sugar expression)
- one for when y is a NumericVector and mu is a double
- one for both are NumericVector.

However, at the moment it does not take care about recycling so it is up 
to the user to make sure that y and mu have the same length. I'm 
currently thinking about how to deal with recycling.



rep is a possibility here, but consider this hint in the TODO file:

     o	not sure rep should be lazy, i.e. rep( x, 4 ) fetches x[i] 4 times,
	maybe we should use LazyVector like in outer to somehow cache the
	result when it is judged expensive to calculate
		


One other thing about sugar is that it has to do many checks for missing 
values so if (with the one defined above) you did call this expression:

2 * wt * (y_log_y(y, mu) + y_log_y(1.-y, 1.-mu))

(and again, currently binary operators +,*,.. don't take care of recycling)

you would have many checks for missing values. That is fine if you may 
have missing values, because it will propagate them correctly, but it 
can slow things down because they are tested for at every step.

If you are sure that y and mu don't contain missing values, then perhaps 
one thing we can do is embed that information in the data so that sugar 
does not have to check for missing values because it just assumes there 
are not any. Most of the code in sugar contains version that ignores 
missing values, controlled by a template parameter. For example seq_len 
creates a sugar expression where we know for sure that it does not 
contain missing values.


One way perhaps to short circuit this is to first write the code with 
double's and then promote it to sugar:

double resid( double y, double mu, double w){
	return 2 * wt * (y_log_y(y, mu) + y_log_y(1.-y, 1.-mu) ;
}

But then you need someone to write SUGAR_BLOCK_3 or write it manually.


Another idea would be to have something like 
NumericVector::import_transform but that would take 3 vector parameters 
instead of 1.


Sorry if this email is a bit of a mess, I sort of wrote the ideas as 
they came.

Romain

Le 14/08/10 20:28, Douglas Bates a ?crit :