Back to formatted view
Raw Message

Message-ID: <e1361350-bb9a-ca23-896f-2d3b02325e05@kruisselbrink.eu>
Date: 2016-12-14T06:26:39Z
From: Johannes Kruisselbrink
Subject: [Rcpp-devel] Rcpp ISNAN slower than C ISNAN?
In-Reply-To: <CADNH-Pt5_cQFV28Uhmiuuz8cJLWbf=7d1kQqgpT0p7gGBq8g6g@mail.gmail.com>

Moving the call outside the main loop would be effective for some 
scenarios (i.e, the scenarios where the data objects do not contain 
NaNs). However, once they do we still want to compute a distance based 
on the values and "correct" for the NaNs in some way, so skipping the 
entire object is not really an option. Including a switch between the 
cases of objects with and objects without NaNs is probably something 
worthwhile (that and using more rcpp-sugar).

Nevertheless, the question still remains why the rcpp isNaN call is so 
much slower.

On 12/13/2016 2:04 PM, xian at unm.edu (Christian Gunning) wrote:
>> |    for (i = 0; i < numObjects; i++) {
>> |      for (j = 0; j < numCodes; j++) {
>> |        dist = 0;
>> |        for (k = 0; k < numVars; k++) {
>> |          if (!ISNAN(data[i * numVars + k])) {
>> |            tmp = data[i * numVars + k] - codes[j * numVars + k];
>>
>> Why not drop data and codes and use  sData1(i,k) - sData2(j,k)  ?
> Or better yet, just use the original code with NumericMatrix:
> sData1[i * numVars + k] does the right thing.
> I don't get any timing difference based on this change.
>
> Using Rcpp sugar
> (https://cran.r-project.org/package=Rcpp/vignettes/Rcpp-sugar.pdf),
> and moving the call outside the loop, appears to do the right thing.
>
> ## modified example
> ## see edits here:
> https://github.com/helmingstay/rcpp-timings/blob/master/diff/rcppdist.cpp#L24
> git clone https://github.com/helmingstay/rcpp-timings
> cd rcpp-timings/diff
> R --vanilla < glue.R