Skip to content
Prev 385998 / 398528 Next

which() vs. just logical selection in df

Hi Bert,

Thank you very much! I was unaware that .Internal() referred to C code.

I figured out the difference. which() dimensions the object returned
to be only the relevant records first. Logical indexing dimensions
last.
[1] 2000000
[1] 666667
length(dat[index1,])
[1] 666667
length(dat[index2,])
[1] 666667

microbenchmark(index1<-dat$gender2=="other", times=100L) # 2e6 records, ~ 13ms.
microbenchmark(index2<-which(index1), times=100L) # Extra time for
which() ~ 5ms.
microbenchmark(dat[index1,], times=100L) # Time to return just TRUE
records using the whole 2e6 index. ~99ms
microbenchmark(dat[index2,], times=100L) # Time to return all records
from shorter index ~64ms.

Cheers,
Keith
On Wed, Oct 14, 2020 at 4:42 PM Bert Gunter <bgunter.4567 at gmail.com> wrote: