Skip to content

when to use `which'?

6 messages · Sam Steingold, Bert Gunter, David Winsemius +3 more

#
when do I need to use which()?
[1] 1 2 3 4 5 6
[1] 4
[1] 4
[1] 4
[1] 3 4 5 6
[1] 3 4 5 6
seems unnecessary...
#
Well ...
which(a==4)^2

??

-- Bert
On Tue, Jul 12, 2011 at 1:17 PM, Sam Steingold <sds at gnu.org> wrote:

  
    
#
On Jul 12, 2011, at 4:17 PM, Sam Steingold wrote:

            
It is unnecessary when `a` is a toy case and has no NA's. And you will  
find some of the cognoscenti trying to correct you when you do use  
which().

a <- c(1,2, NA ,3,4, NaN, 5,6)

 > data.frame(lets= letters[1:8], stringsAsFactors=FALSE)[a>0, ]
[1] "a" "b" NA  "d" "e" NA  "g" "h"
 > data.frame(lets= letters[1:8], stringsAsFactors=FALSE)[which(a>0), ]
[1] "a" "b" "d" "e" "g" "h"

If you have millions of records and tens of thousands of NA's (say ~  
1% of the data), imagine what your console looks like if you try to  
pick out records from one day and get 10,000 where you were expecting  
100. A real PITA when you are doing real work.
#
On Tue, Jul 12, 2011 at 1:17 PM, Sam Steingold <sds at gnu.org> wrote:
See ?which

For examples, try:

example(which)
Yes, it can be used as a redudant wrapper as you have demonstrated in
your examples.  In those cases, it is most certainly unnecessary.

  
    
#
Em 12/7/2011 17:29, David Winsemius escreveu:
[snipped]
I canvas this snippet of experience and wisdom to become a fortune :-)

--
Cesar Rabak
#
x[which(condition)], like the subset function, treats NAs in
condition as FALSE and hence does not output NAs for them.
I was also surprised to see that it runs a trifle faster than x[condition]
in R 2.13.0 if there are few TRUEs in condition and a trifle slower
if there are many TRUEs.

A danger of the x[which(condition)] approach is the case
where you are trying to omit some entries by using a negative
integer subscript, as in
    x[-which(is.na(x))]
That is equivalent to
    x[!is.na(x)]
if there are any NAs in x but if there are no NAs in x then
its output is a zero-length vector.

For complicated conditions I find it easier understand code
using logical operators
    x[!is.na(x) & x>0 & x<10]
than code using set operators using the output of which
   x[intersect( setdiff( which(x>0), which(is.na(x))), which(x<10))]

Bill Dunlap
TIBCO Spotfire