subset and missing value indexing

Tue, Aug 17, 2004 12:56 PM #

Out of curiosity, is this a bug or a feature or "==" ?

m <- matrix( 1:12, nc=4 )
f <- c("A", NA, "B", "A")

f == "A"
[1]  TRUE    NA FALSE  TRUE

m[ , f == "A"]         # equivalent to m[ , c(1, NA, 4) ]
     [,1] [,2] [,3]
[1,]    1   NA   10
[2,]    2   NA   11
[3,]    3   NA   12

m[ , which(f == "A")]
     [,1] [,2]
[1,]    1   10
[2,]    2   11
[3,]    3   12


In arguments section of help("which") it says that 
   'NA's are allowed and omitted (treated as if 'FALSE').

After some thinking, I think this might be due to subsetting using index
that includes missing value. help("[") appears not to say what happens
when one of the indexing element is a missing value, only that the index
can be logical ( and NA is logical ).

Is there any reason for allowing NA when subsetting ?

Regards, Adai

Brian Ripley

Tue, Aug 17, 2004 1:14 PM #

On Tue, 17 Aug 2004, Adaikalavan Ramasamy wrote:

A documented feature.

Yes: it is part of the S language and widely used to avoid special cases
when programming.  Since you don't know what the index value is, the
column (in your case) is included or not and the only thing to do is to 
return NA.

There are lot of things help("[") does not say, such what happens to
out-of-range indices.  It is in the reference given there (p.358), as
well as in the R Language Reference, section 3.4.1 in the version I looked
at.

Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595