Skip to content

NA, deleting rows

4 messages · juli g. pausas, Achim Zeileis, Sundar Dorai-Raj +1 more

#
Dear colleges,
I do not understand the following behaviour:
a1 a2
1   1 NA
2   2 NA
3   3 NA
4   4 NA
5   5 NA
6   6  1
7   7  2
8   8  3
9   9  4
10 10  5
a1 a2
NA   NA NA
NA.1 NA NA
NA.2 NA NA
NA.3 NA NA
NA.4 NA NA
7     7  2
8     8  3
9     9  4
10   10  5

I didn't expect a1 to be affected.
Is  aa[!aa$a2==1, ]  an incorrect way to remove rows? 
Any other way?

(R 1.8.1. for Windows)
Thanks in advance

Juli
#
On Thu, 18 Dec 2003 15:03:04 +0100 juli g. pausas wrote:

            
It leads to the behaviour above if there are NAs in the logical vector
used for indexing:

R> !aa$a2==1
 [1]    NA    NA    NA    NA    NA FALSE  TRUE  TRUE  TRUE  TRUE
Several other ways are conceivable to treat the NA rows differently.
This precise problem is solved, e.g., by

R> aa[-which(aa$a2==1), ]
   a1 a2
1   1 NA
2   2 NA
3   3 NA
4   4 NA
5   5 NA
7   7  2
8   8  3
9   9  4
10 10  5

hth,
Z
#
Take a look at what (aa$a2 == 1) returns and it may clear things up.

Try

aa[-which(aa$a2 == 1), ]

or

subset(aa, a2 != 1 | is.na(a2))

HTH,
Sundar
juli g. pausas wrote:

            
#
On Thu, 18 Dec 2003, juli g. pausas wrote:

            
You should think of NA as being pronounced "Don't Know".  That is you are
asking for all rows where a2==1 to be removed and you don't know whether
to remove the first five rows. The result is that you don't know what the
result is in the first five rows.

There is a good case for making this give an error or warning.

	-thomas