Skip to content

R-alpha: x[NA]

5 messages · Thomas Lumley, Peter Dalgaard

#
Just curious. Current logic in both R and Splus has
[1]  1 NA

Does anyone know *why*? It makes sense for integer indexing, but for
logicals it seems to have some strange consequences - if you divide a
dataframe according to some criterion, you get as many NA-filled
records as there are missing values of the criterion added to *both*
the group that satisfies the criterion and to those that do not.

One reason could be that since NA is logical by default, you would be
getting awkward consequences of the type x[NA] == real(0), but
x[c(NA,1)] == c(NA,5.3), but why is NA logical by default, then?
#
On 22 Sep 1997, Peter Dalgaard BSA wrote:

            
<snip>
The alternative would be worse.  If one element of the index is not
logical then the whole index is converted, so
R> x<-1:4 
R> x[c(F,T,F,T)]
[1] 2 4
R> x[c(F,T,F,NA)]
[1]  2 NA
but
R> x[c(F,T,F,4)]
[1] 1 4

That is, the presence of any NA would result in selecting only element
as.numeric(T)==1.

BTW, while you obviously can't mix logical and numerical references I
think it's unfortunate that you can't mix numeric and name-based
references like
R> names(x)<-letters[1:4]
R> x
a b c d 
1 2 3 4 
R> x[c("a",2)]
Error: subscript out of bounds

If you want a real example, suppose you had a model frame from which you
wanted to extract the response (in position 1) and the variables whose
names were in a vector nn
	mf[,c(1,nn)] 
appears to be the obvious solution.


Thomas Lumley
------------------------------------------------------+------
Biostatistics		: "Never attribute to malice what  :
Uni of Washington	:  can be adequately explained by  :
Box 357232		:  incompetence" - Hanlon's Razor  :
Seattle WA 98195-7232	:				   :
------------------------------------------------------------

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#
Thomas Lumley <thomas@biostat.washington.edu> writes:
Aha... Of course that only applies when you enter logical vectors
using c(...), if they result from a comparison, then the type would be
logical NA anyway. So the problem is that c(...) converts to numeric
if just one element is.numeric() even if that element is NA. Hmm. I
suppose that could be handled differently by having c() explicitly
ignoring NA when deciding on type, but that probably leads to more
problems elsewhere?
I think that the possibility of setting names(x)<-4:1 will demonstrate
why that can't work... Besides, can you be sure that the dependent
variable is the first one in a model frame any more?
#
On 23 Sep 1997, Peter Dalgaard BSA wrote:

            
<snip>
Why?
R> x<-1:4
R> names(x)<-4:1
R> x[1]
4 
1 
R> x["1"]
1 
4 
So why not x[c(1,"1")] giving 1,4?


	-thomas


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#
Thomas Lumley <thomas@biostat.washington.edu> writes:
Because
[1] "1" "1"

so we'd have to introduce mixed-type vectors first. However, I suppose
that one might make
work.