Hi all,
In some circumstances, as.character applied to a list converts real
NA's into the string "NA". Propagation of NAs is something R does
very well and unless there are good reasons for losing the NA, it
would improve the consistency w.r.t. NA handling for as.character to
behave differently.
Here's an example:
## Create a list with character, logical, and integer NA elements
v <- list(a=as.character(NA), b=NA, c=as.integer(NA))
sapply(v, is.na)
a b c
TRUE TRUE TRUE
sapply(as.character(v), is.na)
<NA> NA NA
TRUE FALSE FALSE
Thoughts?
+ seth
--
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org
NA handling in as.character applied to a list
3 messages · Seth Falcon, Peter Dalgaard
Seth Falcon <sfalcon at fhcrc.org> writes:
Hi all,
In some circumstances, as.character applied to a list converts real
NA's into the string "NA". Propagation of NAs is something R does
very well and unless there are good reasons for losing the NA, it
would improve the consistency w.r.t. NA handling for as.character to
behave differently.
Here's an example:
## Create a list with character, logical, and integer NA elements
v <- list(a=as.character(NA), b=NA, c=as.integer(NA))
sapply(v, is.na)
a b c
TRUE TRUE TRUE
sapply(as.character(v), is.na)
<NA> NA NA
TRUE FALSE FALSE
Thoughts?
Hmm...
as.character(v)
[1] NA "NA" "NA" This does look like a leftover from times when there was no character NA in the language. It is the kind of thing you need to be very careful about fixing though. (I have a couple of scars from as.character on formulas when introducing backtick quoting.) BTW, another little bit of nastiness popped up when playing around with this:
dput(v,control="all")
structure(list(a = NA, b = NA, c = as.integer(NA)), .Names = c("a",
"b", "c"))
sapply(v,mode)
a b c "character" "logical" "numeric"
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Peter Dalgaard <p.dalgaard at biostat.ku.dk> writes:
Hmm...
as.character(v)
[1] NA "NA" "NA" This does look like a leftover from times when there was no character NA in the language. It is the kind of thing you need to be very careful about fixing though. (I have a couple of scars from as.character on formulas when introducing backtick quoting.)
Well, I guess that's an argument for leaving the inconsistent
behavior. In case there is interest in fixing, here is a patch I was
playing with. It doesn't address the nasties with dput.
index 8eec5c3..787c230 100644
--- a/src/main/coerce.c
+++ b/src/main/coerce.c
@@ -1041,6 +1041,14 @@ #if 0
else if (isSymbol(VECTOR_ELT(v, i)))
SET_STRING_ELT(rval, i, PRINTNAME(VECTOR_ELT(v, i)));
#endif
+ else if ((length(VECTOR_ELT(v, i)) == 1) &&
+ (isInteger(VECTOR_ELT(v, i)) ||
+ isReal(VECTOR_ELT(v, i)) ||
+ isLogical(VECTOR_ELT(v, i)))) {
+ SET_STRING_ELT(rval, i,
+ STRING_ELT(coerceVector(VECTOR_ELT(v, i),
+ STRSXP), 0));
+ }
else
SET_STRING_ELT(rval, i,
STRING_ELT(deparse1line(VECTOR_ELT(v, i), 0), 0));