Consider this example code
c1 <- letters[1:7]; c2 <- LETTERS[1:7]
c1[2] <- c2[3:4] <- NA
rbind(c1,c2)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## c1 "a" NA "c" "d" "e" "f" "g"
## c2 "A" "B" NA NA "E" "F" "G"
paste(c1,c2)
## -> [1] "a A" "NA B" "c NA" "d NA" "e E" "f F" "g G"
where a more logical result would have entries 2:4 equal to
NA
i.e., as.character(NA)
aka NA_character_
Is this worth persuing, or does anyone see why not?
Regards,
Martin
paste() with NAs .. change worth persuing?
5 messages · Duncan Murdoch, Jari Oksanen, Petr Savicky +1 more
On 8/22/2007 11:50 AM, Martin Maechler wrote:
Consider this example code
c1 <- letters[1:7]; c2 <- LETTERS[1:7]
c1[2] <- c2[3:4] <- NA
rbind(c1,c2)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## c1 "a" NA "c" "d" "e" "f" "g"
## c2 "A" "B" NA NA "E" "F" "G"
paste(c1,c2)
## -> [1] "a A" "NA B" "c NA" "d NA" "e E" "f F" "g G"
where a more logical result would have entries 2:4 equal to
NA
i.e., as.character(NA)
aka NA_character_
Is this worth persuing, or does anyone see why not?
A fairly common use of paste is to put together reports for human
consumption. Currently we have
> p <- as.character(NA)
> paste("the value of p is", p)
[1] "the value of p is NA"
which looks reasonable. Would this become
> p <- as.character(NA)
> paste("the value of p is", p)
[1] NA
under your proposal? (In a quick search I was unable to find a real
example where this would happen, but it would worry me...)
Duncan Murdoch
On 22 Aug 2007, at 20:16, Duncan Murdoch wrote:
On 8/22/2007 11:50 AM, Martin Maechler wrote:
Consider this example code
c1 <- letters[1:7]; c2 <- LETTERS[1:7]
c1[2] <- c2[3:4] <- NA
rbind(c1,c2)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## c1 "a" NA "c" "d" "e" "f" "g"
## c2 "A" "B" NA NA "E" "F" "G"
paste(c1,c2)
## -> [1] "a A" "NA B" "c NA" "d NA" "e E" "f F" "g G"
where a more logical result would have entries 2:4 equal to
NA
i.e., as.character(NA)
aka NA_character_
Is this worth persuing, or does anyone see why not?
A fairly common use of paste is to put together reports for human consumption. Currently we have
p <- as.character(NA)
paste("the value of p is", p)
[1] "the value of p is NA" which looks reasonable. Would this become
p <- as.character(NA)
paste("the value of p is", p)
[1] NA under your proposal? (In a quick search I was unable to find a real example where this would happen, but it would worry me...)
At least stop() seems to include such a case: message <- paste(args, collapse = "") and we may expect there are NAs sometimes in stop(). cheers, jazza -- Jari Oksanen, Oulu, Finland
On Wed, Aug 22, 2007 at 08:53:39PM +0300, Jari Oksanen wrote:
On 22 Aug 2007, at 20:16, Duncan Murdoch wrote:
A fairly common use of paste is to put together reports for human consumption. Currently we have
p <- as.character(NA)
paste("the value of p is", p)
[1] "the value of p is NA" which looks reasonable. Would this become
p <- as.character(NA)
paste("the value of p is", p)
[1] NA under your proposal? (In a quick search I was unable to find a real example where this would happen, but it would worry me...)
At least stop() seems to include such a case: message <- paste(args, collapse = "") and we may expect there are NAs sometimes in stop().
The examples show, that changing the behavior of paste in general may not be appropriate. On the other hand, if we concatenate character vectors, which are part of data, then is.na(paste(...,NA,...)) makes sense. Character vectors in data are usually represented by factors. On the other hand, factors are not typical in cases, where paste is used to produce a readable message. Hence, it could be possible to have is.na(u[i]) for those i, for which some of the vectors v1, ..., vn in u <- paste(v1,....,vn) is a factor and has NA at i-th position. Petr Savicky.
"PS" == Petr Savicky <savicky at cs.cas.cz>
on Thu, 23 Aug 2007 15:49:32 +0200 writes:
PS> On Wed, Aug 22, 2007 at 08:53:39PM +0300, Jari Oksanen wrote:
>>
>> On 22 Aug 2007, at 20:16, Duncan Murdoch wrote:
>> > A fairly common use of paste is to put together reports for human
>> > consumption. Currently we have
>> >
>> >> p <- as.character(NA)
>> >> paste("the value of p is", p)
>> > [1] "the value of p is NA"
>> >
>> > which looks reasonable. Would this become
>> >
>> >> p <- as.character(NA)
>> >> paste("the value of p is", p)
>> > [1] NA
>> >
>> > under your proposal? (In a quick search I was unable to find a real
>> > example where this would happen, but it would worry me...)
>>
>> At least stop() seems to include such a case:
>>
>> message <- paste(args, collapse = "")
>>
>> and we may expect there are NAs sometimes in stop().
PS> The examples show, that changing the behavior of paste in general
PS> may not be appropriate. On the other hand, if we concatenate
PS> character vectors, which are part of data, then is.na(paste(...,NA,...))
PS> makes sense. Character vectors in data are usually represented
PS> by factors. On the other hand, factors are not typical in cases,
PS> where paste is used to produce a readable message. Hence, it
PS> could be possible to have is.na(u[i]) for those i, for which
PS> some of the vectors v1, ..., vn in
PS> u <- paste(v1,....,vn)
PS> is a factor and has NA at i-th position.
You are right. But I don't think any longer that it is sensible
to make paste() complicated like that.
Also note that currently, the first step in paste
is to "as.character(.)" all of its arguments,
--- and it's help page does say so too ---
such that
later, you can't distinguish anymore between
"original character NA"
and "original numeric/factor NA".
Thanks to all the respondents,
I've now been convinced that the answer to my original question
is "no" {i.e. it's not worth persuing to change paste() here ..}.
I will add a note to paste()'s help page mentioning the
somewhat undesired behavior for the case one is really just
thinking of character string manipulations.
Martin