paste() with NAs .. change worth persuing? - R-devel

Wed, Aug 22, 2007 8:50 AM #

Consider this example code

 c1 <- letters[1:7]; c2 <- LETTERS[1:7]
 c1[2] <- c2[3:4] <- NA
 rbind(c1,c2)

  ##   [,1] [,2] [,3] [,4] [,5] [,6] [,7]
  ## c1 "a"  NA   "c"  "d"  "e"  "f"  "g" 
  ## c2 "A"  "B"  NA   NA   "E"  "F"  "G" 

  paste(c1,c2)

  ## -> [1] "a A"  "NA B" "c NA" "d NA" "e E"  "f F"  "g G" 

where a more logical result would have entries 2:4 equal to
      NA 
i.e.,  as.character(NA)
aka    NA_character_

Is this worth persuing, or does anyone see why not?

Regards,
Martin

Duncan Murdoch

Wed, Aug 22, 2007 10:16 AM #

On 8/22/2007 11:50 AM, Martin Maechler wrote:

A fairly common use of paste is to put together reports for human 
consumption.  Currently we have

 > p <- as.character(NA)
 > paste("the value of p is", p)
[1] "the value of p is NA"

which looks reasonable. Would this become

 > p <- as.character(NA)
 > paste("the value of p is", p)
[1] NA

under your proposal?  (In a quick search I was unable to find a real 
example where this would happen, but it would worry me...)

Duncan Murdoch

Jari Oksanen

Wed, Aug 22, 2007 10:53 AM #

On 22 Aug 2007, at 20:16, Duncan Murdoch wrote:

At least stop() seems to include such a case:

  message <- paste(args, collapse = "")

and we may expect there are NAs sometimes in stop().

cheers, jazza
--
Jari Oksanen, Oulu, Finland

Petr Savicky

Thu, Aug 23, 2007 6:49 AM #

On Wed, Aug 22, 2007 at 08:53:39PM +0300, Jari Oksanen wrote:

The examples show, that changing the behavior of paste in general
may not be appropriate. On the other hand, if we concatenate
character vectors, which are part of data, then is.na(paste(...,NA,...))
makes sense. Character vectors in data are usually represented
by factors. On the other hand, factors are not typical in cases,
where paste is used to produce a readable message. Hence, it
could be possible to have is.na(u[i]) for those i, for which
some of the vectors v1, ..., vn in
  u <- paste(v1,....,vn)
is a factor and has NA at i-th position.

Petr Savicky.

Martin Maechler

Fri, Aug 24, 2007 12:22 AM #

PS> On Wed, Aug 22, 2007 at 08:53:39PM +0300, Jari Oksanen wrote:

>>

>> On 22 Aug 2007, at 20:16, Duncan Murdoch wrote:

>> > A fairly common use of paste is to put together reports for human
    >> > consumption.  Currently we have
    >> >
    >> >> p <- as.character(NA)
    >> >> paste("the value of p is", p)
    >> > [1] "the value of p is NA"
    >> >
    >> > which looks reasonable. Would this become
    >> >
    >> >> p <- as.character(NA)
    >> >> paste("the value of p is", p)
    >> > [1] NA
    >> >
    >> > under your proposal?  (In a quick search I was unable to find a real
    >> > example where this would happen, but it would worry me...)
    >> 
    >> At least stop() seems to include such a case:
    >> 
    >> message <- paste(args, collapse = "")
    >> 
    >> and we may expect there are NAs sometimes in stop().

    PS> The examples show, that changing the behavior of paste in general
    PS> may not be appropriate. On the other hand, if we concatenate
    PS> character vectors, which are part of data, then is.na(paste(...,NA,...))
    PS> makes sense. Character vectors in data are usually represented
    PS> by factors. On the other hand, factors are not typical in cases,
    PS> where paste is used to produce a readable message. Hence, it
    PS> could be possible to have is.na(u[i]) for those i, for which
    PS> some of the vectors v1, ..., vn in
    PS> u <- paste(v1,....,vn)
    PS> is a factor and has NA at i-th position.

You are right.  But I don't think any longer that it is sensible
to make paste() complicated like that.

Also note that currently, the first step in  paste 
is to  "as.character(.)" all of its arguments,
--- and it's help page does say so too ---
such that
later, you can't distinguish anymore between
    "original character NA"
and "original numeric/factor NA".

Thanks to all the respondents,
I've now been convinced that the answer to my original question
is  "no"  {i.e. it's not worth persuing to change paste() here ..}.

I will add a note to paste()'s help page mentioning the
somewhat undesired behavior for the case one is really just
thinking of character string manipulations.

Martin