Message-ID: <77EB52C6DD32BA4D87471DCD70C8D700E8EC77@NA-PA-VBE03.na.tibco.com>
Date: 2009-03-19T15:20:58Z
From: William Dunlap
Subject: sprintf("%d", integer(0)) aborts
In-Reply-To: <alpine.LFD.2.00.0903191024300.12166@gannet.stats.ox.ac.uk>
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: Thursday, March 19, 2009 3:34 AM
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] sprintf("%d", integer(0)) aborts
>
> On Wed, 18 Mar 2009, William Dunlap wrote:
>
> > In R's sprintf() if any of the arguments has length 0
> > the function aborts. E.g.,
> >
> > > sprintf("%d", integer(0))
> > Error in sprintf("%d", integer(0)) : zero-length argument
> > > sprintf(character(), integer(0))
> > Error in sprintf(character(), integer(0)) :
> > 'fmt' is not a non-empty character vector
> >
> > This comes up in code like
> > x[nchar(x)==0] <- sprintf("No. %d", seq_along(x)[nchar(x)==0])
> > which works if x contains any empty strings
> > x<-c("One","Two","") # changes "" -> "No. 3"
> > but not if it doesn't
> > x<-c("One","Two","Three") # throws error instead of doing nothing
> >
> > When I wrote S+'s sprintf() I had it act like the binary
> > arithmetic operators, returning a zero long result if any
> > argument were zero long. (Otherwise its result is as long
> > as the longest input.) I think it would be nice if R's
> > sprintf did this also.
> >
> > Currently you must add defensive code (if (any(nchar(x)==0))...)
> > to make functions using sprintf to work in all cases and that
> > muddies up the code and slows things down.
> >
> > Do you think this is a reasonable thing to do? I've attached
> > a possible patch to src/main/sprintf.c makes the examples above
> > return character(0).
>
> Yes. It was deliberate that it works (and is documented) the way it
> is, and I've not previously seen any problematic examples.
I was prompted to suggest the change by a note from Jim Holtman
in yesterday's R-help:
> system.time({
+ x <- sample(50000) # test data
+ x[sample(50000,10000)] <- 'asdfasdf' # characters strings
+ which.num <- grep("^[ 0-9]+$", x) # find numbers
+ # convert to leading 0
+ x[which.num] <- sprintf("%018.0f", as.numeric(x[which.num]))
+ x[-which.num] <- toupper(x[-which.num])
+ })
This code failed when I converted it to a function to run
through sapply because then which.num was often integer(0).
When used in production it would probably work for a long time
before seeing a sample in which which.num was integer(0).
(Of course, it would then silently mess up on the next line,
x[-which.num]<-...)
> But at
> least for the ... args, allowing zero-length arguments seems very
> reasonable. I'm less convinced by zero-length formats, but the rule
> may be easier to explain if we allow them.
Those were my thoughts as well.
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com