timezone attribute lost
From: r-devel-bounces at r-project.org
[mailto:r-devel-bounces at r-project.org] On Behalf Of Thomas Mang
Sent: Monday, November 24, 2008 1:02 AM
To: r-devel at stat.math.ethz.ch
Subject: [Rd] timezone attribute lost
Hi,
As I didn't get any response on the general help list and I
don't know
if there is a bug in action I am trying my luck here.
I was highly surprised to find out that during simple operations (see
code below) the timezone attribute for POSIXct data is lost and then,
upon the next interpretation, the system settings are used (which are
plain wrong in my case).
I have used R 2.8.0 under Windows XP with the system timezone
(managed
by Windows) set to CET - I suppose however that all other timezones,
with the exception of GMT, will show similiar surprising
behavior (and
those who live in GMT-zone: If you change your timezone
setting please
restart R, otherwise the effect won't take place).
# input data
# note that the timezone is deliberately set to GMT, and of course I
want the operations below to take place in GMT-time
Time = as.POSIXct(strptime(c("2007-12-12 14:30:15", "2008-04-14
15:31:34", "2008-04-14 15:31:34"), format = "%Y-%m-%d %H:%M:%S", tz =
"GMT"))
Time # OK, time zone is GMT
attr(Time, "tzone") # OK, time zone is GMT
# Surprise 1:
TApply = tapply(1:3, Time, max)
names(TApply) # wrong, names are converted to time zone of system
# Surprise 2:
UTime = unique(Time)
UTime # wrong, again time zone of system is used
attr(UTime, "tzone") # == NULL
I know how to "solve" the problem (for example by setting an R system
variable TZ to GMT), but I wonder why is this mess happening at all?
Moreover, is this behavior considered to be a feature, or a
plain bug ?
All of those problems are due to a problem in unique.default(), which
sends
the integer data in POSIXct, Date, and factor objects through a
.Internal
and then tries to reconstruct the original sort of object from the
integer output of that .Internal()
z <- .Internal(unique(x, incomparables, fromLast))
if (is.factor(x))
factor(z, levels = seq_len(nlevels(x)), labels = levels(x),
ordered = is.ordered(x))
else if (inherits(x, "POSIXct") || inherits(x, "Date"))
structure(z, class = class(x))
Your immediately problem could be solved by adding tzone=attr(x,"tzone")
to the structure call, but I'm not familiar enough with classes
inheriting
from POSIXct and Date to know if that is sufficient. There is no reason
someone won't make a new subclass where another attribute is essential.
Since .Internal used the equivalent of as.numeric(x) to extract numeric
codes,
it might be nice to have an as.numeric<-(x,value) function that could
insert
numeric codes back into a dataset so you could avoid reconstructing an
object
of unknown structure with as.numeric(x)<-z (or perhaps as.vector<-
should
be used so you don't have to know what the integer type is). In S and
S+
one can use x at .Data<-newNumericCodes for this sort of thing, but that
can
be dangerous because it lets you stick in inappropriate types.
One might think that adding a new unique method for POSIXct or Date or
things
subclassed from them would be the right way to structure things, but
factor()
explicitly calls unique.default().
Thanks, Thomas
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com