Skip to content

Time Zone problems: midnight goes in; 8am comes out

2 messages · Boylan, Ross, Andrew Simmons

#
I'm having problems with timezones using lubridate, but it's not clear to me the difficulty is in lubridate.
---------------------------------
[1] "1970-01-01 08:01:00 PST"  ## Oops: midnight has turned in 8am
[1] 28860
[1] 28800
------------------------------------
lubridate accepts PST as the time zone, and the result prints "PST" for timezone.  Further, lubridate seems to be using the tz properly since it gets the 8 hour offset from UTC correct.

The problem is the value that is printed gives a UTC time of 08:01 despite having the PST suffix.  So the time appears to have jumped 8 hours ahead from the value parsed.

PST appears not to be a legal timezone (in spite of lubridate inferring the correct offset from it):
---------------------------------------------------
[1] "America/Los_Angeles"
[1] "PST8PDT"         "SystemV/PST8"    "SystemV/PST8PDT"
-------------------------------------
https://www.r-bloggers.com/2018/07/a-tour-of-timezones-troubles-in-r/ says lubridate will complain if given an invalid tz, though I don't see that explicitly in the current man page https://lubridate.tidyverse.org/reference/parse_date_time.html.  As shown above, parse_date_time() does not complain about the timezone, and does use it to get the correct offset.

Using America/Los_Angeles produces the expected results:
---------------------------------------
[1] "1970-01-01 00:01:00 PST"  # still prints PST.  This time it's true!
[1] 28860
----------------------------------------------------

I suppose I can just use "America/Los_Angeles" as the time zone; this would have the advantage of making all my timezones the same, which apparently what R requires for a vector of datetimes.  But the behavior seems odd, and the "fix" also requires me to ignore the time zone specified in my inputs, which look like "2022-03-01 15:54:30 PST" or PDT, depending on time of year.

1. Why this strange behavior in which PST or PDT is used to construct the proper offset from UTC, and then kind of forgotten on output?
2. Is this a bug in lubridate or base POSIXct, particularly its print routine?

My theory on 1 is that lubridate understands PST and constructs an appropriate UTC time.  POSIXct time does not understand a tz of "PST" and so prints out the UTC value for the time, "decorating" it with the not understood tz value.  

For 2, on one hand, lubridate is constructing POSIXct dates with invalid tz values; lubridate probably shouldn't.  On the other hand, POSIXct is printing a UTC time but labeling it with a tz it doesn't understand, so it looks if it's in that local time even though it isn't.  In the context above that seems like a bug, but it's possible a lot of code that depends on it.

Under these theories, the problems only arise because the set of tz values understood by lubridate differs from the tz value understood by POSIXct.

Versions:
R 3.5.2
lubridate 1.7.4
Debian GNU/Linux 10 aka buster (amd64 flavor)

Thanks.
Ross Boylan
#
It seems like the current version of lubridate is 1.8.0, which does
raise a warning for an invalid timezone, just like as.POSIXct. This is
what I tried:


print(lubridate::parse_date_time("1970-01-01 00:01:00",          "ymd
HMS"          , tz = "PST"))
print(as.POSIXct                ("1970-01-01 00:01:00", format =
"%Y-%m-%d %H:%M:%S", tz = "PST"))


outputs:
[1] "1970-01-01 08:01:00 GMT"
Warning message:
In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'PST'
[1] "1970-01-01 00:01:00 GMT"
Warning messages:
1: In strptime(x, format, tz = tz) : unknown timezone 'PST'
2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone 'PST'
3: In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'PST'
But I don't see the same problem when using `tz =
"America/Los_Angeles"` or `tz = "PST8PDT"`.


print(lubridate::parse_date_time("1970-01-01 00:01:00",          "ymd
HMS"          , tz = "PST8PDT"))
print(as.POSIXct                ("1970-01-01 00:01:00", format =
"%Y-%m-%d %H:%M:%S", tz = "PST8PDT"))


outputs:
[1] "1970-01-01 00:01:00 PST"
[1] "1970-01-01 00:01:00 PST"
I would hesitate to use `tz = Sys.timezone()` because someone from
another province/state might not be able to use your code. Depends on
whether this work is being shared with other people though, up to you.

On Tue, Mar 1, 2022 at 8:51 PM Boylan, Ross via R-help
<r-help at r-project.org> wrote: