Hi all,
I have found another POSIXlt bug while I've been fiddling around with it.
This one only appears on specific OSes, because it has to do with the fact
that the `gmtoff` field is optional, and isn't always used on all OSes. It
also doesn't seem to be specific to r-devel, I think it has been there
awhile.
Here is the bug:
```
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
# Oh no!
x[1] <- NA
#> Error in x[[n]][i] <- value[[n]] : replacement has length zero
```
If you look at the objects, you can see that `x` has a `gmtoff` field, but
`NA` (when converted to POSIXlt, which is what `[<-.POSIXlt` does) does not:
```
unclass(x)
#> $sec
#> [1] 0
#>
#> $min
#> [1] 0
#>
#> $hour
#> [1] 0
#>
#> $mday
#> [1] 31
#>
#> $mon
#> [1] 0
#>
#> $year
#> [1] 113
#>
#> $wday
#> [1] 4
#>
#> $yday
#> [1] 30
#>
#> $isdst
#> [1] 0
#>
#> $zone
#> [1] "CST"
#>
#> $gmtoff
#> [1] -21600
#>
#> attr(,"tzone")
#> [1] "America/Chicago" "CST" "CDT"
unclass(as.POSIXlt(NA))
#> $sec
#> [1] NA
#>
#> $min
#> [1] NA
#>
#> $hour
#> [1] NA
#>
#> $mday
#> [1] NA
#>
#> $mon
#> [1] NA
#>
#> $year
#> [1] NA
#>
#> $wday
#> [1] NA
#>
#> $yday
#> [1] NA
#>
#> $isdst
#> [1] -1
#>
#> attr(,"tzone")
#> [1] "UTC"
```
The problem seems to be that `[<-.POSIXlt` assumes that if the field was
there in `x` then it must also be there in `value`:
https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/library/base/R/datetime.R#L1303-L1304
But this isn't the case for the `NA` value that was converted to POSIXlt.
I can't reproduce this on my personal Mac, but it affects the Linux, Mac,
and Windows machines we use for the lubridate CI checks through GitHub
Actions.
Thanks,
Davis
Bug with `[<-.POSIXlt` on specific OSes
5 messages · Davis Vaughan, Kurt Hornik, Martin Maechler
5 days later
I've got a bit more information about this one. It seems like it (only? not
sure) appears when `TZ = "UTC"`, which is why I didn't see it before on my
Mac, which defaults to `TZ = ""`. I think this is at least explainable by
the fact that those "optional" fields aren't technically needed when the
time zone is UTC.
I can reproduce this now on my personal Mac:
```
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
Sys.setenv(TZ = "")
x[1] <- NA
x
#> [1] NA
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
Sys.setenv(TZ = "America/New_York")
x[1] <- NA
x
#> [1] NA
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
Sys.setenv(TZ = "UTC")
x[1] <- NA
#> Error in x[[n]][i] <- value[[n]] : replacement has length zero
x
#> [1] "2013-01-31 CST"
```
Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub
Actions platforms where the bug exists (note they all set `TZ = "UTC"`!):
Linux:
```
sessionInfo()
R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.6 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so locale: [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.2.1
Sys.getenv("TZ")
[1] "UTC" ``` Mac: ```
sessionInfo()
R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur ... 10.16 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.2.1
Sys.getenv("TZ")
[1] "UTC" ``` Windows: This is the best I can get you, sorry (remote worker issues), but note that it does also say `tz UTC` like the others. ``` version R version 4.2.1 (2022-06-23 ucrt) os Windows Server x64 (build 20348) system x86_64, mingw32 ui RTerm language (EN) collate English_United States.utf8 ctype English_United States.utf8 tz UTC date 2022-10-11 ``` And here is my Mac where the bug doesn't show up by default because `TZ = ""`: ```
sessionInfo()
R version 4.2.1 (2022-06-23) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur ... 10.16 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.2.1
Sys.getenv("TZ")
[1] ""
Sys.timezone()
[1] "America/New_York" ``` -Davis
On Thu, Oct 6, 2022 at 9:33 AM Davis Vaughan <davis at rstudio.com> wrote:
Hi all,
I have found another POSIXlt bug while I've been fiddling around with it.
This one only appears on specific OSes, because it has to do with the fact
that the `gmtoff` field is optional, and isn't always used on all OSes. It
also doesn't seem to be specific to r-devel, I think it has been there
awhile.
Here is the bug:
```
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
# Oh no!
x[1] <- NA
#> Error in x[[n]][i] <- value[[n]] : replacement has length zero
```
If you look at the objects, you can see that `x` has a `gmtoff` field, but
`NA` (when converted to POSIXlt, which is what `[<-.POSIXlt` does) does not:
```
unclass(x)
#> $sec
#> [1] 0
#>
#> $min
#> [1] 0
#>
#> $hour
#> [1] 0
#>
#> $mday
#> [1] 31
#>
#> $mon
#> [1] 0
#>
#> $year
#> [1] 113
#>
#> $wday
#> [1] 4
#>
#> $yday
#> [1] 30
#>
#> $isdst
#> [1] 0
#>
#> $zone
#> [1] "CST"
#>
#> $gmtoff
#> [1] -21600
#>
#> attr(,"tzone")
#> [1] "America/Chicago" "CST" "CDT"
unclass(as.POSIXlt(NA))
#> $sec
#> [1] NA
#>
#> $min
#> [1] NA
#>
#> $hour
#> [1] NA
#>
#> $mday
#> [1] NA
#>
#> $mon
#> [1] NA
#>
#> $year
#> [1] NA
#>
#> $wday
#> [1] NA
#>
#> $yday
#> [1] NA
#>
#> $isdst
#> [1] -1
#>
#> attr(,"tzone")
#> [1] "UTC"
```
The problem seems to be that `[<-.POSIXlt` assumes that if the field was
there in `x` then it must also be there in `value`:
https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/library/base/R/datetime.R#L1303-L1304
But this isn't the case for the `NA` value that was converted to POSIXlt.
I can't reproduce this on my personal Mac, but it affects the Linux, Mac,
and Windows machines we use for the lubridate CI checks through GitHub
Actions.
Thanks,
Davis
Davis Vaughan writes:
I've got a bit more information about this one. It seems like it (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see it before on my Mac, which defaults to `TZ = ""`. I think this is at least explainable by the fact that those "optional" fields aren't technically needed when the time zone is UTC.
Exactly. Debugging `[<-.POSIlt` with
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
Sys.setenv(TZ = "UTC")
x[1] <- NA
shows we get into
value <- unclass(as.POSIXlt(value))
if (ici) {
for (n in names(x)) names(x[[n]]) <- nms
}
for (n in names(x)) x[[n]][i] <- value[[n]]
where
Browse[2]> names(value)
[1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst"
Browse[2]> names(x)
[1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday"
[9] "isdst" "zone" "gmtoff"
Without having looked at the code, the docs say
?zone? (Optional.) The abbreviation for the time zone in force at
that time: ?""? if unknown (but ?""? might also be used for
UTC).
?gmtoff? (Optional.) The offset in seconds from GMT: positive
values are East of the meridian. Usually ?NA? if unknown,
but ?0? could mean unknown.
so perhaps we should fill with the values for the unknown case?
-k
I can reproduce this now on my personal Mac:
```
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
Sys.setenv(TZ = "")
x[1] <- NA
x
#> [1] NA
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
Sys.setenv(TZ = "America/New_York")
x[1] <- NA
x
#> [1] NA
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
Sys.setenv(TZ = "UTC")
x[1] <- NA #> Error in x[[n]][i] <- value[[n]] : replacement has length zero
x
#> [1] "2013-01-31 CST" ```
Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub
Actions platforms where the bug exists (note they all set `TZ = "UTC"`!):
Linux:
```
sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.2.1
Sys.getenv("TZ")
[1] "UTC" ```
Mac:
```
sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur ... 10.16
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.2.1
Sys.getenv("TZ")
[1] "UTC" ```
Windows: This is the best I can get you, sorry (remote worker issues), but note that it does also say `tz UTC` like the others.
``` version R version 4.2.1 (2022-06-23 ucrt) os Windows Server x64 (build 20348) system x86_64, mingw32 ui RTerm language (EN) collate English_United States.utf8 ctype English_United States.utf8 tz UTC date 2022-10-11 ```
And here is my Mac where the bug doesn't show up by default because `TZ = ""`:
```
sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur ... 10.16
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.2.1
Sys.getenv("TZ")
[1] ""
Sys.timezone()
[1] "America/New_York" ```
-Davis
On Thu, Oct 6, 2022 at 9:33 AM Davis Vaughan <davis at rstudio.com> wrote:
Hi all,
I have found another POSIXlt bug while I've been fiddling around with it.
This one only appears on specific OSes, because it has to do with the fact
that the `gmtoff` field is optional, and isn't always used on all OSes. It
also doesn't seem to be specific to r-devel, I think it has been there
awhile.
Here is the bug:
```
x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
# Oh no!
x[1] <- NA
#> Error in x[[n]][i] <- value[[n]] : replacement has length zero
```
If you look at the objects, you can see that `x` has a `gmtoff` field, but
`NA` (when converted to POSIXlt, which is what `[<-.POSIXlt` does) does not:
```
unclass(x)
#> $sec
#> [1] 0
#>
#> $min
#> [1] 0
#>
#> $hour
#> [1] 0
#>
#> $mday
#> [1] 31
#>
#> $mon
#> [1] 0
#>
#> $year
#> [1] 113
#>
#> $wday
#> [1] 4
#>
#> $yday
#> [1] 30
#>
#> $isdst
#> [1] 0
#>
#> $zone
#> [1] "CST"
#>
#> $gmtoff
#> [1] -21600
#>
#> attr(,"tzone")
#> [1] "America/Chicago" "CST" "CDT"
unclass(as.POSIXlt(NA))
#> $sec
#> [1] NA
#>
#> $min
#> [1] NA
#>
#> $hour
#> [1] NA
#>
#> $mday
#> [1] NA
#>
#> $mon
#> [1] NA
#>
#> $year
#> [1] NA
#>
#> $wday
#> [1] NA
#>
#> $yday
#> [1] NA
#>
#> $isdst
#> [1] -1
#>
#> attr(,"tzone")
#> [1] "UTC"
```
The problem seems to be that `[<-.POSIXlt` assumes that if the field was
there in `x` then it must also be there in `value`:
https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/library/base/R/datetime.R#L1303-L1304
But this isn't the case for the `NA` value that was converted to POSIXlt.
I can't reproduce this on my personal Mac, but it affects the Linux, Mac,
and Windows machines we use for the lubridate CI checks through GitHub
Actions.
Thanks,
Davis
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Kurt Hornik
on Tue, 11 Oct 2022 16:44:13 +0200 writes:
Davis Vaughan writes:
>> I've got a bit more information about this one. It seems like it
>> (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see
>> it before on my Mac, which defaults to `TZ = ""`. I think this is at
>> least explainable by the fact that those "optional" fields aren't
>> technically needed when the time zone is UTC.
> Exactly. Debugging `[<-.POSIlt` with
> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
> Sys.setenv(TZ = "UTC")
> x[1] <- NA
> shows we get into
> value <- unclass(as.POSIXlt(value))
> if (ici) {
> for (n in names(x)) names(x[[n]]) <- nms
> }
> for (n in names(x)) x[[n]][i] <- value[[n]]
> where
> Browse[2]> names(value)
> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst"
> Browse[2]> names(x)
> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday"
> [9] "isdst" "zone" "gmtoff"
> Without having looked at the code, the docs say
> ?zone? (Optional.) The abbreviation for the time zone in force at
> that time: ?""? if unknown (but ?""? might also be used for
> UTC).
> ?gmtoff? (Optional.) The offset in seconds from GMT: positive
> values are East of the meridian. Usually ?NA? if unknown,
> but ?0? could mean unknown.
> so perhaps we should fill with the values for the unknown case?
> -k
Well,
I think you both know I'm in the midst of dealing with these
issues, to fix both
[.POSIXlt and
[<-.POSIXlt
Yes, one needs a way to not only "fill" the partially filled
entries but also to *normalize* out-of-range values
(say negative seconds, minutes > 60, etc)
All this is available in our C code, but not on the R level,
so yesterday, I wrote a C function to be called via .Internal(.)
from a new R that provides this.
Provisionally called
balancePOXIXlt()
because it both balances the 9 to 11 list-components of POSIXlt
and it also puts all numbers of (sec, min, hour, mday, mon)
into a correct range (and also computes correctl wday and yday numbers).
but I'm happy for proposals of better names.
I had contemplated validatePOSIXlt() as alternative, but then
dismissed that as in some sense we now do agree that
"imbalanced" POSIXlt's are not really invalid ..
.. and yes, to Davis: Even though I've spent so many hours with
POSIXlt, POSIXct and Date during the last week, I'm still
surprised more often than I like by the effects of timezone
settings there.
Martin
>> I can reproduce this now on my personal Mac:
>> ```
>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>> Sys.setenv(TZ = "")
>> x[1] <- NA
>> x
>> #> [1] NA
>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>> Sys.setenv(TZ = "America/New_York")
>> x[1] <- NA
>> x
>> #> [1] NA
>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>> Sys.setenv(TZ = "UTC")
>> x[1] <- NA
>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero
>> x
>> #> [1] "2013-01-31 CST"
>> ```
>> Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub
>> Actions platforms where the bug exists (note they all set `TZ = "UTC"`!):
>> Linux:
>> ```
>>> sessionInfo()
>> R version 4.2.1 (2022-06-23)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 18.04.6 LTS
>> Matrix products: default
>> BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
>> locale:
>> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
>> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
>> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
>> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>> loaded via a namespace (and not attached):
>> [1] compiler_4.2.1
>>> Sys.getenv("TZ")
>> [1] "UTC"
>> ```
>> Mac:
>> ```
>>> sessionInfo()
>> R version 4.2.1 (2022-06-23)
>> Platform: x86_64-apple-darwin17.0 (64-bit)
>> Running under: macOS Big Sur ... 10.16
>> Matrix products: default
>> BLAS:
>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
>> LAPACK:
>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>> loaded via a namespace (and not attached):
>> [1] compiler_4.2.1
>>> Sys.getenv("TZ")
>> [1] "UTC"
>> ```
>> Windows:
>> This is the best I can get you, sorry (remote worker issues), but note that
>> it does also say `tz UTC` like the others.
>> ```
>> version R version 4.2.1 (2022-06-23 ucrt)
>> os Windows Server x64 (build 20348)
>> system x86_64, mingw32
>> ui RTerm
>> language (EN)
>> collate English_United States.utf8
>> ctype English_United States.utf8
>> tz UTC
>> date 2022-10-11
>> ```
>> And here is my Mac where the bug doesn't show up by default because `TZ =
>> ""`:
>> ```
>>> sessionInfo()
>> R version 4.2.1 (2022-06-23)
>> Platform: x86_64-apple-darwin17.0 (64-bit)
>> Running under: macOS Big Sur ... 10.16
>> Matrix products: default
>> BLAS:
>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
>> LAPACK:
>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>> loaded via a namespace (and not attached):
>> [1] compiler_4.2.1
>>> Sys.getenv("TZ")
>> [1] ""
>>> Sys.timezone()
>> [1] "America/New_York"
>> ```
>> -Davis
>> On Thu, Oct 6, 2022 at 9:33 AM Davis Vaughan <davis at rstudio.com> wrote:
>>> Hi all,
>>>
>>> I have found another POSIXlt bug while I've been fiddling around with it.
>>> This one only appears on specific OSes, because it has to do with the fact
>>> that the `gmtoff` field is optional, and isn't always used on all OSes. It
>>> also doesn't seem to be specific to r-devel, I think it has been there
>>> awhile.
>>>
>>> Here is the bug:
>>>
>>> ```
>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>>>
>>> # Oh no!
>>> x[1] <- NA
>>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero
>>> ```
>>>
>>> If you look at the objects, you can see that `x` has a `gmtoff` field, but
>>> `NA` (when converted to POSIXlt, which is what `[<-.POSIXlt` does) does not:
>>>
>>> ```
>>> unclass(x)
>>> #> $sec
>>> #> [1] 0
>>> #>
>>> #> $min
>>> #> [1] 0
>>> #>
>>> #> $hour
>>> #> [1] 0
>>> #>
>>> #> $mday
>>> #> [1] 31
>>> #>
>>> #> $mon
>>> #> [1] 0
>>> #>
>>> #> $year
>>> #> [1] 113
>>> #>
>>> #> $wday
>>> #> [1] 4
>>> #>
>>> #> $yday
>>> #> [1] 30
>>> #>
>>> #> $isdst
>>> #> [1] 0
>>> #>
>>> #> $zone
>>> #> [1] "CST"
>>> #>
>>> #> $gmtoff
>>> #> [1] -21600
>>> #>
>>> #> attr(,"tzone")
>>> #> [1] "America/Chicago" "CST" "CDT"
>>>
>>> unclass(as.POSIXlt(NA))
>>> #> $sec
>>> #> [1] NA
>>> #>
>>> #> $min
>>> #> [1] NA
>>> #>
>>> #> $hour
>>> #> [1] NA
>>> #>
>>> #> $mday
>>> #> [1] NA
>>> #>
>>> #> $mon
>>> #> [1] NA
>>> #>
>>> #> $year
>>> #> [1] NA
>>> #>
>>> #> $wday
>>> #> [1] NA
>>> #>
>>> #> $yday
>>> #> [1] NA
>>> #>
>>> #> $isdst
>>> #> [1] -1
>>> #>
>>> #> attr(,"tzone")
>>> #> [1] "UTC"
>>> ```
>>>
>>> The problem seems to be that `[<-.POSIXlt` assumes that if the field was
>>> there in `x` then it must also be there in `value`:
>>>
>>> https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/library/base/R/datetime.R#L1303-L1304
>>>
>>> But this isn't the case for the `NA` value that was converted to POSIXlt.
>>>
>>> I can't reproduce this on my personal Mac, but it affects the Linux, Mac,
>>> and Windows machines we use for the lubridate CI checks through GitHub
>>> Actions.
>>>
>>> Thanks,
>>> Davis
>>>
Martin Maechler
on Wed, 12 Oct 2022 10:17:28 +0200 writes:
Kurt Hornik
on Tue, 11 Oct 2022 16:44:13 +0200 writes:
Davis Vaughan writes:
>>> I've got a bit more information about this one. It seems like it
>>> (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see
>>> it before on my Mac, which defaults to `TZ = ""`. I think this is at
>>> least explainable by the fact that those "optional" fields aren't
>>> technically needed when the time zone is UTC.
>> Exactly. Debugging `[<-.POSIlt` with
>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>> Sys.setenv(TZ = "UTC")
>> x[1] <- NA
>> shows we get into
>> value <- unclass(as.POSIXlt(value))
>> if (ici) {
>> for (n in names(x)) names(x[[n]]) <- nms
>> }
>> for (n in names(x)) x[[n]][i] <- value[[n]]
>> where
>> Browse[2]> names(value)
>> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst"
>> Browse[2]> names(x)
>> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday"
>> [9] "isdst" "zone" "gmtoff"
>> Without having looked at the code, the docs say
>> ?zone? (Optional.) The abbreviation for the time zone in force at
>> that time: ?""? if unknown (but ?""? might also be used for
>> UTC).
>> ?gmtoff? (Optional.) The offset in seconds from GMT: positive
>> values are East of the meridian. Usually ?NA? if unknown,
>> but ?0? could mean unknown.
>> so perhaps we should fill with the values for the unknown case?
>> -k
> Well,
> I think you both know I'm in the midst of dealing with these
> issues, to fix both
> [.POSIXlt and
> [<-.POSIXlt
> Yes, one needs a way to not only "fill" the partially filled
> entries but also to *normalize* out-of-range values
> (say negative seconds, minutes > 60, etc)
> All this is available in our C code, but not on the R level,
> so yesterday, I wrote a C function to be called via .Internal(.)
> from a new R that provides this.
> Provisionally called
> balancePOSIXlt()
> because it both balances the 9 to 11 list-components of POSIXlt
> and it also puts all numbers of (sec, min, hour, mday, mon)
> into a correct range (and also computes correctl wday and yday numbers).
> but I'm happy for proposals of better names.
> I had contemplated validatePOSIXlt() as alternative, but then
> dismissed that as in some sense we now do agree that
> "imbalanced" POSIXlt's are not really invalid ..
> .. and yes, to Davis: Even though I've spent so many hours with
> POSIXlt, POSIXct and Date during the last week, I'm still
> surprised more often than I like by the effects of timezone
> settings there.
> Martin
I have committed the new R and C code now, defining balancePOSIXlt(),
to get feedback from the community.
I've extended the documentation in help(DateTimeClasses),
and notably factored out the description
of POSIXlt mentioning the "ragged" and "out-of-range" cases.
This needs more testing and experiments, and I have not
announced it NEWS yet.
Planned next is to use it in [.POSIXlt and [<-.POSIXlt
so they will work correctly.
But please share your thoughts, propositions, ...
Martin
>>> I can reproduce this now on my personal Mac:
>>> ```
>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>>> Sys.setenv(TZ = "")
>>> x[1] <- NA
>>> x
>>> #> [1] NA
>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>>> Sys.setenv(TZ = "America/New_York")
>>> x[1] <- NA
>>> x
>>> #> [1] NA
>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>>> Sys.setenv(TZ = "UTC")
>>> x[1] <- NA
>>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero
>>> x
>>> #> [1] "2013-01-31 CST"
>>> ```
>>> Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub
>>> Actions platforms where the bug exists (note they all set `TZ = "UTC"`!):
>>> Linux:
>>> ```
>>>> sessionInfo()
>>> R version 4.2.1 (2022-06-23)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>> Running under: Ubuntu 18.04.6 LTS
>>> Matrix products: default
>>> BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
>>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
>>> locale:
>>> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
>>> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
>>> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
>>> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>> loaded via a namespace (and not attached):
>>> [1] compiler_4.2.1
>>>> Sys.getenv("TZ")
>>> [1] "UTC"
>>> ```
>>> Mac:
>>> ```
>>>> sessionInfo()
>>> R version 4.2.1 (2022-06-23)
>>> Platform: x86_64-apple-darwin17.0 (64-bit)
>>> Running under: macOS Big Sur ... 10.16
>>> Matrix products: default
>>> BLAS:
>>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
>>> LAPACK:
>>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>> loaded via a namespace (and not attached):
>>> [1] compiler_4.2.1
>>>> Sys.getenv("TZ")
>>> [1] "UTC"
>>> ```
>>> Windows:
>>> This is the best I can get you, sorry (remote worker issues), but note that
>>> it does also say `tz UTC` like the others.
>>> ```
>>> version R version 4.2.1 (2022-06-23 ucrt)
>>> os Windows Server x64 (build 20348)
>>> system x86_64, mingw32
>>> ui RTerm
>>> language (EN)
>>> collate English_United States.utf8
>>> ctype English_United States.utf8
>>> tz UTC
>>> date 2022-10-11
>>> ```
>>> And here is my Mac where the bug doesn't show up by default because `TZ =
>>> ""`:
>>> ```
>>>> sessionInfo()
>>> R version 4.2.1 (2022-06-23)
>>> Platform: x86_64-apple-darwin17.0 (64-bit)
>>> Running under: macOS Big Sur ... 10.16
>>> Matrix products: default
>>> BLAS:
>>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
>>> LAPACK:
>>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>> loaded via a namespace (and not attached):
>>> [1] compiler_4.2.1
>>>> Sys.getenv("TZ")
>>> [1] ""
>>>> Sys.timezone()
>>> [1] "America/New_York"
>>> ```
>>> -Davis
>>> On Thu, Oct 6, 2022 at 9:33 AM Davis Vaughan <davis at rstudio.com> wrote:
>>>> Hi all,
>>>>
>>>> I have found another POSIXlt bug while I've been fiddling around with it.
>>>> This one only appears on specific OSes, because it has to do with the fact
>>>> that the `gmtoff` field is optional, and isn't always used on all OSes. It
>>>> also doesn't seem to be specific to r-devel, I think it has been there
>>>> awhile.
>>>>
>>>> Here is the bug:
>>>>
>>>> ```
>>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>>>>
>>>> # Oh no!
>>>> x[1] <- NA
>>>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero
>>>> ```
>>>>
>>>> If you look at the objects, you can see that `x` has a `gmtoff` field, but
>>>> `NA` (when converted to POSIXlt, which is what `[<-.POSIXlt` does) does not:
>>>>
>>>> ```
>>>> unclass(x)
>>>> #> $sec
>>>> #> [1] 0
>>>> #>
>>>> #> $min
>>>> #> [1] 0
>>>> #>
>>>> #> $hour
>>>> #> [1] 0
>>>> #>
>>>> #> $mday
>>>> #> [1] 31
>>>> #>
>>>> #> $mon
>>>> #> [1] 0
>>>> #>
>>>> #> $year
>>>> #> [1] 113
>>>> #>
>>>> #> $wday
>>>> #> [1] 4
>>>> #>
>>>> #> $yday
>>>> #> [1] 30
>>>> #>
>>>> #> $isdst
>>>> #> [1] 0
>>>> #>
>>>> #> $zone
>>>> #> [1] "CST"
>>>> #>
>>>> #> $gmtoff
>>>> #> [1] -21600
>>>> #>
>>>> #> attr(,"tzone")
>>>> #> [1] "America/Chicago" "CST" "CDT"
>>>>
>>>> unclass(as.POSIXlt(NA))
>>>> #> $sec
>>>> #> [1] NA
>>>> #>
>>>> #> $min
>>>> #> [1] NA
>>>> #>
>>>> #> $hour
>>>> #> [1] NA
>>>> #>
>>>> #> $mday
>>>> #> [1] NA
>>>> #>
>>>> #> $mon
>>>> #> [1] NA
>>>> #>
>>>> #> $year
>>>> #> [1] NA
>>>> #>
>>>> #> $wday
>>>> #> [1] NA
>>>> #>
>>>> #> $yday
>>>> #> [1] NA
>>>> #>
>>>> #> $isdst
>>>> #> [1] -1
>>>> #>
>>>> #> attr(,"tzone")
>>>> #> [1] "UTC"
>>>> ```
>>>>
>>>> The problem seems to be that `[<-.POSIXlt` assumes that if the field was
>>>> there in `x` then it must also be there in `value`:
>>>>
>>>> https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/library/base/R/datetime.R#L1303-L1304
>>>>
>>>> But this isn't the case for the `NA` value that was converted to POSIXlt.
>>>>
>>>> I can't reproduce this on my personal Mac, but it affects the Linux, Mac,
>>>> and Windows machines we use for the lubridate CI checks through GitHub
>>>> Actions.
>>>>
>>>> Thanks,
>>>> Davis
>>>>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel