Wrong length of POSIXt vectors (PR#10507)
Jeffrey J. Hallman wrote:
Duncan Murdoch <murdoch at stats.uwo.ca> writes:
One reason I don't want to work on this is because the appropriate
action depends on what "length(x)" is intended to mean. Currently for
POSIXlt objects, it gives the physical length of the underlying basic
type (the list). This is the same behaviour as we have for matrices,
data frames and every other object without a specific length method, so
it's not outrageous.
The proposed change is to have it return the logical length of the
object, which also seems quite reasonable. I don't think matrices and
data frames have a "logical length", so there would be no contradiction
in those examples. The thing that worries me is that there are probably
objects in packages where both logical length and physical length make
sense but are different. I don't have any expectation that length(x) on
those currently is consistent in which type of value it returns.
If we were to decide that "length(x)" *always* meant logical length,
then we would have a problem: matrices and data frames don't have a
logical length, so we shouldn't be getting an answer there. Changing
length(x) for those is not acceptable.
On the other hand, if we decide that "length(x)" *always* means physical
length, we don't need to do anything to the POSIXlt or matrices or data
frames, but there may well be other kinds of objects out there that
violate this rule.
We could leave the meaning of length(x) ambiguous. If you want to know
what it does for a POSIXlt object, you need to read the documentation or
look at the source code. As a policy, this isn't particularly
appealing, but I could probably live with it if someone else did the
research and showed that current usage is ambiguous.
Physical length and logical length are, as you say, two different things. So
why not two functions? Keep length() for physical length, as it is now, and
maybe Length() for logical length. The latter could be defined as
Length <- function(x, ...) UseMethod("Length")
Length.default <- function(x, ...) length(x)
and then add methods for classes that want something else.
A very reasonable suggestion, but I'd also put this in the "next time we design a language" category. The current system in R seems workable to me, if one knows that vector-like classes that have a S3 list-based implementation need to have methods defined for 'c', 'length', '[', etc, and that if these methods aren't defined, then you'll be operating on the underlying list structure. Where these methods are defined, one can get at the underlying structure by unclassing first, and that's OK. However, classes that have some of these methods defined but not others seem to me to be needlessly confusing -- it's not like there any great benefit that length() always returns the length of the underlying list for POSIXlt -- if there was a length() method one could get at the underlying length using length(unclass(x)). It just seems like a design oversight that makes using such classes unnecessarily difficult and error-prone. Hence my proposal (in a new thread) for coding & documentation guidelines that would that would: (1) suggest consistency is a good thing (2) suggent compliance or deviation should be documented (3) define what consistency was (and here it's not so important to get absolutely the right set of consistency definitions as it is to get a reasonable set that people agree on.) -- Tony Plate