Skip to content

Surprising length() of POSIXlt vector (PR#14073)

5 messages · Steven McKinney, Benilton Carvalho, William Dunlap +1 more

#
I've checked the archives, and this problem crops up every
few months going back for years.

What I was not able to find was an explanation of why a
function such as 
  length.POSIXlt <- function(x) { length(x$sec) }
is a Bad Idea, or what it would break.  listserv threads
seem to end without presenting an answer.  R News 2001
Vol 1/2 discusses that "lots of methods are needed..."
(page 11) but I haven't found discussion of why a length
method isn't feasible.

Can anyone clarify this, or point me at the right
archive or documentation source that discusses why
objects of class POSIXlt always need to return a
length of 9?

Thanks
Steve McKinney
#
Steve,

I'm no expert on this, but my understanding is that the choice was to  
stick to the definition.

The help file for length() [1] says:

"For vectors (including lists) and factors the length is the number of  
elements."

The help file for POSIXlt [2] (for example) says:

"Class ?"POSIXlt"? is a named list of vectors representing (...)"

and then lists the 9 elements (sec / min / hour / mday / mon / year /  
wday / yday / isdst).

So, by [1] length of POSIXlt objects is 9, because it "is a named list  
of vectors representing (...)".

b
On Nov 20, 2009, at 12:19 AM, Steven McKinney wrote:

            
#
Thanks, a most sensible description.
After how many bug reports does it qualify for addition to the FAQ?!

Steve McKinney
#
Before data.frames existed (c. 1991) the S help files
probably would have described describe 'dim()' in a
similar way for matrices, but it made sense to extend
it and its help file to work on data.frames after they
were invented.  Aren't the real questions how much code
would break, how much code would start working, and how
easy or hard would it be for a user to make sense of it
if length(POSIXlt.thing) reported how many dates were
in POSIXlt.thing instead of reporting how many components
were in its representation?

R's rep method for POSIXlt has a length argument that
represents the number of dates, as it must.  Its subscript
operator for POSIXlt accepts an index in the range
1:numberOfDates.  I.e., lots of its methods act like its
length is the number of dates.  However POSIXlt is not
vector-like enough to make a matrix out of or to attach
names to its dates.

I don't think a possibly out-of-date help file
is the definitive answer to the question of whether
or not there should be a length method for POSIXlt.

S+ has a timeDate class (represented as 2 vectors
of integers and some scalar attributes) with a length
method that gives the number of dates.  I think the
main problem with the method is that the C-level get_length
function returns a different value than the SV4 method
does.

S+ also has a numRows functions which is documented to
to return the 'number of cases' in a data object, with
methods for lots of classes (vector, matrix, timeSeries,
data.frame, etc.).  Users can call that and know it
never represents some accident of implementation as length
might.  Then users could abandon the use of length in
favor of numRows except when writing low-level code that
deals with the representation of things.  Does R has a
similar high level function?  (In that same family of
functions S+ has rowIds, numCols, and colIds to supplant
rownames, ncol, and colnames, respectively.) 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Benilton Carvalho writes:
Thanks, all.  Yes, I'd already read both, and it's obviously
true that a length() of 9 is correct (as I said up-front).

The difficulty is that some functions -- importantly
including "[" -- already have methods which make POSIXlt
behave like a vector.  The documentation for POSIXlt just
says it's a list of 9 elements: it mentions methods for
addition etc, but AFAICT it doesn't say that subsetting won't
behave is "["'s help says for a list-like object.

In the end, "[" sees a different length to "[[" and "$"
here, so a length.POSIXlt() just shuffles the issue around.

Anyhow, I somehow missed there have been other PRs on this,
including discussion on r-devel of "[" and logical vs physical
length() under PR#10507.  I'm sorry for being repetitive.

Mark <><