rownames, colnames, and date and time
I haven't been following all of this thread, but it reminds me of a bug that was in S-PLUS not too long ago where dimnames could sometimes be numeric. This caused some problems that were very hard to track down because there were no visual clues of what was really wrong. I've been pleased not to encounter that in R and hope it continues. Patrick Burns patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User")
Prof Brian Ripley wrote:
Looking at the code it occurs to me that there is another case you have
not considered, namely dimnames().
rownames<- and colnames<- are just wrappers for dimnames<-, so consistency
does mean that all three should behave the same.
For arrays (including matrices), dimnames<- is primitive. It coerces
factors to character, and says in the C code
/* if (isObject(val1)) dispatch on as.character.foo, but we don't
have the context at this point to do so */
so someone considered this before now.
For data frames, dimnames<-.data.frame is used. That calls row.names<-
and names<-, and the first has a data.frame method. Only the row.names<-
method is documented to coerce its value to character, and I think it _is_
all quite consistent. The basic rule is that all these functions coerce
for data frames, and none do for arrays.
However, there was a problematic assumption in the row.names<-.data.frame
and dimnames<-.data.frame methods, which tested the length of 'value'
before coercion. That sounds reasonable, but in unusual cases such as
POSIXlt, coercion changes the length, and I have swapped the lines around.
What you expected was that dimnames<-() would coerce to character,
although I can find no support for that expectation in the documentation.
If it were not a primitive function that would be easy to achieve, but as
it is, it would need an expert in the internal code to change. There is
also the risk of inconsistency, since as the comment says, the C code is
used in places where the context is not known. I think this is probably
best left alone.
On Wed, 29 Mar 2006, Prof Brian Ripley wrote:
Yet again, this is the wrong list for suggesting changes to R. Please do use R-devel for that purpose (and I have moved this). If this bothers you (it all works as documented, so why not use it as documented?), please supply a suitable patch to the current R-devel sources and it will be considered. And BTW, row.names is the canonical accessor function for data frames, and its 'value' argument is documented differently from that for rownames for an array. Cf: Details: The extractor functions try to do something sensible for any matrix-like object 'x'. If the object has 'dimnames' the first component is used as the row names, and the second component (if any) is used for the col names. For a data frame, 'rownames' and 'colnames' are equivalent to 'row.names' and 'names' respectively. Note: 'row.names' is similar to 'rownames' for arrays, and it has a method that calls 'rownames' for an array argument. I am not sure why R decided to add rownames for the same purpose as row.names: eventually they were made equivalent. On Tue, 21 Mar 2006, Erich Neuwirth wrote:
I noticed something surprising (in R 2.2.1 on WinXP)
According to the documentation, rownames and colnames are character
vectors.
Assigning a vector of class POSIXct or POSIXlt as rownames or colnames
therefore is not strictly according to the rules.
In some cases, R performs a reasonable typecast, but in some other cases
where the same typecast also would be possible, it does not.
Assigning a vector of class POSIXct to the rownames or names of a
dataframe creates a reasonable string representation of the dates (and
possibly times).
Assigning such a vector to the rownames or colnames of a matrix produces
rownames or colnames consisting of the integer representation of the
date-time value.
Trying to assign a vector of class POSIXlt in all cases
(dataframes and matrices, rownames, colnames, names)
produces an error.
Demonstration code is given below.
This is somewhat inconsistent.
Perhaps a reasonable solution could be that the typecast
used for POSIXct and dataframes is used in all the other cases also.
Code:
mymat<-matrix(1:4,nrow=2,ncol=2)
mydf<-data.frame(mymat)
mydates<-as.POSIXct(c("2001-1-24","2005-12-25"))
rownames(mydf)<-mydates
names(mydf)<-mydates
rownames(mymat)<-mydates
colnames(mymat)<-mydates
print(deparse(mydates))
print(deparse(rownames(mydf)))
print(deparse(names(mydf)))
print(deparse(rownames(mymat)))
print(deparse(colnames(mymat)))
mydates1<-as.POSIXlt(mydates)
# the following lines will not work and
# produce errors
rownames(mydf)<-mydates1
names(mydf)<-mydates1
rownames(mymat)<-mydates1
colnames(mymat)<-mydates1