Skip to content

as.data.frame.matrix() returns an invalid object

3 messages · Hervé Pagès, Bert Gunter, Peter Dalgaard

#
Hi,

Two ways to create what should normally be the same data frame:

   > df1 <- data.frame(a=character(0), b=character(0))> df1
   [1] a b
   <0 rows> (or 0-length row.names)

   > df2 <- as.data.frame(matrix(character(0), ncol=2, 
dimnames=list(NULL, letters[1:2])))
   > df2
   [1] a b
   <0 rows> (or 0-length row.names)

unique() works as expected except that I get a warning on 'df2':

   > unique(df1)
   [1] a b
   <0 rows> (or 0-length row.names)

   > unique(df2)
   [1] a b
   <0 rows> (or 0-length row.names)
   Warning message:
   In is.na(rows) : is.na() applied to non-(list or vector) of type 'NULL'

Look like the two data frames are not identical:

   > identical(df1, df2)
   [1] FALSE

   > all.equal(df1, df2)
   [1] "Attributes: < Length mismatch: comparison on first 1 components >"

   > attributes(df1)
   $names
   [1] "a" "b"

   $row.names
   integer(0)

   $class
   [1] "data.frame"

   > attributes(df2)
   $names
   [1] "a" "b"

   $class
   [1] "data.frame"

Actually 'df2' is considered broken by validObject():

   > validObject(df1)
   [1] TRUE

   > validObject(df2)
   Error in validObject(df2) :
     invalid class ?data.frame? object: slots in class definition but 
not in object: "row.names"

This is with R 2.15 and recent R devel.

Cheers,
H.
#
... and further
[1] TRUE

in R 2.15.0

Not sure whether these sorts of degenerate cases are of much value,
though. But I'll leave that for the wizards.

-- Bert
On Wed, Oct 10, 2012 at 11:22 PM, Herv? Pag?s <hpages at fhcrc.org> wrote:

  
    
1 day later
#
On Oct 11, 2012, at 16:02 , Bert Gunter wrote:

            
Looks like this is easier to fix that to argue pro/con fixing it...

AFAICS, there's a gap in the logic in as.data.frame.matrix:

    if (length(row.names) != nrows) 
        row.names <- .set_row_names(nrows)

but length(NULL) is 0 so we can end up leaving row.names at NULL and eventually nulling it in the result. An explicit check for is.null(row.names) should help.

-pd