Skip to content
Prev 395347 / 398502 Next

Bug in print for data frames?

On 25/10/2023 2:18 a.m., Christian Asseburg wrote:
y[1] is a dataframe with one column, i.e. it is identical to y.  To get 
the result you expected, you should have used y[[1]], to extract column 1.

Since dataframes are lists, you can assign them as columns of other 
dataframes, and you'll create a single column in the result whose rows 
are the columns of the dataframe you're assigning.  This means that

  x$C <- y[1]

replaces the C column of x with a dataframe.  It retains the name C (you 
can see this if you print names(x) ), but since the column contains a 
dataframe, it chooses to use the column name of y when printing.

If you try

  x$D <- x

you'll see it generate new names when printing, but the names within x 
remain as A, B, C, D.

This is a situation where tibbles do a better job than dataframes:  if 
you created x and y as tibbles instead of dataframes and executed your 
code, you'd see this:

   library(tibble)
   x <- tibble(A = 1, B = 2, C = 3)
   y <- tibble(A = 1)
   x$C <- y[1]
   x
   #> # A tibble: 1 ? 3
   #>       A     B   C$A
   #>   <dbl> <dbl> <dbl>
   #> 1     1     2     1

Duncan Murdoch