Unexpected alteration of data frame column names

Marc Schwartz · 2007-05-15T18:25:39Z

On Mon, 2007-05-14 at 23:59 -0700, Herve Pages wrote: > Hi, > > I'm using data.frame(..., check.names=FALSE), because I want to create > a data frame with duplicated column names (in the real life you can get such > data frame as the result of an SQL query): > > > df > df > aa aa > 1 1 9 > 2 2 8 > 3 3 7 > 4 4 6 > 5 5 5 > > Why is [.data.frame changing my column names? > > > df[1:3, ] > aa aa.1 > 1 1 9

Marc Schwartz

Tue, May 15, 2007 11:25 AM

On Mon, 2007-05-14 at 23:59 -0700, Herve Pages wrote:

Herve,

I had not seen a reply to your post, but you can review the code for
"[.data.frame" by using:

  getAnywhere("[.data.frame")

and see where there are checks for duplicate column names in the
function.

That is going to be the default behavior for data frame
subsetting/extraction and in fact is noted in the 'ONEWS' file for R
version 1.8.0:

 - Subsetting a data frame can no longer produce duplicate
   column names.

So it has been around for some time (October of 2003).

In terms of avoiding it, I suspect that you would have to create your
own version of the function, perhaps with an additional argument that
enables/disables that duplicate column name checks.

I have not however considered the broader functional implications of
doing so however, so be vewwy vewwy careful here.

HTH,

Marc Schwartz

Unexpected alteration of data frame column names

Thread (2 messages)