Skip to content

row.names(), rownames(), colnames(), names() ...?

3 messages · Boris Steipe, Jeff Newmiller

#
The help text for row+colnames {base} states:

  "For a data frame, rownames and colnames eventually call row.names
   and names respectively, but the latter are preferred."

Why are they "preferred"?
Why is it names(), not col.names()?
I have only ever used names() for vectors - I'm surprised it works on data.frames... IMO this is not great for code readability, thus thinking to require rownames(), colnames() for all 2D objects, names() for vectors and lists. Any problems with this approach?


Thanks for some insight!
Boris
#
Data frames are lists of columns. The names() function is appropriate for lists. 

It doesn't pay to fall into the trap of thinking that data frames are truly symmetric between columns and rows, because there is a performance penalty for accessing rows that is greater than the cost of accessing columns. With that in mind, thinking of data frames as lists is preferred, so names is preferred over colnames.
#
Ah, that makes immediate sense.
On Apr 2, 2016, at 9:11 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:

            
Interesting, I didn't know that.
I see. Thinking about data frames like that has the added benefit that this matches how we describe entities in relational datamodels. Both then turn out to be the transpose of the typical spreadsheet.

Thanks Jeff