Skip to content

ncol() vs. length() on data.frames

2 messages · Hervé Pagès, Ivan Calandra

#
Hi Ivan,
On 3/31/20 06:44, Ivan Calandra wrote:
Not that I know. It's mostly a matter of taste and code readability.

Either use the 2D interface:

    ncol(df), colnames(df), df[ , "somecol"], cbind(), etc...

or the list interface:

    length(df), names(df), df[["somecol"]], c(), etc...

to operate on your data.frames. They're equivalent. One advantage of 
using the latter though is that your code would also work on list 
objects that are not data.frames. But maybe you don't need or care about 
that in which case using one interface or the other makes no difference.

Note that the 2D interface is richer because it has nrow(), rownames(), 
rbind() that are not part of the list interface.

 From a code readability point of view I think one should be consistent 
and avoid mixing the 2 styles. For example IMO using length(df) and 
colnames(df) in the same function body is not good style. Either use 
length(df) and names(df), or use ncol(df) and colnames(df). If in the 
same function body you need to also access the rownames() then it would 
make sense to stick to the 2D interface throughout the entire body of 
your function.

Cheers,
H.

  
    
#
Dear Herv?,

This is indeed a wise recommendation; I hadn't thought about colnames()
vs. names(), and in general 2D vs. list notations.
I will have to edit a bit more than I thought.

Thank you all for all these hints!

Best,
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra
On 07/04/2020 02:56, Herv? Pag?s wrote: