Skip to content

why doesn't table() have a data=argument?

3 messages · Michael Friendly, Marc Schwartz, Achim Zeileis

#
In an Rweave tutorial written for possibly naive R users, I felt it 
necessary to explain why
table() had to be used inside with(), whereas other tools like xtabs() 
had a data= argument.

with() is quite nice for such cases, but it seems an unnecessary thing 
to learn right off.
Before I turn this question into a request for R-devel, is there any 
inherent reason why it might be
hard to add a data= argument to table()?  I've looked at the code, but 
am not enlightened
on this question.

\emph{Example}: Convert the \code{Arthritis} data in case form to a 
3-way table of
\code{Treatment} $\times$ \code{Sex} $\times$ \code{Improved}.%
\footnote{
Unfortunately, \codefun{table} does not allow a \code{data} argument to 
provide
an environment in which the table variables are to be found.  In the
examples in \secref{sec:table} I used \code{attach(mydata)} for this 
purpose,
but \codefun{attach} leaves the variables in the global environment,
while \codefun{with} just evaluates the \codefun{table} expression in a
temporary environment of the data.
}
<<convert-ex2,results=verbatim>>=
Art.tab <-with(Arthritis, table(Treatment, Sex, Improved))
str(Art.tab)
ftable(Art.tab)
@
#
on 02/20/2009 09:11 AM Michael Friendly wrote:
xtabs and other functions (such as modeling and plot functions) that
include a 'data' argument have a formula based argument as the means by
which you indicate the columns of the data frame to be included.

These are then passed to model.frame() internally to create the data
frame to be used subsequently by the function. Since model.frame() is
evaluated within the environment of the data frame indicated by the
'data' argument if present, it is needed there.

It's always dangerous to say always, but in my experience, functions
that have a 'data' argument fit the above profile.

If table() had a formula method, then having a 'data' argument would
make sense.

HTH,

Marc Schwartz
#
On Fri, 20 Feb 2009, Michael Friendly wrote:

            
I think the simple rule is: If there is a "formula", there is (typically) 
also a "data" argument, otherwise not.
There are probably exceptions but

So the question should probably be: Why is there xtabs() with interface
   xtabs(~ x + y, data = mydata)
instead of a "formula" method for table() that would allow
   table(~ x + y, data = mydata)

Well, table() is not generic, to some extent probably due to historical 
reasons. But maybe it would be worth considering a change to this?
Z