Skip to content

Problem with table

3 messages · Brian Ripley, Terry Therneau

#
R version 2.14.0, started with --vanilla

 > table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
    1    3    4 <NA>
    1    1    1    2

This came from a local user who wanted to remove one particular response 
from some tables, but also wants to have NA always reported for data 
checking purposes.
   I don't think the above is what anyone would want.

PS.
This is on a background of our local desires, which is to have the 
default action of the table command be
to report NA, if present.  (It's one of the only commands that we 
globally override at Mayo.)  The user had
added only the exclude=2 argument, and the useNA value is our default.

The above makes this harder to do without rewriting the command 
wholesale, which is ok (we've done it before at
various times in R and Splus) but we would avoid it if possible.  Please 
no wars about whether this is the "right" decison or not, we've done it 
for 10+ years and quite firmly believe the extra robustness gained by 
having NA appear
is worth the maintainance bother, correctness being paramount in medical 
research.  We're not trying to convert anyone
else, just get feedback on the best way to approach this.

Terry T.
7 days later
#
On 19/03/2012 17:01, Terry Therneau wrote:
You have not told us what you want!

Try

 >  table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')

    1    3    4 <NA>
    1    1    1    1

Note carefully how 'exclude' is defined:

  exclude: levels to remove from all factors in ?...?. If set to ?NULL?,
           it implies ?useNA="always"?.

As you did not specify a factor, 'exclude' was used in forming the 'levels'.
Most likely, feed table() a factor with the properties you want.

  
    
#
On 03/27/2012 02:05 AM, Prof Brian Ripley wrote:
Want: that the resulting table exclude values of "2" from the printout, 
while still reporting NA.  This is what the local user expected, the one 
who came to me with their query.

There are lots of ways to get the program to do the right thing, the 
simplest is
      table(c(1,2,3,4,NA), exclude=2)     # keeping the default for useNA

You show another below.
That is almost a "legal loophole" reading of the manual.  I would never 
have seen through to that level of subtlety.  A primary reason is that a 
simple test shows that exclude works on non-factors.

I'm not sure what the best course of action is.  What I've reported is a 
case where use of the options in a fairly obvious way gives an 
unexpected answer.  On the other hand, I have never  before seen or 
considered the case where someone wanted to exclude an actual data level 
from table: I myself would always have removed a column from the 
result.   If fixing this causes other problems, then perhaps we just 
give up on this rare case.

As to our local choices, we figured out a way to make display of NA the 
default without causing the above problem.   As is often the case, a 
fairly simple solution became obvious to us about 30 minutes after 
submitting a question to the list.

Terry T.