R version 2.14.0, started with --vanilla
> table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
1 3 4 <NA>
1 1 1 2
This came from a local user who wanted to remove one particular response
from some tables, but also wants to have NA always reported for data
checking purposes.
I don't think the above is what anyone would want.
PS.
This is on a background of our local desires, which is to have the
default action of the table command be
to report NA, if present. (It's one of the only commands that we
globally override at Mayo.) The user had
added only the exclude=2 argument, and the useNA value is our default.
The above makes this harder to do without rewriting the command
wholesale, which is ok (we've done it before at
various times in R and Splus) but we would avoid it if possible. Please
no wars about whether this is the "right" decison or not, we've done it
for 10+ years and quite firmly believe the extra robustness gained by
having NA appear
is worth the maintainance bother, correctness being paramount in medical
research. We're not trying to convert anyone
else, just get feedback on the best way to approach this.
Terry T.
Problem with table
3 messages · Brian Ripley, Terry Therneau
7 days later
On 19/03/2012 17:01, Terry Therneau wrote:
R version 2.14.0, started with --vanilla
> table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
1 3 4 <NA> 1 1 1 2 This came from a local user who wanted to remove one particular response from some tables, but also wants to have NA always reported for data checking purposes. I don't think the above is what anyone would want.
You have not told us what you want!
Try
> table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')
1 3 4 <NA>
1 1 1 1
Note carefully how 'exclude' is defined:
exclude: levels to remove from all factors in ?...?. If set to ?NULL?,
it implies ?useNA="always"?.
As you did not specify a factor, 'exclude' was used in forming the 'levels'.
PS. This is on a background of our local desires, which is to have the default action of the table command be to report NA, if present. (It's one of the only commands that we globally override at Mayo.) The user had added only the exclude=2 argument, and the useNA value is our default. The above makes this harder to do without rewriting the command wholesale, which is ok (we've done it before at various times in R and Splus) but we would avoid it if possible. Please no wars about whether this is the "right" decison or not, we've done it for 10+ years and quite firmly believe the extra robustness gained by having NA appear is worth the maintainance bother, correctness being paramount in medical research. We're not trying to convert anyone else, just get feedback on the best way to approach this.
Most likely, feed table() a factor with the properties you want.
Terry T.
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 03/27/2012 02:05 AM, Prof Brian Ripley wrote:
n 19/03/2012 17:01, Terry Therneau wrote:
R version 2.14.0, started with --vanilla
table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
1 3 4 <NA> 1 1 1 2 This came from a local user who wanted to remove one particular response from some tables, but also wants to have NA always reported for data checking purposes. I don't think the above is what anyone would want.
You have not told us what you want!
Want: that the resulting table exclude values of "2" from the printout,
while still reporting NA. This is what the local user expected, the one
who came to me with their query.
There are lots of ways to get the program to do the right thing, the
simplest is
table(c(1,2,3,4,NA), exclude=2) # keeping the default for useNA
You show another below.
Try
table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')
1 3 4 <NA>
1 1 1 1
Note carefully how 'exclude' is defined:
exclude: levels to remove from all factors in ?...?. If set to ?NULL?,
it implies ?useNA="always"?.
As you did not specify a factor, 'exclude' was used in forming the
'levels'.
That is almost a "legal loophole" reading of the manual. I would never have seen through to that level of subtlety. A primary reason is that a simple test shows that exclude works on non-factors. I'm not sure what the best course of action is. What I've reported is a case where use of the options in a fairly obvious way gives an unexpected answer. On the other hand, I have never before seen or considered the case where someone wanted to exclude an actual data level from table: I myself would always have removed a column from the result. If fixing this causes other problems, then perhaps we just give up on this rare case. As to our local choices, we figured out a way to make display of NA the default without causing the above problem. As is often the case, a fairly simple solution became obvious to us about 30 minutes after submitting a question to the list. Terry T.