Message-ID: <BEABE04F-E083-4DEF-9487-E72B4E887243@comcast.net>
Date: 2008-12-20T23:58:03Z
From: David Winsemius
Subject: NA, where no NA should (could!) be!
In-Reply-To: <loom.20081220T232310-637@post.gmane.org>
On Dec 20, 2008, at 6:26 PM, Oliver Bandel wrote:
> Hello,
>
>
> ok, I found the problem!
>
>
>
> I now have:
>
> res_size_by_host <- tapply( selected$size, factor(selected$host),
> sum)
>
>
> instead of
>
>
> res_size_by_host <- tapply( selected$size, selected$host, sum)
>
>
> and now it works.
>
> IMHO this is strange, because selected$host is already a factor!
>
>
> I don't know, why this must be done...
> ...someone of the R-experts might know it...
> ...and may explain it...?!
It does not take an expert. All you need to do is read the help page.
Dalgaard already diagnosed the problem. Look at his example and see
what your "solution" does to it.
> x <- factor(1,levels=1:2)
> tapply(1,x,sum)
1 2
1 NA
> x <- factor(1,levels=1:2)
> tapply(1,factor(x),sum)
1
1
The function, factor, applied to a factor with unused levels discards
those levels.
From the factor help page:
"Normally the ?levels? used as an attribute of the result are the
reduced set of levels after removing those in exclude, but this can be
altered by supplying labels."
Since NA is the default for exclude, that results in the "trimming
down" that you see with the application of factor(.)
--
David Winsemius
>
>
>
> Ciao,
> Oliver
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.