An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090506/9cc1012b/attachment-0001.pl>
tapply changing order of factor levels?
7 messages · jim holtman, Chirantan Kundu, Alain Guillet +1 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090506/e0c7249c/attachment-0001.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090506/8f3ce98e/attachment-0001.pl>
Hi, I don't believe the problem is related to tapply. I would say it is because of the factor. In fact, the order of a factor is given by the alphanumerical order of his levels. You can see it with levels(myfactor). I you want to change the order, redefine the levels of myfactor with the expected order or use the function ordered. Alain
Chirantan Kundu wrote:
Hi, Does tapply change the order when applied on a factor? Below is the code I tried.
mylevels<-c("IN0020020155","IN0019800021","IN0020020064")
mydata<-c("IN0020020155","IN0019800021","IN0020020064","IN0020020155","IN0019800021","IN0019800021","IN0020020064","IN0020020064","IN0019800021")
myfactor<-factor(mydata,levels=mylevels)
myfactor
[1] IN0020020155 IN0019800021 IN0020020064 IN0020020155 IN0019800021 IN0019800021 IN0020020064 IN0020020064 IN0019800021 Levels: IN0020020155 IN0019800021 IN0020020064
summary(myfactor)
IN0020020155 IN0019800021 IN0020020064
2 4 3
# Everything fine upto this point. The order of levels is maintained as it
is.
mysummary<-tapply(myfactor,mydata,length)
mysummary
IN0019800021 IN0020020064 IN0020020155
4 3 2
# Now the order has changed.
Is this the expected behavior? Any idea on how to avoid the change in order?
Regards,
Chirantan
____________________________________ Visit us at http://www.2pirad.com [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Universit? catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090506/1ef82bbc/attachment-0001.pl>
Jim's advice is patently false. Please read ?tapply for correct details. Counterexample:
y <- rnorm(6) x <- factor(rep(factor(letters[1:3],lev = letters[3:1]),2)) x
[1] a b c a b c Levels: c b a
tapply(y,x,mean)
c b a 0.4545897 -1.0544782 0.4682773 Bert Gunter Genentech Nonclinical Statistics -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of jim holtman Sent: Wednesday, May 06, 2009 6:57 AM To: Chirantan Kundu Cc: r-help at r-project.org Subject: Re: [R] tapply changing order of factor levels? The result of 'tapply' is just a named vector and the names are in alphabetical order. If you want them printed in a different order, then you have to specify it. Since you have the order in 'mylevels', this will work:
str(mysummary)
int [1:3(1d)] 4 3 2 - attr(*, "dimnames")=List of 1 ..$ : chr [1:3] "IN0019800021" "IN0020020064" "IN0020020155"
mysummary[mylevels]
IN0020020155 IN0019800021 IN0020020064
2 4 3
On Wed, May 6, 2009 at 9:45 AM, Chirantan Kundu <chirantan at 2pirad.com>wrote:
Hi, Does tapply change the order when applied on a factor? Below is the code I tried.
mylevels<-c("IN0020020155","IN0019800021","IN0020020064")
mydata<-c("IN0020020155","IN0019800021","IN0020020064","IN0020020155","IN001
9800021","IN0019800021","IN0020020064","IN0020020064","IN0019800021")
myfactor<-factor(mydata,levels=mylevels) myfactor
[1] IN0020020155 IN0019800021 IN0020020064 IN0020020155 IN0019800021 IN0019800021 IN0020020064 IN0020020064 IN0019800021 Levels: IN0020020155 IN0019800021 IN0020020064
summary(myfactor)
IN0020020155 IN0019800021 IN0020020064
2 4 3
# Everything fine upto this point. The order of levels is maintained as it
is.
mysummary<-tapply(myfactor,mydata,length) mysummary
IN0019800021 IN0020020064 IN0020020155
4 3 2
# Now the order has changed.
Is this the expected behavior? Any idea on how to avoid the change in
order?
Regards,
Chirantan
____________________________________ Visit us at http://www.2pirad.com [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting -guide.html>
and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi,
I meant that your problem occured because the levels of mylevels are not
ordered whereas tapply uses the ordered levels for printing. If you
order them (look under), you can see the results of the tapply has the
same order as the levels of myfactor
>mydata<-c("IN0020020155","IN0019800021","IN0020020064","IN0020020155","IN0019800021","IN0019800021","IN0020020064","IN0020020064","IN0019800021")
> mylevels<-c("IN0020020155","IN0019800021","IN0020020064")
> myfactor<-factor(mydata,levels=mylevels)
> myfactor
[1] IN0020020155 IN0019800021 IN0020020064 IN0020020155 IN0019800021
[6] IN0019800021 IN0020020064 IN0020020064 IN0019800021
Levels: IN0020020155 IN0019800021 IN0020020064
> levels(myfactor) <- sort(mylevels)
> myfactor
[1] IN0019800021 IN0020020064 IN0020020155 IN0019800021 IN0020020064
[6] IN0020020064 IN0020020155 IN0020020155 IN0020020064
Levels: IN0019800021 IN0020020064 IN0020020155
> tapply(myfactor,mydata,length)
IN0019800021 IN0020020064 IN0020020155
4 3 2
Chirantan Kundu wrote:
Hi Alain, I tried levels(myfactor) as you suggested.
levels(myfactor)
[1] "IN0020020155" "IN0019800021" "IN0020020064"
The order is preserved, no alphanumerical sorting done here.
Regards.
On Wed, May 6, 2009 at 7:35 PM, Alain Guillet
<alain.guillet at uclouvain.be <mailto:alain.guillet at uclouvain.be>> wrote:
Hi,
I don't believe the problem is related to tapply. I would say it
is because of the factor. In fact, the order of a factor is given
by the alphanumerical order of his levels. You can see it with
levels(myfactor).
I you want to change the order, redefine the levels of myfactor
with the expected order or use the function ordered.
Alain
Chirantan Kundu wrote:
Hi,
Does tapply change the order when applied on a factor? Below
is the code I
tried.
mylevels<-c("IN0020020155","IN0019800021","IN0020020064")
mydata<-c("IN0020020155","IN0019800021","IN0020020064","IN0020020155","IN0019800021","IN0019800021","IN0020020064","IN0020020064","IN0019800021")
myfactor<-factor(mydata,levels=mylevels)
myfactor
[1] IN0020020155 IN0019800021 IN0020020064 IN0020020155
IN0019800021
IN0019800021 IN0020020064 IN0020020064 IN0019800021
Levels: IN0020020155 IN0019800021 IN0020020064
summary(myfactor)
IN0020020155 IN0019800021 IN0020020064
2 4 3
# Everything fine upto this point. The order of levels is
maintained as it
is.
mysummary<-tapply(myfactor,mydata,length)
mysummary
IN0019800021 IN0020020064 IN0020020155
4 3 2
# Now the order has changed.
Is this the expected behavior? Any idea on how to avoid the
change in order?
Regards,
Chirantan
____________________________________
Visit us at http://www.2pirad.com
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Alain Guillet
Statistician and Computer Scientist
SMCS - Institut de statistique - Universit? catholique de Louvain
Bureau d.126
Voie du Roman Pays, 20
B-1348 Louvain-la-Neuve
Belgium
tel: +32 10 47 30 50
____________________________________
Visit us at http://www.2pirad.com
Alain Guillet Statistician and Computer Scientist SMCS - Institut de statistique - Universit? catholique de Louvain Bureau d.126 Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve Belgium tel: +32 10 47 30 50