Collapse factor levels
Peter Dalgaard wrote:
Kevin E. Thorpe wrote:
I'm sure this is simple enough, but an R site search on my subject terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example.
> x <- 1:10 > x <-
factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C"))
> x
[1] A A A B B B C C C C Levels: A A A B B B C C C C
> summary(x)
A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are:
> x
[1] A A A B B B C C C C Levels: A B C
> summary(x)
A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin
It's an anomaly inherited frokm S-PLUS (or so I have been told). Actually, with the current R, you should get a warning:
> x <- 1:10 > x <-
factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C"))
Warning message:
In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C", :
duplicated levels will not be allowed in factors anymore
This works (as documented on the help page for levels!):
> x <- 1:10
> x <- factor(x,levels=1:10)
> levels(x) <- c("A","A","A","B","B","B","C","C","C","C")
> table(x)
x A B C 3 3 4
Thanks. That's exactly what I need. I knew it was simple. I've even used levels() before, but it just didn't occur to me this time. I'm clearly not on current R. :-) When I have some time, I'll upgrade. Kevin
Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.thorpe at utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016