Skip to content

knowing the code-number of factors in a vector

4 messages · Eduardo Klein, Rainer Schuermann, Ben Bolker +1 more

#
Hi,

I would like to know how R assigns the numeric code to a set of factors 
in a vector. For example, I have a vector of 5 different factors in a 
random order, and I want a color-coded plot by factors:

rfactor=as.factor(sample(letters[1:5], 50, replace=T))
rfactor
  [1] c c c d b a b d d a a e e b b e c e e a a b b b a b a e a a b d b 
b c a b b
[39] d c a e c d e d a a a a
Levels: a b c d e

x=rnorm(50)
boxplot(x~rfactor, col=1:5)

So, colors 1 to 5 are assigned to factors alphabetically (a-1, b-2, ..., 
e-5), or by order of appearance (c-1, d-2, b-3, a-4, e-5)? It is 
possible to control that?

Saludos, EKS
#
It seems to be dependent upon the the character ("assigned alphabetically") which I found out by manually changing the order of appearance of the characters in rfactor; the colour would stick to the character "a", whether this appears first in rfactor or not.

I as able to control the colours being used by
Rgds,
Rainer

Disclaimer:
I'm new to R and very early on the learning curve - just trying my best...
On Saturday 04 December 2010 01:22:56 Eduardo Klein wrote:
#
Eduardo Klein <eklein <at> usb.ve> writes:
[snip]
It is indeed alphabetical by default.  To get it in order of
appearance you could do something like

set.seed(1001)
x <- sample(letters[1:5],50,replace=TRUE)
f <- factor(x,levels=unique(x))
2 days later
#
Hi!

As Ben Bolker told you already, the levels are alphabetically ordered by 
default.
When you print rfactor, the last line shows you the different levels, in 
the saved order.
rfactor
[1] c c c d b a b d d a a e e b b e c e e a a b b b a b a e a a b d b b 
c a b b
[39] d c a e c d e d a a a a
Levels: a b c d e                   ## here

If you want to change the order manually, do this:
rfactor=as.factor(sample(letters[1:5], 50, replace=T))
rfactor
  [1] e e a a e a a a b a a d a b a b b c d c b e b e d d c a a e a b c 
e b c b e
[39] a e e d b e c b e d a e   ## different values because of sample()
Levels: a b c d e
rfactor2 <- factor(rfactor, levels=c("e","d","c","b","a"))
rfactor2
  [1] e e a a e a a a b a a d a b a b b c d c b e b e d d c a a e a b c 
e b c b e
[39] a e e d b e c b e d a e  ## but the values here are the same as for 
my rfactor
Levels: e d c b a                  ## note that the order here is the 
one you typed in your call to factor(, levels=)

HTH,
Ivan


Le 12/4/2010 02:04, Rainer Schuermann a ?crit :