Skip to content

Repeated factor levels - inconsistency of factor and levels<- functions?

3 messages · Honza Hucin, Peter Dalgaard

#
Hello,

I have a vector x containing letters ("a", "b" etc.). Now I want to
convert it to factor and group some letters into one common level. If I do
it by factor function, giving the same label names for all values I want
to group, it doesn't work:
[1] "a" "b" "c" "d" "e"
labels=c("vowel","consonant","consonant","consonant","vowel"))
[1] "vowel"     "consonant" "consonant" "consonant" "vowel"

But, after it, if I update level names by a single assignment, levels with
the same names will group, even when I don't change all of them:
levels to group
[1] "vowel"     "consonant"

I'm rather confused! I think this behavior is double inconsistent. First,
the labeling in factor function should work similarly as in levels<- ,
i.e. they should group levels with the same names either BOTH or NONE.
Second, if I change only one vector item, it should not change anything
else, especially it should not make any "invisible" grouping.

Or am I wrong? Or is it a bug?

Jan Hucin
#
Honza Hucin wrote:
I asked Brian Ripley the same thing half a year ago and his answer was:
"Back compatibility ...."

I'm at a loss trying to figure out what kind of code would depend on
current behaviour, but the workaround is rather obvious, so the
motivation for fixing (changing!) it is not too great.
#
Thank you for the explanation, I understand. Maybe the fix would be 
simple - by providing the factor function with a new parameter, say, 
"unique.levels" or "group.levels" with default value of FALSE. This way 
the back compatibility would be preserved. I should rather send it to 
R-devel mailing list, shoudn't I? :)
Jan Hucin