Skip to content
Prev 244686 / 398502 Next

Why do we have to turn factors into characters for various functions?

At 12.12.2010 00:48 +0200, Tal Galili wrote:
In my view the answer can be found implicitly in the language definition.

"Factors are currently implemented using an integer array to specify 
the actual levels and a second array of names that are mapped to the 
integers. Rather unfortunately users often make use of the 
implementation in order to make some calculations easier."

It is the "unfortunate" use of factors that seems generally accepted, 
even if the language definition continues:

"This, however, is an implementation issue and is not guaranteed to 
hold in all implementations of R."

Personally, like some others, I avoid factors, except in cases, where 
they represent a statistical concept.

Certainly I would agree with you that, if only reading the "R 
Language Definition" and not the documentation of the function 
factor, one would rather expect functions like as.numeric or strsplit 
to operate on the levels of a factor and not on the underlying, 
implementation specific, integer array.

Heinz