Skip to content
Prev 68873 / 398502 Next

Memory consumption, integer versus factor

Ajay Narottam Shah wrote:
Most numeric variables are stored as 8 byte doubles.  Factors are stored 
as 4 byte integers, plus a table giving the factor levels.
You will sometimes find what you want in the R Language Definition, for 
example here:

"Factors are currently implemented using an integer array to specify the 
actual levels and
a second array of names that are mapped to the integers. Rather 
unfortunately users often
make use of the implementation in order to make some calculations 
easier. This, however, is an
implementation issue and is not guaranteed to hold in all 
implementations of R."

For more details, there are some implementation documents on 
developer.r-project.org, but in general the only sure way to find out 
how something is implemented is to look at the source code.

Usually it's a bad idea to rely on the implementation details, as the 
last sentence quoted above says.  If it's not documented, it's subject 
to change without warning.
See the man pages ?gc, ?Memory, and the source code.

Duncan Murdoch