Skip to content

max length of a factor variable

4 messages · Michael Bedward, jim holtman, Richard Mott

#
Hi

Is there a maximum length for the character string representing a level 
of a factor?  I have a set of several million variables, each a factor 
of length 19. Each factor level is a character string which in some 
cases can be many thousands of characters long.  I am trying to find out 
why my analysis fails - I just wanted to rule out the possibility that 
the internal factor conversion has a problem parsing long strings.

Thanks

Richard
#
Hello Richard,

Since no one else has answered yet I'll venture a guess.

The following works on my little macbook...

x <- as.factor(sapply(letters[1:26], function(x) paste(rep(x, 100000),
collapse="")))

So each of the 26 factor levels in x has a string representation of
100,000 chars.  So I'm *guessing* the limit is only that imposed by
system memory.

Hopefully if that's wrong it will provoke someone to correct me :)

Michael
On 27 September 2010 19:15, Richard Mott <rmott at well.ox.ac.uk> wrote:
#
You have provided no information as to what you mean by "my analysis
fails".  Exactly what error message are you getting, what operation
system do you have, how much memory do you have, how much are you
using for all the other objects in your address space, etc......
Information like this would help you get an answer.
On Mon, Sep 27, 2010 at 5:15 AM, Richard Mott <rmott at well.ox.ac.uk> wrote:

  
    
#
Thanks

I eventually tracked down the problem to something unrelated to this 
question (one out of the millions of character strings happened to be 
"NA" by chance, which of course was parsed as a missing value, breaking 
the code a long way downstream.....)

Richard
On 28/09/2010 04:01, Michael Bedward wrote: