Hi Is there a maximum length for the character string representing a level of a factor? I have a set of several million variables, each a factor of length 19. Each factor level is a character string which in some cases can be many thousands of characters long. I am trying to find out why my analysis fails - I just wanted to rule out the possibility that the internal factor conversion has a problem parsing long strings. Thanks Richard -- ---------------------------------------------------- Richard Mott | Wellcome Trust Centre tel 01865 287588 | for Human Genetics fax 01865 287697 | Roosevelt Drive, Oxford OX3 7BN
Hello Richard, Since no one else has answered yet I'll venture a guess. The following works on my little macbook... x <- as.factor(sapply(letters[1:26], function(x) paste(rep(x, 100000), collapse=""))) So each of the 26 factor levels in x has a string representation of 100,000 chars. So I'm *guessing* the limit is only that imposed by system memory. Hopefully if that's wrong it will provoke someone to correct me :) Michael On 27 September 2010 19:15, Richard Mott <rmott at well.ox.ac.uk> wrote:> Hi > > Is there a maximum length for the character string representing a level of a > factor? ?I have a set of several million variables, each a factor of length > 19. Each factor level is a character string which in some cases can be many > thousands of characters long. ?I am trying to find out why my analysis fails > - I just wanted to rule out the possibility that the internal factor > conversion has a problem parsing long strings. > > Thanks > > Richard > -- > ---------------------------------------------------- > Richard Mott ? ? ? | Wellcome Trust Centre > tel 01865 287588 ? | for Human Genetics > fax 01865 287697 ? | Roosevelt Drive, Oxford OX3 7BN > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
You have provided no information as to what you mean by "my analysis fails". Exactly what error message are you getting, what operation system do you have, how much memory do you have, how much are you using for all the other objects in your address space, etc...... Information like this would help you get an answer. On Mon, Sep 27, 2010 at 5:15 AM, Richard Mott <rmott at well.ox.ac.uk> wrote:> Hi > > Is there a maximum length for the character string representing a level of a > factor? ?I have a set of several million variables, each a factor of length > 19. Each factor level is a character string which in some cases can be many > thousands of characters long. ?I am trying to find out why my analysis fails > - I just wanted to rule out the possibility that the internal factor > conversion has a problem parsing long strings. > > Thanks > > Richard > -- > ---------------------------------------------------- > Richard Mott ? ? ? | Wellcome Trust Centre > tel 01865 287588 ? | for Human Genetics > fax 01865 287697 ? | Roosevelt Drive, Oxford OX3 7BN > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?