David.Duffy at qimr.edu.au
2008-May-21 08:05 UTC
[Rd] table(factor(x), exclude=NULL) (PR#11494)
Hi. I don't know if this a bug or just annoying to me:> x <- c(1,2,3,NA)> table(x, exclude=NULL)x 1 2 3 <NA> 1 1 1 1> table(factor(x), exclude=NULL)1 2 3 1 1 1 I don't think many people use factor(x, exclude=NULL): it is not the default handling of character data by read.table(). Cheers, David Duffy. -- | David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
David.Duffy at qimr.edu.au wrote:> Hi. I don't know if this a bug or just annoying to me: > > >> x <- c(1,2,3,NA) >> > > >> table(x, exclude=NULL) >> > x > 1 2 3 <NA> > 1 1 1 1 > > >> table(factor(x), exclude=NULL) >> > > 1 2 3 > 1 1 1 > > I don't think many people use factor(x, exclude=NULL): it is not the > default handling of character data by read.table(). > > Cheers, David Duffy. > >I've moved this to "wishlist" in the bug repository. It's a documented annoyance, but we might be able to do better. The underlying issue is the following: > f <- factor(c(1:3,NA),labels=letters[3:1]) > f [1] c b a <NA> Levels: c b a > factor(f) [1] c b a <NA> Levels: c b a > factor(f,exclude=NULL) [1] c b a <NA> Levels: c b a <NA> > factor(f,levels=levels(f),exclude=NULL) [1] c b a <NA> Levels: c b a > factor(f,levels=c(levels(f),NA),exclude=NULL) [1] c b a <NA> Levels: c b a <NA> and this code in table() suggests that the latter is what was intended (since we actually try to pass exclude=NULL): cat <- if (is.factor(a)) { if (!missing(exclude)) { ll <- levels(a) factor(a, levels = ll[!(ll %in% exclude)], exclude = if (is.null(exclude)) NULL else NA) } else a (levels = if(is.null(exclude)) c(ll,NA) else ll[!(ll %in% exclude)] should do. This will change behaviour of the case where you have mixed factors and non-factors, so I'm not just implementing it right away. -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907