Suharto Anggono Suharto Anggono
2012-Dec-06 05:39 UTC
[Rd] factor(x, exclude=y) if x is a factor
I found this part in the documentation of 'factor'. 'factor(x, exclude=NULL)' applied to a factor is a no-operation unless there are unused levels: in that case, a factor with the reduced level set is returned. If 'exclude' is used it should also be a factor with the same level set as 'x' or a set of codes for the levels to be excluded. Regarding the last sentence, this is the actual behavior.> x <- factor(c("a","b"), levels=c("a","b")) > x[1] a b Levels: a b> factor(x, exclude=factor("a", levels=c("a","b")))[1] a b Levels: a b> factor(x, exclude=1L)[1] a b Levels: a b I expect "a" to be removed from levels.> sessionInfo()R version 2.15.2 (2012-10-26) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.15.2 The results are the same in R 2.5.1. In R 2.5.1, if function 'match' did not apply 'as.character' to factor (and used internal code of factor instead), it would work to set 'exclude' as in the above quotation of the documentation. In the example above, "a" would be removed from levels. One cause of the trouble is this code in the definition of function 'factor', in R 2.15.2 or in R 2.5.1. exclude <- as.vector(exclude, typeof(x)) What is the intent actually?
Suharto, I think that the key is to read the definition of exclude in the Arguments section: a vector of values to be excluded when forming the set of levels. This should be of the same type as x, and will be coerced if necessary. Because the levels already exist for x as a factor, they are not formed or revised, except to drop unused levels in the case where exclude=NULL (or the default value). To drop level a from x use: factor(as.character(x), exclude="a") or, on creation: x <- factor(c("a", "b"), exclude="a") Dave On Wed, Dec 5, 2012 at 11:39 PM, Suharto Anggono Suharto Anggono < suharto_anggono@yahoo.com> wrote:> factor(x, exclude=factor("a", levels=c("a","b")))[[alternative HTML version deleted]]
Suharto Anggono Suharto Anggono
2012-Dec-10 07:46 UTC
[Rd] factor(x, exclude=y) if x is a factor
After searching, I see that https://stat.ethz.ch/pipermail/r-help/2011-April/276274.html has mentioned this issue, perhaps more clearly. Thanks for pointing out "Arguments" section about 'exclude'. That documents the code exclude <- as.vector(exclude, typeof(x)) A note: if x is a factor, factor(x, exclude=y) doesn't always do nothing other than dropping unused levels.> x <- 2:3 > x[1] 2 3> xf <- factor(x, levels=x) > xf[1] 2 3 Levels: 2 3> factor(xf, exclude=2)[1] <NA> 3 Levels: 3> x <- c(2:3, "a") > x[1] "2" "3" "a"> xf <- factor(x, levels=x) > xf[1] 2 3 a Levels: 2 3 a> factor(xf, exclude=2)[1] <NA> 3 a Levels: 3 a> sessionInfo()R version 2.15.2 (2012-10-26) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base --- On Thu, 6/12/12, Lorenz, David <lorenz at usgs.gov> wrote: From: Lorenz, David <lorenz at usgs.gov> Subject: Re: [Rd] factor(x, exclude=y) if x is a factor To: "Suharto Anggono Suharto Anggono" <suharto_anggono at yahoo.com> Cc: R-devel at r-project.org Date: Thursday, 6 December, 2012, 10:12 PM Suharto,? I think that the key is to read the definition of exclude in the Arguments section:a vector of values to be excluded when forming the set of levels. This should be of the same type as x, and will be coerced if necessary.? Because?the levels already exist for x as a factor, they are not formed or revised, except to drop unused levels in the case where exclude=NULL (or the default value). To drop level a from x use: factor(as.character(x), exclude="a")? or, on creation: x <- factor(c("a", "b"), exclude="a") Dave On Wed, Dec 5, 2012 at 11:39 PM, Suharto Anggono Suharto Anggono <suharto_anggono at yahoo.com> wrote: factor(x, exclude=factor("a", levels=c("a","b")))
Possibly Parallel Threads
- Suggestions for 'diff.default'
- Was confused with options(error = expression(NULL)) in example(stop)
- strsplit("dia ma", "\\b") splits characterwise
- factor(x, exclude=NULL) for factor x; names in as.factor(<integer>)
- Result of 'seq' doesn't use compact internal representation