Suharto Anggono Suharto Anggono
2012-Dec-06 05:39 UTC
[Rd] factor(x, exclude=y) if x is a factor
I found this part in the documentation of 'factor'.
'factor(x, exclude=NULL)' applied to a factor is a no-operation
unless there are unused levels: in that case, a factor with the
reduced level set is returned. If 'exclude' is used it should
also be a factor with the same level set as 'x' or a set of codes
for the levels to be excluded.
Regarding the last sentence, this is the actual behavior.
> x <- factor(c("a","b"),
levels=c("a","b"))
> x
[1] a b
Levels: a b> factor(x, exclude=factor("a",
levels=c("a","b")))
[1] a b
Levels: a b> factor(x, exclude=1L)
[1] a b
Levels: a b
I expect "a" to be removed from levels.
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.15.2
The results are the same in R 2.5.1.
In R 2.5.1, if function 'match' did not apply 'as.character' to
factor (and used internal code of factor instead), it would work to set
'exclude' as in the above quotation of the documentation. In the example
above, "a" would be removed from levels.
One cause of the trouble is this code in the definition of function
'factor', in R 2.15.2 or in R 2.5.1.
exclude <- as.vector(exclude, typeof(x))
What is the intent actually?
Suharto,
I think that the key is to read the definition of exclude in the
Arguments section:
a vector of values to be excluded when forming the set of levels. This
should be of the same type as x, and will be coerced if necessary.
Because the levels already exist for x as a factor, they are not formed
or revised, except to drop unused levels in the case where exclude=NULL (or
the default value). To drop level a from x use:
factor(as.character(x), exclude="a")
or, on creation:
x <- factor(c("a", "b"), exclude="a")
Dave
On Wed, Dec 5, 2012 at 11:39 PM, Suharto Anggono Suharto Anggono <
suharto_anggono@yahoo.com> wrote:
> factor(x, exclude=factor("a",
levels=c("a","b")))
[[alternative HTML version deleted]]
Suharto Anggono Suharto Anggono
2012-Dec-10 07:46 UTC
[Rd] factor(x, exclude=y) if x is a factor
After searching, I see that
https://stat.ethz.ch/pipermail/r-help/2011-April/276274.html has mentioned this
issue, perhaps more clearly.
Thanks for pointing out "Arguments" section about 'exclude'.
That documents the code
exclude <- as.vector(exclude, typeof(x))
A note: if x is a factor, factor(x, exclude=y) doesn't always do nothing
other than dropping unused levels.
> x <- 2:3
> x
[1] 2 3> xf <- factor(x, levels=x)
> xf
[1] 2 3
Levels: 2 3> factor(xf, exclude=2)
[1] <NA> 3
Levels: 3
> x <- c(2:3, "a")
> x
[1] "2" "3" "a"> xf <- factor(x, levels=x)
> xf
[1] 2 3 a
Levels: 2 3 a> factor(xf, exclude=2)
[1] <NA> 3 a
Levels: 3 a
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
--- On Thu, 6/12/12, Lorenz, David <lorenz at usgs.gov> wrote:
From: Lorenz, David <lorenz at usgs.gov>
Subject: Re: [Rd] factor(x, exclude=y) if x is a factor
To: "Suharto Anggono Suharto Anggono" <suharto_anggono at
yahoo.com>
Cc: R-devel at r-project.org
Date: Thursday, 6 December, 2012, 10:12 PM
Suharto,? I think that the key is to read the definition of exclude in the
Arguments section:a vector of values to be excluded when forming the set of
levels. This should
be of the same type as x, and will be coerced if necessary.? Because?the levels
already exist for x as a factor, they are not formed or revised, except to drop
unused levels in the case where exclude=NULL (or the default value). To drop
level a from x use:
factor(as.character(x), exclude="a")? or, on creation:
x <- factor(c("a", "b"), exclude="a")
Dave
On Wed, Dec 5, 2012 at 11:39 PM, Suharto Anggono Suharto Anggono
<suharto_anggono at yahoo.com> wrote:
factor(x, exclude=factor("a", levels=c("a","b")))
Seemingly Similar Threads
- Suggestions for 'diff.default'
- Was confused with options(error = expression(NULL)) in example(stop)
- strsplit("dia ma", "\\b") splits characterwise
- factor(x, exclude=NULL) for factor x; names in as.factor(<integer>)
- Result of 'seq' doesn't use compact internal representation