Suharto Anggono Suharto Anggono
2017-Jun-08 16:43 UTC
[Rd] [bug] droplevels() also drop object attributes (comment…)
* Be careful with "contrasts" attribute. If the number of levels is reduced, the original contrasts matrix is no longer valid. Example case: x <- factor(c("a", "a", "b", "b", "b"), levels = c("a", "b", "c")) contrasts(x) <- contr.treatment(levels(x), contrasts=FALSE)[, -2, drop=FALSE] droplevels(x) * If function 'factor' is changed, make sure that as.factor(x) and factor(x) is the same for 'x' where is.integer(x) is TRUE. Currently, as.factor(<integer>) is treated specially. * It is possible that names(x) is not attr(x, "names"). For example, 'x' is a "POSIXlt" object. Look at this example, which works in R 3.3.2. x <- as.POSIXlt("2017-01-01", tz="UTC") factor(x, levels=x) By the way, in NEWS, in "CHANGES IN R 3.4.0", in "SIGNIFICANT USER-VISIBLE CHANGES", there is "factor() now uses order() to sort its levels". It is false. Code of function 'factor' in R 3.4.0 (https://svn.r-project.org/R/tags/R-3-4-0/src/library/base/R/factor.R) still uses 'sort.list', not 'order'. -------------------------------->>>>> Martin Maechler <maechler at stat.math.ethz.ch> >>>>> on Tue, 16 May 2017 11:01:23 +0200 writes:>>>>> Serge Bibauw <sbibauw at gmail.com> >>>>> on Mon, 15 May 2017 11:59:32 -0400 writes:>> Hi, >> Just reporting a small bug? not really a big deal, but I >> don?t think that is intended: droplevels() also drops all >> object?s attributes. > Yes. The help page for droplevels (or the simple > definition of 'droplevels.factor') clearly indicate that > the method for factors is really just a call to factor(x, > exclude = *) > and that _is_ quite an important base function whose > semantic should not be changed lightly. Still, let's > continue : > Looking a bit, I see that the current behavior of factor() > {and hence droplevels} has been unchanged in this respect > for the whole history of R, well, at least for more than > 17 years (R 1.0.1, April 2000). > I'd agree there _is_ a bug, at least in the documentation > which does *not* mention that currently, all attributes > are dropped but "names", "levels" (and "class"). > OTOH, factor() would only need a small change to make it > preserve all attributes (but "class" and "levels" which > are set explicitly). > I'm sure this will break some checks in some packages. Is > it worth it?> e.g., our own R QC checks currently check (the printing of) the > following (in tests/reg-tests-2.R ):> > ## some tests of factor matrices > > A <- factor(7:12) > > dim(A) <- c(2, 3) > > A > [,1] [,2] [,3] > [1,] 7 9 11 > [2,] 8 10 12 > Levels: 7 8 9 10 11 12 > > str(A) > factor [1:2, 1:3] 7 8 9 10 ... > - attr(*, "levels")= chr [1:6] "7" "8" "9" "10" ... > > A[, 1:2] > [,1] [,2] > [1,] 7 9 > [2,] 8 10 > Levels: 7 8 9 10 11 12 > > A[, 1:2, drop=TRUE] > [1] 7 8 9 10 > Levels: 7 8 9 10 > > with the proposed change to factor(), > the last call would change its result: > > > A[, 1:2, drop=TRUE] > [,1] [,2] > [1,] 7 9 > [2,] 8 10 > Levels: 7 8 9 10> because 'drop=TRUE' calls factor(..) and that would also > preserve the "dim" attribute. I would think that the > changed behavior _is_ better, and is also according to > documentation, because the help page for [.factor explains > that 'drop = TRUE' drops levels, but _not_ that it > transforms a factor matrix into a factor (vector). > Martin I'm finally coming back to this. It still seems to make sense to change factor() and hence droplevels() behavior here, and plan to commit this change within a day. Martin Maechler ETH Zurich
Possibly Parallel Threads
- [bug] droplevels() also drop object attributes (comment…)
- [bug] droplevels() also drop object attributes (comment…)
- [bug] droplevels() also drop object attributes (comment…)
- [bug] droplevels() also drop object attributes (comment…)
- droplevels: drops contrasts as well