Hi, consider the following> a<-gl(3,3,9) > a[1] 1 1 1 2 2 2 3 3 3 Levels: 1 2 3> levels(a)<-3:1 > a[1] 3 3 3 2 2 2 1 1 1 Levels: 3 2 1> a<-gl(3,3,9) > factor(a,levels=3:1)[1] 1 1 1 2 2 2 3 3 3 Levels: 3 2 1 It is probably something obvious I missed, but reading the documentation of factor, and levels I would have thought that both should produce the same output as factor(a,levels=3:1) [1] 1 1 1 2 2 2 3 3 3 Levels: 3 2 1 The closest I could find in a quick search was this http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2503.html Thanks Nicholas sessionInfo() R version 2.10.1 Patched (2009-12-20 r50794) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines tcltk stats graphics grDevices utils datasets [8] methods base other attached packages: [1] mvtnorm_0.9-9 latticeExtra_0.6-9 RColorBrewer_1.0-2 lattice_0.18-3 [5] nlme_3.1-96 XML_2.6-0 gsubfn_0.5-0 proto_0.3-8 loaded via a namespace (and not attached): [1] grid_2.10.1 tools_2.10.1
On Mar 1, 2010, at 12:07 PM, Nicholas Lewin-Koh wrote:> Hi, > consider the following >> a<-gl(3,3,9) >> a > [1] 1 1 1 2 2 2 3 3 3 > Levels: 1 2 3 >> levels(a)<-3:1That may look like the same re-ordered factor but you instead merely re-labeled each level where the internal numbers that represent the factor values stayed the same..>> a > [1] 3 3 3 2 2 2 1 1 1 > Levels: 3 2 1 >> a<-gl(3,3,9) >> factor(a,levels=3:1)That is the right way IMO to safely change the ordering of the levels without changing the "semantics" or the "meaning" of the factor level assignments. Try: levels(a) <- letters[4:6] a [1] d d d e e e f f f Levels: d e f > a <- factor(a, levels=letters[1:3]) > a [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Levels: a b c Using the second form sets any non-existent (in the new level vector) factor values to NA's, in this case all of them. It is better in my mind to get assignments to NA than it would be to get assignments to incorrect levels. > b <-factor(c(0,0,0,0, 1, 1)) > b [1] 0 0 0 0 1 1 Levels: 0 1 > levels(b) <-c(1,0) > b [1] 1 1 1 1 0 0 # No longer the same "meaning" Levels: 1 0 > b <-factor(c(0,0,0,0, 1, 1)) > b<- factor(b, levels=c(1,0)) > b [1] 0 0 0 0 1 1 Levels: 1 0 # Only the ordering has changed but the meaning is the same This is especially so when working with factors as components of data.frames. -- David.> [1] 1 1 1 2 2 2 3 3 3 > Levels: 3 2 1 > It is probably something obvious I missed, but reading the > documentation > of factor, and levels I would have thought > that both should produce the same output as > factor(a,levels=3:1) > [1] 1 1 1 2 2 2 3 3 3 > Levels: 3 2 1 > The closest I could find in a quick search was this > http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2503.html > > Thanks > Nicholas > > sessionInfo() > R version 2.10.1 Patched (2009-12-20 r50794) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > attached base packages: > [1] splines tcltk stats graphics grDevices utils > datasets > [8] methods base > > other attached packages: > [1] mvtnorm_0.9-9 latticeExtra_0.6-9 RColorBrewer_1.0-2 > lattice_0.18-3 > [5] nlme_3.1-96 XML_2.6-0 gsubfn_0.5-0 > proto_0.3-8 > > loaded via a namespace (and not attached): > [1] grid_2.10.1 tools_2.10.1 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Nicholas Lewin-Koh > Sent: Monday, March 01, 2010 9:08 AM > To: r-help at r-project.org > Subject: [R] Thougt I understood factors but?? > > Hi, > consider the following > > a<-gl(3,3,9) > > a > [1] 1 1 1 2 2 2 3 3 3 > Levels: 1 2 3 > > levels(a)<-3:1 > > a > [1] 3 3 3 2 2 2 1 1 1 > Levels: 3 2 1A very similar question came up last week. Your call to levels<-, with a [non-recursive] vector on the right side, just relabels the levels. It does not alter the integer codes in the factor. It does not compare the new level labels with the existing ones. E.g., > a <- gl(3,2) > str(a) Factor w/ 3 levels "1","2","3": 1 1 2 2 3 3 > levels(a) <- c("Dog", "Cat", "Chicken") > str(a) Factor w/ 3 levels "Dog","Cat","Chicken": 1 1 2 2 3 3 > # you can increase number of levels (but not decrease) > levels(a) <- c("Dog", "Cat", "Chicken", "Goat") > str(a) Factor w/ 4 levels "Dog","Cat","Chicken",..: 1 1 2 2 3 3 You can use a list on the right side of levels<- to reorder or combine levels (and thus change the integer codes). > # combine levels > levels(a) <- list(small=c("Cat", "Chicken"), large=c("Dog","Goat")) > str(a) Factor w/ 2 levels "small","large": 2 2 1 1 1 1 > a [1] large large small small small small Levels: small large # next reorder > levels(a) <- list(largeFirst = "large", smallLast = "small") > str(a) Factor w/ 2 levels "largeFirst","smallLast": 1 1 2 2 2 2 You might find it easier to use factor(), along with ifelse(), and is.element(), for this sort of thing. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > a<-gl(3,3,9) > > factor(a,levels=3:1) > [1] 1 1 1 2 2 2 3 3 3 > Levels: 3 2 1 > It is probably something obvious I missed, but reading the > documentation > of factor, and levels I would have thought > that both should produce the same output as > factor(a,levels=3:1) > [1] 1 1 1 2 2 2 3 3 3 > Levels: 3 2 1 > The closest I could find in a quick search was this > http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2503.html > > Thanks > Nicholas > > sessionInfo() > R version 2.10.1 Patched (2009-12-20 r50794) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > attached base packages: > [1] splines tcltk stats graphics grDevices utils > datasets > [8] methods base > > other attached packages: > [1] mvtnorm_0.9-9 latticeExtra_0.6-9 RColorBrewer_1.0-2 > lattice_0.18-3 > [5] nlme_3.1-96 XML_2.6-0 gsubfn_0.5-0 > proto_0.3-8 > > loaded via a namespace (and not attached): > [1] grid_2.10.1 tools_2.10.1 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >