Dear all, I've taken a subset of data from a data frame using crb<-subset(all.raw, creek %in% c("CR") & year %in% c(2000,2001) & substrate %in% ("b")) this works fine, except that all of the original factor levels are maintained. This results in NA's for these empty levels when I try to do summaries based on factors using by(). Is there a simple way to drop the factor levels that are no longer represented. I've used na.omit on the results from by, but then I have to deal with the attr setting, which catches me too. Probably a silly question, but I've done a search and couldn't find anything. Can someone help me please. Regards Nick ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr Nick Bond Department of Biological Sciences Monash University (Clayton Campus) Victoria, Australia, 3800 Ph: +61 3 9905 5606 Fax: +61 3 9905 5613 Email: Nick.Bond at sci.monash.edu.au ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>-----Original Message----- >From: r-help-bounces at stat.math.ethz.ch >[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Nick Bond >Sent: Thursday, June 26, 2003 10:08 PM >To: r-help at stat.math.ethz.ch >Subject: [R] dropping factor levels in subset > > >Dear all, >I've taken a subset of data from a data frame using > >crb<-subset(all.raw, creek %in% c("CR") & year %in% >c(2000,2001) & substrate >%in% ("b")) > >this works fine, except that all of the original factor levels are >maintained. This results in NA's for these empty levels when I >try to do >summaries based on factors using by(). Is there a simple way >to drop the >factor levels that are no longer represented. I've used na.omit onthe>results from by, but then I have to deal with the attr setting, which >catches me too. Probably a silly question, but I've done a search and >couldn't find anything. Can someone help me please. >Regards >NickSee ?factor for additional information, but a quick example where using factor(old.factor) will return the factor with unused levels dropped. # Create a factor> old.factor <- factor(c("One", "Two", "Three", "Four")) > old.factor[1] One Two Three Four Levels: Four One Three Two # Create a subset of three noting that all four # levels are retained> new.factor <- old.factor[1:3] > new.factor[1] One Two Three Levels: Four One Three Two # Drop unused level> new.factor2 <- factor(new.factor) > new.factor2[1] One Two Three Levels: One Three Two HTH, Marc Schwartz
Another option is pruneLevels() in library nlme. x <- factor( c( 0,1,2,1,2 ) )> x[1] 0 1 2 1 2 Levels: 0 1 2> pruneLevels( x[-1] )[1] 1 2 1 2 Levels: 1 2 -----Original Message----- From: Marc Schwartz [mailto:mschwartz at medanalytics.com] Sent: Saturday, June 28, 2003 2:31 AM To: 'Prof Brian Ripley' Cc: r-help at stat.math.ethz.ch; 'Nick Bond' Subject: RE: [R] dropping factor levels in subset>-----Original Message----- >From: r-help-bounces at stat.math.ethz.ch >[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Prof >Brian Ripley >Sent: Friday, June 27, 2003 12:11 PM >To: Marc Schwartz >Cc: r-help at stat.math.ethz.ch; 'Nick Bond' >Subject: RE: [R] dropping factor levels in subset > > >Re: [, drop=TRUE} for factors > >It's been in S-PLUS (but not S I believe) for a long time, >probably since >before 1994: it is in S+3.4, 1996 vintage. > >It appears to have been added to R around August 1998. > >Yes, Frank Harrell argues for the default to be true and I believehis>Hmisc package overrides this. Although less unsafe than it >used to be (a >lot more consistency checking of factor levels has been added) >it is still >I believe undesirable. The argument `drop.unused.levels' to >model.frame >will usually do all that is required. (That's another thing that is >very-little known.)Thanks for the clarifications. SNIP>> I now note that for factor objects, this is included in MASS 4 (pg >> 19), whereas it is a footnote in MASS 3 (pg 20) and I could notfind>> it in MASS 1 (I don't have a copy of MASS 2 to review). It is alsoa>> footnote in S Programming (pg 14). Not sure if any >significance should >> be attached to being a footnote versus being in the body of >the text. > >None.OK. I initially had the impression that it may have been either chronologically associated with the addition of this method for factors or the greater emphasis on R in MASS 4, since it moved from a footnote to the body. I was wrong. Also, I realized a typo in the MASS 4 page number I had above, it should be 16. Regards, Marc ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
"Adaikalavan Ramasamy" <gisar at nus.edu.sg> writes:> Another option is pruneLevels() in library nlme. > > x <- factor( c( 0,1,2,1,2 ) ) > > x > [1] 0 1 2 1 2 > Levels: 0 1 2 > > pruneLevels( x[-1] ) > [1] 1 2 1 2 > Levels: 1 2That function has been removed from the latest release of the nlme package because it not needed. All uses of pruneLevels in nlme were replaced by code of the form myfactor[] = myfactor[, drop = TRUE]