Hello, I hope this question is not too stupid. I would like to know how to update levels after subsetting data from a data.frame. df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) names(df) <- c("X1","X2","X3") my.sub <- subset(df, X1 == "a" | X1 == "b") levels(my.sub$X1) # still gives me "a","b","c", though the subset does not contain entries with "c" anymore I guess, the solution is rather simple, but I cannot find it. Antje
try this:> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) > names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1)[1] "a" "b" "c"> my.sub$X1 <- factor(my.sub$X1) > levels(my.sub$X1)[1] "a" "b">On Fri, Dec 5, 2008 at 7:50 AM, Antje <niederlein-rstat at yahoo.de> wrote:> Hello, > > I hope this question is not too stupid. I would like to know how to update > levels after subsetting data from a data.frame. > > df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) > names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1) > > # still gives me "a","b","c", though the subset does not contain entries > with "c" anymore > > I guess, the solution is rather simple, but I cannot find it. > > Antje > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
I do the following for a subsetted dataframe: cleanfactors <- function(mydf){ outdf<-mydf for (i in 1:dim(mydf)[2]){ if (is.factor(mydf[,i])) outdf[,i]<-factor(mydf[,i]) } outdf } Antje wrote:> Hello, > > I hope this question is not too stupid. I would like to know how to > update levels after subsetting data from a data.frame. > > df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), > c(9,1,2,3,4)) > names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1) > > # still gives me "a","b","c", though the subset does not contain entries > with "c" anymore > > I guess, the solution is rather simple, but I cannot find it. > > Antje > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ------------------------------------------------------------------------ > > > No virus found in this incoming message. > Checked by AVG - http://www.avg.com > Version: 8.0.176 / Virus Database: 270.9.14/1831 - Release Date: 12/4/2008 9:55 PM >-- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459
On Fri, 5 Dec 2008, jim holtman wrote:> try this: > >> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) >> names(df) <- c("X1","X2","X3") >> >> my.sub <- subset(df, X1 == "a" | X1 == "b") >> levels(my.sub$X1) > [1] "a" "b" "c" >> my.sub$X1 <- factor(my.sub$X1)I find my.sub$X1 <- my.sub$X1[drop=TRUE] a lot more self-explanatory. See ?"[.factor". However, if you find yourself wanting to do this, ask why you have a factor (rather than a character vector) in the first place.>> levels(my.sub$X1) > [1] "a" "b" >> > > > On Fri, Dec 5, 2008 at 7:50 AM, Antje <niederlein-rstat at yahoo.de> wrote: >> Hello, >> >> I hope this question is not too stupid. I would like to know how to update >> levels after subsetting data from a data.frame. >> >> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) >> names(df) <- c("X1","X2","X3") >> >> my.sub <- subset(df, X1 == "a" | X1 == "b") >> levels(my.sub$X1) >> >> # still gives me "a","b","c", though the subset does not contain entries >> with "c" anymore >> >> I guess, the solution is rather simple, but I cannot find it. >> >> Antje >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Fri, Dec 5, 2008 at 6:50 AM, Antje <niederlein-rstat at yahoo.de> wrote:> Hello, > > I hope this question is not too stupid. I would like to know how to update > levels after subsetting data from a data.frame. > > df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) > names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1) > > # still gives me "a","b","c", though the subset does not contain entries > with "c" anymore > > I guess, the solution is rather simple, but I cannot find it.You might find it easier just to work with character vectors: options(stringsAsFactors = FALSE) Hadley -- http://had.co.nz/
Thanks a lot!!! the "drop" thing was exactly what I was looking for (I already used it some time ago but forgot about it). Thanks to everybody else too. Antje Prof Brian Ripley schrieb:> On Fri, 5 Dec 2008, jim holtman wrote: > >> try this: >> >>> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), >>> c(9,1,2,3,4)) >>> names(df) <- c("X1","X2","X3") >>> >>> my.sub <- subset(df, X1 == "a" | X1 == "b") >>> levels(my.sub$X1) >> [1] "a" "b" "c" >>> my.sub$X1 <- factor(my.sub$X1) > > I find > > my.sub$X1 <- my.sub$X1[drop=TRUE] > > a lot more self-explanatory. See ?"[.factor". However, if you find > yourself wanting to do this, ask why you have a factor (rather than a > character vector) in the first place. > > >>> levels(my.sub$X1) >> [1] "a" "b" >>> >> >> >> On Fri, Dec 5, 2008 at 7:50 AM, Antje <niederlein-rstat at yahoo.de> wrote: >>> Hello, >>> >>> I hope this question is not too stupid. I would like to know how to >>> update >>> levels after subsetting data from a data.frame. >>> >>> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), >>> c(9,1,2,3,4)) >>> names(df) <- c("X1","X2","X3") >>> >>> my.sub <- subset(df, X1 == "a" | X1 == "b") >>> levels(my.sub$X1) >>> >>> # still gives me "a","b","c", though the subset does not contain entries >>> with "c" anymore >>> >>> I guess, the solution is rather simple, but I cannot find it. >>> >>> Antje >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >
> I hope this question is not too stupid. I would like to know how toupdate> levels after subsetting data from a data.frame. > > df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8),c(9,1,2,3,4))> names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1) > > # still gives me "a","b","c", though the subset does not contain entrieswith> "c" anymoreTwo questions in one afternon; aren't I good to you! levels(my.sub$X1[,drop=TRUE]) [1] "a" "b" levels(factor(my.sub$X1)) [1] "a" "b" Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}