Hello,
I hope this question is not too stupid. I would like to know how to update
levels after subsetting data from a data.frame.
df <-
data.frame(factor(c("a","a","c","b","b")),
c(4,5,6,7,8), c(9,1,2,3,4))
names(df) <- c("X1","X2","X3")
my.sub <- subset(df, X1 == "a" | X1 == "b")
levels(my.sub$X1)
# still gives me "a","b","c", though the subset
does not contain entries with
"c" anymore
I guess, the solution is rather simple, but I cannot find it.
Antje
try this:> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) > names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1)[1] "a" "b" "c"> my.sub$X1 <- factor(my.sub$X1) > levels(my.sub$X1)[1] "a" "b">On Fri, Dec 5, 2008 at 7:50 AM, Antje <niederlein-rstat at yahoo.de> wrote:> Hello, > > I hope this question is not too stupid. I would like to know how to update > levels after subsetting data from a data.frame. > > df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) > names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1) > > # still gives me "a","b","c", though the subset does not contain entries > with "c" anymore > > I guess, the solution is rather simple, but I cannot find it. > > Antje > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
I do the following for a subsetted dataframe:
cleanfactors <- function(mydf){
outdf<-mydf
for (i in 1:dim(mydf)[2]){
if (is.factor(mydf[,i]))
outdf[,i]<-factor(mydf[,i])
}
outdf
}
Antje wrote:> Hello,
>
> I hope this question is not too stupid. I would like to know how to
> update levels after subsetting data from a data.frame.
>
> df <-
data.frame(factor(c("a","a","c","b","b")),
c(4,5,6,7,8),
> c(9,1,2,3,4))
> names(df) <- c("X1","X2","X3")
>
> my.sub <- subset(df, X1 == "a" | X1 == "b")
> levels(my.sub$X1)
>
> # still gives me "a","b","c", though the
subset does not contain entries
> with "c" anymore
>
> I guess, the solution is rather simple, but I cannot find it.
>
> Antje
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - http://www.avg.com
> Version: 8.0.176 / Virus Database: 270.9.14/1831 - Release Date: 12/4/2008
9:55 PM
>
--
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459
On Fri, 5 Dec 2008, jim holtman wrote:> try this: > >> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) >> names(df) <- c("X1","X2","X3") >> >> my.sub <- subset(df, X1 == "a" | X1 == "b") >> levels(my.sub$X1) > [1] "a" "b" "c" >> my.sub$X1 <- factor(my.sub$X1)I find my.sub$X1 <- my.sub$X1[drop=TRUE] a lot more self-explanatory. See ?"[.factor". However, if you find yourself wanting to do this, ask why you have a factor (rather than a character vector) in the first place.>> levels(my.sub$X1) > [1] "a" "b" >> > > > On Fri, Dec 5, 2008 at 7:50 AM, Antje <niederlein-rstat at yahoo.de> wrote: >> Hello, >> >> I hope this question is not too stupid. I would like to know how to update >> levels after subsetting data from a data.frame. >> >> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) >> names(df) <- c("X1","X2","X3") >> >> my.sub <- subset(df, X1 == "a" | X1 == "b") >> levels(my.sub$X1) >> >> # still gives me "a","b","c", though the subset does not contain entries >> with "c" anymore >> >> I guess, the solution is rather simple, but I cannot find it. >> >> Antje >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On Fri, Dec 5, 2008 at 6:50 AM, Antje <niederlein-rstat at yahoo.de> wrote:> Hello, > > I hope this question is not too stupid. I would like to know how to update > levels after subsetting data from a data.frame. > > df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), c(9,1,2,3,4)) > names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1) > > # still gives me "a","b","c", though the subset does not contain entries > with "c" anymore > > I guess, the solution is rather simple, but I cannot find it.You might find it easier just to work with character vectors: options(stringsAsFactors = FALSE) Hadley -- http://had.co.nz/
Thanks a lot!!! the "drop" thing was exactly what I was looking for (I already used it some time ago but forgot about it). Thanks to everybody else too. Antje Prof Brian Ripley schrieb:> On Fri, 5 Dec 2008, jim holtman wrote: > >> try this: >> >>> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), >>> c(9,1,2,3,4)) >>> names(df) <- c("X1","X2","X3") >>> >>> my.sub <- subset(df, X1 == "a" | X1 == "b") >>> levels(my.sub$X1) >> [1] "a" "b" "c" >>> my.sub$X1 <- factor(my.sub$X1) > > I find > > my.sub$X1 <- my.sub$X1[drop=TRUE] > > a lot more self-explanatory. See ?"[.factor". However, if you find > yourself wanting to do this, ask why you have a factor (rather than a > character vector) in the first place. > > >>> levels(my.sub$X1) >> [1] "a" "b" >>> >> >> >> On Fri, Dec 5, 2008 at 7:50 AM, Antje <niederlein-rstat at yahoo.de> wrote: >>> Hello, >>> >>> I hope this question is not too stupid. I would like to know how to >>> update >>> levels after subsetting data from a data.frame. >>> >>> df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8), >>> c(9,1,2,3,4)) >>> names(df) <- c("X1","X2","X3") >>> >>> my.sub <- subset(df, X1 == "a" | X1 == "b") >>> levels(my.sub$X1) >>> >>> # still gives me "a","b","c", though the subset does not contain entries >>> with "c" anymore >>> >>> I guess, the solution is rather simple, but I cannot find it. >>> >>> Antje >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >
> I hope this question is not too stupid. I would like to know how toupdate> levels after subsetting data from a data.frame. > > df <- data.frame(factor(c("a","a","c","b","b")), c(4,5,6,7,8),c(9,1,2,3,4))> names(df) <- c("X1","X2","X3") > > my.sub <- subset(df, X1 == "a" | X1 == "b") > levels(my.sub$X1) > > # still gives me "a","b","c", though the subset does not contain entrieswith> "c" anymoreTwo questions in one afternon; aren't I good to you! levels(my.sub$X1[,drop=TRUE]) [1] "a" "b" levels(factor(my.sub$X1)) [1] "a" "b" Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}