If I have a column with 2 levels, but one level has no remaining observations. Can I remove the level? Had intended to do it as listed below, but soon realized that even though there are no observations, the level is still there. For instance summary(dbs3.train.sans.influential.obs$HAC) yields 0 ,1 4685,0 nlevels(dbs3.train.sans.influential.obs$HAC) yields [1] 2 drop.list <- NULL for (i in 1:ncol(dbs3.train.sans.influential.obs)) { if (nlevels(dbs3.train.sans.influential.obs[,i]) < 2) {drop.list <- cbind(drop.list,i)}} yields nothing because HAC still has two levels, even though there aren't any observations in on of the levels. What I want to do is loop through all columns that are factors and create a list of items to drop because there will subsequently be < 2 levels when I try to run a linear model. -- View this message in context: http://r.789695.n4.nabble.com/REmove-level-with-zero-observations-tp2312553p2312553.html Sent from the R help mailing list archive at Nabble.com.
GL wrote:> If I have a column with 2 levels, but one level has no remaining > observations. Can I remove the level?What is a 'column'? An element of a data.frame? Does the following help? f1 <- factor("L1", levels = c("L1", "L2")) levels(f1) f1 <- factor(f1) levels(f1) In absence of a reproducible example, as the posting guide requests, I cannot tell exactly what you're after here.> > Had intended to do it as listed below, but soon realized that even though > there are no observations, the level is still there. > > For instance > > summary(dbs3.train.sans.influential.obs$HAC) > > yields > > 0 ,1 > 4685,0 > > nlevels(dbs3.train.sans.influential.obs$HAC) > > yields > [1] 2 > > drop.list <- NULL > for (i in 1:ncol(dbs3.train.sans.influential.obs)) { > if (nlevels(dbs3.train.sans.influential.obs[,i]) < 2) {drop.list <- > cbind(drop.list,i)}} > > yields > nothing because HAC still has two levels, even though there aren't any > observations in on of the levels. > > What I want to do is loop through all columns that are factors and create a > list of items to drop because there will subsequently be < 2 levels when I > try to run a linear model. > >
Actually, you probably want to remove the remaining level -- that is, remove the variable altogether, since if it has only a single value its effect is indistinguishable from the overall mean. Again, complying with the posting guide would be advisable. Bert Gunter Genentech Nonclinical Biostatistics On Tue, Aug 3, 2010 at 1:50 PM, GL <pflugg at shands.ufl.edu> wrote:> > If I have a column with 2 levels, but one level has no remaining > observations. Can I remove the level? > > Had intended to do it as listed below, but soon realized that even though > there are no observations, the level is still there. > > For instance > > summary(dbs3.train.sans.influential.obs$HAC) > > yields > > 0 ,1 > 4685,0 > > nlevels(dbs3.train.sans.influential.obs$HAC) > > yields > [1] 2 > > drop.list <- NULL > for (i in 1:ncol(dbs3.train.sans.influential.obs)) { > ? ? if (nlevels(dbs3.train.sans.influential.obs[,i]) < 2) {drop.list <- > cbind(drop.list,i)}} > > yields > nothing because HAC still has two levels, even though there aren't any > observations in on of the levels. > > What I want to do is loop through all columns that are factors and create a > list of items to drop because there will subsequently be < 2 levels when I > try to run a linear model. > > > -- > View this message in context: http://r.789695.n4.nabble.com/REmove-level-with-zero-observations-tp2312553p2312553.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Ended up working as follows: dbs3.train.sans.influential.obs <- drop.levels(dbs3.train.sans.influential.obs) drop.list <- NULL for (i in 4:ncol(dbs3.train.sans.influential.obs)) { if (nlevels(dbs3.train.sans.influential.obs[,i]) < 2) {drop.list <- cbind(drop.list,i)}} dbs3.train.sans.influential.obs <- dbs3.train.sans.influential.obs[-c(drop.list)] -- View this message in context: http://r.789695.n4.nabble.com/REmove-level-with-zero-observations-tp2312553p2312821.html Sent from the R help mailing list archive at Nabble.com.
On 03/08/10 21:50, GL wrote:> If I have a column with 2 levels, but one level has no remaining > observations. Can I remove the level? >Like this? d <- data.frame(a = factor(rep("A", 3), levels = c("A", "B"))) levels(d$a) # [1] "A" "B" d$a <- d$a[,drop=TRUE] levels(d$a) # [1] "A" Hope this helps Allan