Hello, I woul dlike to remove factors from all the columns of a dataframe. I can do it n a column at the time with ``` df <- data.frame(region=factor(c('A', 'B', 'C', 'D', 'E')), sales = c(13, 16, 22, 27, 34), country=factor(c('a', 'b', 'c', 'd', 'e'))) new_df$region <- droplevels(new_df$region) ``` What is the syntax to remove all factors at once (from all columns)? For this does not work: ```> str(df)'data.frame': 5 obs. of 3 variables: $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 $ sales : num 13 16 22 27 34 $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5> df = droplevels(df) > str(df)'data.frame': 5 obs. of 3 variables: $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 $ sales : num 13 16 22 27 34 $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 ``` Thank you
On Sun, 19 Sep 2021 10:17:51 +0200 Luigi Marongiu <marongiu.luigi at gmail.com> wrote:> Hello, > I woul dlike to remove factors from all the columns of a dataframe.What on earth do you mean by that? After struggling with your (inadequate) example for a while, I conjecture that what you want to do is to drop unused levels from all factor columns in a data frame. I is that correct?> I can do it n a column at the time with > ``` > > df <- data.frame(region=factor(c('A', 'B', 'C', 'D', 'E')), > sales = c(13, 16, 22, 27, 34), country=factor(c('a', > 'b', 'c', 'd', 'e'))) > > new_df$region <- droplevels(new_df$region) > ```Before executing the foregoing command, you would have to create new_df. *Perhaps* you intended to do "new_df <- df" initially. If this is the case, then new_df will be exactly the same as df after you've applied droplevels() to new_df$region. Note that droplevels() removes unused levels from the levels of a factor. The factor df$region in your confusing example has no unused levels, so droplevels() has no effect upon it.> > What is the syntax to remove all factors at once (from all columns)? > For this does not work: > ``` > > str(df) > 'data.frame': 5 obs. of 3 variables: > $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 > $ sales : num 13 16 22 27 34 > $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 > > df = droplevels(df) > > str(df) > 'data.frame': 5 obs. of 3 variables: > $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 > $ sales : num 13 16 22 27 34 > $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 > ``` > Thank youI believe the reason you think "this does not work" is that your example is inadequate. If the factors in "df" actually had any unused levels, then droplevels(df) would indeed remove them. (a) In future please present your questions in a comprehensible manner. (b) Also please construct your examples so that they are actually capable of illustrating what you a trying to accomplish. You are asking others for help. Have a little consideration for the helpers, who are giving of their time and effort free of charge! (c) Note that "df" is a lousy name for a data frame, since it is the name of a base R function (the density function for the F distribution). No harm is done in the current context, but such nomenclature can at times lead to errors "object of type 'closure' is not subsettable" which mystifies most users. cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
Hi Luigi, It's easy: df1<-df[,!unlist(lapply(df,is.factor))] _except_ when there is only one column left, as in your example. In that case, you will have to coerce the resulting vector back into a one column data frame. Jim On Sun, Sep 19, 2021 at 6:18 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote:> > Hello, > I woul dlike to remove factors from all the columns of a dataframe. > I can do it n a column at the time with > ``` > > df <- data.frame(region=factor(c('A', 'B', 'C', 'D', 'E')), > sales = c(13, 16, 22, 27, 34), country=factor(c('a', > 'b', 'c', 'd', 'e'))) > > new_df$region <- droplevels(new_df$region) > ``` > > What is the syntax to remove all factors at once (from all columns)? > For this does not work: > ``` > > str(df) > 'data.frame': 5 obs. of 3 variables: > $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 > $ sales : num 13 16 22 27 34 > $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 > > df = droplevels(df) > > str(df) > 'data.frame': 5 obs. of 3 variables: > $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 > $ sales : num 13 16 22 27 34 > $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5 > ``` > Thank you > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.