Hello, Is there a way to avoid the warning below in dplyr. I?m performing an operation within groups, and the warning says that the factors created from each group do not have the same levels, and so it coerces the factor to character. I?m using this inside a package I?m developing. I?d appreciate your recommendation on how to handle this. library(dplyr) set.seed(4) df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels = c("model1", "model2"))) create_bins <- function (pred, nBins) { Breaks <- unique(quantile(pred, probs = seq(0, 1, 1/nBins))) bin <- data.frame(pred = pred, bin = cut(pred, breaks = Breaks, include.lowest = TRUE)) bin } res_dplyr <- df %>% group_by(models) %>% do(create_bins(.$pred, 10)) Warning message: In rbind_all(out[[1]]) : Unequal factor levels: coercing to character Thank you, Axel. [[alternative HTML version deleted]]
> On 06 Nov 2015, at 00:59 , Axel Urbiz <axel.urbiz at gmail.com> wrote: > > Hello, > > Is there a way to avoid the warning below in dplyr. I?m performing an operation within groups, and the warning says that the factors created from each group do not have the same levels, and so it coerces the factor to character. I?m using this inside a package I?m developing. I?d appreciate your recommendation on how to handle this.Well, what did you intend? If you cut according to quantiles, the levels of the result will reflect the value of the quantiles, as in> y <- runif(10) > cut(y, quantile(y,c(0,.25,.5,.75, 1)), include.lowest=T)[1] (0.65,0.765] [0.108,0.281] [0.108,0.281] (0.65,0.765] (0.281,0.528] [6] [0.108,0.281] (0.528,0.65] (0.281,0.528] (0.65,0.765] (0.528,0.65] Levels: [0.108,0.281] (0.281,0.528] (0.528,0.65] (0.65,0.765] If you do it in different groups, the quantiles will differ, hence the factor levels too. Concatenating the resulting factors will get you in trouble. If you don't mind losing the information about that the quantile intervals are, you could consider standardizing the levels with somthing like levels(bin$bin) <- 1:nBins. -pd> > library(dplyr) > > set.seed(4) > df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels = c("model1", "model2"))) > > create_bins <- function (pred, nBins) { > Breaks <- unique(quantile(pred, probs = seq(0, 1, 1/nBins))) > bin <- data.frame(pred = pred, bin = cut(pred, breaks = Breaks, include.lowest = TRUE)) > bin > } > > res_dplyr <- df %>% group_by(models) %>% do(create_bins(.$pred, 10)) > Warning message: > In rbind_all(out[[1]]) : Unequal factor levels: coercing to character > > Thank you, > Axel. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
> On Nov 5, 2015, at 3:59 PM, Axel Urbiz <axel.urbiz at gmail.com> wrote: > > Hello, > > Is there a way to avoid the warning below in dplyr.There is an option that lets you turn off warnings. There also a wrapper function called, not surprisingly, ? `suppressWarnings`. This is all descibed on: ?warning ? David.> I?m performing an operation within groups, and the warning says that the factors created from each group do not have the same levels, and so it coerces the factor to character. I?m using this inside a package I?m developing. I?d appreciate your recommendation on how to handle this. > > library(dplyr) > > set.seed(4) > df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels = c("model1", "model2"))) > > create_bins <- function (pred, nBins) { > Breaks <- unique(quantile(pred, probs = seq(0, 1, 1/nBins))) > bin <- data.frame(pred = pred, bin = cut(pred, breaks = Breaks, include.lowest = TRUE)) > bin > } > > res_dplyr <- df %>% group_by(models) %>% do(create_bins(.$pred, 10)) > Warning message: > In rbind_all(out[[1]]) : Unequal factor levels: coercing to character > > Thank you, > Axel. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Solution is to always use the stringsAsFactors=TRUE option in your data.frame() function calls. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. On November 5, 2015 3:59:10 PM PST, Axel Urbiz <axel.urbiz at gmail.com> wrote:>Hello, > >Is there a way to avoid the warning below in dplyr. I?m performing an >operation within groups, and the warning says that the factors created >from each group do not have the same levels, and so it coerces the >factor to character. I?m using this inside a package I?m developing. >I?d appreciate your recommendation on how to handle this. > >library(dplyr) > >set.seed(4) >df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels >c("model1", "model2"))) > >create_bins <- function (pred, nBins) { > Breaks <- unique(quantile(pred, probs = seq(0, 1, 1/nBins))) >bin <- data.frame(pred = pred, bin = cut(pred, breaks = Breaks, >include.lowest = TRUE)) > bin >} > >res_dplyr <- df %>% group_by(models) %>% do(create_bins(.$pred, 10)) >Warning message: > In rbind_all(out[[1]]) : Unequal factor levels: coercing to character > >Thank you, >Axel. > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
> On Nov 5, 2015, at 4:58 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: > > Solution is to always use the stringsAsFactors=TRUE option in your data.frame() function calls.Since that is the default, I?m wondering if you meant to say FALSE? ? David.> --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > On November 5, 2015 3:59:10 PM PST, Axel Urbiz <axel.urbiz at gmail.com> wrote: >> Hello, >> >> Is there a way to avoid the warning below in dplyr. I?m performing an >> operation within groups, and the warning says that the factors created >> from each group do not have the same levels, and so it coerces the >> factor to character. I?m using this inside a package I?m developing. >> I?d appreciate your recommendation on how to handle this. >> >> library(dplyr) >> >> set.seed(4) >> df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels >> c("model1", "model2"))) >> >> create_bins <- function (pred, nBins) { >> Breaks <- unique(quantile(pred, probs = seq(0, 1, 1/nBins))) >> bin <- data.frame(pred = pred, bin = cut(pred, breaks = Breaks, >> include.lowest = TRUE)) >> bin >> } >> >> res_dplyr <- df %>% group_by(models) %>% do(create_bins(.$pred, 10)) >> Warning message: >> In rbind_all(out[[1]]) : Unequal factor levels: coercing to character >> >> Thank you, >> Axel. >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA