Hello, Sorry, resending this question as the prior was not sent properly. I?m using the plyr package below to add a variable named "bin" to my original data frame "df" with the user-defined function "create_bins". I'd like to get similar results using dplyr instead, but failing to do so. set.seed(4) df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels c("model1", "model2"))) ### Using plyr (works fine) create_bins <- function(x, nBins) { Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins))) dfB <- data.frame(pred = x$pred, bin = cut(x$pred, breaks = Breaks, include.lowest TRUE)) dfB } nBins = 10 res_plyr <- plyr::ddply(df, plyr::.(models), create_bins, nBins) head(res_plyr) ### Using dplyr (fails) by_group <- dplyr::group_by(df, models) res_dplyr <- dplyr::summarize(by_group, create_bins, nBins) Error: not a vector Any help would be much appreciated. Best, Axel. [[alternative HTML version deleted]]
You are jumping the gun (your other email did get through) and you are posting using HTML (which does not come through on the list). Some time (re)reading the Posting Guide mentioned at the bottom of all emails on this list seems to be in order. The error is actually quite clear. You should return a vector from your function, not a data frame. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. On October 29, 2015 4:55:19 PM MST, Axel Urbiz <axel.urbiz at gmail.com> wrote:>Hello, > >Sorry, resending this question as the prior was not sent properly. > >I?m using the plyr package below to add a variable named "bin" to my >original data frame "df" with the user-defined function "create_bins". >I'd >like to get similar results using dplyr instead, but failing to do so. > >set.seed(4) >df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels >c("model1", "model2"))) > > >### Using plyr (works fine) >create_bins <- function(x, nBins) >{ > Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins))) > dfB <- data.frame(pred = x$pred, > bin = cut(x$pred, breaks = Breaks, include.lowest >TRUE)) > dfB >} > >nBins = 10 >res_plyr <- plyr::ddply(df, plyr::.(models), create_bins, nBins) >head(res_plyr) > >### Using dplyr (fails) > >by_group <- dplyr::group_by(df, models) >res_dplyr <- dplyr::summarize(by_group, create_bins, nBins) >Error: not a vector > > >Any help would be much appreciated. > >Best, >Axel. > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
So in this case, "create_bins" returns a vector and I still get the same error. create_bins <- function(x, nBins) { Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins))) bin <- cut(x$pred, breaks = Breaks, include.lowest = TRUE) bin } ### Using dplyr (fails) nBins = 10 by_group <- dplyr::group_by(df, models) res_dplyr <- dplyr::summarize(by_group, create_bins, nBins) Error: not a vector On Thu, Oct 29, 2015 at 8:28 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> You are jumping the gun (your other email did get through) and you are > posting using HTML (which does not come through on the list). Some time > (re)reading the Posting Guide mentioned at the bottom of all emails on this > list seems to be in order. > > The error is actually quite clear. You should return a vector from your > function, not a data frame. > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > On October 29, 2015 4:55:19 PM MST, Axel Urbiz <axel.urbiz at gmail.com> > wrote: > >Hello, > > > >Sorry, resending this question as the prior was not sent properly. > > > >I?m using the plyr package below to add a variable named "bin" to my > >original data frame "df" with the user-defined function "create_bins". > >I'd > >like to get similar results using dplyr instead, but failing to do so. > > > >set.seed(4) > >df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels > >c("model1", "model2"))) > > > > > >### Using plyr (works fine) > >create_bins <- function(x, nBins) > >{ > > Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins))) > > dfB <- data.frame(pred = x$pred, > > bin = cut(x$pred, breaks = Breaks, include.lowest > >TRUE)) > > dfB > >} > > > >nBins = 10 > >res_plyr <- plyr::ddply(df, plyr::.(models), create_bins, nBins) > >head(res_plyr) > > > >### Using dplyr (fails) > > > >by_group <- dplyr::group_by(df, models) > >res_dplyr <- dplyr::summarize(by_group, create_bins, nBins) > >Error: not a vector > > > > > >Any help would be much appreciated. > > > >Best, > >Axel. > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]