Hello, using the `unsplit()` function with tibbles currently leads to the following error:> mtcars_tb <- as_tibble(mtcars, rownames = NULL) > s <- split(mtcars_tb, mtcars_tb$gear) > unsplit(s, mtcars_tb$gear)Error: Must subset rows with a valid subscript vector. ? Logical subscripts must match the size of the indexed input. x Input has size 15 but subscript `rep(NA, len)` has size 32. Run `rlang::last_error()` to see where the error occurred. Tibble seems to (rightly) complain, that a logical vector has been used for subsetting which does not have the same length as the data.frame (rows). Since `NA` is a logical value, the subset should be changed to `NA_integer_` in `unsplit()`:> unsplitfunction (value, f, drop = FALSE) { len <- length(if (is.list(f)) f[[1L]] else f) if (is.data.frame(value[[1L]])) { x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE] rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop) } else x <- value[[1L]][rep(NA, len)] split(x, f, drop = drop) <- value x } Cheers, Mario [[alternative HTML version deleted]]
> On Nov 21, 2020, at 10:55 AM, Mario Annau <mario.annau at gmail.com> wrote: > > Hello, > > using the `unsplit()` function with tibbles currently leads to the > following error: > >> mtcars_tb <- as_tibble(mtcars, rownames = NULL) >> s <- split(mtcars_tb, mtcars_tb$gear) >> unsplit(s, mtcars_tb$gear) > Error: Must subset rows with a valid subscript vector. > ? Logical subscripts must match the size of the indexed input. > x Input has size 15 but subscript `rep(NA, len)` has size 32. > Run `rlang::last_error()` to see where the error occurred. > > Tibble seems to (rightly) complain, that a logical vector has been used for > subsetting which does not have the same length as the data.frame (rows). > Since `NA` is a logical value, the subset should be changed to > `NA_integer_` in `unsplit()`: > >> unsplit > function (value, f, drop = FALSE) > { > len <- length(if (is.list(f)) f[[1L]] else f) > if (is.data.frame(value[[1L]])) { > x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE] > rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop) > } > else x <- value[[1L]][rep(NA, len)] > split(x, f, drop = drop) <- value > x > } > > Cheers, > MarioHi, Perhaps I am missing something, but if you are using objects, like tibbles, that are intended to be part of another environment, in this case the tidyverse, why would you not use functions to manipulate these objects that were specifically created in the other environment? I don't use the tidyverse, but it seems to me that to expect base R functions to work with objects not created in base R, is problematic, even though, perhaps by coincidence, they may work without adverse effects, as appears to be the case with split(). In other words, you should not, in reality, have had an a priori expectation that split() would work with a tibble either. Rather than modifying the base R functions, like unsplit(), as you are suggesting, to be compatible with these third party objects, the burden should either be on you to use relevant tidyverse functions, or on the authors of the tidyverse to provide relevant class methods to provide that functionality. Regards, Marc Schwartz
Yes. Nevermind tibbles, the [rep(NA, len),] construction only happens to work because len will always be >= the number of rows in value[[1L]], witness> (1:10)[rep(NA, 20)][1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA> (1:20)[rep(NA, 10)][1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA> (1:20)[rep(NA_integer_, 10)][1] NA NA NA NA NA NA NA NA NA NA> (1:10)[rep(NA_integer_, 20)][1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA -pd> On 21 Nov 2020, at 16:55 , Mario Annau <mario.annau at gmail.com> wrote: > > Hello, > > using the `unsplit()` function with tibbles currently leads to the > following error: > >> mtcars_tb <- as_tibble(mtcars, rownames = NULL) >> s <- split(mtcars_tb, mtcars_tb$gear) >> unsplit(s, mtcars_tb$gear) > Error: Must subset rows with a valid subscript vector. > ? Logical subscripts must match the size of the indexed input. > x Input has size 15 but subscript `rep(NA, len)` has size 32. > Run `rlang::last_error()` to see where the error occurred. > > Tibble seems to (rightly) complain, that a logical vector has been used for > subsetting which does not have the same length as the data.frame (rows). > Since `NA` is a logical value, the subset should be changed to > `NA_integer_` in `unsplit()`: > >> unsplit > function (value, f, drop = FALSE) > { > len <- length(if (is.list(f)) f[[1L]] else f) > if (is.data.frame(value[[1L]])) { > x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE] > rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop) > } > else x <- value[[1L]][rep(NA, len)] > split(x, f, drop = drop) <- value > x > } > > Cheers, > Mario > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
I get the sentiment, but this is really just bad coding (on my own part, I suspect), so we might as well just fix it... -pd> On 21 Nov 2020, at 17:42 , Marc Schwartz via R-devel <r-devel at r-project.org> wrote: > > >> On Nov 21, 2020, at 10:55 AM, Mario Annau <mario.annau at gmail.com> wrote: >> >> Hello, >> >> using the `unsplit()` function with tibbles currently leads to the >> following error: >> >>> mtcars_tb <- as_tibble(mtcars, rownames = NULL) >>> s <- split(mtcars_tb, mtcars_tb$gear) >>> unsplit(s, mtcars_tb$gear) >> Error: Must subset rows with a valid subscript vector. >> ? Logical subscripts must match the size of the indexed input. >> x Input has size 15 but subscript `rep(NA, len)` has size 32. >> Run `rlang::last_error()` to see where the error occurred. >> >> Tibble seems to (rightly) complain, that a logical vector has been used for >> subsetting which does not have the same length as the data.frame (rows). >> Since `NA` is a logical value, the subset should be changed to >> `NA_integer_` in `unsplit()`: >> >>> unsplit >> function (value, f, drop = FALSE) >> { >> len <- length(if (is.list(f)) f[[1L]] else f) >> if (is.data.frame(value[[1L]])) { >> x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE] >> rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop) >> } >> else x <- value[[1L]][rep(NA, len)] >> split(x, f, drop = drop) <- value >> x >> } >> >> Cheers, >> Mario > > > Hi, > > Perhaps I am missing something, but if you are using objects, like tibbles, that are intended to be part of another environment, in this case the tidyverse, why would you not use functions to manipulate these objects that were specifically created in the other environment? > > I don't use the tidyverse, but it seems to me that to expect base R functions to work with objects not created in base R, is problematic, even though, perhaps by coincidence, they may work without adverse effects, as appears to be the case with split(). > > In other words, you should not, in reality, have had an a priori expectation that split() would work with a tibble either. > > Rather than modifying the base R functions, like unsplit(), as you are suggesting, to be compatible with these third party objects, the burden should either be on you to use relevant tidyverse functions, or on the authors of the tidyverse to provide relevant class methods to provide that functionality. > > Regards, > > Marc Schwartz > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com