Brian
2015-Sep-20 15:49 UTC
[R] Unexpected/undocumented behavior of 'within': dropping variable names that start with '.'
Dear List, Somewhere I missed something, and now I'm really missing something!> d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), dummy = c(1, 2, 3), a c(1, 2, 3), b = c(1, 2, 3) + 1)> within(d.f, {d = a + b}) dummy a b d 1 1 1 2 3 2 2 2 3 5 3 3 3 4 7 > d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), .dummy = c(1, 2, 3), a = c(1, 2, 3), b = c(1, 2, 3) + 1) > within(d.f, {d = a + b}) a b d 1 1 2 3 2 2 3 5 3 3 4 7 Could somebody please explain to me why this does this? I think could be considered a feature (for lots of calculations within a data frame you don't have to remove all extra variables at the end). I just wish it was documented. Cheers, Brian sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines grid stats graphics grDevices utils datasets [8] methods base other attached packages: [1] scales_0.2.4 plyr_1.8.3 reshape2_1.4 ccchDataProc_0.7 [5] ccchTools_0.6 xtable_1.7-4 tables_0.7.79 Hmisc_3.14-5 [9] Formula_1.1-2 survival_2.37-7 ggplot2_1.0.1 IDPmisc_1.1.17 [13] lattice_0.20-29 myRplots_1.1 myRtools_1.2 meteoconv_0.1 [17] pixmap_0.4-11 RColorBrewer_1.0-5 maptools_0.8-30 sp_1.1-1 [21] mapdata_2.2-3 mapproj_1.2-2 maps_2.3-9 chron_2.3-45 [25] MASS_7.3-35 loaded via a namespace (and not attached): [1] acepack_1.3-3.3 cluster_1.15.2 colorspace_1.2-4 [4] compiler_3.1.0 data.table_1.9.4 digest_0.6.4 [7] foreign_0.8-61 gtable_0.1.2 labeling_0.3 [10] latticeExtra_0.6-26 munsell_0.4.2 nnet_7.3-8 [13] proto_0.3-10 Rcpp_0.12.0 rpart_4.1-8 [16] stringr_0.6.2 tools_3.1.0 > within function (data, expr, ...) UseMethod("within") <bytecode: 0x26d32c8> <environment: namespace:base>
Hadley Wickham
2015-Sep-20 18:23 UTC
[R] Unexpected/undocumented behavior of 'within': dropping variable names that start with '.'
The problem is that within.data.frame calls as.list.environment with the default value of all.names = FALSE. I doubt this is a deliberate feature, and is more likely to be a minor oversight. Hadley On Sun, Sep 20, 2015 at 11:49 AM, Brian <zenlines at gmail.com> wrote:> Dear List, > > Somewhere I missed something, and now I'm really missing something! > >> d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), dummy = c(1, 2, 3), a > c(1, 2, 3), b = c(1, 2, 3) + 1) > > within(d.f, {d = a + b}) > dummy a b d > 1 1 1 2 3 > 2 2 2 3 5 > 3 3 3 4 7 > > d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), .dummy = c(1, 2, 3), a > = c(1, 2, 3), b = c(1, 2, 3) + 1) > > within(d.f, {d = a + b}) > a b d > 1 1 2 3 > 2 2 3 5 > 3 3 4 7 > > Could somebody please explain to me why this does this? I think could be > considered a feature (for lots of calculations within a data frame you > don't have to remove all extra variables at the end). I just wish it > was documented. > > Cheers, > Brian > > > sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] splines grid stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] scales_0.2.4 plyr_1.8.3 reshape2_1.4 > ccchDataProc_0.7 > [5] ccchTools_0.6 xtable_1.7-4 tables_0.7.79 Hmisc_3.14-5 > [9] Formula_1.1-2 survival_2.37-7 ggplot2_1.0.1 > IDPmisc_1.1.17 > [13] lattice_0.20-29 myRplots_1.1 myRtools_1.2 meteoconv_0.1 > [17] pixmap_0.4-11 RColorBrewer_1.0-5 maptools_0.8-30 sp_1.1-1 > [21] mapdata_2.2-3 mapproj_1.2-2 maps_2.3-9 chron_2.3-45 > [25] MASS_7.3-35 > > loaded via a namespace (and not attached): > [1] acepack_1.3-3.3 cluster_1.15.2 colorspace_1.2-4 > [4] compiler_3.1.0 data.table_1.9.4 digest_0.6.4 > [7] foreign_0.8-61 gtable_0.1.2 labeling_0.3 > [10] latticeExtra_0.6-26 munsell_0.4.2 nnet_7.3-8 > [13] proto_0.3-10 Rcpp_0.12.0 rpart_4.1-8 > [16] stringr_0.6.2 tools_3.1.0 > > within > function (data, expr, ...) > UseMethod("within") > <bytecode: 0x26d32c8> > <environment: namespace:base> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- http://had.co.nz/
Martin Maechler
2015-Sep-23 10:32 UTC
[R] Unexpected/undocumented behavior of 'within': dropping variable names that start with '.'
>>>>> Hadley Wickham <h.wickham at gmail.com> >>>>> on Sun, 20 Sep 2015 14:23:43 -0400 writes:> The problem is that within.data.frame calls as.list.environment with > the default value of all.names = FALSE. I doubt this is a deliberate > feature, and is more likely to be a minor oversight. Indeed; Thank you, Hadley (and Brian)! It is fixed now in R-devel .... and will be ported to R-patched probably tomorrow. Martin > Hadley > On Sun, Sep 20, 2015 at 11:49 AM, Brian <zenlines at gmail.com> wrote: >> Dear List, >> >> Somewhere I missed something, and now I'm really missing something! >> >>> d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), dummy = c(1, 2, 3), a >> c(1, 2, 3), b = c(1, 2, 3) + 1) >> > within(d.f, {d = a + b}) >> dummy a b d >> 1 1 1 2 3 >> 2 2 2 3 5 >> 3 3 3 4 7 >> > d.f <- data.frame(.id = c(TRUE, FALSE, TRUE), .dummy = c(1, 2, 3), a >> = c(1, 2, 3), b = c(1, 2, 3) + 1) >> > within(d.f, {d = a + b}) >> a b d >> 1 1 2 3 >> 2 2 3 5 >> 3 3 4 7 >> >> Could somebody please explain to me why this does this? I think could be >> considered a feature (for lots of calculations within a data frame you >> don't have to remove all extra variables at the end). I just wish it >> was documented. >> >> Cheers, >> Brian >> >> >> sessionInfo() >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] splines grid stats graphics grDevices utils datasets >> [8] methods base >> >> other attached packages: >> [1] scales_0.2.4 plyr_1.8.3 reshape2_1.4 >> ccchDataProc_0.7 >> [5] ccchTools_0.6 xtable_1.7-4 tables_0.7.79 Hmisc_3.14-5 >> [9] Formula_1.1-2 survival_2.37-7 ggplot2_1.0.1 >> IDPmisc_1.1.17 >> [13] lattice_0.20-29 myRplots_1.1 myRtools_1.2 meteoconv_0.1 >> [17] pixmap_0.4-11 RColorBrewer_1.0-5 maptools_0.8-30 sp_1.1-1 >> [21] mapdata_2.2-3 mapproj_1.2-2 maps_2.3-9 chron_2.3-45 >> [25] MASS_7.3-35 >> >> loaded via a namespace (and not attached): >> [1] acepack_1.3-3.3 cluster_1.15.2 colorspace_1.2-4 >> [4] compiler_3.1.0 data.table_1.9.4 digest_0.6.4 >> [7] foreign_0.8-61 gtable_0.1.2 labeling_0.3 >> [10] latticeExtra_0.6-26 munsell_0.4.2 nnet_7.3-8 >> [13] proto_0.3-10 Rcpp_0.12.0 rpart_4.1-8 >> [16] stringr_0.6.2 tools_3.1.0 >> > within >> function (data, expr, ...) >> UseMethod("within") >> <bytecode: 0x26d32c8> >> <environment: namespace:base> >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > -- > http://had.co.nz/ > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.