Martin Maechler
2017-Jun-26 17:04 UTC
[Rd] Odd behaviour in within.list() when deleting 2+ variables
>>>>> peter dalgaard <pdalgd at gmail.com> >>>>> on Mon, 26 Jun 2017 13:43:28 +0200 writes:> This seems to be due to changes made by Martin Maechler in > 2008. Presumably this fixed something, but it escapes my > memory. Yes: The change set (svn -c46441) also contains the following NEWS entry BUG FIXES o within(<dataframe>, { ... }) now also works when '...' removes more than one column. > However, it seems to have broken the equivalence > between within.list and within.data.frame, so now > within.list <- within.data.frame > does not suffice. There have been many improvements since then, so maybe we can change the code so that the above will work again. Another problem seems that we had no tests of within.list() anywhere... so we will have them now. I've hade an idea that seems to work and even simplify the code.... will get back to the issue later in the evening. Martin > The crux of the matter seems to be that both the following > constructions work for data frames >> aq <- head(airquality) >> names(aq) > [1] "Ozone" "Solar.R" "Wind" "Temp" "Month" "Day" >> aq[c("Wind","Temp")] <- NULL >> aq > Ozone Solar.R Month Day > 1 41 190 5 1 > 2 36 118 5 2 > 3 12 149 5 3 > 4 18 313 5 4 > 5 NA NA 5 5 > 6 28 NA 5 6 >> aq <- head(airquality) >> aq[c("Wind","Temp")] <- vector("list",2) >> aq > Ozone Solar.R Month Day > 1 41 190 5 1 > 2 36 118 5 2 > 3 12 149 5 3 > 4 18 313 5 4 > 5 NA NA 5 5 > 6 28 NA 5 6 > However, for lists they differ: >> aq <- as.list(head(airquality)) >> aq[c("Wind","Temp")] <- vector("list",2) >> aq > $Ozone > [1] 41 36 12 18 NA 28 > $Solar.R > [1] 190 118 149 313 NA NA > $Wind > NULL > $Temp > NULL > $Month > [1] 5 5 5 5 5 5 > $Day > [1] 1 2 3 4 5 6 >> aq <- as.list(head(airquality)) >> aq[c("Wind","Temp")] <- NULL >> aq > $Ozone > [1] 41 36 12 18 NA 28 > $Solar.R > [1] 190 118 149 313 NA NA > $Month > [1] 5 5 5 5 5 5 > $Day > [1] 1 2 3 4 5 6 > -pd >> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel <r-devel at r-project.org> wrote: >> >> The behaviour of within() with list input changes if you delete 2 or more variables, compared to deleting one: >> >> l <- list(x=1, y=2, z=3) >> >> within(l, >> { >> rm(z) >> }) >> #$x >> #[1] 1 >> # >> #$y >> #[1] 2 >> >> >> within(l, { >> rm(y) >> rm(z) >> }) >> #$x >> #[1] 1 >> # >> #$y >> #NULL >> # >> #$z >> #NULL >> >> >> When 2 or more variables are deleted, the list entries are instead set to NULL. Is this intended? >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Peter Dalgaard
2017-Jun-26 18:12 UTC
[Rd] Odd behaviour in within.list() when deleting 2+ variables
> On 26 Jun 2017, at 19:04 , Martin Maechler <maechler at stat.math.ethz.ch> wrote: > >>>>>> peter dalgaard <pdalgd at gmail.com> >>>>>> on Mon, 26 Jun 2017 13:43:28 +0200 writes: > >> This seems to be due to changes made by Martin Maechler in >> 2008. Presumably this fixed something, but it escapes my >> memory. > > Yes: The change set (svn -c46441) also contains the following NEWS entry > > BUG FIXES > > o within(<dataframe>, { ... }) now also works when '...' removes > more than one column. >The odd thing is that the assign-NULL technique used for removing a single column, NOW also seems to work for several columns in a data frame, so I wonder what the bug was back then... -pd> >> However, it seems to have broken the equivalence >> between within.list and within.data.frame, so now > >> within.list <- within.data.frame > >> does not suffice. > > There have been many improvements since then, so maybe we can > change the code so that the above will work again. > > Another problem seems that we had no tests of within.list() > anywhere... so we will have them now. > > I've hade an idea that seems to work and even simplify the > code.... will get back to the issue later in the evening. > > Martin > > >> The crux of the matter seems to be that both the following >> constructions work for data frames > >>> aq <- head(airquality) >>> names(aq) >> [1] "Ozone" "Solar.R" "Wind" "Temp" "Month" "Day" >>> aq[c("Wind","Temp")] <- NULL >>> aq >> Ozone Solar.R Month Day >> 1 41 190 5 1 >> 2 36 118 5 2 >> 3 12 149 5 3 >> 4 18 313 5 4 >> 5 NA NA 5 5 >> 6 28 NA 5 6 >>> aq <- head(airquality) >>> aq[c("Wind","Temp")] <- vector("list",2) >>> aq >> Ozone Solar.R Month Day >> 1 41 190 5 1 >> 2 36 118 5 2 >> 3 12 149 5 3 >> 4 18 313 5 4 >> 5 NA NA 5 5 >> 6 28 NA 5 6 > >> However, for lists they differ: > >>> aq <- as.list(head(airquality)) >>> aq[c("Wind","Temp")] <- vector("list",2) >>> aq >> $Ozone >> [1] 41 36 12 18 NA 28 > >> $Solar.R >> [1] 190 118 149 313 NA NA > >> $Wind >> NULL > >> $Temp >> NULL > >> $Month >> [1] 5 5 5 5 5 5 > >> $Day >> [1] 1 2 3 4 5 6 > >>> aq <- as.list(head(airquality)) >>> aq[c("Wind","Temp")] <- NULL >>> aq >> $Ozone >> [1] 41 36 12 18 NA 28 > >> $Solar.R >> [1] 190 118 149 313 NA NA > >> $Month >> [1] 5 5 5 5 5 5 > >> $Day >> [1] 1 2 3 4 5 6 > > >> -pd > >>> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel <r-devel at r-project.org> wrote: >>> >>> The behaviour of within() with list input changes if you delete 2 or more variables, compared to deleting one: >>> >>> l <- list(x=1, y=2, z=3) >>> >>> within(l, >>> { >>> rm(z) >>> }) >>> #$x >>> #[1] 1 >>> # >>> #$y >>> #[1] 2 >>> >>> >>> within(l, { >>> rm(y) >>> rm(z) >>> }) >>> #$x >>> #[1] 1 >>> # >>> #$y >>> #NULL >>> # >>> #$z >>> #NULL >>> >>> >>> When 2 or more variables are deleted, the list entries are instead set to NULL. Is this intended? >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel > >> -- >> Peter Dalgaard, Professor, >> Center for Statistics, Copenhagen Business School >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> Phone: (+45)38153501 >> Office: A 4.23 >> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > >-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Martin Maechler
2017-Jun-26 19:56 UTC
[Rd] Odd behaviour in within.list() when deleting 2+ variables
>>>>> "PD" == Peter Dalgaard <pdalgd at gmail.com> >>>>> on Mon, 26 Jun 2017 20:12:38 +0200 writes:>> On 26 Jun 2017, at 19:04 , Martin Maechler >> <maechler at stat.math.ethz.ch> wrote: >> >>>>>>> peter dalgaard <pdalgd at gmail.com> on Mon, 26 Jun >>>>>>> 2017 13:43:28 +0200 writes: >> >>> This seems to be due to changes made by Martin Maechler >>> in 2008. Presumably this fixed something, but it escapes >>> my memory. >> >> Yes: The change set (svn -c46441) also contains the >> following NEWS entry >> >> BUG FIXES >> >> o within(<dataframe>, { ... }) now also works when '...' >> removes more than one column. >> > The odd thing is that the assign-NULL technique used for > removing a single column, NOW also seems to work for > several columns in a data frame, so I wonder what the > bug was back then... It did not work back then: We have had lots of improvements in [.data.frame in these almost 9 years. Indeed, the fix I've committed reverts almost to the previous first version of within.data.frame (which is from Peter Dalgaard, for those who don't know). Martin >>> However, it seems to have broken the equivalence between >>> within.list and within.data.frame, so now >> >>> within.list <- within.data.frame >> >>> does not suffice. >> >> There have been many improvements since then, so maybe we >> can change the code so that the above will work again. >> >> Another problem seems that we had no tests of >> within.list() anywhere... so we will have them now. >> >> I've hade an idea that seems to work and even simplify >> the code.... will get back to the issue later in the >> evening. >> >> Martin >> >> >>> The crux of the matter seems to be that both the >>> following constructions work for data frames >> >>>> aq <- head(airquality) names(aq) >>> [1] "Ozone" "Solar.R" "Wind" "Temp" "Month" "Day" >>>> aq[c("Wind","Temp")] <- NULL aq >>> Ozone Solar.R Month Day 1 41 190 5 1 2 36 118 5 2 3 12 >>> 149 5 3 4 18 313 5 4 5 NA NA 5 5 6 28 NA 5 6 >>>> aq <- head(airquality) aq[c("Wind","Temp")] <- >>>> vector("list",2) aq >>> Ozone Solar.R Month Day 1 41 190 5 1 2 36 118 5 2 3 12 >>> 149 5 3 4 18 313 5 4 5 NA NA 5 5 6 28 NA 5 6 >> >>> However, for lists they differ: >> >>>> aq <- as.list(head(airquality)) aq[c("Wind","Temp")] <- >>>> vector("list",2) aq >>> $Ozone [1] 41 36 12 18 NA 28 >> >>> $Solar.R [1] 190 118 149 313 NA NA >> >>> $Wind NULL >> >>> $Temp NULL >> >>> $Month [1] 5 5 5 5 5 5 >> >>> $Day [1] 1 2 3 4 5 6 >> >>>> aq <- as.list(head(airquality)) aq[c("Wind","Temp")] <- >>>> NULL aq >>> $Ozone [1] 41 36 12 18 NA 28 >> >>> $Solar.R [1] 190 118 149 313 NA NA >> >>> $Month [1] 5 5 5 5 5 5 >> >>> $Day [1] 1 2 3 4 5 6 >> >> >>> -pd >> >>>> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel >>>> <r-devel at r-project.org> wrote: >>>> >>>> The behaviour of within() with list input changes if >>>> you delete 2 or more variables, compared to deleting >>>> one: >>>> >>>> l <- list(x=1, y=2, z=3) >>>> >>>> within(l, { rm(z) }) #$x #[1] 1 >>>> # >>>> #$y #[1] 2 >>>> >>>> >>>> within(l, { rm(y) rm(z) }) #$x #[1] 1 >>>> # >>>> #$y #NULL >>>> # >>>> #$z #NULL >>>> >>>> >>>> When 2 or more variables are deleted, the list entries >>>> are instead set to NULL. Is this intended? >>>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >>> -- >>> Peter Dalgaard, Professor, Center for Statistics, >>> Copenhagen Business School Solbjerg Plads 3, 2000 >>> Frederiksberg, Denmark Phone: (+45)38153501 Office: A >>> 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com >> >> >> >> >> >> >> >> PD> -- Peter Dalgaard, Professor, Center for Statistics, PD> Copenhagen Business School Solbjerg Plads 3, 2000 PD> Frederiksberg, Denmark Phone: (+45)38153501 Office: A PD> 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com