My statement "Using a single bracket '[' on a data.frame does the same as for matrices: you need to specify rows and columns" was not correct. When you use a single bracket on a list with only one argument in between, then R extracts "elements", i.e. columns in the case of a data.frame. This explains your errors. But it is possible to use a single bracket on a data.frame with 2 arguments (rows, columns) separated by a comma, as with matrices. This is the solution you received. Ivan -- Ivan Calandra, PhD Scientific Mediator University of Reims Champagne-Ardenne GEGENAA - EA 3795 CREA - 2 esplanade Roland Garros 51100 Reims, France +33(0)3 26 77 36 89 ivan.calandra at univ-reims.fr -- https://www.researchgate.net/profile/Ivan_Calandra https://publons.com/author/705639/ Le 23/06/2016 ? 16:27, Ivan Calandra a ?crit :> Dear Georg, > > You need to learn a bit more about the subsetting methods, depending > on the object structure you're trying to subset. > > More specifically, when you run this: ds_test[is.na(ds_test$var1)] > you get this error: "Error in `[.data.frame`(ds_test, > is.na(ds_test$var1)) : undefined columns selected" > > This means that R does not understand which column you're trying to > select. But you're actually trying to select rows. > > Using a single bracket '[' on a data.frame does the same as for > matrices: you need to specify rows and columns, like this: > ds_test[is.na(ds_test$var1), ] ## notice the last comma > ds_test[is.na(ds_test$var1), ] <- 0 ## works on all columns because > you didn't specify any after the comma > > If you want it only for "var1", then you need to specify the column: > ds_test[is.na(ds_test$var1), "var1"] <- 0 > > It's the same problem with your 2nd and 4th tries (4th one has other > problems). Your 3rd try does not change ds_test at all. > > HTH, > Ivan > > -- > Ivan Calandra, PhD > Scientific Mediator > University of Reims Champagne-Ardenne > GEGENAA - EA 3795 > CREA - 2 esplanade Roland Garros > 51100 Reims, France > +33(0)3 26 77 36 89 > ivan.calandra at univ-reims.fr > -- > https://www.researchgate.net/profile/Ivan_Calandra > https://publons.com/author/705639/ > > Le 23/06/2016 ? 15:57, G.Maubach at weinwolf.de a ?crit : >> Hi All, >> >> I would like to recode my NAs to 0. Using a single vector everything is >> fine. >> >> But if I use a data.frame things go wrong: >> >> -- cut -- >> >> var1 <- c(1:3, NA, 5:7, NA, 9:10) >> var2 <- c(1:3, NA, 5:7, NA, 9:10) >> ds_test <- >> data.frame(var1, var2) >> >> test <- var1 >> test[is.na(test)] <- 0 >> test # NA recoded OK >> >> # First try >> ds_test[is.na(ds_test$var1)] <- 0 # duplicate subscripts WRONG >> >> # Second try >> ds_test[is.na("var1")] <- 0 >> ds_test$var1 # not recoded WRONG >> >> # Third try: to me the most intuitive approach >> is.na(ds_test["var1"]) <- 0 # attempt to select less than one >> element in >> integerOneIndex WRONG >> >> # Fourth try >> ds_test[is.na(var1)] <- 0 # duplicate subscripts for columns WRONG >> >> -- cut -- >> How can I do it correctly? >> >> Where could I have found something about it? >> >> Kind regards >> >> Georg >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Sorry, Ivan, your statement is incorrect: "When you use a single bracket on a list with only one argument in between, then R extracts "elements", i.e. columns in the case of a data.frame. This explains your errors. " e.g.> ex <- data.frame(a = 1:3, b = letters[1:3]) > a <- 1:3> identical(ex[1], a)[1] FALSE> class(ex[1])[1] "data.frame"> class(a)[1] "integer" Compare:> identical(ex[[1]], a)[1] TRUE Why? Single bracket extraction on a list results in a list; double bracket extraction results in the element of the list ( a "column" in the case of a data frame, which is a specific kind of list). The relevant sections of ?Extract are: "Indexing by [ is similar to atomic vectors and selects a **list** of the specified element(s). Both [[ and $ select a **single element of the list**. " Hope this clarifies this often-confused issue. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, Jun 23, 2016 at 7:34 AM, Ivan Calandra <ivan.calandra at univ-reims.fr> wrote:> My statement "Using a single bracket '[' on a data.frame does the same as > for matrices: you need to specify rows and columns" was not correct. > > > When you use a single bracket on a list with only one argument in between, > then R extracts "elements", i.e. columns in the case of a data.frame. This > explains your errors. > > But it is possible to use a single bracket on a data.frame with 2 arguments > (rows, columns) separated by a comma, as with matrices. This is the solution > you received. > > Ivan > > > -- > Ivan Calandra, PhD > Scientific Mediator > University of Reims Champagne-Ardenne > GEGENAA - EA 3795 > CREA - 2 esplanade Roland Garros > 51100 Reims, France > +33(0)3 26 77 36 89 > ivan.calandra at univ-reims.fr > -- > https://www.researchgate.net/profile/Ivan_Calandra > https://publons.com/author/705639/ > > Le 23/06/2016 ? 16:27, Ivan Calandra a ?crit : >> >> Dear Georg, >> >> You need to learn a bit more about the subsetting methods, depending on >> the object structure you're trying to subset. >> >> More specifically, when you run this: ds_test[is.na(ds_test$var1)] >> you get this error: "Error in `[.data.frame`(ds_test, is.na(ds_test$var1)) >> : undefined columns selected" >> >> This means that R does not understand which column you're trying to >> select. But you're actually trying to select rows. >> >> Using a single bracket '[' on a data.frame does the same as for matrices: >> you need to specify rows and columns, like this: >> ds_test[is.na(ds_test$var1), ] ## notice the last comma >> ds_test[is.na(ds_test$var1), ] <- 0 ## works on all columns because you >> didn't specify any after the comma >> >> If you want it only for "var1", then you need to specify the column: >> ds_test[is.na(ds_test$var1), "var1"] <- 0 >> >> It's the same problem with your 2nd and 4th tries (4th one has other >> problems). Your 3rd try does not change ds_test at all. >> >> HTH, >> Ivan >> >> -- >> Ivan Calandra, PhD >> Scientific Mediator >> University of Reims Champagne-Ardenne >> GEGENAA - EA 3795 >> CREA - 2 esplanade Roland Garros >> 51100 Reims, France >> +33(0)3 26 77 36 89 >> ivan.calandra at univ-reims.fr >> -- >> https://www.researchgate.net/profile/Ivan_Calandra >> https://publons.com/author/705639/ >> >> Le 23/06/2016 ? 15:57, G.Maubach at weinwolf.de a ?crit : >>> >>> Hi All, >>> >>> I would like to recode my NAs to 0. Using a single vector everything is >>> fine. >>> >>> But if I use a data.frame things go wrong: >>> >>> -- cut -- >>> >>> var1 <- c(1:3, NA, 5:7, NA, 9:10) >>> var2 <- c(1:3, NA, 5:7, NA, 9:10) >>> ds_test <- >>> data.frame(var1, var2) >>> >>> test <- var1 >>> test[is.na(test)] <- 0 >>> test # NA recoded OK >>> >>> # First try >>> ds_test[is.na(ds_test$var1)] <- 0 # duplicate subscripts WRONG >>> >>> # Second try >>> ds_test[is.na("var1")] <- 0 >>> ds_test$var1 # not recoded WRONG >>> >>> # Third try: to me the most intuitive approach >>> is.na(ds_test["var1"]) <- 0 # attempt to select less than one element in >>> integerOneIndex WRONG >>> >>> # Fourth try >>> ds_test[is.na(var1)] <- 0 # duplicate subscripts for columns WRONG >>> >>> -- cut -- >>> How can I do it correctly? >>> >>> Where could I have found something about it? >>> >>> Kind regards >>> >>> Georg >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thank you Bert for this clarification. It is indeed an important point. Ivan -- Ivan Calandra, PhD Scientific Mediator University of Reims Champagne-Ardenne GEGENAA - EA 3795 CREA - 2 esplanade Roland Garros 51100 Reims, France +33(0)3 26 77 36 89 ivan.calandra at univ-reims.fr -- https://www.researchgate.net/profile/Ivan_Calandra https://publons.com/author/705639/ Le 23/06/2016 ? 17:06, Bert Gunter a ?crit :> Sorry, Ivan, your statement is incorrect: > > "When you use a single bracket on a list with only one argument in > between, then R extracts "elements", i.e. columns in the case of a > data.frame. This explains your errors. " > > e.g. > >> ex <- data.frame(a = 1:3, b = letters[1:3]) >> a <- 1:3 >> identical(ex[1], a) > [1] FALSE > >> class(ex[1]) > [1] "data.frame" >> class(a) > [1] "integer" > > Compare: > >> identical(ex[[1]], a) > [1] TRUE > > Why? Single bracket extraction on a list results in a list; double > bracket extraction results in the element of the list ( a "column" in > the case of a data frame, which is a specific kind of list). The > relevant sections of ?Extract are: > > "Indexing by [ is similar to atomic vectors and selects a **list** of > the specified element(s). > > Both [[ and $ select a **single element of the list**. " > > > Hope this clarifies this often-confused issue. > > > Cheers, > Bert > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Thu, Jun 23, 2016 at 7:34 AM, Ivan Calandra > <ivan.calandra at univ-reims.fr> wrote: >> My statement "Using a single bracket '[' on a data.frame does the same as >> for matrices: you need to specify rows and columns" was not correct. >> >> >> When you use a single bracket on a list with only one argument in between, >> then R extracts "elements", i.e. columns in the case of a data.frame. This >> explains your errors. >> >> But it is possible to use a single bracket on a data.frame with 2 arguments >> (rows, columns) separated by a comma, as with matrices. This is the solution >> you received. >> >> Ivan >> >> >> -- >> Ivan Calandra, PhD >> Scientific Mediator >> University of Reims Champagne-Ardenne >> GEGENAA - EA 3795 >> CREA - 2 esplanade Roland Garros >> 51100 Reims, France >> +33(0)3 26 77 36 89 >> ivan.calandra at univ-reims.fr >> -- >> https://www.researchgate.net/profile/Ivan_Calandra >> https://publons.com/author/705639/ >> >> Le 23/06/2016 ? 16:27, Ivan Calandra a ?crit : >>> Dear Georg, >>> >>> You need to learn a bit more about the subsetting methods, depending on >>> the object structure you're trying to subset. >>> >>> More specifically, when you run this: ds_test[is.na(ds_test$var1)] >>> you get this error: "Error in `[.data.frame`(ds_test, is.na(ds_test$var1)) >>> : undefined columns selected" >>> >>> This means that R does not understand which column you're trying to >>> select. But you're actually trying to select rows. >>> >>> Using a single bracket '[' on a data.frame does the same as for matrices: >>> you need to specify rows and columns, like this: >>> ds_test[is.na(ds_test$var1), ] ## notice the last comma >>> ds_test[is.na(ds_test$var1), ] <- 0 ## works on all columns because you >>> didn't specify any after the comma >>> >>> If you want it only for "var1", then you need to specify the column: >>> ds_test[is.na(ds_test$var1), "var1"] <- 0 >>> >>> It's the same problem with your 2nd and 4th tries (4th one has other >>> problems). Your 3rd try does not change ds_test at all. >>> >>> HTH, >>> Ivan >>> >>> -- >>> Ivan Calandra, PhD >>> Scientific Mediator >>> University of Reims Champagne-Ardenne >>> GEGENAA - EA 3795 >>> CREA - 2 esplanade Roland Garros >>> 51100 Reims, France >>> +33(0)3 26 77 36 89 >>> ivan.calandra at univ-reims.fr >>> -- >>> https://www.researchgate.net/profile/Ivan_Calandra >>> https://publons.com/author/705639/ >>> >>> Le 23/06/2016 ? 15:57, G.Maubach at weinwolf.de a ?crit : >>>> Hi All, >>>> >>>> I would like to recode my NAs to 0. Using a single vector everything is >>>> fine. >>>> >>>> But if I use a data.frame things go wrong: >>>> >>>> -- cut -- >>>> >>>> var1 <- c(1:3, NA, 5:7, NA, 9:10) >>>> var2 <- c(1:3, NA, 5:7, NA, 9:10) >>>> ds_test <- >>>> data.frame(var1, var2) >>>> >>>> test <- var1 >>>> test[is.na(test)] <- 0 >>>> test # NA recoded OK >>>> >>>> # First try >>>> ds_test[is.na(ds_test$var1)] <- 0 # duplicate subscripts WRONG >>>> >>>> # Second try >>>> ds_test[is.na("var1")] <- 0 >>>> ds_test$var1 # not recoded WRONG >>>> >>>> # Third try: to me the most intuitive approach >>>> is.na(ds_test["var1"]) <- 0 # attempt to select less than one element in >>>> integerOneIndex WRONG >>>> >>>> # Fourth try >>>> ds_test[is.na(var1)] <- 0 # duplicate subscripts for columns WRONG >>>> >>>> -- cut -- >>>> How can I do it correctly? >>>> >>>> Where could I have found something about it? >>>> >>>> Kind regards >>>> >>>> Georg >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.