Andrew Simmons
2023-Feb-12 22:30 UTC
[R] Removing variables from data frame with a wile card
drop = FALSE means that should the indexing select exactly one column, then return a data frame with one column, instead of the object in the column. It's usually not necessary, but I've messed up some data before by assuming the indexing always returns a data frame when it doesn't, so drop = FALSE let's me that I will always get a data frame. ``` x <- data.frame(V1 = 1:5, V2 = letters[1:5]) x[, "V2"] x[, "V2", drop = FALSE] ``` You'll notice that the first returns a character vector, a through e, where the second returns a data frame with one column where the object in the column is the same character vector. You could alternatively use x["V2"] which should be identical to x[, "V2", drop = FALSE], but some people don't like that because it doesn't look like matrix indexing anymore. On Sun, Feb 12, 2023, 17:18 Steven T. Yen <styen at ntu.edu.tw> wrote:> In the line suggested by Andrew Simmons, > > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] > > what does drop=FALSE do? Thanks. > > On 1/14/2023 8:48 PM, Steven Yen wrote: > > Thanks to all. Very helpful. > > Steven from iPhone > > On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at gmail.com> > <akwsimmo at gmail.com> wrote: > > ?You'll want to use grep() or grepl(). By default, grep() uses extended > regular expressions to find matches, but you can also use perl regular > expressions and globbing (after converting to a regular expression). > For example: > > grepl("^yr", colnames(mydata)) > > will tell you which 'colnames' start with "yr". If you'd rather you > use globbing: > > grepl(glob2rx("yr*"), colnames(mydata)) > > Then you might write something like this to remove the columns starting > with yr: > > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] > > On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at ntu.edu.tw> > <styen at ntu.edu.tw> wrote: > > > I have a data frame containing variables "yr3",...,"yr28". > > > How do I remove them with a wild card----something similar to "del yr*" > > in Windows/doc? Thank you. > > > colnames(mydata) > > [1] "year" "weight" "confeduc" "confothr" "college" > > [6] ... > > [41] "yr3" "yr4" "yr5" "yr6" "yr7" > > [46] "yr8" "yr9" "yr10" "yr11" "yr12" > > [51] "yr13" "yr14" "yr15" "yr16" "yr17" > > [56] "yr18" "yr19" "yr20" "yr21" "yr22" > > [61] "yr23" "yr24" "yr25" "yr26" "yr27" > > [66] "yr28"... > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
Jeff Newmiller
2023-Feb-12 22:57 UTC
[R] Removing variables from data frame with a wile card
x["V2"] is more efficient than using drop=FALSE, and perfectly normal syntax (data frames are lists of columns). I would ignore the naysayers, or put a comment in if you want to accelerate their uptake. As I understand it, one of the main reasons tibbles exist is because of drop=TRUE. List-slice (single-dimension) indexing works equally well with both standard and tibble types of data frames. On February 12, 2023 2:30:15 PM PST, Andrew Simmons <akwsimmo at gmail.com> wrote:>drop = FALSE means that should the indexing select exactly one column, then >return a data frame with one column, instead of the object in the column. >It's usually not necessary, but I've messed up some data before by assuming >the indexing always returns a data frame when it doesn't, so drop = FALSE >let's me that I will always get a data frame. > >``` >x <- data.frame(V1 = 1:5, V2 = letters[1:5]) >x[, "V2"] >x[, "V2", drop = FALSE] >``` > >You'll notice that the first returns a character vector, a through e, where >the second returns a data frame with one column where the object in the >column is the same character vector. > >You could alternatively use > >x["V2"] > >which should be identical to x[, "V2", drop = FALSE], but some people don't >like that because it doesn't look like matrix indexing anymore. > > >On Sun, Feb 12, 2023, 17:18 Steven T. Yen <styen at ntu.edu.tw> wrote: > >> In the line suggested by Andrew Simmons, >> >> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] >> >> what does drop=FALSE do? Thanks. >> >> On 1/14/2023 8:48 PM, Steven Yen wrote: >> >> Thanks to all. Very helpful. >> >> Steven from iPhone >> >> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at gmail.com> >> <akwsimmo at gmail.com> wrote: >> >> ?You'll want to use grep() or grepl(). By default, grep() uses extended >> regular expressions to find matches, but you can also use perl regular >> expressions and globbing (after converting to a regular expression). >> For example: >> >> grepl("^yr", colnames(mydata)) >> >> will tell you which 'colnames' start with "yr". If you'd rather you >> use globbing: >> >> grepl(glob2rx("yr*"), colnames(mydata)) >> >> Then you might write something like this to remove the columns starting >> with yr: >> >> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] >> >> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at ntu.edu.tw> >> <styen at ntu.edu.tw> wrote: >> >> >> I have a data frame containing variables "yr3",...,"yr28". >> >> >> How do I remove them with a wild card----something similar to "del yr*" >> >> in Windows/doc? Thank you. >> >> >> colnames(mydata) >> >> [1] "year" "weight" "confeduc" "confothr" "college" >> >> [6] ... >> >> [41] "yr3" "yr4" "yr5" "yr6" "yr7" >> >> [46] "yr8" "yr9" "yr10" "yr11" "yr12" >> >> [51] "yr13" "yr14" "yr15" "yr16" "yr17" >> >> [56] "yr18" "yr19" "yr20" "yr21" "yr22" >> >> [61] "yr23" "yr24" "yr25" "yr26" "yr27" >> >> [66] "yr28"... >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> >> > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Maybe Matching Threads
- Removing variables from data frame with a wile card
- Removing variables from data frame with a wile card
- Removing variables from data frame with a wile card
- Removing variables from data frame with a wile card
- Removing variables from data frame with a wile card