Steven T. Yen
2023-Feb-12 22:18 UTC
[R] Removing variables from data frame with a wile card
In the line suggested by Andrew Simmons, mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] what does drop=FALSE do? Thanks. On 1/14/2023 8:48 PM, Steven Yen wrote:> Thanks to all. Very helpful. > > Steven from iPhone > >> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at gmail.com> wrote: >> >> ?You'll want to use grep() or grepl(). By default, grep() uses extended >> regular expressions to find matches, but you can also use perl regular >> expressions and globbing (after converting to a regular expression). >> For example: >> >> grepl("^yr", colnames(mydata)) >> >> will tell you which 'colnames' start with "yr". If you'd rather you >> use globbing: >> >> grepl(glob2rx("yr*"), colnames(mydata)) >> >> Then you might write something like this to remove the columns >> starting with yr: >> >> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] >> >> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at ntu.edu.tw> wrote: >>> >>> I have a data frame containing variables "yr3",...,"yr28". >>> >>> How do I remove them with a wild card----something similar to "del yr*" >>> in Windows/doc? Thank you. >>> >>>> colnames(mydata) >>> ??[1] "year" ??????"weight" ????"confeduc" ??"confothr" "college" >>> ??[6] ... >>> ?[41] "yr3" ???????"yr4" ???????"yr5" ???????"yr6" "yr7" >>> ?[46] "yr8" ???????"yr9" ???????"yr10" ??????"yr11" "yr12" >>> ?[51] "yr13" ??????"yr14" ??????"yr15" ??????"yr16" "yr17" >>> ?[56] "yr18" ??????"yr19" ??????"yr20" ??????"yr21" "yr22" >>> ?[61] "yr23" ??????"yr24" ??????"yr25" ??????"yr26" "yr27" >>> ?[66] "yr28"... >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
@vi@e@gross m@iii@g oii gm@ii@com
2023-Feb-12 22:25 UTC
[R] Removing variables from data frame with a wile card
Steven, The default is drop=TRUE. If you want to retain a data.frame and not have it reduced to a vector under some circumstances. https://win-vector.com/2018/02/27/r-tip-use-drop-false-with-data-frames/ -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Steven T. Yen Sent: Sunday, February 12, 2023 5:19 PM To: Andrew Simmons <akwsimmo at gmail.com> Cc: R-help Mailing List <r-help at r-project.org> Subject: Re: [R] Removing variables from data frame with a wile card In the line suggested by Andrew Simmons, mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] what does drop=FALSE do? Thanks. On 1/14/2023 8:48 PM, Steven Yen wrote:> Thanks to all. Very helpful. > > Steven from iPhone > >> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at gmail.com> wrote: >> >> ?You'll want to use grep() or grepl(). By default, grep() uses >> extended regular expressions to find matches, but you can also use >> perl regular expressions and globbing (after converting to a regular expression). >> For example: >> >> grepl("^yr", colnames(mydata)) >> >> will tell you which 'colnames' start with "yr". If you'd rather you >> use globbing: >> >> grepl(glob2rx("yr*"), colnames(mydata)) >> >> Then you might write something like this to remove the columns >> starting with yr: >> >> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] >> >> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at ntu.edu.tw> wrote: >>> >>> I have a data frame containing variables "yr3",...,"yr28". >>> >>> How do I remove them with a wild card----something similar to "del yr*" >>> in Windows/doc? Thank you. >>> >>>> colnames(mydata) >>> [1] "year" "weight" "confeduc" "confothr" "college" >>> [6] ... >>> [41] "yr3" "yr4" "yr5" "yr6" "yr7" >>> [46] "yr8" "yr9" "yr10" "yr11" "yr12" >>> [51] "yr13" "yr14" "yr15" "yr16" "yr17" >>> [56] "yr18" "yr19" "yr20" "yr21" "yr22" >>> [61] "yr23" "yr24" "yr25" "yr26" "yr27" >>> [66] "yr28"... >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Andrew Simmons
2023-Feb-12 22:30 UTC
[R] Removing variables from data frame with a wile card
drop = FALSE means that should the indexing select exactly one column, then return a data frame with one column, instead of the object in the column. It's usually not necessary, but I've messed up some data before by assuming the indexing always returns a data frame when it doesn't, so drop = FALSE let's me that I will always get a data frame. ``` x <- data.frame(V1 = 1:5, V2 = letters[1:5]) x[, "V2"] x[, "V2", drop = FALSE] ``` You'll notice that the first returns a character vector, a through e, where the second returns a data frame with one column where the object in the column is the same character vector. You could alternatively use x["V2"] which should be identical to x[, "V2", drop = FALSE], but some people don't like that because it doesn't look like matrix indexing anymore. On Sun, Feb 12, 2023, 17:18 Steven T. Yen <styen at ntu.edu.tw> wrote:> In the line suggested by Andrew Simmons, > > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] > > what does drop=FALSE do? Thanks. > > On 1/14/2023 8:48 PM, Steven Yen wrote: > > Thanks to all. Very helpful. > > Steven from iPhone > > On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at gmail.com> > <akwsimmo at gmail.com> wrote: > > ?You'll want to use grep() or grepl(). By default, grep() uses extended > regular expressions to find matches, but you can also use perl regular > expressions and globbing (after converting to a regular expression). > For example: > > grepl("^yr", colnames(mydata)) > > will tell you which 'colnames' start with "yr". If you'd rather you > use globbing: > > grepl(glob2rx("yr*"), colnames(mydata)) > > Then you might write something like this to remove the columns starting > with yr: > > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] > > On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at ntu.edu.tw> > <styen at ntu.edu.tw> wrote: > > > I have a data frame containing variables "yr3",...,"yr28". > > > How do I remove them with a wild card----something similar to "del yr*" > > in Windows/doc? Thank you. > > > colnames(mydata) > > [1] "year" "weight" "confeduc" "confothr" "college" > > [6] ... > > [41] "yr3" "yr4" "yr5" "yr6" "yr7" > > [46] "yr8" "yr9" "yr10" "yr11" "yr12" > > [51] "yr13" "yr14" "yr15" "yr16" "yr17" > > [56] "yr18" "yr19" "yr20" "yr21" "yr22" > > [61] "yr23" "yr24" "yr25" "yr26" "yr27" > > [66] "yr28"... > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]