Steven T. Yen
2023-Feb-12 22:18 UTC
[R] Removing variables from data frame with a wile card
In the line suggested by Andrew Simmons,
mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
what does drop=FALSE do? Thanks.
On 1/14/2023 8:48 PM, Steven Yen wrote:> Thanks to all. Very helpful.
>
> Steven from iPhone
>
>> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at
gmail.com> wrote:
>>
>> ?You'll want to use grep() or grepl(). By default, grep() uses
extended
>> regular expressions to find matches, but you can also use perl regular
>> expressions and globbing (after converting to a regular expression).
>> For example:
>>
>> grepl("^yr", colnames(mydata))
>>
>> will tell you which 'colnames' start with "yr". If
you'd rather you
>> use globbing:
>>
>> grepl(glob2rx("yr*"), colnames(mydata))
>>
>> Then you might write something like this to remove the columns
>> starting with yr:
>>
>> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop =
FALSE]
>>
>> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at
ntu.edu.tw> wrote:
>>>
>>> I have a data frame containing variables
"yr3",...,"yr28".
>>>
>>> How do I remove them with a wild card----something similar to
"del yr*"
>>> in Windows/doc? Thank you.
>>>
>>>> colnames(mydata)
>>> ??[1] "year" ??????"weight"
????"confeduc" ??"confothr" "college"
>>> ??[6] ...
>>> ?[41] "yr3" ???????"yr4" ???????"yr5"
???????"yr6" "yr7"
>>> ?[46] "yr8" ???????"yr9"
???????"yr10" ??????"yr11" "yr12"
>>> ?[51] "yr13" ??????"yr14"
??????"yr15" ??????"yr16" "yr17"
>>> ?[56] "yr18" ??????"yr19"
??????"yr20" ??????"yr21" "yr22"
>>> ?[61] "yr23" ??????"yr24"
??????"yr25" ??????"yr26" "yr27"
>>> ?[66] "yr28"...
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
@vi@e@gross m@iii@g oii gm@ii@com
2023-Feb-12 22:25 UTC
[R] Removing variables from data frame with a wile card
Steven,
The default is drop=TRUE.
If you want to retain a data.frame and not have it reduced to a vector under
some circumstances.
https://win-vector.com/2018/02/27/r-tip-use-drop-false-with-data-frames/
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Steven T. Yen
Sent: Sunday, February 12, 2023 5:19 PM
To: Andrew Simmons <akwsimmo at gmail.com>
Cc: R-help Mailing List <r-help at r-project.org>
Subject: Re: [R] Removing variables from data frame with a wile card
In the line suggested by Andrew Simmons,
mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
what does drop=FALSE do? Thanks.
On 1/14/2023 8:48 PM, Steven Yen wrote:> Thanks to all. Very helpful.
>
> Steven from iPhone
>
>> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at
gmail.com> wrote:
>>
>> ?You'll want to use grep() or grepl(). By default, grep() uses
>> extended regular expressions to find matches, but you can also use
>> perl regular expressions and globbing (after converting to a regular
expression).
>> For example:
>>
>> grepl("^yr", colnames(mydata))
>>
>> will tell you which 'colnames' start with "yr". If
you'd rather you
>> use globbing:
>>
>> grepl(glob2rx("yr*"), colnames(mydata))
>>
>> Then you might write something like this to remove the columns
>> starting with yr:
>>
>> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop =
FALSE]
>>
>> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at
ntu.edu.tw> wrote:
>>>
>>> I have a data frame containing variables
"yr3",...,"yr28".
>>>
>>> How do I remove them with a wild card----something similar to
"del yr*"
>>> in Windows/doc? Thank you.
>>>
>>>> colnames(mydata)
>>> [1] "year" "weight"
"confeduc" "confothr" "college"
>>> [6] ...
>>> [41] "yr3" "yr4" "yr5"
"yr6" "yr7"
>>> [46] "yr8" "yr9"
"yr10" "yr11" "yr12"
>>> [51] "yr13" "yr14"
"yr15" "yr16" "yr17"
>>> [56] "yr18" "yr19"
"yr20" "yr21" "yr22"
>>> [61] "yr23" "yr24"
"yr25" "yr26" "yr27"
>>> [66] "yr28"...
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Andrew Simmons
2023-Feb-12 22:30 UTC
[R] Removing variables from data frame with a wile card
drop = FALSE means that should the indexing select exactly one column, then return a data frame with one column, instead of the object in the column. It's usually not necessary, but I've messed up some data before by assuming the indexing always returns a data frame when it doesn't, so drop = FALSE let's me that I will always get a data frame. ``` x <- data.frame(V1 = 1:5, V2 = letters[1:5]) x[, "V2"] x[, "V2", drop = FALSE] ``` You'll notice that the first returns a character vector, a through e, where the second returns a data frame with one column where the object in the column is the same character vector. You could alternatively use x["V2"] which should be identical to x[, "V2", drop = FALSE], but some people don't like that because it doesn't look like matrix indexing anymore. On Sun, Feb 12, 2023, 17:18 Steven T. Yen <styen at ntu.edu.tw> wrote:> In the line suggested by Andrew Simmons, > > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] > > what does drop=FALSE do? Thanks. > > On 1/14/2023 8:48 PM, Steven Yen wrote: > > Thanks to all. Very helpful. > > Steven from iPhone > > On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo at gmail.com> > <akwsimmo at gmail.com> wrote: > > ?You'll want to use grep() or grepl(). By default, grep() uses extended > regular expressions to find matches, but you can also use perl regular > expressions and globbing (after converting to a regular expression). > For example: > > grepl("^yr", colnames(mydata)) > > will tell you which 'colnames' start with "yr". If you'd rather you > use globbing: > > grepl(glob2rx("yr*"), colnames(mydata)) > > Then you might write something like this to remove the columns starting > with yr: > > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] > > On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen at ntu.edu.tw> > <styen at ntu.edu.tw> wrote: > > > I have a data frame containing variables "yr3",...,"yr28". > > > How do I remove them with a wild card----something similar to "del yr*" > > in Windows/doc? Thank you. > > > colnames(mydata) > > [1] "year" "weight" "confeduc" "confothr" "college" > > [6] ... > > [41] "yr3" "yr4" "yr5" "yr6" "yr7" > > [46] "yr8" "yr9" "yr10" "yr11" "yr12" > > [51] "yr13" "yr14" "yr15" "yr16" "yr17" > > [56] "yr18" "yr19" "yr20" "yr21" "yr22" > > [61] "yr23" "yr24" "yr25" "yr26" "yr27" > > [66] "yr28"... > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]