thr3ads.net - R help - [R] remove a row [Nov 2019]

If this information is useful, please help other people find it:
Share via:

Ashta

2019-Nov-28 23:16 UTC

[R] remove a row

Hi all,  I want to remove a row based on a condition in one of the
variables from a data frame.
When we split this string it should be composed of 3-2- 5 format (3
digits numeric, 2 characters and 5 digits  numeric).  Like
area code -region-numeric. The max length of the area code should be
3, the  max length of region be should be 2,  followed by a max length
of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
3 digits  but not more than three digits.  So  the  max length of this
variable is 10.  Anything outside of this pattern should be excluded.
As an example

dat <-read.table(text=" rown  varx
1   9F209
2  FL250
3  2F250
4  102250
5  102FL
6   102
7  1212FL250
8  121FL50",header=TRUE,stringsAsFactors=F)

1  9F209           # keep
2  FL250           # remove, no area code
3   2F250          # keep
4  102250         # remove , no region code
5  102FL           # remove , no numeric after region code
6   102              # remove ,  no region code and numeric
7  1212FL250  #remove, area code is more than three digits
8  121FL50      # Keep

The desired output should be
1   9F209
3   2F250
8  121FL50

How do I do this in an efficient way?

Thank you in advance

Bert Gunter

2019-Nov-29 01:31 UTC

head link

[R] remove a row

Use regular expressions.

See ?regexp  and ?grep

Using your example:
>
grep("^[[:digit:]]{1,3}[[:alpha:]]{1,2}[[:digit:]]{1,5}$",dat$varx,value= TRUE)
[1] "9F209"   "2F250"   "121FL50"

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Nov 28, 2019 at 3:17 PM Ashta <sewashm at gmail.com> wrote:
> Hi all,  I want to remove a row based on a condition in one of the
> variables from a data frame.
> When we split this string it should be composed of 3-2- 5 format (3
> digits numeric, 2 characters and 5 digits  numeric).  Like
> area code -region-numeric. The max length of the area code should be
> 3, the  max length of region be should be 2,  followed by a max length
> of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
> 3 digits  but not more than three digits.  So  the  max length of this
> variable is 10.  Anything outside of this pattern should be excluded.
> As an example
>
> dat <-read.table(text=" rown  varx
> 1   9F209
> 2  FL250
> 3  2F250
> 4  102250
> 5  102FL
> 6   102
> 7  1212FL250
> 8  121FL50",header=TRUE,stringsAsFactors=F)
>
> 1  9F209           # keep
> 2  FL250           # remove, no area code
> 3   2F250          # keep
> 4  102250         # remove , no region code
> 5  102FL           # remove , no numeric after region code
> 6   102              # remove ,  no region code and numeric
> 7  1212FL250  #remove, area code is more than three digits
> 8  121FL50      # Keep
>
> The desired output should be
> 1   9F209
> 3   2F250
> 8  121FL50
>
> How do I do this in an efficient way?
>
> Thank you in advance
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Ashta

2019-Nov-29 01:57 UTC

head link

[R] remove a row

Thank you so much Bert.

Is it possible to split the varx into  three ( area code, region and
the numeric part)as a separate variable

On Thu, Nov 28, 2019 at 7:31 PM Bert Gunter <bgunter.4567 at gmail.com>
wrote:>
> Use regular expressions.
>
> See ?regexp  and ?grep
>
> Using your example:
>
> >
grep("^[[:digit:]]{1,3}[[:alpha:]]{1,2}[[:digit:]]{1,5}$",dat$varx,value
= TRUE)
> [1] "9F209"   "2F250"   "121FL50"
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Thu, Nov 28, 2019 at 3:17 PM Ashta <sewashm at gmail.com> wrote:
>>
>> Hi all,  I want to remove a row based on a condition in one of the
>> variables from a data frame.
>> When we split this string it should be composed of 3-2- 5 format (3
>> digits numeric, 2 characters and 5 digits  numeric).  Like
>> area code -region-numeric. The max length of the area code should be
>> 3, the  max length of region be should be 2,  followed by a max length
>> of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
>> 3 digits  but not more than three digits.  So  the  max length of this
>> variable is 10.  Anything outside of this pattern should be excluded.
>> As an example
>>
>> dat <-read.table(text=" rown  varx
>> 1   9F209
>> 2  FL250
>> 3  2F250
>> 4  102250
>> 5  102FL
>> 6   102
>> 7  1212FL250
>> 8  121FL50",header=TRUE,stringsAsFactors=F)
>>
>> 1  9F209           # keep
>> 2  FL250           # remove, no area code
>> 3   2F250          # keep
>> 4  102250         # remove , no region code
>> 5  102FL           # remove , no numeric after region code
>> 6   102              # remove ,  no region code and numeric
>> 7  1212FL250  #remove, area code is more than three digits
>> 8  121FL50      # Keep
>>
>> The desired output should be
>> 1   9F209
>> 3   2F250
>> 8  121FL50
>>
>> How do I do this in an efficient way?
>>
>> Thank you in advance
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

R help - Nov 2019 - remove a row

[R] remove a row

[R] remove a row

[R] remove a row