thr3ads.net - R help - [R] A question on regular expression [Sep 2019]

If this information is useful, please help other people find it:
Share via:

Christofer Bogaso

2019-Sep-12 17:12 UTC

[R] A question on regular expression

Thanks Bert,

This works, but if in my text there are more than one patterns then
fails to generate desired result.

library(stringr)
str_extract_all(paste("ab{cd$ }ed", "ab{cad$ }ed", collapse
= " "),
".*(\\{.*\\}).*")

This generates below -

[[1]]

[1] "ab{cd$ }ed ab{cad$ }ed"

I was expecting I would get a vector of length 2 with desired pattern.

Where did I make any mistake?

Thanks,

On Thu, Sep 12, 2019 at 10:29 PM Bert Gunter <bgunter.4567 at gmail.com>
wrote:>
> > sub(".*(\\{.*\\}).*", "\\1","ab{cd$
}ed")
> [1] "{cd$ }"
>
> Use ".+" instead of ".*" within the {} if you don't
want to return empty {}'s.
>
> You might wish to use the stringr package for string matching and
manipulation, as it provides a more user friendly and consistent interface to
these tasks.
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Thu, Sep 12, 2019 at 9:31 AM Christofer Bogaso <bogaso.christofer at
gmail.com> wrote:
>>
>> Hi,
>>
>> I am wondering on what is the correct way to select a pattern which
goes as -
>>
>> {"(any character with any length)"}
>>
>> The expressions " {" " and " "} " both
are included in the pattern.
>>
>> For example, the lookup of the above pattern in the text "
>> {"asaf455%"}57573blabla " will result in
{"asaf455%"}
>>
>> Any help will be highly appreciated.
>>
>> Thanks,
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

Bert Gunter

2019-Sep-12 18:49 UTC

head link

[R] A question on regular expression

You can't use the same regex for str_extract_all as I used for sub (or
gsub, which is what is required here)! If you do this sort of thing a lot,
you *must* learn more about regex's.

Anyway, this will do what you want I think:

z <- paste("ab{cd$ }ed", "ab{cad$ }ed", collapse = "
")  ## just for
readability
> str_extract_all(z,"\\{[^}]*\\}")[[1]]
[1] "{cd$ }"  "{cad$ }"

Cheers,
Bert

On Thu, Sep 12, 2019 at 10:12 AM Christofer Bogaso <
bogaso.christofer at gmail.com> wrote:
> Thanks Bert,
>
> This works, but if in my text there are more than one patterns then
> fails to generate desired result.
>
> library(stringr)
> str_extract_all(paste("ab{cd$ }ed", "ab{cad$ }ed",
collapse = " "),
> ".*(\\{.*\\}).*")
>
> This generates below -
>
> [[1]]
>
> [1] "ab{cd$ }ed ab{cad$ }ed"
>
> I was expecting I would get a vector of length 2 with desired pattern.
>
> Where did I make any mistake?
>
> Thanks,
>
> On Thu, Sep 12, 2019 at 10:29 PM Bert Gunter <bgunter.4567 at
gmail.com>
> wrote:
> >
> > > sub(".*(\\{.*\\}).*", "\\1","ab{cd$
}ed")
> > [1] "{cd$ }"
> >
> > Use ".+" instead of ".*" within the {} if you
don't want to return empty
> {}'s.
> >
> > You might wish to use the stringr package for string matching and
> manipulation, as it provides a more user friendly and consistent interface
> to these tasks.
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming
along
> and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic
strip )
> >
> >
> > On Thu, Sep 12, 2019 at 9:31 AM Christofer Bogaso <
> bogaso.christofer at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I am wondering on what is the correct way to select a pattern
which
> goes as -
> >>
> >> {"(any character with any length)"}
> >>
> >> The expressions " {" " and " "} "
both are included in the pattern.
> >>
> >> For example, the lookup of the above pattern in the text "
> >> {"asaf455%"}57573blabla " will result in
{"asaf455%"}
> >>
> >> Any help will be highly appreciated.
> >>
> >> Thanks,
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Christofer Bogaso

2019-Sep-12 18:54 UTC

head link

[R] A question on regular expression

Awesome, thanks!

On Fri, Sep 13, 2019 at 12:19 AM Bert Gunter <bgunter.4567 at gmail.com>
wrote:>
>
> You can't use the same regex for str_extract_all as I used for sub (or
gsub, which is what is required here)! If you do this sort of thing a lot, you
*must* learn more about regex's.
>
> Anyway, this will do what you want I think:
>
> z <- paste("ab{cd$ }ed", "ab{cad$ }ed", collapse =
" ")  ## just for readability
>
> > str_extract_all(z,"\\{[^}]*\\}")
> [[1]]
> [1] "{cd$ }"  "{cad$ }"
>
> Cheers,
> Bert
>
> On Thu, Sep 12, 2019 at 10:12 AM Christofer Bogaso <bogaso.christofer at
gmail.com> wrote:
>>
>> Thanks Bert,
>>
>> This works, but if in my text there are more than one patterns then
>> fails to generate desired result.
>>
>> library(stringr)
>> str_extract_all(paste("ab{cd$ }ed", "ab{cad$ }ed",
collapse = " "),
>> ".*(\\{.*\\}).*")
>>
>> This generates below -
>>
>> [[1]]
>>
>> [1] "ab{cd$ }ed ab{cad$ }ed"
>>
>> I was expecting I would get a vector of length 2 with desired pattern.
>>
>> Where did I make any mistake?
>>
>> Thanks,
>>
>> On Thu, Sep 12, 2019 at 10:29 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>> >
>> > > sub(".*(\\{.*\\}).*", "\\1","ab{cd$
}ed")
>> > [1] "{cd$ }"
>> >
>> > Use ".+" instead of ".*" within the {} if you
don't want to return empty {}'s.
>> >
>> > You might wish to use the stringr package for string matching and
manipulation, as it provides a more user friendly and consistent interface to
these tasks.
>> >
>> >
>> > Bert Gunter
>> >
>> > "The trouble with having an open mind is that people keep
coming along and sticking things into it."
>> > -- Opus (aka Berkeley Breathed in his "Bloom County"
comic strip )
>> >
>> >
>> > On Thu, Sep 12, 2019 at 9:31 AM Christofer Bogaso
<bogaso.christofer at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am wondering on what is the correct way to select a pattern
which goes as -
>> >>
>> >> {"(any character with any length)"}
>> >>
>> >> The expressions " {" " and " "}
" both are included in the pattern.
>> >>
>> >> For example, the lookup of the above pattern in the text
"
>> >> {"asaf455%"}57573blabla " will result in
{"asaf455%"}
>> >>
>> >> Any help will be highly appreciated.
>> >>
>> >> Thanks,
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible
code.

Christofer Bogaso

2019-Sep-13 10:59 UTC

head link

[R] A question on regular expression

A quick question.

Could you please explain the -- [^}]* -- part in finding the pattern?

On Fri, Sep 13, 2019 at 12:19 AM Bert Gunter <bgunter.4567 at gmail.com>
wrote:>
>
> You can't use the same regex for str_extract_all as I used for sub (or
gsub, which is what is required here)! If you do this sort of thing a lot, you
*must* learn more about regex's.
>
> Anyway, this will do what you want I think:
>
> z <- paste("ab{cd$ }ed", "ab{cad$ }ed", collapse =
" ")  ## just for readability
>
> > str_extract_all(z,"\\{[^}]*\\}")
> [[1]]
> [1] "{cd$ }"  "{cad$ }"
>
> Cheers,
> Bert
>
> On Thu, Sep 12, 2019 at 10:12 AM Christofer Bogaso <bogaso.christofer at
gmail.com> wrote:
>>
>> Thanks Bert,
>>
>> This works, but if in my text there are more than one patterns then
>> fails to generate desired result.
>>
>> library(stringr)
>> str_extract_all(paste("ab{cd$ }ed", "ab{cad$ }ed",
collapse = " "),
>> ".*(\\{.*\\}).*")
>>
>> This generates below -
>>
>> [[1]]
>>
>> [1] "ab{cd$ }ed ab{cad$ }ed"
>>
>> I was expecting I would get a vector of length 2 with desired pattern.
>>
>> Where did I make any mistake?
>>
>> Thanks,
>>
>> On Thu, Sep 12, 2019 at 10:29 PM Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>> >
>> > > sub(".*(\\{.*\\}).*", "\\1","ab{cd$
}ed")
>> > [1] "{cd$ }"
>> >
>> > Use ".+" instead of ".*" within the {} if you
don't want to return empty {}'s.
>> >
>> > You might wish to use the stringr package for string matching and
manipulation, as it provides a more user friendly and consistent interface to
these tasks.
>> >
>> >
>> > Bert Gunter
>> >
>> > "The trouble with having an open mind is that people keep
coming along and sticking things into it."
>> > -- Opus (aka Berkeley Breathed in his "Bloom County"
comic strip )
>> >
>> >
>> > On Thu, Sep 12, 2019 at 9:31 AM Christofer Bogaso
<bogaso.christofer at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am wondering on what is the correct way to select a pattern
which goes as -
>> >>
>> >> {"(any character with any length)"}
>> >>
>> >> The expressions " {" " and " "}
" both are included in the pattern.
>> >>
>> >> For example, the lookup of the above pattern in the text
"
>> >> {"asaf455%"}57573blabla " will result in
{"asaf455%"}
>> >>
>> >> Any help will be highly appreciated.
>> >>
>> >> Thanks,
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible
code.

R help - Sep 2019 - A question on regular expression

[R] A question on regular expression

[R] A question on regular expression

[R] A question on regular expression

[R] A question on regular expression