thr3ads.net - R help - [R] gsub issue with consecutive pattern finds [Mar 2024]

If this information is useful, please help other people find it:
Share via:

Iris Simmons

2024-Mar-01 11:49 UTC

[R] gsub issue with consecutive pattern finds

Hi Iago,


This is not a bug. It is expected. Patterns may not overlap. However, there
is a way to get the result you want using perl:

```R
gsub("([aeiouAEIOU])(?=[aeiouAEIOU])", "\\1_",
"aerioue", perl = TRUE)
```

The specific change I made is called a positive lookahead, you can read
more about it here:

https://www.regular-expressions.info/lookaround.html

It's a way to check for a piece of text without consuming it in the match.

Also, since you don't care about character case, it might be more legible
to add ignore.case = TRUE and remove the upper case characters:

```R
gsub("([aeiou])(?=[aeiou])", "\\1_", "aerioue",
perl = TRUE, ignore.case TRUE)

## or

gsub("(?i)([aeiou])(?=[aeiou])", "\\1_",
"aerioue", perl = TRUE)
```

I hope this helps!


On Fri, Mar 1, 2024, 06:37 Iago Gin? V?zquez <iago.gine at sjd.es> wrote:
> Hi all,
>
> I tested next command:
>
> gsub("([aeiouAEIOU])([aeiouAEIOU])", "\\1_\\2",
"aerioue")
>
> with the following output:
>
> [1] "a_eri_ou_e"
>
> So, there are two consecutive vowels where an underscore is not added.
>
> May it be a bug? Is it expected (bug or not)? Is there any chance to get
> what I want (an underscore between each pair of consecutive vowels)?
>
>
> Thank you!
>
> Best regards,
>
> Iago
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Iago Giné Vázquez

2024-Mar-01 11:52 UTC

head link

[R] gsub issue with consecutive pattern finds

Hi Iris,

Thank you. Further, very nice solution.

Best,

Iago

On 01/03/2024 12:49, Iris Simmons wrote:> Hi Iago,
>
>
> This is not a bug. It is expected. Patterns may not overlap. However, there
> is a way to get the result you want using perl:
>
> ```R
> gsub("([aeiouAEIOU])(?=[aeiouAEIOU])", "\\1_",
"aerioue", perl = TRUE)
> ```
>
> The specific change I made is called a positive lookahead, you can read
> more about it here:
>
> https://www.regular-expressions.info/lookaround.html
>
> It's a way to check for a piece of text without consuming it in the
match.
>
> Also, since you don't care about character case, it might be more
legible
> to add ignore.case = TRUE and remove the upper case characters:
>
> ```R
> gsub("([aeiou])(?=[aeiou])", "\\1_",
"aerioue", perl = TRUE, ignore.case > TRUE)
>
> ## or
>
> gsub("(?i)([aeiou])(?=[aeiou])", "\\1_",
"aerioue", perl = TRUE)
> ```
>
> I hope this helps!
>
>
> On Fri, Mar 1, 2024, 06:37 Iago Gin? V?zquez<iago.gine at sjd.es> 
wrote:
>
>> Hi all,
>>
>> I tested next command:
>>
>> gsub("([aeiouAEIOU])([aeiouAEIOU])", "\\1_\\2",
"aerioue")
>>
>> with the following output:
>>
>> [1] "a_eri_ou_e"
>>
>> So, there are two consecutive vowels where an underscore is not added.
>>
>> May it be a bug? Is it expected (bug or not)? Is there any chance to
get
>> what I want (an underscore between each pair of consecutive vowels)?
>>
>>
>> Thank you!
>>
>> Best regards,
>>
>> Iago
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org  mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>	[[alternative HTML version deleted]]

Maybe Matching Threads

Search for more reasonably related threads

R help - Mar 2024 - gsub issue with consecutive pattern finds

[R] gsub issue with consecutive pattern finds

[R] gsub issue with consecutive pattern finds

Maybe Matching Threads