thr3ads.net - R help - [R] interval between specific characters in a string... [Dec 2022]

If this information is useful, please help other people find it:
Share via:

Rui Barradas

2022-Dec-03 08:48 UTC

[R] interval between specific characters in a string...

?s 17:18 de 02/12/2022, Evan Cooch escreveu:> Was wondering if there is an 'efficient/elegant' way to do the
following
> (without tidyverse). Take a string
> 
> abaaabbaaaaabaaab
> 
> Its easy enough to count the number of times the character 'b'
shows up
> in the string, but...what I'm looking for is outputing the
'intervals'
> between occurrences of 'b' (starting the counter at the beginning
of the
> string). So, for the preceding example, 'b' shows up in positions
> 
> 2, 6, 7, 13, 17
> 
> So, the interval data would be: 2, 4, 1, 6, 4
> 
> My main approach has been to simply output positions (say, something 
> like unlist(gregexpr('b', target_string))), and 'do the
math' between
> successive positions. Can anyone suggest a more elegant approach?
> 
> Thanks in advance...
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.Hello,

I don't find your solution inelegant, it's even easy to write it as a 
one-line function.


char_interval <- function(x, s) {
   lapply(gregexpr(x, s), \(y) c(head(y, 1), diff(y)))
}

target_string <-"abaaabbaaaaabaaab"
char_interval('b', target_string)
#> [[1]]
#> [1] 2 4 1 6 4


Hope this helps,

Rui Barradas

Bert Gunter

2022-Dec-03 15:21 UTC

head link

[R] interval between specific characters in a string...

Perhaps it is worth pointing out that looping constructs like lapply() can
be avoided and the procedure vectorized by mimicking Martin Morgan's
solution:

## s is the string to be searched.
diff(c(0,grep('b',strsplit(s,'')[[1]])))

However, Martin's solution is simpler and likely even faster as the regex
engine is unneeded:

diff(c(0, which(strsplit(s, "")[[1]] == "b"))) ## completely
vectorized

This seems much preferable to me.

-- Bert





On Sat, Dec 3, 2022 at 12:49 AM Rui Barradas <ruipbarradas at sapo.pt>
wrote:
> ?s 17:18 de 02/12/2022, Evan Cooch escreveu:
> > Was wondering if there is an 'efficient/elegant' way to do the
following
> > (without tidyverse). Take a string
> >
> > abaaabbaaaaabaaab
> >
> > Its easy enough to count the number of times the character 'b'
shows up
> > in the string, but...what I'm looking for is outputing the
'intervals'
> > between occurrences of 'b' (starting the counter at the
beginning of the
> > string). So, for the preceding example, 'b' shows up in
positions
> >
> > 2, 6, 7, 13, 17
> >
> > So, the interval data would be: 2, 4, 1, 6, 4
> >
> > My main approach has been to simply output positions (say, something
> > like unlist(gregexpr('b', target_string))), and 'do the
math' between
> > successive positions. Can anyone suggest a more elegant approach?
> >
> > Thanks in advance...
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> I don't find your solution inelegant, it's even easy to write it as
a
> one-line function.
>
>
> char_interval <- function(x, s) {
>    lapply(gregexpr(x, s), \(y) c(head(y, 1), diff(y)))
> }
>
> target_string <-"abaaabbaaaaabaaab"
> char_interval('b', target_string)
> #> [[1]]
> #> [1] 2 4 1 6 4
>
>
> Hope this helps,
>
> Rui Barradas
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

R help - Dec 2022 - interval between specific characters in a string...

[R] interval between specific characters in a string...

[R] interval between specific characters in a string...