thr3ads.net - R help - [R] Counting enumerated items in each element of a character vector [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Boris Steipe

2017-Apr-26 00:33 UTC

[R] Counting enumerated items in each element of a character vector

I should add: there's a str_count() function in the stringr package.

library(stringr)
str_count(text1, "Example")
# [1] 5 5 5 5

I guess that would be the neater solution.

B.


> On Apr 25, 2017, at 8:23 PM, Boris Steipe <boris.steipe at
utoronto.ca> wrote:
> 
> How about:
> 
> unlist(lapply(strsplit(text1, "Example"), function(x) { length(x)
- 1 } ))
> 
> 
> Splitting your string on the five "Examples" in each gives six
elements. length(x) - 1 is the number of
> matches. You can use any regex instead of "example" if you need
to tweak what you are looking for.
> 
> 
> B.
> 
> 
> 
> 
>> On Apr 25, 2017, at 8:14 PM, Dan Abner <dan.abner99 at gmail.com>
wrote:
>> 
>> Hi all,
>> 
>> I am looking for a streamlined way of counting the number of enumerated
>> items are each element of a character vector. For example:
>> 
>> 
>> text1<-c("This is an example.
>> List 1
>> 1) Example 1
>> 2) Example 2
>> 10) Example 10
>> List 2
>> 1) Example 1
>> 2) Example 2
>> These have been examples.","This is another example.
>> List 1
>> 1. Example 1
>> 2. Example 2
>> 10. Example 10
>> List 2
>> 1. Example 1
>> 2. Example 2
>> These have been examples.","This is a third example. List 1
1) Example 1.
>> 2) Example 2. 10) Example 10. List 2 1) Example 1. 2) Example 2. These
have
>> been examples."
>> ,"This is a fourth example. List 1 1. Example 1. 2. Example 2. 10.
Example
>> 10. List 2 Example 1. 2. Example 2. These have been examples.")
>> 
>> text1
>> 
>> ==>> 
>> I would like the result to be c(5,5,5,5). Notice that sometimes there
are
>> leading hard returns, other times not. Sometimes are there separate
lists
>> and the same numbers are used in the enumerated items multiple times
within
>> each character string. Sometimes the leading numbers for the enumerated
>> items exceed single digits. Notice that the delimiter may be ) or a
period
>> (.). If the delimiter is a period and there are hard returns (example
2),
>> then I expect that will be easy enough to differentiate sentences
ending
>> with a number from enumerated items. However, I imagine it would be
much
>> more difficult to differentiate the two for example 4.
>> 
>> Any suggestions are appreciated.
>> 
>> Best,
>> 
>> Dan
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Michael Hannon

2017-Apr-26 03:40 UTC

head link

[R] Counting enumerated items in each element of a character vector

I like Boris's "Hadley" solution.  For the record, I've
appended a
version that uses regular expressions, the only benefit of which is
that it could be generalized to find more-complicated patterns.

-- Mike

counts <- sapply(text1, function(next_string) {
    loc_example <- length(gregexpr("Example", next_string)[[1]])
    loc_example
}, USE.NAMES=FALSE)
> counts
[1] 5 5 5 5>
On Tue, Apr 25, 2017 at 5:33 PM, Boris Steipe <boris.steipe at
utoronto.ca> wrote:> I should add: there's a str_count() function in the stringr package.
>
> library(stringr)
> str_count(text1, "Example")
> # [1] 5 5 5 5
>
> I guess that would be the neater solution.
>
> B.
>
>
>
>> On Apr 25, 2017, at 8:23 PM, Boris Steipe <boris.steipe at
utoronto.ca> wrote:
>>
>> How about:
>>
>> unlist(lapply(strsplit(text1, "Example"), function(x) {
length(x) - 1 } ))
>>
>>
>> Splitting your string on the five "Examples" in each gives
six elements. length(x) - 1 is the number of
>> matches. You can use any regex instead of "example" if you
need to tweak what you are looking for.
>>
>>
>> B.
>>
>>
>>
>>
>>> On Apr 25, 2017, at 8:14 PM, Dan Abner <dan.abner99 at
gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> I am looking for a streamlined way of counting the number of
enumerated
>>> items are each element of a character vector. For example:
>>>
>>>
>>> text1<-c("This is an example.
>>> List 1
>>> 1) Example 1
>>> 2) Example 2
>>> 10) Example 10
>>> List 2
>>> 1) Example 1
>>> 2) Example 2
>>> These have been examples.","This is another example.
>>> List 1
>>> 1. Example 1
>>> 2. Example 2
>>> 10. Example 10
>>> List 2
>>> 1. Example 1
>>> 2. Example 2
>>> These have been examples.","This is a third example. List
1 1) Example 1.
>>> 2) Example 2. 10) Example 10. List 2 1) Example 1. 2) Example 2.
These have
>>> been examples."
>>> ,"This is a fourth example. List 1 1. Example 1. 2. Example 2.
10. Example
>>> 10. List 2 Example 1. 2. Example 2. These have been
examples.")
>>>
>>> text1
>>>
>>> ==>>>
>>> I would like the result to be c(5,5,5,5). Notice that sometimes
there are
>>> leading hard returns, other times not. Sometimes are there separate
lists
>>> and the same numbers are used in the enumerated items multiple
times within
>>> each character string. Sometimes the leading numbers for the
enumerated
>>> items exceed single digits. Notice that the delimiter may be ) or a
period
>>> (.). If the delimiter is a period and there are hard returns
(example 2),
>>> then I expect that will be easy enough to differentiate sentences
ending
>>> with a number from enumerated items. However, I imagine it would be
much
>>> more difficult to differentiate the two for example 4.
>>>
>>> Any suggestions are appreciated.
>>>
>>> Best,
>>>
>>> Dan
>>>
>>>      [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ista Zahn

2017-Apr-26 03:47 UTC

head link

[R] Counting enumerated items in each element of a character vector

stringr::str_count (and stringi::stri_count that it wraps) interpret
the pattern argument as a regular expression by default.

Best,
Ista

On Tue, Apr 25, 2017 at 11:40 PM, Michael Hannon
<jmhannon.ucdavis at gmail.com> wrote:> I like Boris's "Hadley" solution.  For the record, I've
appended a
> version that uses regular expressions, the only benefit of which is
> that it could be generalized to find more-complicated patterns.
>
> -- Mike
>
> counts <- sapply(text1, function(next_string) {
>     loc_example <- length(gregexpr("Example",
next_string)[[1]])
>     loc_example
> }, USE.NAMES=FALSE)
>
>> counts
> [1] 5 5 5 5
>>
>
> On Tue, Apr 25, 2017 at 5:33 PM, Boris Steipe <boris.steipe at
utoronto.ca> wrote:
>> I should add: there's a str_count() function in the stringr
package.
>>
>> library(stringr)
>> str_count(text1, "Example")
>> # [1] 5 5 5 5
>>
>> I guess that would be the neater solution.
>>
>> B.
>>
>>
>>
>>> On Apr 25, 2017, at 8:23 PM, Boris Steipe <boris.steipe at
utoronto.ca> wrote:
>>>
>>> How about:
>>>
>>> unlist(lapply(strsplit(text1, "Example"), function(x) {
length(x) - 1 } ))
>>>
>>>
>>> Splitting your string on the five "Examples" in each
gives six elements. length(x) - 1 is the number of
>>> matches. You can use any regex instead of "example" if
you need to tweak what you are looking for.
>>>
>>>
>>> B.
>>>
>>>
>>>
>>>
>>>> On Apr 25, 2017, at 8:14 PM, Dan Abner <dan.abner99 at
gmail.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I am looking for a streamlined way of counting the number of
enumerated
>>>> items are each element of a character vector. For example:
>>>>
>>>>
>>>> text1<-c("This is an example.
>>>> List 1
>>>> 1) Example 1
>>>> 2) Example 2
>>>> 10) Example 10
>>>> List 2
>>>> 1) Example 1
>>>> 2) Example 2
>>>> These have been examples.","This is another example.
>>>> List 1
>>>> 1. Example 1
>>>> 2. Example 2
>>>> 10. Example 10
>>>> List 2
>>>> 1. Example 1
>>>> 2. Example 2
>>>> These have been examples.","This is a third example.
List 1 1) Example 1.
>>>> 2) Example 2. 10) Example 10. List 2 1) Example 1. 2) Example
2. These have
>>>> been examples."
>>>> ,"This is a fourth example. List 1 1. Example 1. 2.
Example 2. 10. Example
>>>> 10. List 2 Example 1. 2. Example 2. These have been
examples.")
>>>>
>>>> text1
>>>>
>>>> ==>>>>
>>>> I would like the result to be c(5,5,5,5). Notice that sometimes
there are
>>>> leading hard returns, other times not. Sometimes are there
separate lists
>>>> and the same numbers are used in the enumerated items multiple
times within
>>>> each character string. Sometimes the leading numbers for the
enumerated
>>>> items exceed single digits. Notice that the delimiter may be )
or a period
>>>> (.). If the delimiter is a period and there are hard returns
(example 2),
>>>> then I expect that will be easy enough to differentiate
sentences ending
>>>> with a number from enumerated items. However, I imagine it
would be much
>>>> more difficult to differentiate the two for example 4.
>>>>
>>>> Any suggestions are appreciated.
>>>>
>>>> Best,
>>>>
>>>> Dan
>>>>
>>>>      [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Apr 2017 - Counting enumerated items in each element of a character vector

[R] Counting enumerated items in each element of a character vector

[R] Counting enumerated items in each element of a character vector

[R] Counting enumerated items in each element of a character vector