Displaying 3 results from an estimated 3 matches for "abc_someth".
2018 May 05
1
Discovering patterns in textual strings
...are
> AdNames (Real Time Bidding data)
>
> #1. Generally yes, but not always
>
> #2 Separators could be underscores (_) or dots (.) as in 1.2.3_ABC ......
>
> #3 Yes. So there could be Abc 123 could be a matching string
>
> This would not be considered a match ...
> abc_something
> this.is_a long stringwithabcinthemiddle
>
> The sequence(s) are always are at the beginning (or so it appears). Out
> of the 54 billion records I am able to pull (SparkR sql) 948,679 unique
> strings. It is from these unique strings that I (if possible) want to
> identify...
2018 May 07
0
Discovering patterns in textual strings
...rge 53 Billion records. The column in question are AdNames (Real Time Bidding data)
#1. Generally yes, but not always
#2 Separators could be underscores (_) or dots (.) as in 1.2.3_ABC .....
#3 Yes. So there could be Abc 123 could be a matching string
This would not be considered a match ...
abc_something
this.is_a long stringwithabcinthemiddle
The sequence(s) are always are at the beginning (or so it appears). Out of the 54 billion records I am able to pull (SparkR sql) 948,679 unique strings. It is from these unique strings that I (if possible) want to identify the "key" strings....
2018 May 04
4
Discovering patterns in textual strings
R Help Forum
Is there a R library (or a way) that I can extract unique character strings,
or repeating patterns in textual strings. Say for example I have the
following records:
Abc_1234_kjhksh_276
Abc
Abc_1234_lakdofyo_324
Bce_876_skdhk_*&^%*&
Bce
Bce_454
And I would like to see the following results
Abc
Abc_1234
Bce
Jeff Reichman
[[alternative HTML version