thr3ads.net - R help - [R] Retain last grouping after a strsplit() [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Steven Ranney

2012-Dec-11 17:46 UTC

[R] Retain last grouping after a strsplit()

All -

I have a column of SiteNames:

SiteName
OYS-PIA2-FL-1
OYS-PIA2-LA-1
OYS-PI-LA-BB-1
OYS-PIA2-LA-10
...
[truncated]

and I want to include only the last few digits into a new column.

I tried

substr(data$SiteName, 13, 20)

but because some SiteName values are of a different length, the final
hyphen (i.e., "-") was included:

"1"
"1"
"-1"
"10"
...

 so I use

strsplit(data$SiteName, split = "-")

and get

"OYS" "PIA2" "FL" "1"
"OYS" "PIA2" "LA" "1"
"OYS" "PI" "LA" "BB" "1"
"OYS" "PIA2" "LA" "10"
...

which is great.  Unfortunately, I'm stuck.  I don't know how to
retrieve the final grouping of information from the strsplit()
statement I called into a new column.

Can you help?

Thanks -

SR
Steven H. Ranney

jim holtman

2012-Dec-11 18:10 UTC

head link

[R] Retain last grouping after a strsplit()

try this:
> x[1] "OYS-PIA2-FL-1"  "OYS-PIA2-LA-1" 
"OYS-PI-LA-BB-1" "OYS-PIA2-LA-10"> sub("^.*?([0-9]+)$", "\\1", x)[1] "1"  "1"  "1" 
"10">


On Tue, Dec 11, 2012 at 12:46 PM, Steven Ranney <steven.ranney at
gmail.com> wrote:> OYS-PIA2-FL-1
> OYS-PIA2-LA-1
> OYS-PI-LA-BB-1
> OYS-PIA2-LA-10


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

David Winsemius

2012-Dec-11 18:37 UTC

head link

[R] Retain last grouping after a strsplit()

On Dec 11, 2012, at 10:10 AM, jim holtman wrote:
> try this:
>
>> x
> [1] "OYS-PIA2-FL-1"  "OYS-PIA2-LA-1" 
"OYS-PI-LA-BB-1" "OYS-PIA2-
> LA-10"
>> sub("^.*?([0-9]+)$", "\\1", x)
> [1] "1"  "1"  "1"  "10"
>>
>
>
Steve;

jim holtman is one of the jewels of the rhelp world. I generally  
assume that his answers are going to be the most succinct and  
efficient ones possible and avoid adding noise, but here I thought I  
would try to improve. Thinking there might be a string-splitting  
approach I first tried (and discovered a not-so-great solution:

  x <- c("OYS-PIA2-FL-1",  "OYS-PIA2-LA-1", 
"OYS-PI-LA-BB-1", "OYS-
PIA2-LA-10")
  sapply( strsplit(x, "-") , "[", 4)
[1] "1"  "1"  "BB" "10"

So then I asked myself if we could just "blank out" everything before
the last das, finding what seemed to be a fairly economical solution  
and one that does not require back-references:

  sub( "^.+-" , "", x)
[1] "1"  "1"  "1"  "10"

If there were no digits after the last dash these approaches give  
different results:

  x <- c("OYS-PIA2-FL-1",  "OYS-PIA2-LA-1", 
"OYS-PI-LA-BB-1", "OYS-
PIA2-LA-")

  sub( "^.+-" , "", x)
[1] "1" "1" "1" ""

  sub("^.*?([0-9]+)$", "\\1", x)
[1] "1"            "1"            "1"           
"OYS-PIA2-LA-"

When a grep pattern does not match, sub and gsub will return the whole  
argument.

-- 
David.
>
> On Tue, Dec 11, 2012 at 12:46 PM, Steven Ranney <steven.ranney at
gmail.com
> > wrote:
>> OYS-PIA2-FL-1
>> OYS-PIA2-LA-1
>> OYS-PI-LA-BB-1
>> OYS-PIA2-LA-10
>
>
>
> -- 
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Alameda, CA, USA

arun

2012-Dec-11 19:31 UTC

head link

[R] Retain last grouping after a strsplit()

HI,
You could also use:
x <- c("OYS-PIA2-FL-1",? "OYS-PIA2-LA-1",?
"OYS-PI-LA-BB-1", "OYS-PIA2-LA-10")
gsub(".*\\-(\\d+)$","\\1",x)
#[1] "1"? "1"? "1"? "10"

#or 
gsub("[A-Z2-]","",x) #in this case
#[1] "1"? "1"? "1"? "10"



----- Original Message -----
From: Steven Ranney <steven.ranney at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, December 11, 2012 12:46 PM
Subject: [R] Retain last grouping after a strsplit()

All -

I have a column of SiteNames:

SiteName
OYS-PIA2-FL-1
OYS-PIA2-LA-1
OYS-PI-LA-BB-1
OYS-PIA2-LA-10
...
[truncated]

and I want to include only the last few digits into a new column.

I tried

substr(data$SiteName, 13, 20)

but because some SiteName values are of a different length, the final
hyphen (i.e., "-") was included:

"1"
"1"
"-1"
"10"
...

so I use

strsplit(data$SiteName, split = "-")

and get

"OYS" "PIA2" "FL" "1"
"OYS" "PIA2" "LA" "1"
"OYS" "PI" "LA" "BB" "1"
"OYS" "PIA2" "LA" "10"
...

which is great.? Unfortunately, I'm stuck.? I don't know how to
retrieve the final grouping of information from the strsplit()
statement I called into a new column.

Can you help?

Thanks -

SR
Steven H. Ranney

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gabor Grothendieck

2012-Dec-11 19:56 UTC

head link

[R] Retain last grouping after a strsplit()

On Tue, Dec 11, 2012 at 12:46 PM, Steven Ranney <steven.ranney at
gmail.com> wrote:> All -
>
> I have a column of SiteNames:
>
> SiteName
> OYS-PIA2-FL-1
> OYS-PIA2-LA-1
> OYS-PI-LA-BB-1
> OYS-PIA2-LA-10
> ...
> [truncated]
>
> and I want to include only the last few digits into a new column.
>
> I tried
>
> substr(data$SiteName, 13, 20)
>
> but because some SiteName values are of a different length, the final
> hyphen (i.e., "-") was included:
>
> "1"
> "1"
> "-1"
> "10"
Replace everything up to the last dash with the empty string like this:

sub(".*-", "", data$SiteName)

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Paul Miller

2012-Dec-12 13:55 UTC

head link

[R] Retain last grouping after a strsplit()

Hi Steven,

Not sure if you want to understand regular expressions in general or just the
solution to your particular problem. If it's the former and you'd be
willing to read a book on the subject, I'd recommend "Mastering Regular
Expressions" by Jeffrey Friedl. I'm about halfway through now, and
think the book is excellent. I'm developing an understanding that I feel is
much harder to obtain solely from the documentation that is available online.

Thanks,

Paul

Seemingly Similar Threads

Search for more reasonably related threads

R help - Dec 2012 - Retain last grouping after a strsplit()

[R] Retain last grouping after a strsplit()

[R] Retain last grouping after a strsplit()

[R] Retain last grouping after a strsplit()

[R] Retain last grouping after a strsplit()

[R] Retain last grouping after a strsplit()

[R] Retain last grouping after a strsplit()

Seemingly Similar Threads