Hi, I have a data.frame as following: var1 var2 1 ab_c_(ok) 2 okf789(db)_c 3 jojfiod(90).gt 4 "ij"_(78)__op 5 (iojfodjfo)_ab what I want is to create a new variable called "var3". the value of var3 is the content in the Parentheses. so var3 would be: var3 ok db 90 78 iojfodjfo how to do this? thanks, karena -- View this message in context: http://r.789695.n4.nabble.com/question-about-string-handling-tp2289178p2289178.html Sent from the R help mailing list archive at Nabble.com.
On Wed, Jul 14, 2010 at 2:21 PM, karena <dr.jzhou at gmail.com> wrote:> > Hi, > > I have a data.frame as following: > var1 ? ? ? ? var2 > 1 ? ? ? ? ? ab_c_(ok) > 2 ? ? ? ? ? okf789(db)_c > 3 ? ? ? ? ? jojfiod(90).gt > 4 ? ? ? ? ? "ij"_(78)__op > 5 ? ? ? ? ? (iojfodjfo)_ab > > what I want is to create a new variable called "var3". the value of var3 is > the content in the Parentheses. so var3 would be: > var3 > ok > db > 90 > 78 > iojfodjfo >Here are several alternatives. The gsub solution matches everything up to the ( as well as everything after the ) and replaces each with nothing. The strsplit solution splits each into three fields, everything before the (, everything with in the (), and everything after the ) and the picks off the second. The strapply solution matches everything from ( to ) and returns everything between them. The below works whether DF$var2 is factor or character but if you know its character you can drop the as.character in #2 and #3. # 1 gsub(".*[(]|[)].*", "", DF$var2) # 2 sapply(strsplit(as.character(DF$var2), "[()]"), "[", 2) # 3 library(gsubfn) strapply(as.character(DF$var2), "[(](.*)[)]", simplify = TRUE)
Try this:
text <- 'var1 var2
1 ab_c_(ok)
2 okf789(db)_c
3 jojfiod(90).gt
4 "ij"_(78)__op
5 (iojfodjfo)_ab'
df <- read.table(textConnection(text), head=T, sep="
",quote="")
df$var3 <- gsub("(.*\\()(.*)(\\).*)","\\2",df$var2)
-----
A R learner.
--
View this message in context:
http://r.789695.n4.nabble.com/question-about-string-handling-tp2289178p2289327.html
Sent from the R help mailing list archive at Nabble.com.
Another option could be:
df$var3 <- gsub(".*\\((.*)\\).*", "\\1", df$var2)
On Wed, Jul 14, 2010 at 3:21 PM, karena <dr.jzhou@gmail.com> wrote:
>
> Hi,
>
> I have a data.frame as following:
> var1 var2
> 1 ab_c_(ok)
> 2 okf789(db)_c
> 3 jojfiod(90).gt
> 4 "ij"_(78)__op
> 5 (iojfodjfo)_ab
>
> what I want is to create a new variable called "var3". the value
of var3 is
> the content in the Parentheses. so var3 would be:
> var3
> ok
> db
> 90
> 78
> iojfodjfo
>
> how to do this?
>
> thanks,
>
> karena
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/question-about-string-handling-tp2289178p2289178.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O
[[alternative HTML version deleted]]
hey, guys, all these methods work perfectly. thank you!! -- View this message in context: http://r.789695.n4.nabble.com/question-about-string-handling-tp2289178p2291497.html Sent from the R help mailing list archive at Nabble.com.