HI All,
I am trying to create   new columns based on another column string
content. First I want to identify rows that contain a particular
string.  If it contains, I want to split the string and create two
variables.
Here is my sample of data.
F1<-read.table(text="ID1  ID2  text
A1 B1   NONE
A1 B1   cf_12
A1 B1   NONE
A2 B2   X2_25
A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
If the variable "text" contains this "_" I want to create an
indicator
variable as shown below
F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
Then I want to split that string in to two, before "_" and after
"_"
and create two variables as shown below
x1= strsplit(as.character(F1$text),'_',2)
My problem is how to combine this with the original data frame. The
desired  output is shown   below,
ID1 ID2  Y1   X1    X2
A1  B1    0   NONE   .
A1  B1   1    cf        12
A1  B1   0  NONE   .
A2  B2   1    X2    25
A2  B3   1    fd    15
Any help?
Thank you.
Hello,
Something like this?
F1$Y1 <- +grepl("_", F1$text)
F1 <- F1[c(1, 2, 4, 3)]
F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep
= "_", fill =
"right")
F1
Hope this helps,
Rui Barradas
?s 19:55 de 22/09/20, Val escreveu:> HI All,
> 
> I am trying to create   new columns based on another column string
> content. First I want to identify rows that contain a particular
> string.  If it contains, I want to split the string and create two
> variables.
> 
> Here is my sample of data.
> F1<-read.table(text="ID1  ID2  text
> A1 B1   NONE
> A1 B1   cf_12
> A1 B1   NONE
> A2 B2   X2_25
> A2 B3   fd_15  ",header=TRUE,stringsAsFactors=F)
> If the variable "text" contains this "_" I want to
create an indicator
> variable as shown below
> 
> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
> 
> 
> Then I want to split that string in to two, before "_" and after
"_"
> and create two variables as shown below
> x1= strsplit(as.character(F1$text),'_',2)
> 
> My problem is how to combine this with the original data frame. The
> desired  output is shown   below,
> 
> 
> ID1 ID2  Y1   X1    X2
> A1  B1    0   NONE   .
> A1  B1   1    cf        12
> A1  B1   0  NONE   .
> A2  B2   1    X2    25
> A2  B3   1    fd    15
> 
> Any help?
> Thank you.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Hello,
A base R solution with strsplit, like in your code.
F1$Y1 <- +grepl("_", F1$text)
tmp <- strsplit(as.character(F1$text), "_")
tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
tmp <- do.call(rbind, tmp)
colnames(tmp) <- c("X1", "X2")
F1 <- cbind(F1[-3], tmp)    # remove the original column
rm(tmp)
F1
#  ID1 ID2 Y1   X1 X2
#1  A1  B1  0 NONE  .
#2  A1  B1  1   cf 12
#3  A1  B1  0 NONE  .
#4  A2  B2  1   X2 25
#5  A2  B3  1   fd 15
Note that cbind dispatches on F1, an object of class "data.frame".
Therefore it's the method cbind.data.frame that is called and the result 
is also a df, though tmp is a "matrix".
Hope this helps,
Rui Barradas
?s 20:07 de 22/09/20, Rui Barradas escreveu:> Hello,
> 
> Something like this?
> 
> 
> F1$Y1 <- +grepl("_", F1$text)
> F1 <- F1[c(1, 2, 4, 3)]
> F1 <- tidyr::separate(F1, text, into = c("X1",
"X2"), sep = "_", fill =
> "right")
> F1
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> ?s 19:55 de 22/09/20, Val escreveu:
>> HI All,
>>
>> I am trying to create?? new columns based on another column string
>> content. First I want to identify rows that contain a particular
>> string.? If it contains, I want to split the string and create two
>> variables.
>>
>> Here is my sample of data.
>> F1<-read.table(text="ID1? ID2? text
>> A1 B1?? NONE
>> A1 B1?? cf_12
>> A1 B1?? NONE
>> A2 B2?? X2_25
>> A2 B3?? fd_15? ",header=TRUE,stringsAsFactors=F)
>> If the variable "text" contains this "_" I want to
create an indicator
>> variable as shown below
>>
>> F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
>>
>>
>> Then I want to split that string in to two, before "_" and
after "_"
>> and create two variables as shown below
>> x1= strsplit(as.character(F1$text),'_',2)
>>
>> My problem is how to combine this with the original data frame. The
>> desired? output is shown?? below,
>>
>>
>> ID1 ID2? Y1?? X1??? X2
>> A1? B1??? 0?? NONE?? .
>> A1? B1?? 1??? cf??????? 12
>> A1? B1?? 0? NONE?? .
>> A2? B2?? 1??? X2??? 25
>> A2? B3?? 1??? fd??? 15
>>
>> Any help?
>> Thank you.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Sometimes it just makes more sense to pre-process your data and get it into the format you need. It just depends on whether you are more comfortable programing in R or in some other text manipulation language like bash/sed/awk/grep etc. If you know how to do this with other tools, you could write a script and probably call the script from R. I could post a sample if you are interested. LMH Val wrote:> HI All, > > I am trying to create new columns based on another column string > content. First I want to identify rows that contain a particular > string. If it contains, I want to split the string and create two > variables. > > Here is my sample of data. > F1<-read.table(text="ID1 ID2 text > A1 B1 NONE > A1 B1 cf_12 > A1 B1 NONE > A2 B2 X2_25 > A2 B3 fd_15 ",header=TRUE,stringsAsFactors=F) > If the variable "text" contains this "_" I want to create an indicator > variable as shown below > > F1$Y1 <- ifelse(grepl("_", F1$text),1,0) > > > Then I want to split that string in to two, before "_" and after "_" > and create two variables as shown below > x1= strsplit(as.character(F1$text),'_',2) > > My problem is how to combine this with the original data frame. The > desired output is shown below, > > > ID1 ID2 Y1 X1 X2 > A1 B1 0 NONE . > A1 B1 1 cf 12 > A1 B1 0 NONE . > A2 B2 1 X2 25 > A2 B3 1 fd 15 > > Any help? > Thank you. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >