Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9 and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9 I have tried to substitute "?" with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent? Thank you in advance, Luca
Hello, Works with me: d1 <- data.frame(V1 = 1:3, V2 = c("some text = 9", "some t?xt = 9", "some other text = 9")) regexpr("some text = 9", d1$V2) [1] 1 -1 -1 attr(,"match.length") [1] 13 -1 -1 regexpr("some t?xt = 9", d1$V2) [1] -1 1 -1 attr(,"match.length") [1] -1 13 -1 d1$V1[regexpr("some text = 9",d1$V2) > 0] <- 9 d1$V1[regexpr("some t?xt = 9",d1$V2) > 0] <- 9 d1 V1 V2 1 9 some text = 9 2 9 some t?xt = 9 3 3 some other text = 9 What do you mean by "it did not work"? What was the contents of 'd1'? sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252 [3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Portugal.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] fortunes_1.5-0 Hope this helps, Rui Barradas Em 06-08-2012 06:55, Luca Meyer escreveu:> Hello, > > I have build a syntax to find out if a given substring is included in a larger string that works like this: > > d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9 > > and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following: > > d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9 > > I have tried to substitute "?" with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent? > > Thank you in advance, > > Luca > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
HI, It works with me.? I am using R 2.15 on Ubuntu 12.04. ?d1 <- data.frame(V1 = 1:5, V2=c("some text = 9", "some t?xt=9","s?me t?xt=9",? "s?me text=9", "some t?xt=9")) d1 #? V1??????????? V2 #1? 1 some text = 9 #2? 2?? some t?xt=9 #3? 3?? s?me t?xt=9 #4? 4?? s?me text=9 #5? 5?? some t?xt=9 ? d1$V1[regexpr("some t?xt=9",d1$V2)>0]<-9 d1$V1[regexpr("s?me text=9",d1$V2)>0] <-9 d1$V1[regexpr("some t?xt=9",d1$V2)>0] <-9 d1$V1[regexpr("s?me t?xt=9",d1$V2)>0] <-9 d1$V1[regexpr("some text = 9",d1$V2)>0] <-9 d1 #? V1??????????? V2 #1? 9 some text = 9 #2? 9?? some t?xt=9 #3? 9?? s?me t?xt=9 #4? 9?? s?me text=9 #5? 9?? some t?xt=9 A.K. ----- Original Message ----- From: Luca Meyer <lucam1968 at gmail.com> To: r-help at r-project.org Cc: Sent: Monday, August 6, 2012 1:55 AM Subject: [R] regexpr with accents Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9 and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9 I have tried to substitute "?" with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent? Thank you in advance, Luca ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
? stato filtrato un testo allegato il cui set di caratteri non era indicato... Nome: non disponibile URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120806/5e8ac959/attachment.pl>
Hi, Here, the string with in the quotes are read exactly like that.? So, you may have to use the symbol instead of "friendly" or "numeric" from the link.? Or you have to convert those. d1 <- data.frame(V1 = 1:4, ??? V2 = c("some text = 9", "some tèxt = 9", "some t?xt = 9", "some tèxt = 9")) d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 ?d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9 d1 ? V1????????????????? V2 1? 1?????? some text = 9 2? 9 some tèxt = 9 3? 9?????? some t?xt = 9 4? 9?? some tèxt = 9 A.K. ----- Original Message ----- From: Luca Meyer <lucam1968 at gmail.com> To: r-help at r-project.org Cc: Sent: Monday, August 6, 2012 8:25 AM Subject: [R] regexpr with accents Sorry but my previous email did not go through properly. Instead of the ? you should really read an è or è according to http://www.lookuptables.com/. So there are extended ASCII characters I need to deal with. I have tried d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 and d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 without success... Thanks, Luca ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks Arun, It works all right, I just found out that my problem was not with accents but with the correct spelling of "some text"..... Kind regards, Luca Il giorno 06/ago/2012, alle ore 15.01, arun ha scritto:> > > Hi, > > Here, the string with in the quotes are read exactly like that. So, you may have to use the symbol instead of "friendly" or "numeric" from the link. Or you have to convert those. > > d1 <- data.frame(V1 = 1:4, > V2 = c("some text = 9", "some tèxt = 9", "some t?xt = 9", "some tèxt = 9")) > > d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 > d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 > d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9 > > d1 > V1 V2 > 1 1 some text = 9 > 2 9 some tèxt = 9 > 3 9 some t?xt = 9 > 4 9 some tèxt = 9 > > A.K. > > > ----- Original Message ----- > From: Luca Meyer <lucam1968 at gmail.com> > To: r-help at r-project.org > Cc: > Sent: Monday, August 6, 2012 8:25 AM > Subject: [R] regexpr with accents > > Sorry but my previous email did not go through properly. Instead of the ? you should really read an è or è according to http://www.lookuptables.com/. > > So there are extended ASCII characters I need to deal with. > > I have tried > > d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 > and > > d1$V1[regexpr("some tèxt = 9",d1$V2)>0] <- 9 > > without success... > > Thanks, > Luca > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Reasonably Related Threads
- How to visualise what code is processed within a for loop
- How to visualise what code is processed within a for loop
- How to visualise what code is processed within a for loop
- How to visualise what code is processed within a for loop
- How to visualise what code is processed within a for loop