Sorry, silly question, gsub works already with regex. But still, if I add `[[:blank:]]` still I don't get rid of all instances. And I am keeping obtaining extra columns ```> df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE) > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE);dfVAR VAL value is blue Value is red empty 1 1 value is blue b b b 2 2 Value is red rd rd rd 3 3 empty mpty mpty mpty ``` On Mon, Aug 9, 2021 at 12:40 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote:> > Thank you, that is much appreciated. But on the real data, the > substitution works only on few instances. Is there a way to introduce > regex into this? > Cheers > Luigi > > On Mon, Aug 9, 2021 at 11:01 AM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > Hi Luigi, > > Ah, now I see: > > > > df$VAL<-gsub("Value is","",df$VAL,ignore.case=TRUE) > > df > > VAR VAL > > 1 1 blue > > 2 2 red > > 3 3 empty > > > > Jim > > > > On Mon, Aug 9, 2021 at 6:43 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote: > > > > > > Hello, > > > I have a dataframe where I would like to change the string of certain > > > rows, essentially I am looking to remove some useless text from the > > > variables. > > > I tried with: > > > ``` > > > > df = data.frame(VAR = 1:3, VAL = c("value is blue", "Value is red", "empty")) > > > > df[df$VAL] = gsub("value is ", "", df$VAL, ignore.case = TRUE, perl = FALSE) > > > > df > > > VAR VAL value is blue Value is red empty > > > 1 1 value is blue blue blue blue > > > 2 2 Value is red red red red > > > 3 3 empty empty empty empty > > > ``` > > > which is of course wrong because I was expecting > > > ``` > > > VAR VAL > > > 1 1 blue > > > 2 2 red > > > 3 3 empty > > > ``` > > > What is the correct syntax in these cases? > > > Thank you > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Best regards, > Luigi-- Best regards, Luigi
Hi Luigi, You want to get rid of certain strings in the "VAL" column. You are assigning to: df[df$VAL] Error in `[.data.frame`(df, df$VAL) : undefined columns selected when I think you should be assigning to: df$VAL What do you want to remove other than "[V|v]alue is" ? JIim On Mon, Aug 9, 2021 at 8:50 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote:> > Sorry, silly question, gsub works already with regex. But still, if I > add `[[:blank:]]` still I don't get rid of all instances. And I am > keeping obtaining extra columns > ``` > > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE) > > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE);df > VAR VAL value is blue Value is red empty > 1 1 value is blue b b b > 2 2 Value is red rd rd rd > 3 3 empty mpty mpty mpty > ``` > > On Mon, Aug 9, 2021 at 12:40 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote: > > > > Thank you, that is much appreciated. But on the real data, the > > substitution works only on few instances. Is there a way to introduce > > regex into this? > > Cheers > > Luigi > > > > On Mon, Aug 9, 2021 at 11:01 AM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > > > Hi Luigi, > > > Ah, now I see: > > > > > > df$VAL<-gsub("Value is","",df$VAL,ignore.case=TRUE) > > > df > > > VAR VAL > > > 1 1 blue > > > 2 2 red > > > 3 3 empty > > > > > > Jim > > > > > > On Mon, Aug 9, 2021 at 6:43 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote: > > > > > > > > Hello, > > > > I have a dataframe where I would like to change the string of certain > > > > rows, essentially I am looking to remove some useless text from the > > > > variables. > > > > I tried with: > > > > ``` > > > > > df = data.frame(VAR = 1:3, VAL = c("value is blue", "Value is red", "empty")) > > > > > df[df$VAL] = gsub("value is ", "", df$VAL, ignore.case = TRUE, perl = FALSE) > > > > > df > > > > VAR VAL value is blue Value is red empty > > > > 1 1 value is blue blue blue blue > > > > 2 2 Value is red red red red > > > > 3 3 empty empty empty empty > > > > ``` > > > > which is of course wrong because I was expecting > > > > ``` > > > > VAR VAL > > > > 1 1 blue > > > > 2 2 red > > > > 3 3 empty > > > > ``` > > > > What is the correct syntax in these cases? > > > > Thank you > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Best regards, > > Luigi > > > > -- > Best regards, > Luigi
Hello, There are two convenient ways to access a column in a data.frame using `$` and `[[`. Using `df` from your first email, we would do something like df <- data.frame(VAR = 1:3, VAL = c("value is blue", "Value is red", "empty")) df$VAL df[["VAL"]] The two convenient ways to update / / replace a column with something new are also very similar, something like df$VAL <- ... df[["VAL"]] <- ... As for the regex part, I would suggest using `sub` instead of `gsub` since you're looking to remove only the first instance of "value is". Also, I would recommend using "^" to mark the beginning of your string, something like df$VAL <- sub("^Value is ", "", df$VAL, ignore.case = TRUE) I might be misunderstanding, but it sounds like you also want to remove all leading whitespace. If so, you could do something like df$VAL <- sub("^[[:blank:]]*Value is ", "", df$VAL, ignore.case = TRUE) where "*" signifies that there will be zero or more blank characters at the beginning of the string. You can try `?regex` to read more about this. I hope this helps! On Mon, Aug 9, 2021 at 6:50 AM Luigi Marongiu <marongiu.luigi at gmail.com> wrote:> Sorry, silly question, gsub works already with regex. But still, if I > add `[[:blank:]]` still I don't get rid of all instances. And I am > keeping obtaining extra columns > ``` > > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE) > > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE);df > VAR VAL value is blue Value is red empty > 1 1 value is blue b b b > 2 2 Value is red rd rd rd > 3 3 empty mpty mpty mpty > ``` > > On Mon, Aug 9, 2021 at 12:40 PM Luigi Marongiu <marongiu.luigi at gmail.com> > wrote: > > > > Thank you, that is much appreciated. But on the real data, the > > substitution works only on few instances. Is there a way to introduce > > regex into this? > > Cheers > > Luigi > > > > On Mon, Aug 9, 2021 at 11:01 AM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > > > Hi Luigi, > > > Ah, now I see: > > > > > > df$VAL<-gsub("Value is","",df$VAL,ignore.case=TRUE) > > > df > > > VAR VAL > > > 1 1 blue > > > 2 2 red > > > 3 3 empty > > > > > > Jim > > > > > > On Mon, Aug 9, 2021 at 6:43 PM Luigi Marongiu < > marongiu.luigi at gmail.com> wrote: > > > > > > > > Hello, > > > > I have a dataframe where I would like to change the string of certain > > > > rows, essentially I am looking to remove some useless text from the > > > > variables. > > > > I tried with: > > > > ``` > > > > > df = data.frame(VAR = 1:3, VAL = c("value is blue", "Value is > red", "empty")) > > > > > df[df$VAL] = gsub("value is ", "", df$VAL, ignore.case = TRUE, > perl = FALSE) > > > > > df > > > > VAR VAL value is blue Value is red empty > > > > 1 1 value is blue blue blue blue > > > > 2 2 Value is red red red red > > > > 3 3 empty empty empty empty > > > > ``` > > > > which is of course wrong because I was expecting > > > > ``` > > > > VAR VAL > > > > 1 1 blue > > > > 2 2 red > > > > 3 3 empty > > > > ``` > > > > What is the correct syntax in these cases? > > > > Thank you > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Best regards, > > Luigi > > > > -- > Best regards, > Luigi > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]