Sorry, silly question, gsub works already with regex. But still, if I add `[[:blank:]]` still I don't get rid of all instances. And I am keeping obtaining extra columns ```> df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE) > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE);dfVAR VAL value is blue Value is red empty 1 1 value is blue b b b 2 2 Value is red rd rd rd 3 3 empty mpty mpty mpty ``` On Mon, Aug 9, 2021 at 12:40 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote:> > Thank you, that is much appreciated. But on the real data, the > substitution works only on few instances. Is there a way to introduce > regex into this? > Cheers > Luigi > > On Mon, Aug 9, 2021 at 11:01 AM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > Hi Luigi, > > Ah, now I see: > > > > df$VAL<-gsub("Value is","",df$VAL,ignore.case=TRUE) > > df > > VAR VAL > > 1 1 blue > > 2 2 red > > 3 3 empty > > > > Jim > > > > On Mon, Aug 9, 2021 at 6:43 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote: > > > > > > Hello, > > > I have a dataframe where I would like to change the string of certain > > > rows, essentially I am looking to remove some useless text from the > > > variables. > > > I tried with: > > > ``` > > > > df = data.frame(VAR = 1:3, VAL = c("value is blue", "Value is red", "empty")) > > > > df[df$VAL] = gsub("value is ", "", df$VAL, ignore.case = TRUE, perl = FALSE) > > > > df > > > VAR VAL value is blue Value is red empty > > > 1 1 value is blue blue blue blue > > > 2 2 Value is red red red red > > > 3 3 empty empty empty empty > > > ``` > > > which is of course wrong because I was expecting > > > ``` > > > VAR VAL > > > 1 1 blue > > > 2 2 red > > > 3 3 empty > > > ``` > > > What is the correct syntax in these cases? > > > Thank you > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Best regards, > Luigi-- Best regards, Luigi
Hi Luigi, You want to get rid of certain strings in the "VAL" column. You are assigning to: df[df$VAL] Error in `[.data.frame`(df, df$VAL) : undefined columns selected when I think you should be assigning to: df$VAL What do you want to remove other than "[V|v]alue is" ? JIim On Mon, Aug 9, 2021 at 8:50 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote:> > Sorry, silly question, gsub works already with regex. But still, if I > add `[[:blank:]]` still I don't get rid of all instances. And I am > keeping obtaining extra columns > ``` > > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE) > > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE);df > VAR VAL value is blue Value is red empty > 1 1 value is blue b b b > 2 2 Value is red rd rd rd > 3 3 empty mpty mpty mpty > ``` > > On Mon, Aug 9, 2021 at 12:40 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote: > > > > Thank you, that is much appreciated. But on the real data, the > > substitution works only on few instances. Is there a way to introduce > > regex into this? > > Cheers > > Luigi > > > > On Mon, Aug 9, 2021 at 11:01 AM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > > > Hi Luigi, > > > Ah, now I see: > > > > > > df$VAL<-gsub("Value is","",df$VAL,ignore.case=TRUE) > > > df > > > VAR VAL > > > 1 1 blue > > > 2 2 red > > > 3 3 empty > > > > > > Jim > > > > > > On Mon, Aug 9, 2021 at 6:43 PM Luigi Marongiu <marongiu.luigi at gmail.com> wrote: > > > > > > > > Hello, > > > > I have a dataframe where I would like to change the string of certain > > > > rows, essentially I am looking to remove some useless text from the > > > > variables. > > > > I tried with: > > > > ``` > > > > > df = data.frame(VAR = 1:3, VAL = c("value is blue", "Value is red", "empty")) > > > > > df[df$VAL] = gsub("value is ", "", df$VAL, ignore.case = TRUE, perl = FALSE) > > > > > df > > > > VAR VAL value is blue Value is red empty > > > > 1 1 value is blue blue blue blue > > > > 2 2 Value is red red red red > > > > 3 3 empty empty empty empty > > > > ``` > > > > which is of course wrong because I was expecting > > > > ``` > > > > VAR VAL > > > > 1 1 blue > > > > 2 2 red > > > > 3 3 empty > > > > ``` > > > > What is the correct syntax in these cases? > > > > Thank you > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Best regards, > > Luigi > > > > -- > Best regards, > Luigi
Hello,
There are two convenient ways to access a column in a data.frame using `$`
and `[[`. Using `df` from your first email, we would do something like
df <- data.frame(VAR = 1:3, VAL = c("value is blue", "Value is
red",
"empty"))
df$VAL
df[["VAL"]]
The two convenient ways to update / / replace a column with something new
are also very similar, something like
df$VAL <- ...
df[["VAL"]] <- ...
As for the regex part, I would suggest using `sub` instead of `gsub` since
you're looking to remove only the first instance of "value is".
Also, I
would recommend using "^" to mark the beginning of your string,
something
like
df$VAL <- sub("^Value is ", "", df$VAL, ignore.case =
TRUE)
I might be misunderstanding, but it sounds like you also want to remove all
leading whitespace. If so, you could do something like
df$VAL <- sub("^[[:blank:]]*Value is ", "", df$VAL,
ignore.case = TRUE)
where "*" signifies that there will be zero or more blank characters
at the
beginning of the string. You can try `?regex` to read more about this.
I hope this helps!
On Mon, Aug 9, 2021 at 6:50 AM Luigi Marongiu <marongiu.luigi at
gmail.com>
wrote:
> Sorry, silly question, gsub works already with regex. But still, if I
> add `[[:blank:]]` still I don't get rid of all instances. And I am
> keeping obtaining extra columns
> ```
> > df[df$VAL] = gsub("[[:blank:]Value is]", "",
df$VAL, ignore.case=TRUE)
> > df[df$VAL] = gsub("[[:blank:]Value is]", "",
df$VAL, ignore.case=TRUE);df
> VAR VAL value is blue Value is red empty
> 1 1 value is blue b b b
> 2 2 Value is red rd rd rd
> 3 3 empty mpty mpty mpty
> ```
>
> On Mon, Aug 9, 2021 at 12:40 PM Luigi Marongiu <marongiu.luigi at
gmail.com>
> wrote:
> >
> > Thank you, that is much appreciated. But on the real data, the
> > substitution works only on few instances. Is there a way to introduce
> > regex into this?
> > Cheers
> > Luigi
> >
> > On Mon, Aug 9, 2021 at 11:01 AM Jim Lemon <drjimlemon at
gmail.com> wrote:
> > >
> > > Hi Luigi,
> > > Ah, now I see:
> > >
> > > df$VAL<-gsub("Value
is","",df$VAL,ignore.case=TRUE)
> > > df
> > > VAR VAL
> > > 1 1 blue
> > > 2 2 red
> > > 3 3 empty
> > >
> > > Jim
> > >
> > > On Mon, Aug 9, 2021 at 6:43 PM Luigi Marongiu <
> marongiu.luigi at gmail.com> wrote:
> > > >
> > > > Hello,
> > > > I have a dataframe where I would like to change the string
of certain
> > > > rows, essentially I am looking to remove some useless text
from the
> > > > variables.
> > > > I tried with:
> > > > ```
> > > > > df = data.frame(VAR = 1:3, VAL = c("value is
blue", "Value is
> red", "empty"))
> > > > > df[df$VAL] = gsub("value is ", "",
df$VAL, ignore.case = TRUE,
> perl = FALSE)
> > > > > df
> > > > VAR VAL value is blue Value is red empty
> > > > 1 1 value is blue blue blue blue
> > > > 2 2 Value is red red red red
> > > > 3 3 empty empty empty empty
> > > > ```
> > > > which is of course wrong because I was expecting
> > > > ```
> > > > VAR VAL
> > > > 1 1 blue
> > > > 2 2 red
> > > > 3 3 empty
> > > > ```
> > > > What is the correct syntax in these cases?
> > > > Thank you
> > > >
> > > > ______________________________________________
> > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible
code.
> >
> >
> >
> > --
> > Best regards,
> > Luigi
>
>
>
> --
> Best regards,
> Luigi
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]