thr3ads.net - R help - [R] Replace Text but not from within a word [Feb 2017]

If this information is useful, please help other people find it:
Share via:

Harshal Athawale

2017-Feb-28 09:38 UTC

[R] Replace Text but not from within a word

I am new in R.

I have a file. This file contains name of the companies.
'data.frame': 494 obs. of  1 variable:
 $ V1: Factor w/ 470 levels "3-d engineering corp",..: 293 134 339 359
143
399 122 447 398 384 ...

Problem: I would like to remove "CO" (As it is the most frequent
word). I
would like "CO" to removed from BOEING CO --> BOEING but not from
SAGINAW
*CO*UNTY INC*. *
> text = c("BOEING CO","ENGMANTAYLOR CO","SAGINAW
COUNTY INC")
> gsub(x = text, pattern = "CO", replacement = "")
[1] "BOEING "       "ENGMANTAYLOR " "SAGINAW UNTY"

Thanks in advance.

- Sam

	[[alternative HTML version deleted]]

Marc Schwartz

2017-Feb-28 13:19 UTC

head link

[R] Replace Text but not from within a word

> On Feb 28, 2017, at 3:38 AM, Harshal Athawale <pgcim15.harshal at
spjimr.org> wrote:
> 
> I am new in R.
> 
> I have a file. This file contains name of the companies.
> 'data.frame': 494 obs. of  1 variable:
> $ V1: Factor w/ 470 levels "3-d engineering corp",..: 293 134 339
359 143
> 399 122 447 398 384 ...
> 
> Problem: I would like to remove "CO" (As it is the most frequent
word). I
> would like "CO" to removed from BOEING CO --> BOEING but not
from SAGINAW
> *CO*UNTY INC*. *
> 
>> text = c("BOEING CO","ENGMANTAYLOR
CO","SAGINAW COUNTY INC")
> 
>> gsub(x = text, pattern = "CO", replacement = "")
> 
> [1] "BOEING "       "ENGMANTAYLOR " "SAGINAW
UNTY"
> 
> Thanks in advance.
> 
> - Sam

Hi,

See ?regex and ?grep for some details and examples on how to construct the
expression used for matching, as well as some of the references therein.

In this case, you want to use something along the lines of:
> gsub(" CO$", "", text)[1] "BOEING"             "ENGMANTAYLOR"       "SAGINAW
COUNTY INC"

where the "CO" is preceded by a space and followed by the
"$", which is a special character that indicates the end of the string
to be matched.

Regards,

Marc Schwartz


	[[alternative HTML version deleted]]

Jeff Newmiller

2017-Feb-28 14:36 UTC

head link

[R] Replace Text but not from within a word

For tasks like this, you will probably want to make sure to import the data as
character data rather than as a factor.  E.g.

dat <- read.csv( "myfile.csv", header=FALSE, as.is=TRUE )

You can check what you have with the str() function.
-- 
Sent from my phone. Please excuse my brevity.

On February 28, 2017 5:19:40 AM PST, Marc Schwartz <marc_schwartz at
me.com> wrote:>
>> On Feb 28, 2017, at 3:38 AM, Harshal Athawale
><pgcim15.harshal at spjimr.org> wrote:
>> 
>> I am new in R.
>> 
>> I have a file. This file contains name of the companies.
>> 'data.frame': 494 obs. of  1 variable:
>> $ V1: Factor w/ 470 levels "3-d engineering corp",..: 293 134
339 359
>143
>> 399 122 447 398 384 ...
>> 
>> Problem: I would like to remove "CO" (As it is the most
frequent
>word). I
>> would like "CO" to removed from BOEING CO --> BOEING but
not from
>SAGINAW
>> *CO*UNTY INC*. *
>> 
>>> text = c("BOEING CO","ENGMANTAYLOR
CO","SAGINAW COUNTY INC")
>> 
>>> gsub(x = text, pattern = "CO", replacement =
"")
>> 
>> [1] "BOEING "       "ENGMANTAYLOR " "SAGINAW
UNTY"
>> 
>> Thanks in advance.
>> 
>> - Sam
>
>
>Hi,
>
>See ?regex and ?grep for some details and examples on how to construct
>the expression used for matching, as well as some of the references
>therein.
>
>In this case, you want to use something along the lines of:
>
>> gsub(" CO$", "", text)
>[1] "BOEING"             "ENGMANTAYLOR"      
"SAGINAW COUNTY INC"
>
>where the "CO" is preceded by a space and followed by the
"$", which is
>a special character that indicates the end of the string to be matched.
>
>Regards,
>
>Marc Schwartz
>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

R help - Feb 2017 - Replace Text but not from within a word

[R] Replace Text but not from within a word

[R] Replace Text but not from within a word

[R] Replace Text but not from within a word