Dennis Fisher
2016-Aug-02 16:46 UTC
[R] Regression expression to delete one or more spaces at end of string
R 3.3.1
OS X
Colleagues,
I have encountered an unexpected regex problem
I have read an Excel file into R using the readxl package. Columns names are:
COLNAMES <- c("Study ID", "Test and Biological Matrix",
"Subject No. ", "Collection Date",
"Collection Time", "Scheduled Time Point",
"Concentration", "Concentration Units",
"LLOQ", "ULOQ", "Comment?)
As you can see, there is a trailing space in ?Subject No. ?. I would like to
delete that space. The following works:
sub(? $?, ??, COLNAMES)
However, I would like a more general approach that removes any trailing
whitespace.
I tried variations such as:
sub("[:blank:]$", "", COLNAMES)
(also, without the $ and ?space' instead of ?blank') without success ?
to my surprise, characters other than the trailing space were deleted but the
trailing space remained.
Guidance on the correct syntax would be appreciated.
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com
Marc Schwartz
2016-Aug-02 16:55 UTC
[R] Regression expression to delete one or more spaces at end of string
> On Aug 2, 2016, at 11:46 AM, Dennis Fisher <fisher at plessthan.com> wrote: > > R 3.3.1 > OS X > > Colleagues, > > I have encountered an unexpected regex problem > > I have read an Excel file into R using the readxl package. Columns names are: > > COLNAMES <- c("Study ID", "Test and Biological Matrix", "Subject No. ", "Collection Date", > "Collection Time", "Scheduled Time Point", "Concentration", "Concentration Units", > "LLOQ", "ULOQ", "Comment?) > > As you can see, there is a trailing space in ?Subject No. ?. I would like to delete that space. The following works: > sub(? $?, ??, COLNAMES) > However, I would like a more general approach that removes any trailing whitespace. > > I tried variations such as: > sub("[:blank:]$", "", COLNAMES) > (also, without the $ and ?space' instead of ?blank') without success ? to my surprise, characters other than the trailing space were deleted but the trailing space remained. > > Guidance on the correct syntax would be appreciated. > > DennisDennis, There is actually an example in ?gsub: ## trim trailing white space str <- "Now is the time " sub(" +$", "", str) ## spaces only The '+' sign will match the preceding space one or more times at the end of the character string. Note that as per ?regex, it is [:space:], not [:blank:] and the brackets need to be doubled in the regex to define the enclosing character group. An example would be: sub("[[:space:]]+$", "", str) ## white space, POSIX-style which is also in ?gsub. Regards, Marc Schwartz
William Dunlap
2016-Aug-02 16:57 UTC
[R] Regression expression to delete one or more spaces at end of string
First, use [[:blank:]] instead of [:blank:]. that latter matches colon, b, l, a, n, and k, the former whitespace. Second, put + after [[:blank:]] to match one or more of them. Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Aug 2, 2016 at 9:46 AM, Dennis Fisher <fisher at plessthan.com> wrote:> R 3.3.1 > OS X > > Colleagues, > > I have encountered an unexpected regex problem > > I have read an Excel file into R using the readxl package. Columns names > are: > > COLNAMES <- c("Study ID", "Test and Biological Matrix", "Subject > No. ", "Collection Date", > "Collection Time", "Scheduled Time Point", "Concentration", "Concentration > Units", > "LLOQ", "ULOQ", "Comment?) > > As you can see, there is a trailing space in ?Subject No. ?. I would like > to delete that space. The following works: > sub(? $?, ??, COLNAMES) > However, I would like a more general approach that removes any trailing > whitespace. > > I tried variations such as: > sub("[:blank:]$", "", COLNAMES) > (also, without the $ and ?space' instead of ?blank') without success ? to > my surprise, characters other than the trailing space were deleted but the > trailing space remained. > > Guidance on the correct syntax would be appreciated. > > Dennis > > Dennis Fisher MD > P < (The "P Less Than" Company) > Phone / Fax: 1-866-PLessThan (1-866-753-7784) > www.PLessThan.com > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
David R Forrest
2016-Aug-02 17:00 UTC
[R] Regression expression to delete one or more spaces at end of string
Double the [[]] and add a + for one-or-more characters:
sub("[[:blank:]]+$", "", COLNAMES)
> On Aug 2, 2016, at 12:46 PM, Dennis Fisher <fisher at plessthan.com>
wrote:
>
> R 3.3.1
> OS X
>
> Colleagues,
>
> I have encountered an unexpected regex problem
>
> I have read an Excel file into R using the readxl package. Columns names
are:
>
> COLNAMES <- c("Study ID", "Test and Biological
Matrix", "Subject No. ", "Collection Date",
> "Collection Time", "Scheduled Time Point",
"Concentration", "Concentration Units",
> "LLOQ", "ULOQ", "Comment?)
>
> As you can see, there is a trailing space in ?Subject No. ?. I would like
to delete that space. The following works:
> sub(? $?, ??, COLNAMES)
> However, I would like a more general approach that removes any trailing
whitespace.
>
> I tried variations such as:
> sub("[:blank:]$", "", COLNAMES)
> (also, without the $ and ?space' instead of ?blank') without
success ? to my surprise, characters other than the trailing space were deleted
but the trailing space remained.
>
> Guidance on the correct syntax would be appreciated.
>
> Dennis
>
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Dr. David Forrest
drf at vims.edu
804-684-7900w
757-968-5509h
804-413-7125c
#240 Andrews Hall
Virginia Institute of Marine Science
Route 1208, Greate Road
PO Box 1346
Gloucester Point, VA, 23062-1346