Try the following function to apply gsub to all character or factor
columns of a data.frame (and maintain change the class of all
columns):
gsubDataFrame <- function(pattern, replacement, x, ...) {
stopifnot(is.data.frame(x))
for(i in seq_len(ncol(x))) {
if (is.character(x[[i]])) {
x[[i]] <- gsub(pattern, replacement, x[[i]], ...)
} else if (is.factor(x[[i]])) {
levels(x[[i]]) <- gsub(pattern, replacement, levels(x[[i]]), ...)
} # else do nothing for numeric or other column types
}
x
}
E.g.,> d <- data.frame(stringsAsFactors = FALSE,
+ Int=1:5,
+ Char=c("a a", "baa", "a a ",
" aa", "b a a"),
+ Fac=factor(c("x x", "yxx", "x x
", " xx", "y x x")))> str(d)
'data.frame': 5 obs. of 3 variables:
$ Int : int 1 2 3 4 5
$ Char: chr "a a" "baa" "a a " " aa"
...
$ Fac : Factor w/ 5 levels " xx","x x","x x ",..:
2 5 3 1 4> str(gsubDataFrame(" ", "", d)) # delete spaces, use
"[[:space:]]" for whitespace
'data.frame': 5 obs. of 3 variables:
$ Int : int 1 2 3 4 5
$ Char: chr "aa" "baa" "aa" "aa" ...
$ Fac : Factor w/ 2 levels "xx","yxx": 1 2 1 1 2
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Tue, Feb 21, 2017 at 11:35 PM, Jos? Luis <josestadistico at gmail.com>
wrote:> Thank's for your answer.
>
> I'm using read.csv.
>
> Enviado desde mi iPad
>
>> El 22/2/2017, a las 3:39, William Michels <wjm1 at
caa.columbia.edu> escribi?:
>>
>> Hi Jos? (and Rolf),
>>
>> It's not entirely clear what type of 'whitespace'
you're referring to,
>> but if you're using read.table() or read.csv() to create your
>> dataframe in the first place, setting 'strip.white = TRUE' will
remove
>> leading and trailing whitespace 'from unquoted character fields
>> (numeric fields are always stripped).'
>>
>>> ?read.table
>>> ?read.csv
>>
>> Cheers,
>>
>> Bill
>>
>>
>>> On 2/21/17, Rolf Turner <r.turner at auckland.ac.nz> wrote:
>>>> On 22/02/17 12:51, Jos? Luis Aguilar wrote:
>>>> Hi all,
>>>>
>>>> i have a dataframe with 34 columns and 1534 observations.
>>>>
>>>> In some columns I have strings with spaces, i want remove the
space.
>>>> Is there a function that removes whitespace from the entire
dataframe?
>>>> I use gsub but I would need some function to automate this.
>>>
>>> Something like
>>>
>>> X <- as.data.frame(lapply(X,function(x){gsub("
","",x)}))
>>>
>>> Untested, since you provide no reproducible example (despite being
told
>>> by the posting guide to do so).
>>>
>>> I do not know what my idea will do to numeric columns or to
factors.
>>>
>>> However it should give you at least a start.
>>>
>>> cheers,
>>>
>>> Rolf Turner
>>>
>>> --
>>> Technical Editor ANZJS
>>> Department of Statistics
>>> University of Auckland
>>> Phone: +64-9-373-7599 ext. 88276
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.