Hi R-helpers, I have a dataframe with 60columns and I would like to convert several columns to factor, others to numeric, and yet others to dates. Rather than having 60 lines like this: data$Var1<-as.factor(data$Var1) I wonder if it's possible to write one line of code (per data type, e.g. factor) that would apply a function (e.g., as.factor) to several (non-contiguous) columns. So, I could then use 3 or 4 lines of code (for 3 or 4 data types) instead of 60. I have tried writing an apply function, but it failed. Thanks for any help you might be able to provide. Mark Na
hadley wickham
2009-Jun-23 21:36 UTC
[R] Apply as.factor (or as.numeric etc) to multiple columns
Hi Mark, Have a look at colwise (and numcolwise and catcolwise) in the plyr package. Hadley On Tue, Jun 23, 2009 at 4:23 PM, Mark Na<mtb954 at gmail.com> wrote:> Hi R-helpers, > > I have a dataframe with 60columns and I would like to convert several > columns to factor, others to numeric, and yet others to dates. Rather > than having 60 lines like this: > > data$Var1<-as.factor(data$Var1) > > I wonder if it's possible to write one line of code (per data type, > e.g. factor) that would apply a function (e.g., as.factor) to several > (non-contiguous) columns. So, I could then use 3 or 4 lines of code > (for 3 or 4 data types) instead of 60. > > I have tried writing an apply function, but it failed. > > Thanks for any help you might be able to provide. > > Mark Na > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- http://had.co.nz/
Gabor Grothendieck
2009-Jun-23 21:45 UTC
[R] Apply as.factor (or as.numeric etc) to multiple columns
Try this: ix <- 2:5 DF[ix] <- lapply(DF[ix], as.numeric) nms <- c("x", "y") DF[nms] <- lapply(DF[nms], as.factor) On Tue, Jun 23, 2009 at 5:23 PM, Mark Na<mtb954 at gmail.com> wrote:> Hi R-helpers, > > I have a dataframe with 60columns and I would like to convert several > columns to factor, others to numeric, and yet others to dates. Rather > than having 60 lines like this: > > data$Var1<-as.factor(data$Var1) > > I wonder if it's possible to write one line of code (per data type, > e.g. factor) that would apply a function (e.g., as.factor) to several > (non-contiguous) columns. So, I could then use 3 or 4 lines of code > (for 3 or 4 data types) instead of 60. > > I have tried writing an apply function, but it failed. > > Thanks for any help you might be able to provide. > > Mark Na > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
baptiste auguie
2009-Jun-23 21:45 UTC
[R] Apply as.factor (or as.numeric etc) to multiple columns
Wacek helped me out on a similar topic a while back, ize function (d, columns = names(d), izer = as.factor) { d[columns] = lapply(d[columns], izer) d } d = data.frame(x=1:10, y=1:10, z =1:10) str( ize(d, 'y') ) # y is now a factor str( ize(d, 1:2, `cumsum`) ) # x and y are affected etc. HTH, baptiste Mark Na wrote:> Hi R-helpers, > > I have a dataframe with 60columns and I would like to convert several > columns to factor, others to numeric, and yet others to dates. Rather > than having 60 lines like this: > > data$Var1<-as.factor(data$Var1) > > I wonder if it's possible to write one line of code (per data type, > e.g. factor) that would apply a function (e.g., as.factor) to several > (non-contiguous) columns. So, I could then use 3 or 4 lines of code > (for 3 or 4 data types) instead of 60. > > I have tried writing an apply function, but it failed. > > Thanks for any help you might be able to provide. > > Mark Na > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- _____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag
Bengoechea Bartolomé Enrique (SIES 73)
2009-Jun-25 07:43 UTC
[R] Apply as.factor (or as.numeric etc) to multiple columns
Hi Mark, I frequently need to do that when importing data. This one-liner works:> data.frame(mapply(as, x, c("integer", "character", "factor"), SIMPLIFY=FALSE), stringsAsFactors=FALSE);but it has two problems: 1) as() is an S4 method that does not always work 2) writting the vector of classes for 60 variables is rather tedious. Both issues can be solved with the following two helper functions. The first function tries to use as(x, class); if it doesn't work, tries as.<class>(x); If it still doesn't work, tries <class>(x). The second function tranforms a single string to a character vector of classes, by transforming each letter in the string to a class name (i.e. "D" is tranformed to "Date", "i" to "integer", etc.), so that writting 60 classes is fast. doCoerce <- function(x, class) { if (canCoerce(x, class)) as(x, class) else { result <- try(match.fun(paste("as", class, sep="."))(x), silent=TRUE); if (inherits(result, "try-error")) result <- match.fun(class)(x) result; } } expandClasses <- function (x) { unknowns <- character(0) result <- lapply(strsplit(as.character(x), NULL, fixed = TRUE), function(y) { sapply(y, function(z) switch(z, i = "integer", n = "numeric", l = "logical", c = "character", x = "complex", r = "raw", f = "factor", D = "Date", P = "POSIXct", t = "POSIXlt", N = NA_character_, { unknowns <<- c(unknowns, z) NA_character_ }), USE.NAMES = FALSE) }) if (length(unknowns)) { unknowns <- unique(unknowns) warning(sprintf(ngettext(length(unknowns), "code %s not recognized", "codes %s not recognized"), dqMsg(unknowns))) } result } An example:> x <- data.frame(X="2008-01-01", Y=1.1:3.1, Z=letters[1:3]) > data.frame(mapply(doCoerce, x, expandClasses("Dif")[[1L]], SIMPLIFY=FALSE), stringsAsFactors=FALSE);Regards, Enrique ------------------------------ Message: 99 Date: Tue, 23 Jun 2009 15:23:54 -0600 From: Mark Na <mtb954 at gmail.com> Subject: [R] Apply as.factor (or as.numeric etc) to multiple columns To: r-help at r-project.org Message-ID: <e40d78ce0906231423m4c3da14i2f6270f92463c943 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi R-helpers, I have a dataframe with 60columns and I would like to convert several columns to factor, others to numeric, and yet others to dates. Rather than having 60 lines like this: data$Var1<-as.factor(data$Var1) I wonder if it's possible to write one line of code (per data type, e.g. factor) that would apply a function (e.g., as.factor) to several (non-contiguous) columns. So, I could then use 3 or 4 lines of code (for 3 or 4 data types) instead of 60. I have tried writing an apply function, but it failed. Thanks for any help you might be able to provide. Mark Na