Michael.Laviolette at dhhs.state.nh.us
2011-Oct-12 13:18 UTC
[R] Applying function to only numeric variable (plyr package?)
My data frame consists of character variables, factors, and proportions, something like c1 <- c("A", "B", "C", "C") c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N")) x <- c(0.5234, 0.6919, 0.2307, 0.1160) y <- c(0.9251, 0.7616, 0.3624, 0.4462) df <- data.frame(c1, c2, x, y) pct <- function(x) round(100*x, 1) I want to apply the pct function to only the numeric variables so that the proportions are computed to percentages, and retain all the columns: c1 c2 x1 x2 1 A Y 52.3 92.5 2 B Y 69.2 76.2 3 C N 23.1 36.2 4 C N 11.6 44.6 I've been approaching it with the ddply and colwise functions from the plyr package, but in that case each I need each row to be its own group and retain all columns. Am I on the right track? If not, what's the best way to do this? Thanks in advance, M. L.
Christoph Molnar
2011-Oct-12 13:45 UTC
[R] Applying function to only numeric variable (plyr package?)
Hi, if the rows in your data.frame are numeric, this solution will work. (numeric.index <- unlist(lapply(df, is.numeric))) df[, numeric.index] <- apply(df[,numeric.index], 2, pct) This does not work for the example you gave, unless you coerce the columns with the your numerics to numeric: c1 <- c("A", "B", "C", "C") c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N")) x <- c(0.5234, 0.6919, 0.2307, 0.1160) y <- c(0.9251, 0.7616, 0.3624, 0.4462) df <- data.frame(c1, c2, x, y) df$y <- as.numeric(df$y) df$x <- as.numeric(df$x) pct <- function(x) round(100*x, 1) (numeric.index <- unlist(lapply(df, is.numeric))) df[, numeric.index] <- apply(df[,numeric.index], 2, pct) Christoph 2011/10/12 <Michael.Laviolette@dhhs.state.nh.us>> > My data frame consists of character variables, factors, and proportions, > something like > > c1 <- c("A", "B", "C", "C") > c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N")) > x <- c(0.5234, 0.6919, 0.2307, 0.1160) > y <- c(0.9251, 0.7616, 0.3624, 0.4462) > df <- data.frame(c1, c2, x, y) > pct <- function(x) round(100*x, 1) > > I want to apply the pct function to only the numeric variables so that the > proportions are computed to percentages, and retain all the columns: > > c1 c2 x1 x2 > 1 A Y 52.3 92.5 > 2 B Y 69.2 76.2 > 3 C N 23.1 36.2 > 4 C N 11.6 44.6 > > I've been approaching it with the ddply and colwise functions from the plyr > package, but in that case each I need each row to be its own group and > retain all columns. Am I on the right track? If not, what's the best way to > do this? > > Thanks in advance, > M. L. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Jan van der Laan
2011-Oct-12 13:50 UTC
[R] Applying function to only numeric variable (plyr package?)
plyr isn't necessary in this case. You can use the following: cols <- sapply(df, is.numeric) df[, cols] <- pct(df[,cols]) round (and therefore pct) accepts a data.frame and returns a data.frame with the same dimensions. If that hadn't been the case colwise might have been of help: library(plyr) pct.colwise <- colwise(pct) df[, cols] <- pct.colwise(df[,colwise]) HTH, Jan Quoting Michael.Laviolette at dhhs.state.nh.us:> > My data frame consists of character variables, factors, and proportions, > something like > > c1 <- c("A", "B", "C", "C") > c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N")) > x <- c(0.5234, 0.6919, 0.2307, 0.1160) > y <- c(0.9251, 0.7616, 0.3624, 0.4462) > df <- data.frame(c1, c2, x, y) > pct <- function(x) round(100*x, 1) > > I want to apply the pct function to only the numeric variables so that the > proportions are computed to percentages, and retain all the columns: > > c1 c2 x1 x2 > 1 A Y 52.3 92.5 > 2 B Y 69.2 76.2 > 3 C N 23.1 36.2 > 4 C N 11.6 44.6 > > I've been approaching it with the ddply and colwise functions from the plyr > package, but in that case each I need each row to be its own group and > retain all columns. Am I on the right track? If not, what's the best way to > do this? > > Thanks in advance, > M. L. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dennis Murphy
2011-Oct-12 22:02 UTC
[R] Applying function to only numeric variable (plyr package?)
Hi: One approach to this problem in plyr is to use the recently developed mutate() function rather than ddply(). mutate() is a somewhat faster version of transform(); when used as a standalone function, it doesn't take a grouping variable as an argument. For this example, one could use mutate(df, px = pct(x), py = pct(y)) c1 c2 x y px py 1 A Y 0.5234 0.9251 52.3 92.5 2 B Y 0.6919 0.7616 69.2 76.2 3 C N 0.2307 0.3624 23.1 36.2 4 C N 0.1160 0.4462 11.6 44.6 Another option is to use numcolwise() from the plyr package, which will apply the function of interest to all numeric variables in the data frame. This is a way to generate the desired outcome for this example: f <- numcolwise(pct) cbind(df[, 1:2], f(df)) c1 c2 x y 1 A Y 52.3 92.5 2 B Y 69.2 76.2 3 C N 23.1 36.2 4 C N 11.6 44.6 In a data frame with a large number of columns, one could separate out the non-numeric variables with sapply(), as shown in a previous response, into one data frame and then cbind() it to the result of numcolwise(). HTH, Dennis On Wed, Oct 12, 2011 at 6:18 AM, <Michael.Laviolette at dhhs.state.nh.us> wrote:> > My data frame consists of character variables, factors, and proportions, > something like > > c1 <- c("A", "B", "C", "C") > c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N")) > x <- c(0.5234, 0.6919, 0.2307, 0.1160) > y <- c(0.9251, 0.7616, 0.3624, 0.4462) > df <- data.frame(c1, c2, x, y) > pct <- function(x) round(100*x, 1) > > I want to apply the pct function to only the numeric variables so that the > proportions are computed to percentages, and retain all the columns: > > ?c1 c2 ? x1 ? x2 > 1 ?A ?Y 52.3 92.5 > 2 ?B ?Y 69.2 76.2 > 3 ?C ?N 23.1 36.2 > 4 ?C ?N 11.6 44.6 > > I've been approaching it with the ddply and colwise functions from the plyr > package, but in that case each I need each row to be its own group and > retain all columns. Am I on the right track? If not, what's the best way to > do this? > > Thanks in advance, > M. L. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >