baptiste auguie
2008-Dec-10 17:02 UTC
[R] tapply within a data.frame: a simpler alternative?
Dear list, I have a data.frame with x, y values and a 3-level factor "group", say. I want to create a new column in this data.frame with the values of y scaled to 1 by group. Perhaps the example below describes it best:> x <- seq(0, 10, len=100) > my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), > cos(2*x)), # note how the y values have a different maximum > depending on the group > group = factor(rep(c("sin", "cos", "cos2"), each=100))) > library(reshape) > df.melt <- melt(my.df, id=c("x","group")) # make a long format > df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame > by the group factor > df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group, > function(.v) {.v / max(.v)})) # calculate the normalised value per > group and assign it to a new column > library(lattice) > xyplot(norm + value ~ x,groups=group, data=df.melt, auto.key=T) # > check that it workedThis procedure works, but it feels like I'm reinventing the wheel using hammer and saw. I tried to use aggregate, by, ddply (plyr package), but I coudn't find anything straight-forward. I'll appreciate any input, Baptiste _____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag
Chuck Cleland
2008-Dec-10 17:20 UTC
[R] tapply within a data.frame: a simpler alternative?
On 12/10/2008 12:02 PM, baptiste auguie wrote:> Dear list, > > I have a data.frame with x, y values and a 3-level factor "group", say. > I want to create a new column in this data.frame with the values of y > scaled to 1 by group. Perhaps the example below describes it best: > >> x <- seq(0, 10, len=100) >> my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), cos(2*x)), >> # note how the y values have a different maximum depending on the group >> group = factor(rep(c("sin", "cos", "cos2"), each=100))) >> library(reshape) >> df.melt <- melt(my.df, id=c("x","group")) # make a long format >> df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame by >> the group factor >> df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group, >> function(.v) {.v / max(.v)})) # calculate the normalised value per >> group and assign it to a new column >> library(lattice) >> xyplot(norm + value ~ x,groups=group, data=df.melt, auto.key=T) # >> check that it worked > > > This procedure works, but it feels like I'm reinventing the wheel using > hammer and saw. I tried to use aggregate, by, ddply (plyr package), but > I coudn't find anything straight-forward.with(my.df, ave(y, group, FUN = function(x){x/max(x)})) ?ave> I'll appreciate any input, > > Baptiste > _____________________________ > > Baptiste Augui? > > School of Physics > University of Exeter > Stocker Road, > Exeter, Devon, > EX4 4QL, UK > > Phone: +44 1392 264187 > > http://newton.ex.ac.uk/research/emag > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894
hadley wickham
2008-Dec-10 17:25 UTC
[R] tapply within a data.frame: a simpler alternative?
On Wed, Dec 10, 2008 at 11:02 AM, baptiste auguie <ba208 at exeter.ac.uk> wrote:> Dear list, > > I have a data.frame with x, y values and a 3-level factor "group", say. I > want to create a new column in this data.frame with the values of y scaled > to 1 by group. Perhaps the example below describes it best: > >> x <- seq(0, 10, len=100) >> my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), cos(2*x)), # >> note how the y values have a different maximum depending on the group >> group = factor(rep(c("sin", "cos", "cos2"), each=100))) >> library(reshape) >> df.melt <- melt(my.df, id=c("x","group")) # make a long format >> df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame by the >> group factor >> df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group, >> function(.v) {.v / max(.v)})) # calculate the normalised value per group and >> assign it to a new column >> library(lattice) >> xyplot(norm + value ~ x,groups=group, data=df.melt, auto.key=T) # check >> that it worked > > > This procedure works, but it feels like I'm reinventing the wheel using > hammer and saw. I tried to use aggregate, by, ddply (plyr package), but I > coudn't find anything straight-forward.It's pretty easy with ddply: df.melt <- ddply(df.melt, .(group), transform, norm = y / max(y)) Hadley -- http://had.co.nz/
Peter Dalgaard
2008-Dec-10 17:29 UTC
[R] tapply within a data.frame: a simpler alternative?
baptiste auguie wrote:> Dear list, > > I have a data.frame with x, y values and a 3-level factor "group", say. > I want to create a new column in this data.frame with the values of y > scaled to 1 by group. Perhaps the example below describes it best: > >> x <- seq(0, 10, len=100) >> my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), cos(2*x)), >> # note how the y values have a different maximum depending on the group >> group = factor(rep(c("sin", "cos", "cos2"), each=100))) >> library(reshape) >> df.melt <- melt(my.df, id=c("x","group")) # make a long format >> df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame by >> the group factor >> df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group, >> function(.v) {.v / max(.v)})) # calculate the normalised value per >> group and assign it to a new column >> library(lattice) >> xyplot(norm + value ~ x,groups=group, data=df.melt, auto.key=T) # >> check that it worked > > > This procedure works, but it feels like I'm reinventing the wheel using > hammer and saw. I tried to use aggregate, by, ddply (plyr package), but > I coudn't find anything straight-forward. > > I'll appreciate any input,You (as many before you) have overlooked the ave() function, which can replace the ordering as well the do.call(c,tapply(....)) Also, I fail to see what good the melt()ing is for:> dim(my.df)[1] 300 3> dim(melt(my.df, id=c("x","group")) )[1] 300 4 And the extra column is just "y" my.df <- transform(my.df, norm=ave(y, group, function(.v) {.v / max(.v)})) xyplot(norm + y ~ x,groups=group, data=my.df, auto.key=T) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
David Freedman
2008-Dec-11 16:02 UTC
[R] tapply within a data.frame: a simpler alternative?
You might take a look at the transformBy function in the doBy package For example, new.df=transformBy(~group,data=my.df, new=y/max(y)) David Freedman baptiste auguie-2 wrote:> > Dear list, > > I have a data.frame with x, y values and a 3-level factor "group", > say. I want to create a new column in this data.frame with the values > of y scaled to 1 by group. Perhaps the example below describes it best: > >> x <- seq(0, 10, len=100) >> my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), >> cos(2*x)), # note how the y values have a different maximum >> depending on the group >> group = factor(rep(c("sin", "cos", "cos2"), each=100))) >> library(reshape) >> df.melt <- melt(my.df, id=c("x","group")) # make a long format >> df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame >> by the group factor >> df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group, >> function(.v) {.v / max(.v)})) # calculate the normalised value per >> group and assign it to a new column >> library(lattice) >> xyplot(norm + value ~ x,groups=group, data=df.melt, auto.key=T) # >> check that it worked > > > This procedure works, but it feels like I'm reinventing the wheel > using hammer and saw. I tried to use aggregate, by, ddply (plyr > package), but I coudn't find anything straight-forward. > > I'll appreciate any input, > > Baptiste > > > > > > _____________________________ > > Baptiste Augui? > > School of Physics > University of Exeter > Stocker Road, > Exeter, Devon, > EX4 4QL, UK > > Phone: +44 1392 264187 > > http://newton.ex.ac.uk/research/emag > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >----- David Freedman Atlanta -- View this message in context: http://www.nabble.com/tapply-within-a-data.frame%3A-a-simpler-alternative--tp20939647p20958347.html Sent from the R help mailing list archive at Nabble.com.