clair.crossupton at googlemail.com
2009-Oct-22  16:28 UTC
[R] How to find moving averages within each subgroup of a data frame
Dear all, If I have the following data frame:> set.seed(21) > df1 <- data.frame(col1=c(rep('a',5), rep('b',5), rep('c',5)), col4=rnorm(1:15))col1 col4 1 a 0.793013171 2 a 0.522251264 3 a 1.746222241 4 a -1.271336123 5 a 2.197389533 6 b 0.433130777 7 b -1.570199630 8 b -0.934905667 9 b 0.063493345 10 b -0.002393336 11 c -2.276781240 12 c 0.757412225 13 c -0.548405554 14 c 0.172549478 15 c 0.562853068 How can i make a 2 point moving average within each group? i.e. col1 col4 SMA a 0.793013171 NA a 0.522251264 0.657632218 a 1.746222241 1.134236753 a -1.271336123 0.237443059 a 2.197389533 0.463026705 b 0.433130777 NA b -1.57019963 -0.568534427 b -0.934905667 -1.252552649 b 0.063493345 -0.435706161 b -0.002393336 0.030550005 c -2.27678124 NA c 0.757412225 -0.759684508 c -0.548405554 0.104503336 c 0.172549478 -0.187928038 c 0.562853068 0.367701273>From what i've read, i was thinking it would be something along thelines of:> aggregate(df1$col4, by=list(df1$col1), function(x) {filter(x, rep(1/2,2), sides=1 )} )Error in aggregate.data.frame(as.data.frame(x), ...) : 'FUN' must always return a scalar But this tells me (i think) that aggregate should only return a single value per group. So what i need, i guess, is something that takes all the values in a given group, and returns a vector of the same length. Not sure which function to use for that. Thanks in advance, C xx P.S. on a side note, is it possible to extract the values which aggregate passes to the function(x) in my example above?
clair.crossupton at googlemail.com
2009-Oct-23  09:01 UTC
[R] How to find moving averages within each subgroup of a data frame
Dear all, If I have the following data frame:> set.seed(21) > df1 <- data.frame(col1=c(rep('a',5), rep('b',5), rep('c',5)), col4=rnorm(1:15))col1 col4 1 a 0.793013171 2 a 0.522251264 3 a 1.746222241 4 a -1.271336123 5 a 2.197389533 6 b 0.433130777 7 b -1.570199630 8 b -0.934905667 9 b 0.063493345 10 b -0.002393336 11 c -2.276781240 12 c 0.757412225 13 c -0.548405554 14 c 0.172549478 15 c 0.562853068 How can i make a 2 point moving average within each group? i.e. col1 col4 SMA a 0.793013171 NA a 0.522251264 0.657632218 a 1.746222241 1.134236753 a -1.271336123 0.237443059 a 2.197389533 0.463026705 b 0.433130777 NA b -1.57019963 -0.568534427 b -0.934905667 -1.252552649 b 0.063493345 -0.435706161 b -0.002393336 0.030550005 c -2.27678124 NA c 0.757412225 -0.759684508 c -0.548405554 0.104503336 c 0.172549478 -0.187928038 c 0.562853068 0.367701273>From what i've read, i was thinking it would be something along thelines of:> aggregate(df1$col4, by=list(df1$col1), function(x) {filter(x, rep(1/2,2), sides=1 )} )Error in aggregate.data.frame(as.data.frame(x), ...) : 'FUN' must always return a scalar But this tells me (i think) that aggregate should only return a single value per group. So what i need, i guess, is something that takes all the values in a given group, and returns a vector of the same length. Not sure which function to use for that. Thanks in advance, C xx P.S. on a side note, is it possible to extract the values which aggregate passes to the function(x) in my example above?
Gabor Grothendieck
2009-Oct-23  15:21 UTC
[R] How to find moving averages within each subgroup of a data frame
Try this: df1$SMA <- ave(df1$col4, df1$col1, FUN = function(x) c(NA, (head(x, -1) + tail(x, -1))/2)) It would also be possible to convert it from long form to wide form using reshape (or read.zoo in the devel version of zoo), convert that to a zoo series and the use rollapply in the zoo package. On Thu, Oct 22, 2009 at 12:28 PM, clair.crossupton at googlemail.com <clair.crossupton at googlemail.com> wrote:> Dear all, > > If I have the following data frame: > >> set.seed(21) >> df1 <- data.frame(col1=c(rep('a',5), rep('b',5), rep('c',5)), col4=rnorm(1:15)) > > ? col1 ? ? ? ? col4 > 1 ? ? a ?0.793013171 > 2 ? ? a ?0.522251264 > 3 ? ? a ?1.746222241 > 4 ? ? a -1.271336123 > 5 ? ? a ?2.197389533 > 6 ? ? b ?0.433130777 > 7 ? ? b -1.570199630 > 8 ? ? b -0.934905667 > 9 ? ? b ?0.063493345 > 10 ? ?b -0.002393336 > 11 ? ?c -2.276781240 > 12 ? ?c ?0.757412225 > 13 ? ?c -0.548405554 > 14 ? ?c ?0.172549478 > 15 ? ?c ?0.562853068 > > How can i make a 2 point moving average within each group? i.e. > > col1 ? ?col4 ? ?SMA > a ? ? ? 0.793013171 ? ? NA > a ? ? ? 0.522251264 ? ? 0.657632218 > a ? ? ? 1.746222241 ? ? 1.134236753 > a ? ? ? -1.271336123 ? ?0.237443059 > a ? ? ? 2.197389533 ? ? 0.463026705 > b ? ? ? 0.433130777 ? ? NA > b ? ? ? -1.57019963 ? ? -0.568534427 > b ? ? ? -0.934905667 ? ?-1.252552649 > b ? ? ? 0.063493345 ? ? -0.435706161 > b ? ? ? -0.002393336 ? ?0.030550005 > c ? ? ? -2.27678124 ? ? NA > c ? ? ? 0.757412225 ? ? -0.759684508 > c ? ? ? -0.548405554 ? ?0.104503336 > c ? ? ? 0.172549478 ? ? -0.187928038 > c ? ? ? 0.562853068 ? ? 0.367701273 > > >From what i've read, i was thinking it would be something along the > lines of: > >> aggregate(df1$col4, by=list(df1$col1), function(x) {filter(x, rep(1/2,2), sides=1 )} ) > Error in aggregate.data.frame(as.data.frame(x), ...) : > ?'FUN' must always return a scalar > > But this tells me (i think) that aggregate should only return a single > value per group. So what i need, i guess, is something that takes all > the values in a given group, and returns a vector of the same length. > Not sure which function to use for that. > > Thanks in advance, > > C xx > > P.S. on a side note, is it possible to extract the values which > aggregate passes to the function(x) in my example above? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >