Tonja Krueger
2011-Mar-22 15:05 UTC
[R] Accelerating the calculation of the moving average
Dear List, I have a data frame with approximately 500000 rows that looks like this: ?Date??? time??? value ? 19.07.1956????????? 12:00:00?????????????? 4.84 19.07.1956????????? 13:00:00?????????????? 4.85 19.07.1956????????? 14:00:00?????????????? 4.89 19.07.1956????????? 15:00:00?????????????? 4.94 19.07.1956????????? 16:00:00?????????????? 4.99 19.07.1956????????? 17:00:00?????????????? 5.01 19.07.1956????????? 18:00:00?????????????? 5.04 19.07.1956????????? 19:00:00?????????????? 5.04 19.07.1956????????? 20:00:00?????????????? 5.04 19.07.1956????????? 21:00:00?????????????? 5.02 19.07.1956????????? 22:00:00?????????????? 5.01 19.07.1956????????? 23:00:00?????????????? 5.00 20.07.1956????????? 00:00:00?????????????? 4.99 20.07.1956????????? 01:00:00?????????????? 4.99 20.07.1956????????? 02:00:00?????????????? 5.00 20.07.1956????????? 03:00:00?????????????? 5.03 20.07.1956????????? 04:00:00?????????????? 5.07 20.07.1956????????? 05:00:00?????????????? 5.10 20.07.1956????????? 06:00:00?????????????? 5.14 20.07.1956????????? 07:00:00?????????????? 5.14 20.07.1956????????? 08:00:00?????????????? 5.11 20.07.1956????????? 09:00:00?????????????? 5.08 20.07.1956????????? 10:00:00?????????????? 5.03 20.07.1956????????? 11:00:00?????????????? 4.98 20.07.1956????????? 12:00:00?????????????? 4.94 20.07.1956????????? 13:00:00?????????????? 4.93 ? I want to calculate the moving average of the right column. I tried: dat$index<-1:length(dat$Zeit) qs<- 43800 erg<-c() for (y in min(dat$index):max(dat$index)){ m<- mean(dat[(dat$index>=y)&(dat$index<=y+qs+1),3]) erg<-c(erg,m) } It does works, but it takes ages. Is there a faster way to compute the moving average? Thank you, Tonja Krueger ___________________________________________________________ Handy Internet-Flat ? gratis ? mit WEB.DE FreePhone
Kenn Konstabel
2011-Mar-22 15:10 UTC
[R] Accelerating the calculation of the moving average
On Tue, Mar 22, 2011 at 3:05 PM, Tonja Krueger <tonja.krueger@web.de> wrote:> > Dear List, > I have a data frame with approximately 500000 rows that looks like this: > > Date time value > … > 19.07.1956 12:00:00 4.84 > 19.07.1956 13:00:00 4.85 > 19.07.1956 14:00:00 4.89 > 19.07.1956 15:00:00 4.94 > 19.07.1956 16:00:00 4.99 > 19.07.1956 17:00:00 5.01 > 19.07.1956 18:00:00 5.04 > 19.07.1956 19:00:00 5.04 > /.../> 20.07.1956 12:00:00 4.94 > 20.07.1956 13:00:00 4.93 > … > > I want to calculate > the moving average of the right column. > I tried: > > dat$index<-1:length(dat$Zeit) > qs<- 43800 > erg<-c() > for (y in min(dat$index):max(dat$index)){ > m<- mean(dat[(dat$index>=y)&(dat$index<=y+qs+1),3]) > erg<-c(erg,m) > } > > It does works, but it takes ages. Is there a faster way to compute the > moving average? >see e.g., rollmean in package zoo> > Thank you, > Tonja Krueger > > > ___________________________________________________________ > Handy Internet-Flat ¿ gratis ¿ mit WEB.DE FreePhone > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Gabor Grothendieck
2011-Mar-22 15:19 UTC
[R] Accelerating the calculation of the moving average
On Tue, Mar 22, 2011 at 11:05 AM, Tonja Krueger <tonja.krueger at web.de> wrote:> > Dear List, > I have a data frame with approximately 500000 rows that looks like this: > > ?Date??? time??? value > ? > 19.07.1956????????? 12:00:00?????????????? 4.84 > 19.07.1956????????? 13:00:00?????????????? 4.85 > 19.07.1956????????? 14:00:00?????????????? 4.89 > 19.07.1956????????? 15:00:00?????????????? 4.94 > 19.07.1956????????? 16:00:00?????????????? 4.99 > 19.07.1956????????? 17:00:00?????????????? 5.01 > 19.07.1956????????? 18:00:00?????????????? 5.04 > 19.07.1956????????? 19:00:00?????????????? 5.04 > 19.07.1956????????? 20:00:00?????????????? 5.04 > 19.07.1956????????? 21:00:00?????????????? 5.02 > 19.07.1956????????? 22:00:00?????????????? 5.01 > 19.07.1956????????? 23:00:00?????????????? 5.00 > 20.07.1956????????? 00:00:00?????????????? 4.99 > 20.07.1956????????? 01:00:00?????????????? 4.99 > 20.07.1956????????? 02:00:00?????????????? 5.00 > 20.07.1956????????? 03:00:00?????????????? 5.03 > 20.07.1956????????? 04:00:00?????????????? 5.07 > 20.07.1956????????? 05:00:00?????????????? 5.10 > 20.07.1956????????? 06:00:00?????????????? 5.14 > 20.07.1956????????? 07:00:00?????????????? 5.14 > 20.07.1956????????? 08:00:00?????????????? 5.11 > 20.07.1956????????? 09:00:00?????????????? 5.08 > 20.07.1956????????? 10:00:00?????????????? 5.03 > 20.07.1956????????? 11:00:00?????????????? 4.98 > 20.07.1956????????? 12:00:00?????????????? 4.94 > 20.07.1956????????? 13:00:00?????????????? 4.93 > ? > > I want to calculate > the moving average of the right column. > I tried: > > dat$index<-1:length(dat$Zeit) > qs<- 43800 > erg<-c() > for (y in min(dat$index):max(dat$index)){ > m<- mean(dat[(dat$index>=y)&(dat$index<=y+qs+1),3]) > erg<-c(erg,m) > } > > It does works, but it takes ages. Is there a faster way to compute the moving average? > > Thank you, > Tonja KruegerThere are rolling mean or sum functions written in C in the caTools, xts and TTR packages (and possibly other packages as well). There are also faster ways to do it even in pure R such as the rollmean function in zoo (although that would not be expected to be as fast as the C implementations). -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
William Dunlap
2011-Mar-22 16:25 UTC
[R] Accelerating the calculation of the moving average
filter(), in the stats package, can do moving averages (with any weights). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Tonja Krueger > Sent: Tuesday, March 22, 2011 8:06 AM > To: r-help at r-project.org > Subject: [R] Accelerating the calculation of the moving average > > > Dear List, > I have a data frame with approximately 500000 rows that looks > like this: > > ?Date??? time??? value > ... > 19.07.1956????????? 12:00:00?????????????? 4.84 > 19.07.1956????????? 13:00:00?????????????? 4.85 > 19.07.1956????????? 14:00:00?????????????? 4.89 > 19.07.1956????????? 15:00:00?????????????? 4.94 > 19.07.1956????????? 16:00:00?????????????? 4.99 > 19.07.1956????????? 17:00:00?????????????? 5.01 > 19.07.1956????????? 18:00:00?????????????? 5.04 > 19.07.1956????????? 19:00:00?????????????? 5.04 > 19.07.1956????????? 20:00:00?????????????? 5.04 > 19.07.1956????????? 21:00:00?????????????? 5.02 > 19.07.1956????????? 22:00:00?????????????? 5.01 > 19.07.1956????????? 23:00:00?????????????? 5.00 > 20.07.1956????????? 00:00:00?????????????? 4.99 > 20.07.1956????????? 01:00:00?????????????? 4.99 > 20.07.1956????????? 02:00:00?????????????? 5.00 > 20.07.1956????????? 03:00:00?????????????? 5.03 > 20.07.1956????????? 04:00:00?????????????? 5.07 > 20.07.1956????????? 05:00:00?????????????? 5.10 > 20.07.1956????????? 06:00:00?????????????? 5.14 > 20.07.1956????????? 07:00:00?????????????? 5.14 > 20.07.1956????????? 08:00:00?????????????? 5.11 > 20.07.1956????????? 09:00:00?????????????? 5.08 > 20.07.1956????????? 10:00:00?????????????? 5.03 > 20.07.1956????????? 11:00:00?????????????? 4.98 > 20.07.1956????????? 12:00:00?????????????? 4.94 > 20.07.1956????????? 13:00:00?????????????? 4.93 > ... > > I want to calculate > the moving average of the right column. > I tried: > > dat$index<-1:length(dat$Zeit) > qs<- 43800 > erg<-c() > for (y in min(dat$index):max(dat$index)){ > m<- mean(dat[(dat$index>=y)&(dat$index<=y+qs+1),3]) > erg<-c(erg,m) > } > > It does works, but it takes ages. Is there a faster way to > compute the moving average? > > Thank you, > Tonja Krueger > > > ___________________________________________________________ > Handy Internet-Flat ? gratis ? mit WEB.DE FreePhone > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >