Hi guys, I have a crap load of data to parse and have enjoyed creating a script that takes this data and creates a number of useful graphics for our area. I am unable to figure out one summary though and its all cause I dont fully understand the apply family of functions. Consider the following: #Create data Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=rep(1,1008)), data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=rep(2,1008))) #Example calc Results_<-list() #Sum Volume by 5 minute break by Day by Direction Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df..$Day,Df..$Hour,Df..$Dir),sum) The data is a snap shot of what im working with and I am trying to get to something similar to the last line where the volumes are summed. What i want to do is to do a weighted average for the speed by 5 minute break. So for all the speeds and volumes in a given hour of 5 minute break(12 per hour), i would want to sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] The output resembling the one from the above but having these weighted values. I am assuming the sum function in the above would be replaced by a function doing the calculation but I am still not sure how to do this using apply functions, so perhaps this isnt the best option. Hope this is clear and hope you guys(and of course ladies) can offer some guidance. Cheers, Josh -- View this message in context: http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp3784212p3784212.html Sent from the R help mailing list archive at Nabble.com.
I would suggest using a 'for' loop rather than an apply function. The advantage is that you will probably understand the loop that you write, and it will run in roughly the same amount of time as a complicated call to an apply function that you don't understand. On 01/09/2011 18:11, LCOG1 wrote:> Hi guys, > I have a crap load of data to parse and have enjoyed creating a script that > takes this data and creates a number of useful graphics for our area. I am > unable to figure out one summary though and its all cause I dont fully > understand the apply family of functions. Consider the following: > > > > #Create data > Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=rep(1,1008)), > data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=rep(2,1008))) > > #Example calc > Results_<-list() > > #Sum Volume by 5 minute break by Day by Direction > Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df..$Day,Df..$Hour,Df..$Dir),sum) > > The data is a snap shot of what im working with and I am trying to get to > something similar to the last line where the volumes are summed. What i > want to do is to do a weighted average for the speed by 5 minute break. So > for all the speeds and volumes in a given hour of 5 minute break(12 per > hour), i would want to > > sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] > > The output resembling the one from the above but having these weighted > values. I am assuming the sum function in the above would be replaced by a > function doing the calculation but I am still not sure how to do this using > apply functions, so perhaps this isnt the best option. > > Hope this is clear and hope you guys(and of course ladies) can offer some > guidance. > > Cheers, > Josh > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp3784212p3784212.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Patrick Burns pburns at pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno')
Is this close to what you are asking for:> require(data.table) > Dt.. <- data.table(Df..) > R <- Dt..[,+ list( + sum = sum(Volume) + , weight = sum(Volume * Mph) / sum(Volume) + ) + , by = list(Min5Break, Day, Hour, Dir) + ]> RMin5Break Day Hour Dir sum weight 1 1 0 NB 730.8880 32.60224 2 1 0 NB 766.4083 35.88443 3 1 0 SB 776.7592 32.66822 4 1 0 SB 768.0923 33.55988 5 1 0 NB 767.5472 36.00546 6 1 0 NB 767.6600 30.38747 7 1 0 SB 814.9662 31.88483 8 1 0 SB 795.4855 30.91495 9 1 0 NB 828.4439 31.57477 10 1 0 NB 797.7522 29.49832 11 1 0 SB 826.5165 32.74487 12 1 0 SB 824.0942 36.28309 1 2 1 NB 830.0683 29.59320 2 2 1 NB 838.8179 34.59878 3 2 1 SB 877.3518 30.77636 4 2 1 SB 838.9765 30.90577 5 2 1 NB 736.6560 30.54381 6 2 1 NB 772.3622 31.40094 7 2 1 SB 819.2347 29.22674 8 2 1 SB 840.9048 32.59222 9 2 1 NB 818.8383 37.55142 10 2 1 NB 783.8896 32.54565 11 2 1 SB 699.0401 30.76466 12 2 1 SB 773.5594 35.87076 cn Min5Break Day Hour Dir sum weight>On Thu, Sep 1, 2011 at 1:11 PM, LCOG1 <jroll at lcog.org> wrote:> Hi guys, > I have a crap load of data to parse and have enjoyed creating a script that > takes this data and creates a number of useful graphics for our area. ?I am > unable to figure out one summary though and its all cause I dont fully > understand the apply family of functions. ?Consider the following: > > > > #Create data > Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=rep(1,1008)), > data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=rep(2,1008))) > > #Example calc > Results_<-list() > > #Sum Volume by 5 minute break by Day by Direction > Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df..$Day,Df..$Hour,Df..$Dir),sum) > > The data is a snap shot of what im working with and I am trying to get to > something similar to the last line where the volumes are summed. ?What i > want to do is to do a weighted average for the speed by 5 minute break. ?So > for all the speeds and volumes in a given hour of 5 minute break(12 per > hour), i would want to > > sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] > > The output resembling the one from the above but having these weighted > values. ?I am assuming the sum function in the above would be replaced by a > function doing the calculation but I am still not sure how to do this using > apply functions, so perhaps this isnt the best option. > > Hope this is clear and hope you guys(and of course ladies) can offer some > guidance. > > Cheers, > ?Josh > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp3784212p3784212.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
Dang Jim this looks to do the trick though I never heard of a data.table, interesting, I will explore more. Thanks you very much. -----Original Message----- From: jim holtman [mailto:jholtman at gmail.com] Sent: Thursday, September 01, 2011 11:20 AM To: ROLL Josh F Cc: r-help at r-project.org Subject: Re: [R] Oh apply functions, how you confuse me Is this close to what you are asking for:> require(data.table) > Dt.. <- data.table(Df..) > R <- Dt..[,+ list( + sum = sum(Volume) + , weight = sum(Volume * Mph) / sum(Volume) + ) + , by = list(Min5Break, Day, Hour, Dir) + ]> RMin5Break Day Hour Dir sum weight 1 1 0 NB 730.8880 32.60224 2 1 0 NB 766.4083 35.88443 3 1 0 SB 776.7592 32.66822 4 1 0 SB 768.0923 33.55988 5 1 0 NB 767.5472 36.00546 6 1 0 NB 767.6600 30.38747 7 1 0 SB 814.9662 31.88483 8 1 0 SB 795.4855 30.91495 9 1 0 NB 828.4439 31.57477 10 1 0 NB 797.7522 29.49832 11 1 0 SB 826.5165 32.74487 12 1 0 SB 824.0942 36.28309 1 2 1 NB 830.0683 29.59320 2 2 1 NB 838.8179 34.59878 3 2 1 SB 877.3518 30.77636 4 2 1 SB 838.9765 30.90577 5 2 1 NB 736.6560 30.54381 6 2 1 NB 772.3622 31.40094 7 2 1 SB 819.2347 29.22674 8 2 1 SB 840.9048 32.59222 9 2 1 NB 818.8383 37.55142 10 2 1 NB 783.8896 32.54565 11 2 1 SB 699.0401 30.76466 12 2 1 SB 773.5594 35.87076 cn Min5Break Day Hour Dir sum weight>On Thu, Sep 1, 2011 at 1:11 PM, LCOG1 <jroll at lcog.org> wrote:> Hi guys, > I have a crap load of data to parse and have enjoyed creating a script > that takes this data and creates a number of useful graphics for our > area. ?I am unable to figure out one summary though and its all cause > I dont fully understand the apply family of functions. ?Consider the following: > > > > #Create data > Df..<-rbind(data.frame(Id=1:1008,Dir=rep(c("NB","NB","SB","SB"),252),M > ph=runif(1008,0,65), > Volume=runif(1008,0,19),Hour=rep(00,1008),Min5Break=rep(1:12,84),Day=r > ep(1,1008)), > data.frame(Id=2009:2016,Dir=rep(c("NB","NB","SB","SB"),252),Mph=runif( > 1008,0,65), > Volume=runif(1008,0,19),Hour=rep(01,1008),Min5Break=rep(1:12,84),Day=r > ep(2,1008))) > > #Example calc > Results_<-list() > > #Sum Volume by 5 minute break by Day by Direction > Results_$FiveMin.Direction<-tapply(Df..$Volume,list(Df..$Min5Break,Df. > .$Day,Df..$Hour,Df..$Dir),sum) > > The data is a snap shot of what im working with and I am trying to get > to something similar to the last line where the volumes are summed. ? > What i want to do is to do a weighted average for the speed by 5 > minute break. ?So for all the speeds and volumes in a given hour of 5 > minute break(12 per hour), i would want to > > sum(Volumes[1:12]*Speed[1:12]) / sum(Volumes[1:12] > > The output resembling the one from the above but having these weighted > values. ?I am assuming the sum function in the above would be replaced > by a function doing the calculation but I am still not sure how to do > this using apply functions, so perhaps this isnt the best option. > > Hope this is clear and hope you guys(and of course ladies) can offer > some guidance. > > Cheers, > ?Josh > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Oh-apply-functions-how-you-confuse-me-tp > 3784212p3784212.html Sent from the R help mailing list archive at > Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?