H all, I have relative abundance data from >100 sites. This is from acoustic monitoring and usually the data is for 2-3 nights but in some cases my be longer like months or years for each location.. The data output from my management data base is proved by species by night for each location so data frame would look like this below. What I need to do is sum the Survey_time by Spec_Code for each location name and divide summed AI values for each Spec_code by the summed Survey time to adjust for unit effort then standardize it all by *10 to represent the relative abundance by survey hour to 10 hours. How best to do this? Using Plyr or reshape? Location name SPEC_CODE Start_Day Survey_Time AI Std AI 079-f2p1-Acetuna Buzz 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Buzz 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Eumspp 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Frag 2/14/2012 12.1 18 14.87603 079-f2p1-Acetuna Molspp 2/14/2012 12.1 5 4.132231 079-f2p1-Acetuna Molspp 2/14/2012 12.1 5 4.132231 079-f2p1-Acetuna Phyllo 2/14/2012 12.1 2 1.652893 079-f2p1-Acetuna Ptedav 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptegym 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptepar 2/14/2012 12.1 2 1.652893 079-f2p1-Acetuna Rhotum 2/14/2012 12.1 6 4.958678 079-f2p1-Acetuna Sacbil 2/14/2012 12.1 6 4.958678 079-f2p1-Acetuna Saclep 2/14/2012 12.1 11 9.090909 079-f2p1-Acetuna Buzz 2/15/2012 12.1 2 1.652893 079-f2p1-Acetuna Buzz 2/15/2012 12.1 2 1.652893 079-f2p1-Acetuna Molmol 2/15/2012 12.1 1 0.8264463 079-f2p1-Acetuna Molspp 2/15/2012 12.1 7 5.785124 079-f2p1-Acetuna Molspp 2/15/2012 12.1 7 5.785124 079-f2p1-Acetuna Nocalb 2/15/2012 12.1 6 4.958678 079-f2p1-Acetuna Phyllo 2/15/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptedav 2/15/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptegym 2/15/2012 12.1 4 3.305785 079-f2p1-Acetuna Ptepar 2/15/2012 12.1 4 3.305785 079-f2p1-Acetuna Pteper 2/15/2012 12.1 3 2.479339 079-f2p1-Acetuna Rhotum 2/15/2012 12.1 7 5.785124 079-f2p1-Acetuna Sacbil 2/15/2012 12.1 2 1.652893 079-f2p1-Acetuna Saclep 2/15/2012 12.1 6 4.958678 079-f2p1-Acetuna Buzz 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Buzz 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Eumspp 2/16/2012 12.2 4 3.278688 079-f2p1-Acetuna Molspp 2/16/2012 12.2 2 1.639344 079-f2p1-Acetuna Molspp 2/16/2012 12.2 2 1.639344 079-f2p1-Acetuna Myokea 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Nocalb 2/16/2012 12.2 4 3.278688 079-f2p1-Acetuna Phyllo 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Ptedav 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Ptegym 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Ptepar 2/16/2012 12.2 3 2.459016 079-f2p1-Acetuna Pteper 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Rhotum 2/16/2012 12.2 17 13.93443 079-f2p1-Acetuna Sacbil 2/16/2012 12.2 3 2.459016 079-f2p1-Acetuna Saclep 2/16/2012 12.2 15 12.29508 Thanks for any suggestions. Excel will be a mess to try to do that. Bruce [[alternative HTML version deleted]]
HI, dat1<- structure(list(Location_name = c("079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna", "079-f2p1-Acetuna" ), SPEC_CODE = c("Buzz", "Buzz", "Eumspp", "Frag", "Molspp", "Molspp", "Phyllo", "Ptedav", "Ptegym", "Ptepar", "Rhotum", "Sacbil", "Saclep", "Buzz", "Buzz", "Molmol", "Molspp", "Molspp", "Nocalb", "Phyllo", "Ptedav", "Ptegym", "Ptepar", "Pteper", "Rhotum", "Sacbil", "Saclep", "Buzz", "Buzz", "Eumspp", "Molspp", "Molspp", "Myokea", "Nocalb", "Phyllo", "Ptedav", "Ptegym", "Ptepar", "Pteper", "Rhotum", "Sacbil", "Saclep"), Start_Day = c("2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/14/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/15/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012", "2/16/2012" ), Survey_Time = c(12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.1, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2, 12.2), AI = c(1L, 1L, 1L, 18L, 5L, 5L, 2L, 1L, 1L, 2L, 6L, 6L, 11L, 2L, 2L, 1L, 7L, 7L, 6L, 1L, 1L, 4L, 4L, 3L, 7L, 2L, 6L, 1L, 1L, 4L, 2L, 2L, 1L, 4L, 1L, 1L, 1L, 3L, 1L, 17L, 3L, 15L), Std_AI = c(0.8264463, 0.8264463, 0.8264463, 14.87603, 4.132231, 4.132231, 1.652893, 0.8264463, 0.8264463, 1.652893, 4.958678, 4.958678, 9.090909, 1.652893, 1.652893, 0.8264463, 5.785124, 5.785124, 4.958678, 0.8264463, 0.8264463, 3.305785, 3.305785, 2.479339, 5.785124, 1.652893, 4.958678, 0.8196721, 0.8196721, 3.278688, 1.639344, 1.639344, 0.8196721, 3.278688, 0.8196721, 0.8196721, 0.8196721, 2.459016, 0.8196721, 13.93443, 2.459016, 12.29508)), .Names = c("Location_name", "SPEC_CODE", "Start_Day", "Survey_Time", "AI", "Std_AI"), class = "data.frame", row.names = c(NA, -42L)) library(plyr) res<- mutate(ddply(dat1,.(SPEC_CODE,Location_name),function(x) colSums(x[,-c(1:3,6)])),SumAIbySumST=Survey_Time/AI,Std.SumAIbySumST=10*SumAIbySumST) res #?? SPEC_CODE??? Location_name Survey_Time AI SumAIbySumST Std.SumAIbySumST #1?????? Buzz 079-f2p1-Acetuna??????? 72.8? 8??? 9.1000000??????? 91.000000 #2???? Eumspp 079-f2p1-Acetuna??????? 24.3? 5??? 4.8600000??????? 48.600000 #3?????? Frag 079-f2p1-Acetuna??????? 12.1 18??? 0.6722222???????? 6.722222 #4???? Molmol 079-f2p1-Acetuna??????? 12.1? 1?? 12.1000000?????? 121.000000 #5???? Molspp 079-f2p1-Acetuna??????? 72.8 28??? 2.6000000??????? 26.000000 #6???? Myokea 079-f2p1-Acetuna??????? 12.2? 1?? 12.2000000?????? 122.000000 #7???? Nocalb 079-f2p1-Acetuna??????? 24.3 10??? 2.4300000??????? 24.300000 #8???? Phyllo 079-f2p1-Acetuna??????? 36.4? 4??? 9.1000000??????? 91.000000 #9???? Ptedav 079-f2p1-Acetuna??????? 36.4? 3?? 12.1333333?????? 121.333333 #10??? Ptegym 079-f2p1-Acetuna??????? 36.4? 6??? 6.0666667??????? 60.666667 #11??? Ptepar 079-f2p1-Acetuna??????? 36.4? 9??? 4.0444444??????? 40.444444 #12??? Pteper 079-f2p1-Acetuna??????? 24.3? 4??? 6.0750000??????? 60.750000 #13??? Rhotum 079-f2p1-Acetuna??????? 36.4 30??? 1.2133333??????? 12.133333 #14??? Sacbil 079-f2p1-Acetuna??????? 36.4 11??? 3.3090909??????? 33.090909 #15??? Saclep 079-f2p1-Acetuna??????? 36.4 32??? 1.1375000??????? 11.375000 A.K. ----- Original Message ----- From: Bruce Miller <batsncats at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, April 20, 2013 9:55 AM Subject: [R] Reshape or Plyr? H all, I have relative abundance data from >100 sites.? This is from acoustic monitoring and usually the data is for 2-3 nights but in some cases my be longer like months or years for each location.. The data output from my management data base is proved by species by night for each location so data frame would look like this below. What I need to do is sum the Survey_time by Spec_Code for each location name and divide summed AI values for each Spec_code by the summed Survey time to adjust for unit effort then standardize it all by *10 to represent the relative abundance by survey hour to 10 hours. How best to do this?? Using Plyr or reshape? Location name ??? SPEC_CODE ??? Start_Day ??? Survey_Time ??? AI ??? Std AI 079-f2p1-Acetuna ??? Buzz ??? 2/14/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Buzz ??? 2/14/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Eumspp ??? 2/14/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Frag ??? 2/14/2012 ??? 12.1 ??? 18 ??? 14.87603 079-f2p1-Acetuna ??? Molspp ??? 2/14/2012 ??? 12.1 ??? 5 ??? 4.132231 079-f2p1-Acetuna ??? Molspp ??? 2/14/2012 ??? 12.1 ??? 5 ??? 4.132231 079-f2p1-Acetuna ??? Phyllo ??? 2/14/2012 ??? 12.1 ??? 2 ??? 1.652893 079-f2p1-Acetuna ??? Ptedav ??? 2/14/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Ptegym ??? 2/14/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Ptepar ??? 2/14/2012 ??? 12.1 ??? 2 ??? 1.652893 079-f2p1-Acetuna ??? Rhotum ??? 2/14/2012 ??? 12.1 ??? 6 ??? 4.958678 079-f2p1-Acetuna ??? Sacbil ??? 2/14/2012 ??? 12.1 ??? 6 ??? 4.958678 079-f2p1-Acetuna ??? Saclep ??? 2/14/2012 ??? 12.1 ??? 11 ??? 9.090909 079-f2p1-Acetuna ??? Buzz ??? 2/15/2012 ??? 12.1 ??? 2 ??? 1.652893 079-f2p1-Acetuna ??? Buzz ??? 2/15/2012 ??? 12.1 ??? 2 ??? 1.652893 079-f2p1-Acetuna ??? Molmol ??? 2/15/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Molspp ??? 2/15/2012 ??? 12.1 ??? 7 ??? 5.785124 079-f2p1-Acetuna ??? Molspp ??? 2/15/2012 ??? 12.1 ??? 7 ??? 5.785124 079-f2p1-Acetuna ??? Nocalb ??? 2/15/2012 ??? 12.1 ??? 6 ??? 4.958678 079-f2p1-Acetuna ??? Phyllo ??? 2/15/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Ptedav ??? 2/15/2012 ??? 12.1 ??? 1 ??? 0.8264463 079-f2p1-Acetuna ??? Ptegym ??? 2/15/2012 ??? 12.1 ??? 4 ??? 3.305785 079-f2p1-Acetuna ??? Ptepar ??? 2/15/2012 ??? 12.1 ??? 4 ??? 3.305785 079-f2p1-Acetuna ??? Pteper ??? 2/15/2012 ??? 12.1 ??? 3 ??? 2.479339 079-f2p1-Acetuna ??? Rhotum ??? 2/15/2012 ??? 12.1 ??? 7 ??? 5.785124 079-f2p1-Acetuna ??? Sacbil ??? 2/15/2012 ??? 12.1 ??? 2 ??? 1.652893 079-f2p1-Acetuna ??? Saclep ??? 2/15/2012 ??? 12.1 ??? 6 ??? 4.958678 079-f2p1-Acetuna ??? Buzz ??? 2/16/2012 ??? 12.2 ??? 1 ??? 0.8196721 079-f2p1-Acetuna ??? Buzz ??? 2/16/2012 ??? 12.2 ??? 1 ??? 0.8196721 079-f2p1-Acetuna ??? Eumspp ??? 2/16/2012 ??? 12.2 ??? 4 ??? 3.278688 079-f2p1-Acetuna ??? Molspp ??? 2/16/2012 ??? 12.2 ??? 2 ??? 1.639344 079-f2p1-Acetuna ??? Molspp ??? 2/16/2012 ??? 12.2 ??? 2 ??? 1.639344 079-f2p1-Acetuna ??? Myokea ??? 2/16/2012 ??? 12.2 ??? 1 ??? 0.8196721 079-f2p1-Acetuna ??? Nocalb ??? 2/16/2012 ??? 12.2 ??? 4 ??? 3.278688 079-f2p1-Acetuna ??? Phyllo ??? 2/16/2012 ??? 12.2 ??? 1 ??? 0.8196721 079-f2p1-Acetuna ??? Ptedav ??? 2/16/2012 ??? 12.2 ??? 1 ??? 0.8196721 079-f2p1-Acetuna ??? Ptegym ??? 2/16/2012 ??? 12.2 ??? 1 ??? 0.8196721 079-f2p1-Acetuna ??? Ptepar ??? 2/16/2012 ??? 12.2 ??? 3 ??? 2.459016 079-f2p1-Acetuna ??? Pteper ??? 2/16/2012 ??? 12.2 ??? 1 ??? 0.8196721 079-f2p1-Acetuna ??? Rhotum ??? 2/16/2012 ??? 12.2 ??? 17 ??? 13.93443 079-f2p1-Acetuna ??? Sacbil ??? 2/16/2012 ??? 12.2 ??? 3 ??? 2.459016 079-f2p1-Acetuna ??? Saclep ??? 2/16/2012 ??? 12.2 ??? 15 ??? 12.29508 Thanks for any suggestions.? Excel will be a mess to try to do that. Bruce ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Bruce It sounds like the aggregate function is what you need. Here's an example using the ChickWeight dataset in R: chick.agg<-aggregate(ChickWeight[,c('weight', 'Time')], by = ChickWeight[c('Chick','Diet')], sum) weight.time<-chick.agg$weight/chick.agg$Time chick.agg.df<-data.frame(chick.agg, weight.time) As you can see above, once you have the aggregated dataframe, you can just create another vector (weight.time), which is chick weight divided by time and add this to a new database using the data.frame function. Take a look at the ChickWeight function so you'll understand what I've done. cheers, Danny On Saturday, April 20, 2013 2:55:03 PM UTC+1, Bruce Miller wrote:> > H all, > > I have relative abundance data from >100 sites. This is from acoustic > monitoring and usually the data is for 2-3 nights but in some cases my > be longer like months or years for each location.. > The data output from my management data base is proved by species by > night for each location so data frame would look like this below. What I > need to do is sum the Survey_time by Spec_Code for each location name > and divide summed AI values for each Spec_code by the summed Survey time > to adjust for unit effort then standardize it all by *10 to represent > the relative abundance by survey hour to 10 hours. How best to do this? > Using Plyr or reshape? > > Location name SPEC_CODE Start_Day Survey_Time > AI Std AI > 079-f2p1-Acetuna Buzz 2/14/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Buzz 2/14/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Eumspp 2/14/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Frag 2/14/2012 12.1 18 > 14.87603 > 079-f2p1-Acetuna Molspp 2/14/2012 12.1 5 > 4.132231 > 079-f2p1-Acetuna Molspp 2/14/2012 12.1 5 > 4.132231 > 079-f2p1-Acetuna Phyllo 2/14/2012 12.1 2 > 1.652893 > 079-f2p1-Acetuna Ptedav 2/14/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Ptegym 2/14/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Ptepar 2/14/2012 12.1 2 > 1.652893 > 079-f2p1-Acetuna Rhotum 2/14/2012 12.1 6 > 4.958678 > 079-f2p1-Acetuna Sacbil 2/14/2012 12.1 6 > 4.958678 > 079-f2p1-Acetuna Saclep 2/14/2012 12.1 11 > 9.090909 > 079-f2p1-Acetuna Buzz 2/15/2012 12.1 2 > 1.652893 > 079-f2p1-Acetuna Buzz 2/15/2012 12.1 2 > 1.652893 > 079-f2p1-Acetuna Molmol 2/15/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Molspp 2/15/2012 12.1 7 > 5.785124 > 079-f2p1-Acetuna Molspp 2/15/2012 12.1 7 > 5.785124 > 079-f2p1-Acetuna Nocalb 2/15/2012 12.1 6 > 4.958678 > 079-f2p1-Acetuna Phyllo 2/15/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Ptedav 2/15/2012 12.1 1 > 0.8264463 > 079-f2p1-Acetuna Ptegym 2/15/2012 12.1 4 > 3.305785 > 079-f2p1-Acetuna Ptepar 2/15/2012 12.1 4 > 3.305785 > 079-f2p1-Acetuna Pteper 2/15/2012 12.1 3 > 2.479339 > 079-f2p1-Acetuna Rhotum 2/15/2012 12.1 7 > 5.785124 > 079-f2p1-Acetuna Sacbil 2/15/2012 12.1 2 > 1.652893 > 079-f2p1-Acetuna Saclep 2/15/2012 12.1 6 > 4.958678 > 079-f2p1-Acetuna Buzz 2/16/2012 12.2 1 > 0.8196721 > 079-f2p1-Acetuna Buzz 2/16/2012 12.2 1 > 0.8196721 > 079-f2p1-Acetuna Eumspp 2/16/2012 12.2 4 > 3.278688 > 079-f2p1-Acetuna Molspp 2/16/2012 12.2 2 > 1.639344 > 079-f2p1-Acetuna Molspp 2/16/2012 12.2 2 > 1.639344 > 079-f2p1-Acetuna Myokea 2/16/2012 12.2 1 > 0.8196721 > 079-f2p1-Acetuna Nocalb 2/16/2012 12.2 4 > 3.278688 > 079-f2p1-Acetuna Phyllo 2/16/2012 12.2 1 > 0.8196721 > 079-f2p1-Acetuna Ptedav 2/16/2012 12.2 1 > 0.8196721 > 079-f2p1-Acetuna Ptegym 2/16/2012 12.2 1 > 0.8196721 > 079-f2p1-Acetuna Ptepar 2/16/2012 12.2 3 > 2.459016 > 079-f2p1-Acetuna Pteper 2/16/2012 12.2 1 > 0.8196721 > 079-f2p1-Acetuna Rhotum 2/16/2012 12.2 17 > 13.93443 > 079-f2p1-Acetuna Sacbil 2/16/2012 12.2 3 > 2.459016 > 079-f2p1-Acetuna Saclep 2/16/2012 12.2 15 > 12.29508 > > > Thanks for any suggestions. Excel will be a mess to try to do that. > > Bruce > > [[alternative HTML version deleted]] > > ______________________________________________ > R-h... at r-project.org <javascript:> mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Bruce, From your second email, I think you solved the problem.? But, still there is confusion due to the wordings. res1<-mutate(ddply(dat1,.(SPEC_CODE),function(x) colSums(x[,-c(1:3,6)])),RA=10*(AI/Survey_Time)) ?res1 ?? SPEC_CODE Survey_Time AI???????? RA 1?????? Buzz??????? 72.8? 8? 1.0989011 2???? Eumspp??????? 24.3? 5? 2.0576132 3?????? Frag??????? 12.1 18 14.8760331 4???? Molmol??????? 12.1? 1? 0.8264463 5???? Molspp??????? 72.8 28? 3.8461538 6???? Myokea??????? 12.2? 1? 0.8196721 7???? Nocalb??????? 24.3 10? 4.1152263 8???? Phyllo??????? 36.4? 4? 1.0989011 9???? Ptedav??????? 36.4? 3? 0.8241758 10??? Ptegym??????? 36.4? 6? 1.6483516 11??? Ptepar??????? 36.4? 9? 2.4725275 12??? Pteper??????? 24.3? 4? 1.6460905 13??? Rhotum??????? 36.4 30? 8.2417582 14??? Sacbil??????? 36.4 11? 3.0219780 15??? Saclep??????? 36.4 32? 8.7912088 #or dat1$RA<-10*with(dat1,ave(AI,SPEC_CODE,FUN=sum))/with(dat1,ave(Survey_Time,SPEC_CODE,FUN=sum)) head(dat1) #??? Location_name SPEC_CODE Start_Day Survey_Time AI???? Std_AI??????? RA #1 079-f2p1-Acetuna????? Buzz 2/14/2012??????? 12.1? 1? 0.8264463? 1.098901 #2 079-f2p1-Acetuna????? Buzz 2/14/2012??????? 12.1? 1? 0.8264463? 1.098901 #3 079-f2p1-Acetuna??? Eumspp 2/14/2012??????? 12.1? 1? 0.8264463? 2.057613 #4 079-f2p1-Acetuna????? Frag 2/14/2012??????? 12.1 18 14.8760300 14.876033 #5 079-f2p1-Acetuna??? Molspp 2/14/2012??????? 12.1? 5? 4.1322310? 3.846154 #6 079-f2p1-Acetuna??? Molspp 2/14/2012??????? 12.1? 5? 4.1322310? 3.846154 ________________________________ From: Bruce Miller <batsncats at gmail.com> To: arun <smartpink111 at yahoo.com> Sent: Sunday, April 21, 2013 4:19 PM Subject: Re: [R] Reshape or Plyr? Hi Arun, This looks good; however the not quite correct yet. Looks like the mutate option may be a good place to start. Sorry I did not clearly explain this in my original query. For adjustment by unit effort I need the "Base AI per hour" summed for the location.? Once I have that it can be standardized by 10 hours. An example for a single from your output? #9???? Ptedav 079-f2p1-Acetuna??????? 36.4? 3?? 12.1333333?????? 121.333333 Is not the correct way to determine AI per hour. So total survey hours for a 3 nights is 36.4 then the AI value is 3 so the AI per hour for Ptedav would be3/36.4 = .08241 then standardized for 10 hours of survey time would be 0.824175824. Now I need to determine how to re-arrange your code to reflect the Summed AI/Summed survey time. res<- mutate(ddply(dat1,.(SPEC_CODE,Location_name),function(x) colSums(x[,-c(1:3,6)])),SumAIbySumST=Survey_Time/AI,Std.SumAIbySumST=10*SumAIbySumST) Thanks again for your input and help. Bruce
Hi Bruce, May be this helps. library(plyr) ?res<-mutate(ddply(dat1,.(SPEC_CODE,Location_name),function(x) colSums(x[,-c(1:3,6)])),RA=10*(AI/Survey_Time)) ?res #?? SPEC_CODE??? Location_name Survey_Time AI???????? RA #1?????? Buzz 079-f2p1-Acetuna??????? 72.8? 8? 1.0989011 #2???? Eumspp 079-f2p1-Acetuna??????? 24.3? 5? 2.0576132 #3?????? Frag 079-f2p1-Acetuna??????? 12.1 18 14.8760331 #4???? Molmol 079-f2p1-Acetuna??????? 12.1? 1? 0.8264463 #5???? Molspp 079-f2p1-Acetuna??????? 72.8 28? 3.8461538 #6???? Myokea 079-f2p1-Acetuna??????? 12.2? 1? 0.8196721 #7???? Nocalb 079-f2p1-Acetuna??????? 24.3 10? 4.1152263 #8???? Phyllo 079-f2p1-Acetuna??????? 36.4? 4? 1.0989011 #9???? Ptedav 079-f2p1-Acetuna??????? 36.4? 3? 0.8241758 #10??? Ptegym 079-f2p1-Acetuna??????? 36.4? 6? 1.6483516 #11??? Ptepar 079-f2p1-Acetuna??????? 36.4? 9? 2.4725275 #12??? Pteper 079-f2p1-Acetuna??????? 24.3? 4? 1.6460905 #13??? Rhotum 079-f2p1-Acetuna??????? 36.4 30? 8.2417582 #14??? Sacbil 079-f2p1-Acetuna??????? 36.4 11? 3.0219780 #15??? Saclep 079-f2p1-Acetuna??????? 36.4 32? 8.7912088 write.csv(res,file="BruceFile.csv",row.names=FALSE) A.K. ----- Original Message ----- From: Bruce Miller <batsncats at gmail.com> To: arun <smartpink111 at yahoo.com> Cc: Sent: Sunday, April 21, 2013 6:18 PM Subject: Re: [R] Reshape or Plyr? Hi Arun, Your first example is what looks like the correct one but needs the Location_name as in your 2nd example. I do want to group by "Location_name". The full data set will include up to 100 Location_names so having this along with SPEC_CODE, Survey_Time, AI and RA is great. Then I need to add the line to write the results to a CSV or tab delimited file. No need for the Start_Day as in your 2nd example. On 4/21/2013 5:38 PM, arun wrote:> Hi Bruce, > >? From your second email, I think you solved the problem.? But, still there is confusion due to the wordings. > res1<-mutate(ddply(dat1,.(SPEC_CODE),function(x) colSums(x[,-c(1:3,6)])),RA=10*(AI/Survey_Time)) >? res1 >? ? SPEC_CODE Survey_Time AI? ? ? ? RA > 1? ? ? Buzz? ? ? ? 72.8? 8? 1.0989011 > 2? ? Eumspp? ? ? ? 24.3? 5? 2.0576132 > 3? ? ? Frag? ? ? ? 12.1 18 14.8760331 > 4? ? Molmol? ? ? ? 12.1? 1? 0.8264463 > 5? ? Molspp? ? ? ? 72.8 28? 3.8461538 > 6? ? Myokea? ? ? ? 12.2? 1? 0.8196721 > 7? ? Nocalb? ? ? ? 24.3 10? 4.1152263 > 8? ? Phyllo? ? ? ? 36.4? 4? 1.0989011 > 9? ? Ptedav? ? ? ? 36.4? 3? 0.8241758 > 10? ? Ptegym? ? ? ? 36.4? 6? 1.6483516 > 11? ? Ptepar? ? ? ? 36.4? 9? 2.4725275 > 12? ? Pteper? ? ? ? 24.3? 4? 1.6460905 > 13? ? Rhotum? ? ? ? 36.4 30? 8.2417582 > 14? ? Sacbil? ? ? ? 36.4 11? 3.0219780 > 15? ? Saclep? ? ? ? 36.4 32? 8.7912088 > > > #or > dat1$RA<-10*with(dat1,ave(AI,SPEC_CODE,FUN=sum))/with(dat1,ave(Survey_Time,SPEC_CODE,FUN=sum)) > head(dat1) > #? ? Location_name SPEC_CODE Start_Day Survey_Time AI? ? Std_AI? ? ? ? RA > #1 079-f2p1-Acetuna? ? ? Buzz 2/14/2012? ? ? ? 12.1? 1? 0.8264463? 1.098901 > #2 079-f2p1-Acetuna? ? ? Buzz 2/14/2012? ? ? ? 12.1? 1? 0.8264463? 1.098901 > #3 079-f2p1-Acetuna? ? Eumspp 2/14/2012? ? ? ? 12.1? 1? 0.8264463? 2.057613 > #4 079-f2p1-Acetuna? ? ? Frag 2/14/2012? ? ? ? 12.1 18 14.8760300 14.876033 > #5 079-f2p1-Acetuna? ? Molspp 2/14/2012? ? ? ? 12.1? 5? 4.1322310? 3.846154 > #6 079-f2p1-Acetuna? ? Molspp 2/14/2012? ? ? ? 12.1? 5? 4.1322310? 3.846154 > > > > ________________________________ >? From: Bruce Miller <batsncats at gmail.com> > To: arun <smartpink111 at yahoo.com> > Sent: Sunday, April 21, 2013 4:19 PM > Subject: Re: [R] Reshape or Plyr? >? > > > Hi Arun, > > This looks good; however the not quite correct yet. > > Looks like the mutate option may be a good place to start. > Sorry I did not clearly explain this in my original query. For >? ? ? ? adjustment by unit effort I need the "Base AI per hour" summed for >? ? ? ? the location.? Once I have that it can be standardized by 10 >? ? ? ? hours. An example for a single from your output > > #9? ? Ptedav 079-f2p1-Acetuna? ? ? ? 36.4? 3? 12.1333333? ? ? 121.333333 > Is not the correct way to determine AI per hour. So total survey hours for a 3 nights is 36.4 then the AI value is 3 so the AI per hour for Ptedav would be3/36.4 = .08241 then standardized for 10 hours of survey time would be 0.824175824. > Now I need to determine how to re-arrange your code to reflect the >? ? ? ? Summed AI/Summed survey time. > > res<- mutate(ddply(dat1,.(SPEC_CODE,Location_name),function(x) colSums(x[,-c(1:3,6)])),SumAIbySumST=Survey_Time/AI,Std.SumAIbySumST=10*SumAIbySumST) > Thanks again for your input and help. > > Bruce >-- Bruce W. Miller, Ph.D. Conservation Ecologist Neotropical Bat Projects office details 11384 Alpine Road Stanwood, Mi. 49346 Phone (231) 679-6059
Hi Bruce, From your excel sheets, it looks like for the step1, you were just taking only the "Survey Time" of the first rows of unique Start_Day for each Location_name. Step2 results looks the same as "res2" If this is the case: dat1<- read.csv("Sample_all_3_locations.csv",header=TRUE,stringsAsFactors=FALSE,sep=",") library(plyr) res1<-ddply(aggregate(Survey_Time~Location_name+Start_Day,data=dat1,function(x) head(x,1)),.(Location_name),summarize,Survey_Time=sum(Survey_Time)) res1 #????? Location_name Survey_Time #1 1_f3p1_La_Campana??????? 46.4 #2 2_f4p1_La_Campana??????? 69.5 #3?? 79_f2p1_Acetuna??????? 36.4 res2<-ddply(dat1,.(Location_name,SPEC_CODE),summarize, AI=sum(AI)) Final<- mutate(join(res1,res2),RA=10*(AI/Survey_Time)) ?head(Final) #????? Location_name Survey_Time SPEC_CODE? AI???????? RA #1 1_f3p1_La_Campana??????? 46.4????? Buzz? 20? 4.3103448 #2 1_f3p1_La_Campana??????? 46.4??? Cencen? 20? 4.3103448 #3 1_f3p1_La_Campana??????? 46.4??? Eptfur 215 46.3362069 #4 1_f3p1_La_Campana??????? 46.4????? Frag?? 3? 0.6465517 #5 1_f3p1_La_Campana??????? 46.4??? Molspp? 22? 4.7413793 #6 1_f3p1_La_Campana??????? 46.4??? Myoele?? 1? 0.2155172 dim(Final) #[1] 44? 5 A.K. ----- Original Message ----- From: Bruce Miller <batsncats at gmail.com> To: arun <smartpink111 at yahoo.com> Cc: Sent: Monday, April 22, 2013 11:32 AM Subject: Re: More Arun, Attached are two Excel sheets that I manually did? what I am hoping I can automate in R for the larger data sets. These use the same 3 locations and data I sent earlier. Step 1 sums the total survey time for each location then step 2 sums the AI values for he SPEC_CODE by location. Then the calculation of Summed AI value/Summed Survey time can be completed for each location and SPEC_Code. Hopefully this clarifies it more. Bruce -- Bruce W. Miller, Ph.D. Conservation Ecologist Neotropical Bat Projects office details 11384 Alpine Road Stanwood, Mi. 49346 Phone (231) 679-6059
Hi Bruce, I work with a lot of similar data and have to do these types of things quite often. I find it helps to keep to vectorized code as much as possible. That is, do as many of the calculations as possible outside of the aggregation code. Here's one way: library(reshape2) # stick to a variable naming convention and you'll avoid a lot of simple code errors names(d) <- gsub('_', '.', tolower(names(d)), fixed = T) dm <- melt(d, measure.var = c('ai', 'survey.time')) results <- dcast(dm, location.name + spec.code ~ variable, fun.aggregate = sum) results$ra <- results$ai / results$survey.time * 10 The output: location.name spec.code ai survey.time ra 1 079-f2p1-Acetuna Buzz 8 72.8 1.0989011 2 079-f2p1-Acetuna Eumspp 5 24.3 2.0576132 3 079-f2p1-Acetuna Frag 18 12.1 14.8760331 4 079-f2p1-Acetuna Molmol 1 12.1 0.8264463 5 079-f2p1-Acetuna Molspp 28 72.8 3.8461538 6 079-f2p1-Acetuna Myokea 1 12.2 0.8196721 7 079-f2p1-Acetuna Nocalb 10 24.3 4.1152263 8 079-f2p1-Acetuna Phyllo 4 36.4 1.0989011 9 079-f2p1-Acetuna Ptedav 3 36.4 0.8241758 10 079-f2p1-Acetuna Ptegym 6 36.4 1.6483516 11 079-f2p1-Acetuna Ptepar 9 36.4 2.4725275 12 079-f2p1-Acetuna Pteper 4 24.3 1.6460905 13 079-f2p1-Acetuna Rhotum 30 36.4 8.2417582 14 079-f2p1-Acetuna Sacbil 11 36.4 3.0219780 15 079-f2p1-Acetuna Saclep 32 36.4 8.7912088 For a simple aggregation like this, reshape is simple and fast. I tend to use plyr when things get more complicated. Jason Law Statistician City of Portland Bureau of Environmental Services Water Pollution Control Laboratory 6543 N Burlington Avenue Portland, OR 97203-5452 503-823-1038 jason.law at portlandoregon.gov -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Bruce Miller Sent: Saturday, April 20, 2013 6:55 AM To: r-help at r-project.org Subject: [R] Reshape or Plyr? H all, I have relative abundance data from >100 sites. This is from acoustic monitoring and usually the data is for 2-3 nights but in some cases my be longer like months or years for each location.. The data output from my management data base is proved by species by night for each location so data frame would look like this below. What I need to do is sum the Survey_time by Spec_Code for each location name and divide summed AI values for each Spec_code by the summed Survey time to adjust for unit effort then standardize it all by *10 to represent the relative abundance by survey hour to 10 hours. How best to do this? Using Plyr or reshape? Location name SPEC_CODE Start_Day Survey_Time AI Std AI 079-f2p1-Acetuna Buzz 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Buzz 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Eumspp 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Frag 2/14/2012 12.1 18 14.87603 079-f2p1-Acetuna Molspp 2/14/2012 12.1 5 4.132231 079-f2p1-Acetuna Molspp 2/14/2012 12.1 5 4.132231 079-f2p1-Acetuna Phyllo 2/14/2012 12.1 2 1.652893 079-f2p1-Acetuna Ptedav 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptegym 2/14/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptepar 2/14/2012 12.1 2 1.652893 079-f2p1-Acetuna Rhotum 2/14/2012 12.1 6 4.958678 079-f2p1-Acetuna Sacbil 2/14/2012 12.1 6 4.958678 079-f2p1-Acetuna Saclep 2/14/2012 12.1 11 9.090909 079-f2p1-Acetuna Buzz 2/15/2012 12.1 2 1.652893 079-f2p1-Acetuna Buzz 2/15/2012 12.1 2 1.652893 079-f2p1-Acetuna Molmol 2/15/2012 12.1 1 0.8264463 079-f2p1-Acetuna Molspp 2/15/2012 12.1 7 5.785124 079-f2p1-Acetuna Molspp 2/15/2012 12.1 7 5.785124 079-f2p1-Acetuna Nocalb 2/15/2012 12.1 6 4.958678 079-f2p1-Acetuna Phyllo 2/15/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptedav 2/15/2012 12.1 1 0.8264463 079-f2p1-Acetuna Ptegym 2/15/2012 12.1 4 3.305785 079-f2p1-Acetuna Ptepar 2/15/2012 12.1 4 3.305785 079-f2p1-Acetuna Pteper 2/15/2012 12.1 3 2.479339 079-f2p1-Acetuna Rhotum 2/15/2012 12.1 7 5.785124 079-f2p1-Acetuna Sacbil 2/15/2012 12.1 2 1.652893 079-f2p1-Acetuna Saclep 2/15/2012 12.1 6 4.958678 079-f2p1-Acetuna Buzz 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Buzz 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Eumspp 2/16/2012 12.2 4 3.278688 079-f2p1-Acetuna Molspp 2/16/2012 12.2 2 1.639344 079-f2p1-Acetuna Molspp 2/16/2012 12.2 2 1.639344 079-f2p1-Acetuna Myokea 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Nocalb 2/16/2012 12.2 4 3.278688 079-f2p1-Acetuna Phyllo 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Ptedav 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Ptegym 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Ptepar 2/16/2012 12.2 3 2.459016 079-f2p1-Acetuna Pteper 2/16/2012 12.2 1 0.8196721 079-f2p1-Acetuna Rhotum 2/16/2012 12.2 17 13.93443 079-f2p1-Acetuna Sacbil 2/16/2012 12.2 3 2.459016 079-f2p1-Acetuna Saclep 2/16/2012 12.2 15 12.29508 Thanks for any suggestions. Excel will be a mess to try to do that. Bruce [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Bruce, This could be also done by: dat1<- read.csv("Sample_all_3_locations.csv",header=TRUE,stringsAsFactors=FALSE,sep=",") library(plyr) res1<-ddply(unique(dat1[,c(1,3:4)]),.(Location_name),summarize,Survey_Time=sum(Survey_Time)) #changed here res1 #????? Location_name Survey_Time #1 1_f3p1_La_Campana??????? 46.4 #2 2_f4p1_La_Campana??????? 69.5 #3?? 79_f2p1_Acetuna??????? 36.4 #The rest as mentioned before. res2<-ddply(dat1,.(Location_name,SPEC_CODE),summarize, AI=sum(AI)) ?Final<- mutate(join(res1,res2),RA=10*(AI/Survey_Time)) A.K. ----- Original Message ----- From: Bruce Miller <batsncats at gmail.com> To: arun <smartpink111 at yahoo.com> Cc: Sent: Monday, April 22, 2013 12:55 PM Subject: Re: More Hi Arun, My last message overlapped yours. YES... that seems correct. Now if I can get this to work with teh entire data set all is be great. Thanks again for your valuable time and help. Bruce On 4/22/2013 12:16 PM, arun wrote:> Hi Bruce, > >? From your excel sheets, it looks like for the step1, you were just taking only the "Survey Time" of the first rows of unique Start_Day for each Location_name. > > Step2 results looks the same as "res2" > If this is the case: > > dat1<- read.csv("Sample_all_3_locations.csv",header=TRUE,stringsAsFactors=FALSE,sep=",") > library(plyr) > > res1<-ddply(aggregate(Survey_Time~Location_name+Start_Day,data=dat1,function(x) head(x,1)),.(Location_name),summarize,Survey_Time=sum(Survey_Time)) > res1 > #? ? ? Location_name Survey_Time > #1 1_f3p1_La_Campana? ? ? ? 46.4 > #2 2_f4p1_La_Campana? ? ? ? 69.5 > #3? 79_f2p1_Acetuna? ? ? ? 36.4 > > res2<-ddply(dat1,.(Location_name,SPEC_CODE),summarize, AI=sum(AI)) > > Final<- mutate(join(res1,res2),RA=10*(AI/Survey_Time)) >? head(Final) > #? ? ? Location_name Survey_Time SPEC_CODE? AI? ? ? ? RA > #1 1_f3p1_La_Campana? ? ? ? 46.4? ? ? Buzz? 20? 4.3103448 > #2 1_f3p1_La_Campana? ? ? ? 46.4? ? Cencen? 20? 4.3103448 > #3 1_f3p1_La_Campana? ? ? ? 46.4? ? Eptfur 215 46.3362069 > #4 1_f3p1_La_Campana? ? ? ? 46.4? ? ? Frag? 3? 0.6465517 > #5 1_f3p1_La_Campana? ? ? ? 46.4? ? Molspp? 22? 4.7413793 > #6 1_f3p1_La_Campana? ? ? ? 46.4? ? Myoele? 1? 0.2155172 > dim(Final) > #[1] 44? 5 > A.K. > > > ----- Original Message ----- > From: Bruce Miller <batsncats at gmail.com> > To: arun <smartpink111 at yahoo.com> > Cc: > Sent: Monday, April 22, 2013 11:32 AM > Subject: Re: More > > Arun, > > Attached are two Excel sheets that I manually did? what I am hoping I can automate in R for the larger data sets. > > These use the same 3 locations and data I sent earlier. > > Step 1 sums the total survey time for each location then step 2 sums the AI values for he SPEC_CODE by location. > > Then the calculation of Summed AI value/Summed Survey time can be completed for each location and SPEC_Code. > Hopefully this clarifies it more. > > Bruce > > -- Bruce W. Miller, Ph.D. > Conservation Ecologist > Neotropical Bat Projects > > > office details > 11384 Alpine Road > Stanwood, Mi. 49346 > Phone (231) 679-6059 >-- Bruce W. Miller, Ph.D. Conservation Ecologist Neotropical Bat Projects office details 11384 Alpine Road Stanwood, Mi. 49346 Phone (231) 679-6059