I am trying to calculate monthly means by year of phosphates and nitrates from a multi year data set. Can anybody help me out with the most effective way to do this? My data looks like this: Collection_Date Test.Name Value 2000-01-24 17:00:00 Phosphate 0.108 2000-01-24 17:00:00 Nitrate 0.037 2001-11-12 08:45:00 Phosphate 0.45 ... Thanks and sorry for the blatantly "newbie" question. [[alternative HTML version deleted]]
On Tue, 2011-03-15 at 07:24 -0700, Carl Nim wrote:> I am trying to calculate monthly means by year of phosphates and nitrates from a multi year data set. Can anybody help me out with the most effective way to do this? > > My data looks like this: > > Collection_Date Test.Name Value > 2000-01-24 17:00:00 Phosphate 0.108 > 2000-01-24 17:00:00 Nitrate 0.037 > 2001-11-12 08:45:00 Phosphate 0.45 > ... > > > Thanks and sorry for the blatantly "newbie" question.Let's say you have a data.frame, mydata, with the above data. Then you could write a function mymean <- function(year, month, substance) { mysub <- subset(mydata, format(as.PPSIXlt.date(Collection_Date), "%Y") == year, format(as.POSIXlt.date(Collection_Date), "%b") == month, Test.Name == substance) return(mean(mysub$value)) } Then you need to apply this function to every combination of year, month and substance in your data.frame. You can do this by M <- expand.grid(2000:2010, month.abb, c("Phosphate", "Nitrate")) meanValues <- apply(M, 1, mymean(myRowEntry[1], myRowEntry[2], myRowEntry[3]) In the end you can put the result together with M, i.e. M <- cbind(M, meanValues)> > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Daniel Kaschek Physikalisches Institut, Freiburg Hermann-Herder-Str. 3 79104 Freiburg Office: Westbau, 02020 Phone: +49-761-203-8531 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 490 bytes Desc: This is a digitally signed message part URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110315/ab04e7ce/attachment.bin>
You could use the by() function after a little data manipulation. The first line will create a field just of the date portion of your datetime field. Then you can use the by() function to use the indices you desire to calculate the mean. mSamp$cDT <- chron(unlist(strsplit(as.character(mSamp$Collection_Date), split=" "))[seq(1,nrow(mSamp),2)], format="y-m-d", out.format="m/d/y") with(mSamp, by(Value, list(years(cDT), months(cDT)), mean, na.rm=TRUE)) I like using the chron package for my dates and times. Adrian FOR OFFICIAL USE ONLY.? THIS MESSAGE MAY CONTAIN PERSONNEL DATA OR INFORMATION COVERED BY THE PRIVACY ACT OF 1974.? PLEASE ENSURE THIS INFORMATION IS PROTECTED FROM UNAUTHORIZED ACCESS AND OR DISCLOSURE. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Carl Nim Sent: Tuesday, March 15, 2011 10:25 AM To: r-help at r-project.org Subject: [R] Calculate monthly means I am trying to calculate monthly means by year of phosphates and nitrates from a multi year data set. Can anybody help me out with the most effective way to do this? My data looks like this: Collection_Date?????????????????? Test.Name????????????? Value 2000-01-24 17:00:00??????????? Phosphate?????????????? 0.108 2000-01-24 17:00:00??????????? Nitrate???????????????????? 0.037 2001-11-12 08:45:00??????????? Phosphate?????????????? 0.45 ... Thanks and sorry for the blatantly "newbie" question. [[alternative HTML version deleted]]