Hi, I have a dataset having monthly observations (from January to December) over a period of time like (2000 to 2018). Now, I am trying to take an average the value from January to July of each year. The data looks like Year Month Value 2000 1 25 2000 2 28 2000 3 22 .... ...... ..... 2000 12 26 2001 1 27 ....... ........ 2018 11 30 20118 12 29 Can someone help me in this regard? Many thanks in advance. *Regards,* *Md* [[alternative HTML version deleted]]
Hi Md, One way is to form a subset of your data, then calculate the means by year: # assume your data is named mddat mddat2<-mddat[mddat$month < 7,] jan2jun<-by(mddat2$value,mddat2$year,mean) Jim On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote:> > Hi, > > I have a dataset having monthly observations (from January to December) > over a period of time like (2000 to 2018). Now, I am trying to take an > average the value from January to July of each year. > > The data looks like > Year Month Value > 2000 1 25 > 2000 2 28 > 2000 3 22 > .... ...... ..... > 2000 12 26 > 2001 1 27 > ....... ........ > 2018 11 30 > 20118 12 29 > > Can someone help me in this regard? > > Many thanks in advance. > > *Regards,* > *Md* > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On 2020-08-03 21:11 +1000, Jim Lemon wrote:> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote: > > > > Hi, > > > > I have a dataset having monthly > > observations (from January to > > December) over a period of time like > > (2000 to 2018). Now, I am trying to > > take an average the value from > > January to July of each year. > > > > The data looks like > > Year Month Value > > 2000 1 25 > > 2000 2 28 > > 2000 3 22 > > .... ...... ..... > > 2000 12 26 > > 2001 1 27 > > ....... ........ > > 2018 11 30 > > 20118 12 29 > > > > Can someone help me in this regard? > > > > Many thanks in advance. > > Hi Md, > One way is to form a subset of your > data, then calculate the means by > year: > > # assume your data is named mddat > mddat2<-mddat[mddat$month < 7,] > jan2jun<-by(mddat2$value,mddat2$year,mean) > > JimHi Md, you can also define the period in a new column, and use aggregate like this: Md <- structure(list( Year = c(2000L, 2000L, 2000L, 2000L, 2001L, 2018L, 2018L), Month = c(1L, 2L, 3L, 12L, 1L, 11L, 12L), Value = c(25L, 28L, 22L, 26L, 27L, 30L, 29L)), class = "data.frame", row.names = c(NA, -7L)) Md[Md$Month %in% 1:6,"Period"] <- "first six months of the year" Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year" aggregate( formula=Value~Year+Period, data=Md, FUN=mean) Rasmus
Your problem is in the subset operation. You have asked for a value of month greater or equal to 7 and less than or equal to 6. You probably got an error message that told you that the data were of length zero or something similar. If you check the result of that statement:> mddat$month >= 7 & mddat$month <= 6logical(0) In other words, the two logical statements when ANDed cannot produce a result. A number cannot be greater than or equal to 7 AND less than or equal to 6. What you want is: mddat2<-mddat[mddat$Year == 1975 & mddat$Month >= 7 | mddat$Year == 1976 & mddat$Month <= 6,] mean(mddat2$Value) [1] 88.91667 Apart from that, your email client is inserting EOL characters that cause an error when pasted into R. Error: unexpected input in "?" Probably due to MS Outlook, this has been happening quite a bit lately. Jim On Mon, Aug 3, 2020 at 11:30 PM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote:> > Dear Jim, > > Thank you very much. It is working now. > > However, I am also trying to find the average of the value from July 1975 to June 1976 and recorded as the value for the year 1975 but got an error message. I am attaching the data file here. Please check the attachment. > > mddat=read.csv("F:/mddat.csv", header=TRUE) > mddat2<-mddat[mddat$Month >=7 & mddat$Month <= 6,] > jan2jun<-by(mddat2$Value,mddat2$Year,mean) > jan2jun > > Please help me again and many thanks in advance. > > Md > > > On Mon, Aug 3, 2020 at 12:33 PM Rasmus Liland <jral at posteo.no> wrote: >> >> On 2020-08-03 21:11 +1000, Jim Lemon wrote: >> > On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote: >> > > >> > > Hi, >> > > >> > > I have a dataset having monthly >> > > observations (from January to >> > > December) over a period of time like >> > > (2000 to 2018). Now, I am trying to >> > > take an average the value from >> > > January to July of each year. >> > > >> > > The data looks like >> > > Year Month Value >> > > 2000 1 25 >> > > 2000 2 28 >> > > 2000 3 22 >> > > .... ...... ..... >> > > 2000 12 26 >> > > 2001 1 27 >> > > ....... ........ >> > > 2018 11 30 >> > > 20118 12 29 >> > > >> > > Can someone help me in this regard? >> > > >> > > Many thanks in advance. >> > >> > Hi Md, >> > One way is to form a subset of your >> > data, then calculate the means by >> > year: >> > >> > # assume your data is named mddat >> > mddat2<-mddat[mddat$month < 7,] >> > jan2jun<-by(mddat2$value,mddat2$year,mean) >> > >> > Jim >> >> Hi Md, >> >> you can also define the period in a new >> column, and use aggregate like this: >> >> Md <- structure(list( >> Year = c(2000L, 2000L, 2000L, >> 2000L, 2001L, 2018L, 2018L), >> Month = c(1L, 2L, 3L, 12L, 1L, >> 11L, 12L), >> Value = c(25L, 28L, 22L, 26L, >> 27L, 30L, 29L)), >> class = "data.frame", >> row.names = c(NA, -7L)) >> >> Md[Md$Month %in% >> 1:6,"Period"] <- "first six months of the year" >> Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year" >> >> aggregate( >> formula=Value~Year+Period, >> data=Md, >> FUN=mean) >> >> Rasmus > > >
Hi Md, I think the errors are that you forgot to initialize "m", calculated the mean outside the loops and forgot the final brace: m<-rep(0,44) for(i in 1975:2017) { for(j in 1:44) { mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 | mddat$Year == (i+1) & mddat$Month <= 6,] m[j]=mean(mddat2$Value) } } Jim On Wed, Aug 5, 2020 at 6:04 AM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote:> > Dear Jim, > > Thank you very much. You are right. It is good now. However, I want to continue it up to the year 2017. > > I use the following code but got the error > > for(i in 1975:2017){ > for(j in 1:44){ > mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 | > mddat$Year == (i+1) & mddat$Month <= 6,] > } > m[j]=mean(mddat2$Value) > > } > m > > Please help me in this regard. Many thanks in advance. > > Regards, > Md > > On Tue, Aug 4, 2020 at 8:41 AM Jim Lemon <drjimlemon at gmail.com> wrote: >> >> Your problem is in the subset operation. You have asked for a value of >> month greater or equal to 7 and less than or equal to 6. You probably >> got an error message that told you that the data were of length zero >> or something similar. If you check the result of that statement: >> >> > mddat$month >= 7 & mddat$month <= 6 >> logical(0) >> >> In other words, the two logical statements when ANDed cannot produce a >> result. A number cannot be greater than or equal to 7 AND less than or >> equal to 6. What you want is: >> >> mddat2<-mddat[mddat$Year == 1975 & mddat$Month >= 7 | >> mddat$Year == 1976 & mddat$Month <= 6,] >> mean(mddat2$Value) >> [1] 88.91667 >> >> Apart from that, your email client is inserting EOL characters that >> cause an error when pasted into R. >> >> Error: unexpected input in "?" >> >> Probably due to MS Outlook, this has been happening quite a bit lately. >> >> Jim >> >> On Mon, Aug 3, 2020 at 11:30 PM Md. Moyazzem Hossain >> <hossainmm at juniv.edu> wrote: >> > >> > Dear Jim, >> > >> > Thank you very much. It is working now. >> > >> > However, I am also trying to find the average of the value from July 1975 to June 1976 and recorded as the value for the year 1975 but got an error message. I am attaching the data file here. Please check the attachment. >> > >> > mddat=read.csv("F:/mddat.csv", header=TRUE) >> > mddat2<-mddat[mddat$Month >=7 & mddat$Month <= 6,] >> > jan2jun<-by(mddat2$Value,mddat2$Year,mean) >> > jan2jun >> > >> > Please help me again and many thanks in advance. >> > >> > Md >> > >> > >> > On Mon, Aug 3, 2020 at 12:33 PM Rasmus Liland <jral at posteo.no> wrote: >> >> >> >> On 2020-08-03 21:11 +1000, Jim Lemon wrote: >> >> > On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote: >> >> > > >> >> > > Hi, >> >> > > >> >> > > I have a dataset having monthly >> >> > > observations (from January to >> >> > > December) over a period of time like >> >> > > (2000 to 2018). Now, I am trying to >> >> > > take an average the value from >> >> > > January to July of each year. >> >> > > >> >> > > The data looks like >> >> > > Year Month Value >> >> > > 2000 1 25 >> >> > > 2000 2 28 >> >> > > 2000 3 22 >> >> > > .... ...... ..... >> >> > > 2000 12 26 >> >> > > 2001 1 27 >> >> > > ....... ........ >> >> > > 2018 11 30 >> >> > > 20118 12 29 >> >> > > >> >> > > Can someone help me in this regard? >> >> > > >> >> > > Many thanks in advance. >> >> > >> >> > Hi Md, >> >> > One way is to form a subset of your >> >> > data, then calculate the means by >> >> > year: >> >> > >> >> > # assume your data is named mddat >> >> > mddat2<-mddat[mddat$month < 7,] >> >> > jan2jun<-by(mddat2$value,mddat2$year,mean) >> >> > >> >> > Jim >> >> >> >> Hi Md, >> >> >> >> you can also define the period in a new >> >> column, and use aggregate like this: >> >> >> >> Md <- structure(list( >> >> Year = c(2000L, 2000L, 2000L, >> >> 2000L, 2001L, 2018L, 2018L), >> >> Month = c(1L, 2L, 3L, 12L, 1L, >> >> 11L, 12L), >> >> Value = c(25L, 28L, 22L, 26L, >> >> 27L, 30L, 29L)), >> >> class = "data.frame", >> >> row.names = c(NA, -7L)) >> >> >> >> Md[Md$Month %in% >> >> 1:6,"Period"] <- "first six months of the year" >> >> Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year" >> >> >> >> aggregate( >> >> formula=Value~Year+Period, >> >> data=Md, >> >> FUN=mean) >> >> >> >> Rasmus >> > >> > >> > > > >
Dear Jim, Thanks a lot for your support. Take care. Md On Wed, Aug 5, 2020 at 1:06 PM Jim Lemon <drjimlemon at gmail.com> wrote:> Hi Md, > I think the errors are that you forgot to initialize "m", calculated > the mean outside the loops and forgot the final brace: > > m<-rep(0,44) > for(i in 1975:2017) { > for(j in 1:44) { > mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 | > mddat$Year == (i+1) & mddat$Month <= 6,] > m[j]=mean(mddat2$Value) > } > } > > Jim > > On Wed, Aug 5, 2020 at 6:04 AM Md. Moyazzem Hossain <hossainmm at juniv.edu> > wrote: > > > > Dear Jim, > > > > Thank you very much. You are right. It is good now. However, I want to > continue it up to the year 2017. > > > > I use the following code but got the error > > > > for(i in 1975:2017){ > > for(j in 1:44){ > > mddat2[j]<-mddat[mddat$Year == i & mddat$Month >= 7 | > > mddat$Year == (i+1) & mddat$Month <= 6,] > > } > > m[j]=mean(mddat2$Value) > > > > } > > m > > > > Please help me in this regard. Many thanks in advance. > > > > Regards, > > Md > > > > On Tue, Aug 4, 2020 at 8:41 AM Jim Lemon <drjimlemon at gmail.com> wrote: > >> > >> Your problem is in the subset operation. You have asked for a value of > >> month greater or equal to 7 and less than or equal to 6. You probably > >> got an error message that told you that the data were of length zero > >> or something similar. If you check the result of that statement: > >> > >> > mddat$month >= 7 & mddat$month <= 6 > >> logical(0) > >> > >> In other words, the two logical statements when ANDed cannot produce a > >> result. A number cannot be greater than or equal to 7 AND less than or > >> equal to 6. What you want is: > >> > >> mddat2<-mddat[mddat$Year == 1975 & mddat$Month >= 7 | > >> mddat$Year == 1976 & mddat$Month <= 6,] > >> mean(mddat2$Value) > >> [1] 88.91667 > >> > >> Apart from that, your email client is inserting EOL characters that > >> cause an error when pasted into R. > >> > >> Error: unexpected input in "?" > >> > >> Probably due to MS Outlook, this has been happening quite a bit lately. > >> > >> Jim > >> > >> On Mon, Aug 3, 2020 at 11:30 PM Md. Moyazzem Hossain > >> <hossainmm at juniv.edu> wrote: > >> > > >> > Dear Jim, > >> > > >> > Thank you very much. It is working now. > >> > > >> > However, I am also trying to find the average of the value from July > 1975 to June 1976 and recorded as the value for the year 1975 but got an > error message. I am attaching the data file here. Please check the > attachment. > >> > > >> > mddat=read.csv("F:/mddat.csv", header=TRUE) > >> > mddat2<-mddat[mddat$Month >=7 & mddat$Month <= 6,] > >> > jan2jun<-by(mddat2$Value,mddat2$Year,mean) > >> > jan2jun > >> > > >> > Please help me again and many thanks in advance. > >> > > >> > Md > >> > > >> > > >> > On Mon, Aug 3, 2020 at 12:33 PM Rasmus Liland <jral at posteo.no> wrote: > >> >> > >> >> On 2020-08-03 21:11 +1000, Jim Lemon wrote: > >> >> > On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain < > hossainmm at juniv.edu> wrote: > >> >> > > > >> >> > > Hi, > >> >> > > > >> >> > > I have a dataset having monthly > >> >> > > observations (from January to > >> >> > > December) over a period of time like > >> >> > > (2000 to 2018). Now, I am trying to > >> >> > > take an average the value from > >> >> > > January to July of each year. > >> >> > > > >> >> > > The data looks like > >> >> > > Year Month Value > >> >> > > 2000 1 25 > >> >> > > 2000 2 28 > >> >> > > 2000 3 22 > >> >> > > .... ...... ..... > >> >> > > 2000 12 26 > >> >> > > 2001 1 27 > >> >> > > ....... ........ > >> >> > > 2018 11 30 > >> >> > > 20118 12 29 > >> >> > > > >> >> > > Can someone help me in this regard? > >> >> > > > >> >> > > Many thanks in advance. > >> >> > > >> >> > Hi Md, > >> >> > One way is to form a subset of your > >> >> > data, then calculate the means by > >> >> > year: > >> >> > > >> >> > # assume your data is named mddat > >> >> > mddat2<-mddat[mddat$month < 7,] > >> >> > jan2jun<-by(mddat2$value,mddat2$year,mean) > >> >> > > >> >> > Jim > >> >> > >> >> Hi Md, > >> >> > >> >> you can also define the period in a new > >> >> column, and use aggregate like this: > >> >> > >> >> Md <- structure(list( > >> >> Year = c(2000L, 2000L, 2000L, > >> >> 2000L, 2001L, 2018L, 2018L), > >> >> Month = c(1L, 2L, 3L, 12L, 1L, > >> >> 11L, 12L), > >> >> Value = c(25L, 28L, 22L, 26L, > >> >> 27L, 30L, 29L)), > >> >> class = "data.frame", > >> >> row.names = c(NA, -7L)) > >> >> > >> >> Md[Md$Month %in% > >> >> 1:6,"Period"] <- "first six months of the year" > >> >> Md[Md$Month %in% 7:12,"Period"] <- "last six months of the > year" > >> >> > >> >> aggregate( > >> >> formula=Value~Year+Period, > >> >> data=Md, > >> >> FUN=mean) > >> >> > >> >> Rasmus > >> > > >> > > >> > > > > > > > >-- Best Regards, Md. Moyazzem Hossain Associate Professor Department of Statistics Jahangirnagar University Savar, Dhaka-1342 Bangladesh Website: http://www.juniv.edu/teachers/hossainmm Research: *Google Scholar <https://scholar.google.com/citations?user=-U03XCgAAAAJ&hl=en&oi=ao>*; *ResearchGate <https://www.researchgate.net/profile/Md_Hossain107>*; *ORCID iD <https://orcid.org/0000-0003-3593-6936>* [[alternative HTML version deleted]]