On 2020-08-03 21:11 +1000, Jim Lemon wrote:> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote: > > > > Hi, > > > > I have a dataset having monthly > > observations (from January to > > December) over a period of time like > > (2000 to 2018). Now, I am trying to > > take an average the value from > > January to July of each year. > > > > The data looks like > > Year Month Value > > 2000 1 25 > > 2000 2 28 > > 2000 3 22 > > .... ...... ..... > > 2000 12 26 > > 2001 1 27 > > ....... ........ > > 2018 11 30 > > 20118 12 29 > > > > Can someone help me in this regard? > > > > Many thanks in advance. > > Hi Md, > One way is to form a subset of your > data, then calculate the means by > year: > > # assume your data is named mddat > mddat2<-mddat[mddat$month < 7,] > jan2jun<-by(mddat2$value,mddat2$year,mean) > > JimHi Md, you can also define the period in a new column, and use aggregate like this: Md <- structure(list( Year = c(2000L, 2000L, 2000L, 2000L, 2001L, 2018L, 2018L), Month = c(1L, 2L, 3L, 12L, 1L, 11L, 12L), Value = c(25L, 28L, 22L, 26L, 27L, 30L, 29L)), class = "data.frame", row.names = c(NA, -7L)) Md[Md$Month %in% 1:6,"Period"] <- "first six months of the year" Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year" aggregate( formula=Value~Year+Period, data=Md, FUN=mean) Rasmus
Hello, And here is another way, with aggregate. Make up test data. set.seed(2020) df1 <- expand.grid(Year = 2000:2018, Month = 1:12) df1 <- df1[order(df1$Year),] df1$Value <- sample(20:30, nrow(df1), TRUE) head(df1) #Use subset to keep only the relevant months aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean) Hope this helps, Rui Barradas ?s 12:33 de 03/08/2020, Rasmus Liland escreveu:> On 2020-08-03 21:11 +1000, Jim Lemon wrote: >> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain <hossainmm at juniv.edu> wrote: >>> Hi, >>> >>> I have a dataset having monthly >>> observations (from January to >>> December) over a period of time like >>> (2000 to 2018). Now, I am trying to >>> take an average the value from >>> January to July of each year. >>> >>> The data looks like >>> Year Month Value >>> 2000 1 25 >>> 2000 2 28 >>> 2000 3 22 >>> .... ...... ..... >>> 2000 12 26 >>> 2001 1 27 >>> ....... ........ >>> 2018 11 30 >>> 20118 12 29 >>> >>> Can someone help me in this regard? >>> >>> Many thanks in advance. >> Hi Md, >> One way is to form a subset of your >> data, then calculate the means by >> year: >> >> # assume your data is named mddat >> mddat2<-mddat[mddat$month < 7,] >> jan2jun<-by(mddat2$value,mddat2$year,mean) >> >> Jim > Hi Md, > > you can also define the period in a new > column, and use aggregate like this: > > Md <- structure(list( > Year = c(2000L, 2000L, 2000L, > 2000L, 2001L, 2018L, 2018L), > Month = c(1L, 2L, 3L, 12L, 1L, > 11L, 12L), > Value = c(25L, 28L, 22L, 26L, > 27L, 30L, 29L)), > class = "data.frame", > row.names = c(NA, -7L)) > > Md[Md$Month %in% > 1:6,"Period"] <- "first six months of the year" > Md[Md$Month %in% 7:12,"Period"] <- "last six months of the year" > > aggregate( > formula=Value~Year+Period, > data=Md, > FUN=mean) > > Rasmus > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Este e-mail foi verificado em termos de v?rus pelo software antiv?rus Avast. https://www.avast.com/antivirus
Hello, Please keep cc-ing the list R-help is threaded and questions and answers might be of help to others in the future. As for the question, see if the following code does what you want. First, create a logical index i of the months between 7 and 3 and use that index to subset the original data.frame. Then, a cumsum trick gives a vector M defining the data grouping. Group and compute the Value means with aggregate. Finally, since each group spans a year border, create a more meaningful Years column and put everything together. df1 <- read.csv("mddat.csv") i <- with(df1, (Month >= 7 & Month <= 12) | (Month >= 1 & Month <= 3)) df2 <- df1[i, ] M <- cumsum(c(FALSE, diff(as.integer(row.names(df2))) > 1)) agg <- aggregate(Value ~ M, df2, mean) Years <- sapply(split(df2$Year, M), function(x){paste(x[1], x[length(x)], sep = "-")}) final <- cbind.data.frame(Years, Value = agg[["Value"]]) head(final) # Years Value #0 1975-1975 87.00000 #1 1975-1976 89.44444 #2 1976-1977 85.77778 #3 1977-1978 81.55556 #4 1978-1979 71.55556 #5 1979-1980 75.77778 Hope this helps, Rui Barradas ?s 20:44 de 04/08/20, Md. Moyazzem Hossain escreveu:> Dear Rui, > > Thanks a lot for your help. > > It is working. Now I am also trying to find the average of values for > *July 1975 to March 1976* and record as the value of the year 1975. > Moreover, I want to continue it up to the year 2017. You may check the > attached file for data (mddat.csv). > > I use the following function but got error > aggregate(Value ~ Year, data = subset(df1, Month >= 7 & Month <= 3), FUN > = mean) > > Please help me again. Thanks in advance. > > Best Regards, > Md > > On Mon, Aug 3, 2020 at 11:28 PM Rui Barradas <ruipbarradas at sapo.pt > <mailto:ruipbarradas at sapo.pt>> wrote: > > Hello, > > And here is another way, with aggregate. > > Make up test data. > > set.seed(2020) > df1 <- expand.grid(Year = 2000:2018, Month = 1:12) > df1 <- df1[order(df1$Year),] > df1$Value <- sample(20:30, nrow(df1), TRUE) > head(df1) > > > #Use subset to keep only the relevant months > aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean) > > > Hope this helps, > > Rui Barradas > > ?s 12:33 de 03/08/2020, Rasmus Liland escreveu: > > On 2020-08-03 21:11 +1000, Jim Lemon wrote: > >> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain > <hossainmm at juniv.edu <mailto:hossainmm at juniv.edu>> wrote: > >>> Hi, > >>> > >>> I have a dataset having monthly > >>> observations (from January to > >>> December) over a period of time like > >>> (2000 to 2018). Now, I am trying to > >>> take an average the value from > >>> January to July of each year. > >>> > >>> The data looks like > >>> Year? ? Month? Value > >>> 2000? ? 1? ? ? ? ?25 > >>> 2000? ? 2? ? ? ? ?28 > >>> 2000? ? 3? ? ? ? ?22 > >>> ....? ? ......? ? ? ..... > >>> 2000? ? 12? ? ? ?26 > >>> 2001? ? ?1? ? ? ?27 > >>> .......? ? ? ? ?........ > >>> 2018? ? 11? ? ? ?30 > >>> 20118? ?12? ? ? 29 > >>> > >>> Can someone help me in this regard? > >>> > >>> Many thanks in advance. > >> Hi Md, > >> One way is to form a subset of your > >> data, then calculate the means by > >> year: > >> > >> # assume your data is named mddat > >> mddat2<-mddat[mddat$month < 7,] > >> jan2jun<-by(mddat2$value,mddat2$year,mean) > >> > >> Jim > > Hi Md, > > > > you can also define the period in a new > > column, and use aggregate like this: > > > >? ? ? ?Md <- structure(list( > >? ? ? ?Year = c(2000L, 2000L, 2000L, > >? ? ? ?2000L, 2001L, 2018L, 2018L), > >? ? ? ?Month = c(1L, 2L, 3L, 12L, 1L, > >? ? ? ?11L, 12L), > >? ? ? ?Value = c(25L, 28L, 22L, 26L, > >? ? ? ?27L, 30L, 29L)), > >? ? ? ?class = "data.frame", > >? ? ? ?row.names = c(NA, -7L)) > > > >? ? ? ?Md[Md$Month %in% > >? ? ? ? ? ? ? ?1:6,"Period"] <- "first six months of the year" > >? ? ? ?Md[Md$Month %in% 7:12,"Period"] <- "last six months of the > year" > > > >? ? ? ?aggregate( > >? ? ? ? ?formula=Value~Year+Period, > >? ? ? ? ?data=Md, > >? ? ? ? ?FUN=mean) > > > > Rasmus > > > > ______________________________________________ > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > -- > Este e-mail foi verificado em termos de v?rus pelo software > antiv?rus Avast. > https://www.avast.com/antivirus > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- > To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > >
Dear Rui, Thank you for your nice help. Take care and be safe. Md On Tue, Aug 4, 2020 at 10:45 PM Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > Please keep cc-ing the list R-help is threaded and questions and answers > might be of help to others in the future. > > As for the question, see if the following code does what you want. > First, create a logical index i of the months between 7 and 3 and use > that index to subset the original data.frame. Then, a cumsum trick gives > a vector M defining the data grouping. Group and compute the Value means > with aggregate. Finally, since each group spans a year border, create a > more meaningful Years column and put everything together. > > df1 <- read.csv("mddat.csv") > > i <- with(df1, (Month >= 7 & Month <= 12) | (Month >= 1 & Month <= 3)) > df2 <- df1[i, ] > M <- cumsum(c(FALSE, diff(as.integer(row.names(df2))) > 1)) > > agg <- aggregate(Value ~ M, df2, mean) > Years <- sapply(split(df2$Year, M), function(x){paste(x[1], > x[length(x)], sep = "-")}) > final <- cbind.data.frame(Years, Value = agg[["Value"]]) > > head(final) > # Years Value > #0 1975-1975 87.00000 > #1 1975-1976 89.44444 > #2 1976-1977 85.77778 > #3 1977-1978 81.55556 > #4 1978-1979 71.55556 > #5 1979-1980 75.77778 > > > Hope this helps, > > Rui Barradas > > > > ?s 20:44 de 04/08/20, Md. Moyazzem Hossain escreveu: > > Dear Rui, > > > > Thanks a lot for your help. > > > > It is working. Now I am also trying to find the average of values for > > *July 1975 to March 1976* and record as the value of the year 1975. > > Moreover, I want to continue it up to the year 2017. You may check the > > attached file for data (mddat.csv). > > > > I use the following function but got error > > aggregate(Value ~ Year, data = subset(df1, Month >= 7 & Month <= 3), FUN > > = mean) > > > > Please help me again. Thanks in advance. > > > > Best Regards, > > Md > > > > On Mon, Aug 3, 2020 at 11:28 PM Rui Barradas <ruipbarradas at sapo.pt > > <mailto:ruipbarradas at sapo.pt>> wrote: > > > > Hello, > > > > And here is another way, with aggregate. > > > > Make up test data. > > > > set.seed(2020) > > df1 <- expand.grid(Year = 2000:2018, Month = 1:12) > > df1 <- df1[order(df1$Year),] > > df1$Value <- sample(20:30, nrow(df1), TRUE) > > head(df1) > > > > > > #Use subset to keep only the relevant months > > aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean) > > > > > > Hope this helps, > > > > Rui Barradas > > > > ?s 12:33 de 03/08/2020, Rasmus Liland escreveu: > > > On 2020-08-03 21:11 +1000, Jim Lemon wrote: > > >> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain > > <hossainmm at juniv.edu <mailto:hossainmm at juniv.edu>> wrote: > > >>> Hi, > > >>> > > >>> I have a dataset having monthly > > >>> observations (from January to > > >>> December) over a period of time like > > >>> (2000 to 2018). Now, I am trying to > > >>> take an average the value from > > >>> January to July of each year. > > >>> > > >>> The data looks like > > >>> Year Month Value > > >>> 2000 1 25 > > >>> 2000 2 28 > > >>> 2000 3 22 > > >>> .... ...... ..... > > >>> 2000 12 26 > > >>> 2001 1 27 > > >>> ....... ........ > > >>> 2018 11 30 > > >>> 20118 12 29 > > >>> > > >>> Can someone help me in this regard? > > >>> > > >>> Many thanks in advance. > > >> Hi Md, > > >> One way is to form a subset of your > > >> data, then calculate the means by > > >> year: > > >> > > >> # assume your data is named mddat > > >> mddat2<-mddat[mddat$month < 7,] > > >> jan2jun<-by(mddat2$value,mddat2$year,mean) > > >> > > >> Jim > > > Hi Md, > > > > > > you can also define the period in a new > > > column, and use aggregate like this: > > > > > > Md <- structure(list( > > > Year = c(2000L, 2000L, 2000L, > > > 2000L, 2001L, 2018L, 2018L), > > > Month = c(1L, 2L, 3L, 12L, 1L, > > > 11L, 12L), > > > Value = c(25L, 28L, 22L, 26L, > > > 27L, 30L, 29L)), > > > class = "data.frame", > > > row.names = c(NA, -7L)) > > > > > > Md[Md$Month %in% > > > 1:6,"Period"] <- "first six months of the year" > > > Md[Md$Month %in% 7:12,"Period"] <- "last six months of the > > year" > > > > > > aggregate( > > > formula=Value~Year+Period, > > > data=Md, > > > FUN=mean) > > > > > > Rasmus > > > > > > ______________________________________________ > > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > > -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > > Este e-mail foi verificado em termos de v?rus pelo software > > antiv?rus Avast. > > https://www.avast.com/antivirus > > > > ______________________________________________ > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- > > To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > >-- Best Regards, Md. Moyazzem Hossain Associate Professor Department of Statistics Jahangirnagar University Savar, Dhaka-1342 Bangladesh Website: http://www.juniv.edu/teachers/hossainmm Research: *Google Scholar <https://scholar.google.com/citations?user=-U03XCgAAAAJ&hl=en&oi=ao>*; *ResearchGate <https://www.researchgate.net/profile/Md_Hossain107>*; *ORCID iD <https://orcid.org/0000-0003-3593-6936>* [[alternative HTML version deleted]]