Dear list users, I have wind data with frequency of 10 minutes (three years data). For simplicity let me use only max wind speed. I need to reduce the frequency to 30 minutes, at 00 (taking the mean of data at 40, 50 and 00 minutes) and at 30 (taking the mean of data at 10, 20 and 30 minutes) of each hour. The simple code here reported works well, but the column "interval" groups data forward, not backward: init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1") fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1") mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins")) mydf$vmax <- round(rnorm(13, 35, 10)) mydf$interval <- cut(mydf$data_POSIX, , breaks="30 min") means <- aggregate(vmax ~ interval, mydf, mean) data_POSIX vmax interval 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 2 2018-02-01 00:10:00 41 2018-02-01 00:00:00 3 2018-02-01 00:20:00 46 2018-02-01 00:00:00 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 5 2018-02-01 00:40:00 34 2018-02-01 00:30:00 6 2018-02-01 00:50:00 32 2018-02-01 00:30:00 ... I should work with data_POSIX vmax interval 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 2 2018-02-01 00:10:00 41 2018-02-01 00:30:00 3 2018-02-01 00:20:00 46 2018-02-01 00:30:00 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 5 2018-02-01 00:40:00 34 2018-02-01 00:00:00 6 2018-02-01 00:50:00 32 2018-02-01 00:00:00 ... Is there a way to modify this code to groupp data correctly? (I would prefer using only the base package) Thank you for your help Stefano (oo) --oOO--( )--OOo---------------- Stefano Sofia PhD Civil Protection - Marche Region Meteo Section Snow Section Via del Colle Ameno 5 60126 Torrette di Ancona, Ancona Uff: 071 806 7743 E-mail: stefano.sofia at regione.marche.it ---Oo---------oO---------------- ________________________________ AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed urgenza, la risposta al presente messaggio di posta elettronica pu? essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. -- Questo messaggio stato analizzato da Libra ESVA ed risultato non infetto. This message was scanned by Libra ESVA and is believed to be clean. [[alternative HTML version deleted]]
Hi Stefano, I read in your date-time as two separate fields for convenience. You can split your single field at the space to get the same result. ssdf<-read.table(text="date_POSIX time_POSIX vmax 2018-02-01 00:00:00 27 2018-02-01 00:10:00 41 2018-02-01 00:20:00 46 2018-02-01 00:30:00 39 2018-02-01 00:40:00 34 2018-02-01 00:50:00 32", header=TRUE,stringsAsFactors=FALSE) # get the time of day as seconds from the time field ssdf$seconds<-as.numeric(strptime(ssdf$time_POSIX,"%H:%M:%S")) # subtract whatever current date strptime guesses for the date ssdf$seconds<-ssdf$seconds-min(ssdf$seconds) # create an AM/PM variable ssdf$ampm<-ifelse(ssdf$seconds > 0 & ssdf$seconds <= 1800,"am","pm") means<-aggregate(vmax~ampm,ssdf,mean) Jim On Thu, Dec 3, 2020 at 4:55 AM Stefano Sofia <stefano.sofia at regione.marche.it> wrote:> > Dear list users, > I have wind data with frequency of 10 minutes (three years data). For simplicity let me use only max wind speed. > I need to reduce the frequency to 30 minutes, at 00 (taking the mean of data at 40, 50 and 00 minutes) and at 30 (taking the mean of data at 10, 20 and 30 minutes) of each hour. > > The simple code here reported works well, but the column "interval" groups data forward, not backward: > > init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1") > fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1") > mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins")) > mydf$vmax <- round(rnorm(13, 35, 10)) > mydf$interval <- cut(mydf$data_POSIX, , breaks="30 min") > means <- aggregate(vmax ~ interval, mydf, mean) > > data_POSIX vmax interval > 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 > 2 2018-02-01 00:10:00 41 2018-02-01 00:00:00 > 3 2018-02-01 00:20:00 46 2018-02-01 00:00:00 > 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 > 5 2018-02-01 00:40:00 34 2018-02-01 00:30:00 > 6 2018-02-01 00:50:00 32 2018-02-01 00:30:00 > ... > > I should work with > > data_POSIX vmax interval > 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 > 2 2018-02-01 00:10:00 41 2018-02-01 00:30:00 > 3 2018-02-01 00:20:00 46 2018-02-01 00:30:00 > 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 > 5 2018-02-01 00:40:00 34 2018-02-01 00:00:00 > 6 2018-02-01 00:50:00 32 2018-02-01 00:00:00 > ... > > > Is there a way to modify this code to groupp data correctly? (I would prefer using only the base package) > > Thank you for your help > Stefano > > > > (oo) > --oOO--( )--OOo---------------- > Stefano Sofia PhD > Civil Protection - Marche Region > Meteo Section > Snow Section > Via del Colle Ameno 5 > 60126 Torrette di Ancona, Ancona > Uff: 071 806 7743 > E-mail: stefano.sofia at regione.marche.it > ---Oo---------oO---------------- > > ________________________________ > > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed urgenza, la risposta al presente messaggio di posta elettronica pu? essere visionata da persone estranee al destinatario. > IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. > > -- > Questo messaggio stato analizzato da Libra ESVA ed risultato non infetto. > This message was scanned by Libra ESVA and is believed to be clean. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Stefano I think either of these does what you need... 1: This gets the interval column as you want it, but utilises the lubridate package: library(lubridate) mydf$interval = ceiling_date(mydf$data_POSIX, unit="30 minutes?) 2: Alternative in base R is a bit more long winded: convert the date to numeric (in seconds), divide by 1800 (seconds in 30min), take the ceiling, and convert back. mydf$interval = as.POSIXct(ceiling(as.numeric(mydf$data_POSIX)/1800)*1800, origin="1970-01-01", tz="Etc/GMT-1") Cheers> On 2 Dec 2020, at 17:53, Stefano Sofia <stefano.sofia at regione.marche.it> wrote: > > init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1") > fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1") > mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins")) > mydf$vmax <- round(rnorm(13, 35, 10)) > mydf$interval <- cut(mydf$data_POSIX, , breaks="30 min") > means <- aggregate(vmax ~ interval, mydf, mean)[[alternative HTML version deleted]]
To be honest, I would do this one of two ways. (1) Use ?decimate from library(signal), decimating by a factor of three. (2) Convert the variable to an (n/3)*3 matrix using as.matrix then use rowMeans or apply. On Thu, 3 Dec 2020 at 06:55, Stefano Sofia <stefano.sofia at regione.marche.it> wrote:> Dear list users, > I have wind data with frequency of 10 minutes (three years data). For > simplicity let me use only max wind speed. > I need to reduce the frequency to 30 minutes, at 00 (taking the mean of > data at 40, 50 and 00 minutes) and at 30 (taking the mean of data at 10, 20 > and 30 minutes) of each hour. > > The simple code here reported works well, but the column "interval" groups > data forward, not backward: > > init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M", > tz="Etc/GMT-1") > fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M", > tz="Etc/GMT-1") > mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins")) > mydf$vmax <- round(rnorm(13, 35, 10)) > mydf$interval <- cut(mydf$data_POSIX, , breaks="30 min") > means <- aggregate(vmax ~ interval, mydf, mean) > > data_POSIX vmax interval > 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 > 2 2018-02-01 00:10:00 41 2018-02-01 00:00:00 > 3 2018-02-01 00:20:00 46 2018-02-01 00:00:00 > 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 > 5 2018-02-01 00:40:00 34 2018-02-01 00:30:00 > 6 2018-02-01 00:50:00 32 2018-02-01 00:30:00 > ... > > I should work with > > data_POSIX vmax interval > 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 > 2 2018-02-01 00:10:00 41 2018-02-01 00:30:00 > 3 2018-02-01 00:20:00 46 2018-02-01 00:30:00 > 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 > 5 2018-02-01 00:40:00 34 2018-02-01 00:00:00 > 6 2018-02-01 00:50:00 32 2018-02-01 00:00:00 > ... > > > Is there a way to modify this code to groupp data correctly? (I would > prefer using only the base package) > > Thank you for your help > Stefano > > > > (oo) > --oOO--( )--OOo---------------- > Stefano Sofia PhD > Civil Protection - Marche Region > Meteo Section > Snow Section > Via del Colle Ameno 5 > 60126 Torrette di Ancona, Ancona > Uff: 071 806 7743 > E-mail: stefano.sofia at regione.marche.it > ---Oo---------oO---------------- > > ________________________________ > > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere > informazioni confidenziali, pertanto ? destinato solo a persone autorizzate > alla ricezione. I messaggi di posta elettronica per i client di Regione > Marche possono contenere informazioni confidenziali e con privilegi legali. > Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o > archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, > inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio > computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in > caso di necessit? ed urgenza, la risposta al presente messaggio di posta > elettronica pu? essere visionata da persone estranee al destinatario. > IMPORTANT NOTICE: This e-mail message is intended to be received only by > persons entitled to receive the confidential information it may contain. > E-mail messages to clients of Regione Marche may contain information that > is confidential and legally privileged. Please do not read, copy, forward, > or store this message unless you are an intended recipient of it. If you > have received this message in error, please forward it to the sender and > delete it completely from your computer system. > > -- > Questo messaggio stato analizzato da Libra ESVA ed risultato non infetto. > This message was scanned by Libra ESVA and is believed to be clean. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Instead of using breaks="30 mins" construct the breaks explicitly with seq() so you can control the start point. E.g.,> init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M",tz="Etc/GMT-1")> fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M",tz="Etc/GMT-1")> mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins")) > mydf$vmax <- seq_len(nrow(mydf)) # instead of rnorm so we can checkresult more easily> # the following line is not very general > breaks <- seq(init_day-as.difftime(20,units="mins"), fin_day,by=as.difftime(30,units="mins"))> mydf$interval <- cut(mydf$data_POSIX, breaks=breaks) > aggregate(vmax ~ interval, mydf, FUN=function(x)paste(x,collapse=",")) #paste() so we can check results interval vmax 1 2018-01-31 23:40:00 1 2 2018-02-01 00:10:00 2,3,4 3 2018-02-01 00:40:00 5,6,7 4 2018-02-01 01:10:00 8,9,10 On Wed, Dec 2, 2020 at 9:55 AM Stefano Sofia < stefano.sofia at regione.marche.it> wrote:> Dear list users, > I have wind data with frequency of 10 minutes (three years data). For > simplicity let me use only max wind speed. > I need to reduce the frequency to 30 minutes, at 00 (taking the mean of > data at 40, 50 and 00 minutes) and at 30 (taking the mean of data at 10, 20 > and 30 minutes) of each hour. > > The simple code here reported works well, but the column "interval" groups > data forward, not backward: > > init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M", > tz="Etc/GMT-1") > fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M", > tz="Etc/GMT-1") > mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins")) > mydf$vmax <- round(rnorm(13, 35, 10)) > mydf$interval <- cut(mydf$data_POSIX, , breaks="30 min") > means <- aggregate(vmax ~ interval, mydf, mean) > > data_POSIX vmax interval > 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 > 2 2018-02-01 00:10:00 41 2018-02-01 00:00:00 > 3 2018-02-01 00:20:00 46 2018-02-01 00:00:00 > 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 > 5 2018-02-01 00:40:00 34 2018-02-01 00:30:00 > 6 2018-02-01 00:50:00 32 2018-02-01 00:30:00 > ... > > I should work with > > data_POSIX vmax interval > 1 2018-02-01 00:00:00 27 2018-02-01 00:00:00 > 2 2018-02-01 00:10:00 41 2018-02-01 00:30:00 > 3 2018-02-01 00:20:00 46 2018-02-01 00:30:00 > 4 2018-02-01 00:30:00 39 2018-02-01 00:30:00 > 5 2018-02-01 00:40:00 34 2018-02-01 00:00:00 > 6 2018-02-01 00:50:00 32 2018-02-01 00:00:00 > ... > > > Is there a way to modify this code to groupp data correctly? (I would > prefer using only the base package) > > Thank you for your help > Stefano > > > > (oo) > --oOO--( )--OOo---------------- > Stefano Sofia PhD > Civil Protection - Marche Region > Meteo Section > Snow Section > Via del Colle Ameno 5 > 60126 Torrette di Ancona, Ancona > Uff: 071 806 7743 > E-mail: stefano.sofia at regione.marche.it > ---Oo---------oO---------------- > > ________________________________ > > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere > informazioni confidenziali, pertanto ? destinato solo a persone autorizzate > alla ricezione. I messaggi di posta elettronica per i client di Regione > Marche possono contenere informazioni confidenziali e con privilegi legali. > Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o > archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, > inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio > computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in > caso di necessit? ed urgenza, la risposta al presente messaggio di posta > elettronica pu? essere visionata da persone estranee al destinatario. > IMPORTANT NOTICE: This e-mail message is intended to be received only by > persons entitled to receive the confidential information it may contain. > E-mail messages to clients of Regione Marche may contain information that > is confidential and legally privileged. Please do not read, copy, forward, > or store this message unless you are an intended recipient of it. If you > have received this message in error, please forward it to the sender and > delete it completely from your computer system. > > -- > Questo messaggio stato analizzato da Libra ESVA ed risultato non infetto. > This message was scanned by Libra ESVA and is believed to be clean. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]