[The data setting in the last email might be faulty] Dear useRs, I have the following dataset which represents rainfall data at a 5-minute interval from 1 May 2021 to 30 September 2021.> dput(YY)structure(list(CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM", "2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM", "2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM", "2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM", "2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM", "2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM", "2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM", "2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM", "2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM", "2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM", "2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM", "2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM", "2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM", "2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM", "2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM", "2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM", "2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM", "2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM", "2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM", "2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM", "2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM", "2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM", "2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM", "2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM", "2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM", "2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM", "2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM", "2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM", "2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM", "2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM", "2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM", "2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM", "2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM", "2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM" ), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2 )), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L, 996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L, 1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L, 1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L, 1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L, 2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L, 2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L, 3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L, 3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L, 3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L, 3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L, 3932L, 3933L), class = "data.frame") You could clearly see that there are some intervals which are missing from this dataset. For example, the data values for 1st of May are missing. Similarly, between 30 2021 2021/05/02 10:00:00 PM 0.2 and 30 2021 2021/05/02 10:55:00 PM 0.2 the values of rainfall depth for following "time stamps" are missing because they were "zero" 30 2021 2021/05/02 10:05:00 PM 0.0 30 2021 2021/05/02 10:10:00 PM 0.0 30 2021 2021/05/02 10:15:00 PM 0.0 30 2021 2021/05/02 10:20:00 PM 0.0 30 2021 2021/05/02 10:25:00 PM 0.0 30 2021 2021/05/02 10:30:00 PM 0.0 30 2021 2021/05/02 10:35:00 PM 0.0 30 2021 2021/05/02 10:40:00 PM 0.0 30 2021 2021/05/02 10:45:00 PM 0.0 30 2021 2021/05/02 10:50:00 PM 0.0 So, what I want is a uniform list starting from 2021/05/01 to 2021/09/30 at every 5-minute intervals with "zero" values for the missing intervals in the original data list. I hope my question is clear. Thank You very much in advance, Eliza [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> [[alternative HTML version deleted]]
There are quite a variety of approaches implemented in various contributed packages, but here is a base R approach based on merge: Sys.setenv(TZ = "UTC" ) # or other non-DST zone unless you need it YY$TIMESTAMP <- as.POSIXct( YY$TIMESTAMP, format = "%Y/%m/%d %I:%M:%S %p" ) tlims <- as.POSIXct( c( "2021-05-02", "2021-05-10" ) ) tdiff <- as.difftime( 5, units="mins" ) aa <- seq( tlims[1], tlims[2], by = tdiff ) AA <- expand.grid( CHANNEL = 30, TIMESTAMP = aa ) yy <- merge( AA, YY[ , c( "CHANNEL", "TIMESTAMP", "RAINFALL" ) ], by = c( "CHANNEL", "TIMESTAMP" ), all.x = TRUE ) yy$RAINFALL[ is.na( yy$RAINFALL ) ] <- 0 yy On February 28, 2022 7:52:47 PM PST, Eliza Botto <eliza_botto at outlook.com> wrote:>[The data setting in the last email might be faulty] > >Dear useRs, > >I have the following dataset which represents rainfall data at a 5-minute interval from 1 May 2021 to 30 September 2021. > >> dput(YY) > >structure(list(CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, >30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, >2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM", >"2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM", >"2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM", >"2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM", >"2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM", >"2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM", >"2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM", >"2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM", >"2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM", >"2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM", >"2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM", >"2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM", >"2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM", >"2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM", >"2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM", >"2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM", >"2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM", >"2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM", >"2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM", >"2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM", >"2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM", >"2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM", >"2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM", >"2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM", >"2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM", >"2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM", >"2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM", >"2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM", >"2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM", >"2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM", >"2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM", >"2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM", >"2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM", >"2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM" >), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, >0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, >0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, >0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, >0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, >0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, >0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, >0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2 >)), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L, >996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L, >1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L, >1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L, >1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L, >2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L, >2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L, >3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L, >3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L, >3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L, >3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L, >3932L, 3933L), class = "data.frame") > >You could clearly see that there are some intervals which are missing from this dataset. For example, the data values for 1st of May are missing. Similarly, > >between > >30 2021 2021/05/02 10:00:00 PM 0.2 > >and > >30 2021 2021/05/02 10:55:00 PM 0.2 > >the values of rainfall depth for following "time stamps" are missing because they were "zero" > >30 2021 2021/05/02 10:05:00 PM 0.0 > >30 2021 2021/05/02 10:10:00 PM 0.0 > >30 2021 2021/05/02 10:15:00 PM 0.0 > >30 2021 2021/05/02 10:20:00 PM 0.0 > >30 2021 2021/05/02 10:25:00 PM 0.0 > >30 2021 2021/05/02 10:30:00 PM 0.0 > >30 2021 2021/05/02 10:35:00 PM 0.0 > >30 2021 2021/05/02 10:40:00 PM 0.0 > >30 2021 2021/05/02 10:45:00 PM 0.0 > >30 2021 2021/05/02 10:50:00 PM 0.0 > >So, what I want is a uniform list starting from 2021/05/01 to 2021/09/30 at every 5-minute intervals with "zero" values for the missing intervals in the original data list. I hope my question is clear. > >Thank You very much in advance, > >Eliza > >[https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Hi Eliza, It sure was: YY$datetime<-strptime(YY$TIMESTAMP,"%Y/%m/%d %I:%M:%S %p") dt5min<-seq(ISOdate(2021,5,1,0,5),ISOdate(2021,5,31,12,55),by="5 min") newdt<-data.frame(datetime=dt5min) newyy<-merge(newdt,YY,by="datetime",all=TRUE) newyy$RAINFALL[is.na(newyy$RAINFALL)]<-0 plot(newyy$datetime,newyy$RAINFALL) Jim On Tue, Mar 1, 2022 at 2:57 PM Eliza Botto <eliza_botto at outlook.com> wrote:> > [The data setting in the last email might be faulty] > > Dear useRs, > > I have the following dataset which represents rainfall data at a 5-minute interval from 1 May 2021 to 30 September 2021. > > > dput(YY) > > structure(list(CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM", > "2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM", > "2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM", > "2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM", > "2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM", > "2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM", > "2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM", > "2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM", > "2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM", > "2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM", > "2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM", > "2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM", > "2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM", > "2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM", > "2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM", > "2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM", > "2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM", > "2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM", > "2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM", > "2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM", > "2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM", > "2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM", > "2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM", > "2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM", > "2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM", > "2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM", > "2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM", > "2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM", > "2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM", > "2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM", > "2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM", > "2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM", > "2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM", > "2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM" > ), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2 > )), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L, > 996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L, > 1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L, > 1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L, > 1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L, > 2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L, > 2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L, > 3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L, > 3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L, > 3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L, > 3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L, > 3932L, 3933L), class = "data.frame") > > You could clearly see that there are some intervals which are missing from this dataset. For example, the data values for 1st of May are missing. Similarly, > > between > > 30 2021 2021/05/02 10:00:00 PM 0.2 > > and > > 30 2021 2021/05/02 10:55:00 PM 0.2 > > the values of rainfall depth for following "time stamps" are missing because they were "zero" > > 30 2021 2021/05/02 10:05:00 PM 0.0 > > 30 2021 2021/05/02 10:10:00 PM 0.0 > > 30 2021 2021/05/02 10:15:00 PM 0.0 > > 30 2021 2021/05/02 10:20:00 PM 0.0 > > 30 2021 2021/05/02 10:25:00 PM 0.0 > > 30 2021 2021/05/02 10:30:00 PM 0.0 > > 30 2021 2021/05/02 10:35:00 PM 0.0 > > 30 2021 2021/05/02 10:40:00 PM 0.0 > > 30 2021 2021/05/02 10:45:00 PM 0.0 > > 30 2021 2021/05/02 10:50:00 PM 0.0 > > So, what I want is a uniform list starting from 2021/05/01 to 2021/09/30 at every 5-minute intervals with "zero" values for the missing intervals in the original data list. I hope my question is clear. > > Thank You very much in advance, > > Eliza > > [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Richard M. Heiberger
2022-Mar-01 05:18 UTC
[R] [External] setting zeros for missing interval in data
I believe this is what you are looking for. The idea is to set up the full range of interest and then subset into it with the values that you have observed. I illustrate here with three days, you will need the full five months. Start <- strptime("2021/05/02 10:00:00 PM", format="%Y/%m/%d %H:%M:%S", tz="GMT") CompleteTimeSet <- Start + (0:(288*3))*5*60 ## 288 5min intervals per day, 3 days, 5*60=300 seconds per 5 minutes YY <- structure(list(CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM", "2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM", "2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM", "2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM", "2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM", "2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM", "2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM", "2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM", "2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM", "2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM", "2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM", "2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM", "2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM", "2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM", "2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM", "2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM", "2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM", "2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM", "2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM", "2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM", "2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM", "2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM", "2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM", "2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM", "2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM", "2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM", "2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM", "2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM", "2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM", "2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM", "2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM", "2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM", "2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM", "2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM" ), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2 )), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L, 996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L, 1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L, 1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L, 1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L, 2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L, 2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L, 3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L, 3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L, 3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L, 3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L, 3932L, 3933L), class = "data.frame") CompleteData <- data.frame(TIMESTAMP=CompleteTimeSet, RAINFALL=0) YYTIME <- strptime(YY$TIMESTAMP, format="%Y/%m/%d %H:%M:%S", tz="GMT") CompleteData[CompleteData$TIMESTAMP %in% YYTIME[1:4], "RAINFALL"] <- YY$RAINFALL[1:4] CompleteData ## output> CompleteData[1:14,]TIMESTAMP RAINFALL 1 2021-05-02 10:00:00 0.2 2 2021-05-02 10:05:00 0.0 3 2021-05-02 10:10:00 0.0 4 2021-05-02 10:15:00 0.0 5 2021-05-02 10:20:00 0.0 6 2021-05-02 10:25:00 0.0 7 2021-05-02 10:30:00 0.0 8 2021-05-02 10:35:00 0.0 9 2021-05-02 10:40:00 0.0 10 2021-05-02 10:45:00 0.0 11 2021-05-02 10:50:00 0.0 12 2021-05-02 10:55:00 0.2 13 2021-05-02 11:00:00 0.0 14 2021-05-02 11:05:00 0.0> On Feb 28, 2022, at 22:52, Eliza Botto <eliza_botto at outlook.com> wrote: > > structure(list(CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, > 30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, > 2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM", > "2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM", > "2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM", > "2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM", > "2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM", > "2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM", > "2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM", > "2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM", > "2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM", > "2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM", > "2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM", > "2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM", > "2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM", > "2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM", > "2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM", > "2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM", > "2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM", > "2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM", > "2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM", > "2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM", > "2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM", > "2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM", > "2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM", > "2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM", > "2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM", > "2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM", > "2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM", > "2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM", > "2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM", > "2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM", > "2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM", > "2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM", > "2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM", > "2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM" > ), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, > 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2 > )), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L, > 996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L, > 1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L, > 1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L, > 1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L, > 2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L, > 2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L, > 3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L, > 3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L, > 3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L, > 3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L, > 3932L, 3933L), class = "data.frame")
In base R, I wish the question below had been explained better. It is nice that an example was given, albeit misleading for me. The data shown is not flawed and has nothing inside it that reflects being missing as it first sounded like. What sounds like it is missing is specific dates entirely. The column called Channel seems irrelevant as it is always 30. Rain fall is always 0.2 or 0.4. The YEAR is always 2021. So the ONLY interesting thing here seems to be TIMESTAMP. But I am NOT convinced they are missing because the times are all over the place. I mean 10 PM and 5:40 PM and 5:20 AM and so on. There are multiple rows for the same day. Yes, there is no info for May 1 and May 6 and 7. I have no idea why but How and why are we supposed to guess that it means no rain versus some other reason? Towards the end, what I think is the real message is shown. The suggestion is there should be data for every five minute period interpolated here Fair enough. Can I suggest that the data offered to us has the TIMESTAMP field as character, rather than some form of DATE/TIME that can be used in Python? Converting it or extracting some info into temporary columns might be useful here. You could then create some kind of data that loops over times starting with your start time, say midnight on the 1st and for every 5 minute interval makes a timestamp that looks like what you need and COMPARE to what is in the data shown. For any that are nor present, you can create a similar row with a zero in it for the RAINFALL field. There are oodles of ways to do that, including some more straightforward than others. Or, you may just make the sequence or all, and later in some kind of merge, only keep ones from the original data if there is a duplicate. Again, many ways, even in base R. If my analysis is right, and clearly it may not be, a much better way to ask this question might be to say you have timestamped data about rainfall where the readings for every 5 minute interval with no rainfall have been omitted. How do you create records for all 5-minute intervals that are not present and merge that info with the records shown? As a hint, you can make a sequence like below, with your own adjustments for starting and ending dates.> seq(from=as.POSIXct("2020-05-01 00:00"),to=as.POSIXct("2020-05-01 01:00"),by="5 min")[1] "2020-05-01 00:00:00 EDT" "2020-05-01 00:05:00 EDT" "2020-05-01 00:10:00 EDT" [4] "2020-05-01 00:15:00 EDT" "2020-05-01 00:20:00 EDT" "2020-05-01 00:25:00 EDT" [7] "2020-05-01 00:30:00 EDT" "2020-05-01 00:35:00 EDT" "2020-05-01 00:40:00 EDT" [10] "2020-05-01 00:45:00 EDT" "2020-05-01 00:50:00 EDT" "2020-05-01 00:55:00 EDT" [13] "2020-05-01 01:00:00 EDT" Of course you may want to know WHY you need the missing data interpolated. Some graphics programs, if properly supplied with actual dates, not character strings, may simply skip missing records and leave room between others. The missing ones might be treated as zero, depending what you are doing. -----Original Message----- From: Eliza Botto <eliza_botto at outlook.com> To: R-help at r-project.org <R-help at r-project.org> Sent: Mon, Feb 28, 2022 10:52 pm Subject: [R] setting zeros for missing interval in data [The data setting in the last email might be faulty] Dear useRs, I have the following dataset which represents rainfall data at a 5-minute interval from 1 May 2021 to 30 September 2021.> dput(YY)structure(list(CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM", "2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM", "2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM", "2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM", "2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM", "2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM", "2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM", "2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM", "2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM", "2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM", "2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM", "2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM", "2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM", "2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM", "2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM", "2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM", "2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM", "2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM", "2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM", "2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM", "2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM", "2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM", "2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM", "2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM", "2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM", "2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM", "2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM", "2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM", "2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM", "2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM", "2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM", "2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM", "2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM", "2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM" ), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2 )), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L, 996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L, 1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L, 1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L, 1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L, 2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L, 2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L, 3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L, 3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L, 3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L, 3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L, 3932L, 3933L), class = "data.frame") You could clearly see that there are some intervals which are missing from this dataset. For example, the data values for 1st of May are missing. Similarly, between 30 2021 2021/05/02 10:00:00 PM? ? ? 0.2 and 30 2021 2021/05/02 10:55:00 PM? ? ? 0.2 the values of rainfall depth for following "time stamps" are missing because they were "zero" 30 2021 2021/05/02 10:05:00 PM? ? ? 0.0 30 2021 2021/05/02 10:10:00 PM? ? ? 0.0 30 2021 2021/05/02 10:15:00 PM? ? ? 0.0 30 2021 2021/05/02 10:20:00 PM? ? ? 0.0 30 2021 2021/05/02 10:25:00 PM? ? ? 0.0 30 2021 2021/05/02 10:30:00 PM? ? ? 0.0 30 2021 2021/05/02 10:35:00 PM? ? ? 0.0 30 2021 2021/05/02 10:40:00 PM? ? ? 0.0 30 2021 2021/05/02 10:45:00 PM? ? ? 0.0 30 2021 2021/05/02 10:50:00 PM? ? ? 0.0 So, what I want is a uniform list starting from 2021/05/01 to 2021/09/30 at every 5-minute intervals with "zero" values for the missing intervals in the original data list. I hope my question is clear. Thank You very much in advance, Eliza [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>? ? Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.