Hi, I have a irregularly spaced time series dataset, which reads in from a .csv. I need to convert this to a regularly spaced time series by filling in missing rows of data with NAs. So my data, called NtuMot, looks like this (I've removed some of the additional rows for simplicity).... ELEID date_time height slope 1 2009-06-24 00:00:00 150 4.0 1 2009-06-24 01:00:00 175 4.0 1 2009-06-24 02:00:00 180 2.3 1 2009-06-24 03:00:00 200 1.0 1 2009-06-24 06:00:00 201 1.0 1 2009-06-24 07:00:00 202 0.0 1 2009-06-24 08:00:00 202 0.0 1 2009-06-24 09:00:00 202 0.0 1 2009-06-24 10:00:00 202 0.0 I need to end up with this: ELEID date_time height slope 1 2009-06-24 00:00:00 150 4.0 1 2009-06-24 01:00:00 175 4.0 1 2009-06-24 02:00:00 180 2.3 1 2009-06-24 03:00:00 200 1.0 1 2009-06-24 04:00:00 NA NA 1 2009-06-24 05:00:00 NA NA 1 2009-06-24 06:00:00 201 1.0 1 2009-06-24 07:00:00 202 0.0 1 2009-06-24 08:00:00 202 0.0 1 2009-06-24 09:00:00 202 0.0 1 2009-06-24 10:00:00 202 0.0 Any ideas much appreciated! Thanks, Louise -- View this message in context: http://r.789695.n4.nabble.com/Fill-in-missing-times-in-a-timeseries-with-NA-tp3015391p3015391.html Sent from the R help mailing list archive at Nabble.com.
Gabor Grothendieck
2010-Oct-27 14:57 UTC
[R] Fill in missing times in a timeseries with NA
On Wed, Oct 27, 2010 at 8:57 AM, lglew <l.glew at soton.ac.uk> wrote:> > Hi, > > I have a irregularly spaced time series dataset, which reads in from a .csv. > I need to convert this to a regularly spaced time series by filling in > missing rows of data with NAs. > > So my data, called NtuMot, looks like this (I've removed some of the > additional rows for simplicity).... > ELEID ? ?date_time ? ? ? ? ? ? ? ? ? ? ? height ? ? ? ? ? ? ? ? ?slope > 1 ? ? ? ?2009-06-24 00:00:00 ? ? ? ? ?150 ? ? ? ? ? ? ? ? ? ? ?4.0 > 1 ? ? ? ?2009-06-24 01:00:00 ? ? ? ? ?175 ? ? ? ? ? ? ? ? ? ? ?4.0 > 1 ? ? ? ?2009-06-24 02:00:00 ? ? ? ? ?180 ? ? ? ? ? ? ? ? ? ? ?2.3 > 1 ? ? ? ?2009-06-24 03:00:00 ? ? ? ? ?200 ? ? ? ? ? ? ? ? ? ? ?1.0 > 1 ? ? ? ?2009-06-24 06:00:00 ? ? ? ? ?201 ? ? ? ? ? ? ? ? ? ? ?1.0 > 1 ? ? ? ?2009-06-24 07:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > 1 ? ? ? ?2009-06-24 08:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > 1 ? ? ? ?2009-06-24 09:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > 1 ? ? ? ?2009-06-24 10:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > > > I need to end up with this: > ELEID ? ?date_time ? ? ? ? ? ? ? ? ? ? ? height ? ? ? ? ? ? ? ? ?slope > > 1 ? ? ? ?2009-06-24 00:00:00 ? ? ? ? 150 ? ? ? ? ? ? ? ? ? ? ?4.0 > 1 ? ? ? ?2009-06-24 01:00:00 ? ? ? ? 175 ? ? ? ? ? ? ? ? ? ? ?4.0 > 1 ? ? ? ?2009-06-24 02:00:00 ? ? ? ? 180 ? ? ? ? ? ? ? ? ? ? ?2.3 > 1 ? ? ? ?2009-06-24 03:00:00 ? ? ? ? 200 ? ? ? ? ? ? ? ? ? ? ?1.0 > 1 ? ? ? ?2009-06-24 04:00:00 ? ? ? ? ?NA ? ? ? ? ? ? ? ? ? ? ? NA > 1 ? ? ? ?2009-06-24 05:00:00 ? ? ? ? ?NA ? ? ? ? ? ? ? ? ? ? ? NA > 1 ? ? ? ?2009-06-24 06:00:00 ? ? ? ? ?201 ? ? ? ? ? ? ? ? ? ? ?1.0 > 1 ? ? ? ?2009-06-24 07:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > 1 ? ? ? ?2009-06-24 08:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > 1 ? ? ? ?2009-06-24 09:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > 1 ? ? ? ?2009-06-24 10:00:00 ? ? ? ? ?202 ? ? ? ? ? ? ? ? ? ? ?0.0 > > Any ideas much appreciated! >This will do it producing a new data frame: grid.df <- data.frame(date_time = seq(DF[1, 2], DF[nrow(DF), 2], by = "hour")) merge(DF, grid.df) however, if you are dealing with irregular series you might find it more convenient to use the zoo package: library(zoo) z <- zoo(DF[-2], DF[, 2]) g <- seq(start(z), end(z), by = "hour") m <- merge(z, zoo(, g)) m In the above we used this for data frame DF: DF <- structure(list(ELEID = c(1, 1, 1, 1, 1, 1, 1, 1, 1), date_time = structure(c(1245816000, 1245819600, 1245823200, 1245826800, 1245837600, 1245841200, 1245844800, 1245848400, 1245852000), class = c("POSIXt", "POSIXct"), tzone = ""), height = c(150, 175, 180, 200, 201, 202, 202, 202, 202), slope = c(4, 4, 2.3, 1, 1, 0, 0, 0, 0)), .Names = c("ELEID", "date_time", "height", "slope"), row.names = c("2009-06-24 00:00:00", "2009-06-24 01:00:00", "2009-06-24 02:00:00", "2009-06-24 03:00:00", "2009-06-24 06:00:00", "2009-06-24 07:00:00", "2009-06-24 08:00:00", "2009-06-24 09:00:00", "2009-06-24 10:00:00"), class = "data.frame") -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com