I posted a similar question, but feel it needs a bit more elaboration. I have a data frame (read from a csv file) that may have missing rows. Each day has 7 time intervals associated with it, with a range from 17:00 hrs to 18:00 hrs in 10 minute bins. What I am looking for is a script that will run through the data frame and insert "NA"in the Volume column for any dates that are missing a time interval. For example: Date Time Camera Volume 57 2009-10-09 5:00:00 PM MANBRIN_RIVER_NB 210 58 2009-10-09 5:10:00 PM MANBRIN_RIVER_NB 207 59 2009-10-09 5:20:00 PM MANBRIN_RIVER_NB 250 60 2009-10-09 5:30:00 PM MANBRIN_RIVER_NB 193 61 2009-10-09 5:40:00 PM MANBRIN_RIVER_NB 205 62 2009-10-09 6:00:00 PM MANBRIN_RIVER_NB 185 Note that between row 61 and row 62, there is a missing time interval (5:50 PM). I want the data frame to look like this: Date Time Camera Volume 57 2009-10-09 5:00:00 PM MANBRIN_RIVER_NB 210 58 2009-10-09 5:10:00 PM MANBRIN_RIVER_NB 207 59 2009-10-09 5:20:00 PM MANBRIN_RIVER_NB 250 60 2009-10-09 5:30:00 PM MANBRIN_RIVER_NB 193 61 2009-10-09 5:40:00 PM MANBRIN_RIVER_NB 205 *62 2009-10-09 5:50:00 PM MANBRIN_RIVER_NB NA* 62 2009-10-09 6:00:00 PM MANBRIN_RIVER_NB 185 Thanks in advance, Kindra [[alternative HTML version deleted]]
On Mar 4, 2010, at 4:45 PM, Kindra Martinenko wrote:> I posted a similar question, but feel it needs a bit more elaboration. > > I have a data frame (read from a csv file) that may have missing > rows. Each > day has 7 time intervals associated with it, with a range from 17:00 > hrs to > 18:00 hrs in 10 minute bins. > > What I am looking for is a script that will run through the data > frame and > insert "NA"in the Volume column for any dates that are missing a time > interval. For example: > > Date Time Camera > Volume > 57 2009-10-09 5:00:00 PM MANBRIN_RIVER_NB 210 > 58 2009-10-09 5:10:00 PM MANBRIN_RIVER_NB 207 > 59 2009-10-09 5:20:00 PM MANBRIN_RIVER_NB 250 > 60 2009-10-09 5:30:00 PM MANBRIN_RIVER_NB 193 > 61 2009-10-09 5:40:00 PM MANBRIN_RIVER_NB 205 > 62 2009-10-09 6:00:00 PM MANBRIN_RIVER_NB 185Here is one method of generating a series of time points at 10 minute intervals: > as.POSIXlt("5:00:00 PM", format="%I:%M:%s %p" ) + (1:20)*60*10 [1] "2010-03-04 05:10:00 EST" "2010-03-04 05:20:00 EST" "2010-03-04 05:30:00 EST" [4] "2010-03-04 05:40:00 EST" "2010-03-04 05:50:00 EST" "2010-03-04 06:00:00 EST" [7] "2010-03-04 06:10:00 EST" "2010-03-04 06:20:00 EST" "2010-03-04 06:30:00 EST" [10] "2010-03-04 06:40:00 EST" "2010-03-04 06:50:00 EST" "2010-03-04 07:00:00 EST" [13] "2010-03-04 07:10:00 EST" "2010-03-04 07:20:00 EST" "2010-03-04 07:30:00 EST" [16] "2010-03-04 07:40:00 EST" "2010-03-04 07:50:00 EST" "2010-03-04 08:00:00 EST" [19] "2010-03-04 08:10:00 EST" "2010-03-04 08:20:00 EST" Applying that to the solution you referenced should finish the job. -- David.> > Note that between row 61 and row 62, there is a missing time > interval (5:50 > PM). I want the data frame to look like this: > > Date Time Camera > Volume > 57 2009-10-09 5:00:00 PM MANBRIN_RIVER_NB 210 > 58 2009-10-09 5:10:00 PM MANBRIN_RIVER_NB 207 > 59 2009-10-09 5:20:00 PM MANBRIN_RIVER_NB 250 > 60 2009-10-09 5:30:00 PM MANBRIN_RIVER_NB 193 > 61 2009-10-09 5:40:00 PM MANBRIN_RIVER_NB 205 > *62 2009-10-09 5:50:00 PM MANBRIN_RIVER_NB NA* > 62 2009-10-09 6:00:00 PM MANBRIN_RIVER_NB 185 > > > Thanks in advance, > Kindra > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
Gabor Grothendieck
2010-Mar-05 00:55 UTC
[R] missing date and time intervals in data frame
Try this: Lines <- "Date,Time,Camera,Volume 57,2009-10-09,5:00:00 PM,MANBRIN_RIVER_NB,210 58,2009-10-09,5:10:00 PM,MANBRIN_RIVER_NB,207 59,2009-10-09,5:20:00 PM,MANBRIN_RIVER_NB,250 60,2009-10-09,5:30:00 PM,MANBRIN_RIVER_NB,193 61,2009-10-09,5:40:00 PM,MANBRIN_RIVER_NB,205 62,2009-10-09,6:00:00 PM,MANBRIN_RIVER_NB,185" # read data DF <- read.csv(textConnection(Lines), as.is = TRUE) library(zoo) library(chron) # convert to zoo z <- zoo(DF$Volume, as.chron(paste(DF$Date, DF$Time), "%Y-%m-%d %H:%M:%S %p")) # convert to ts which makes it regularly spaced and back to zoo zz <- as.zoo(as.ts(z)) # fixup times as converting to ts loses the time class but not the time values time(zz) <- as.chron(time(zz) At this point zz is a regularly spaced zoo object. coredata(zz) are its values and time(zz) are its times.> zz(10/09/09 05:00:00) (10/09/09 05:10:00) (10/09/09 05:20:00) (10/09/09 05:30:00) 210 207 250 193 (10/09/09 05:40:00) (10/09/09 05:50:00) (10/09/09 06:00:00) 205 NA 185 There are further examples in ?merge.zoo and vignette("zoo") Also see R News 4/1 on dates and times. On Thu, Mar 4, 2010 at 4:45 PM, Kindra Martinenko <kmartinenko at gmail.com> wrote:> I posted a similar question, but feel it needs a bit more elaboration. > > I have a data frame (read from a csv file) that may have missing rows. ?Each > day has 7 time intervals associated with it, with a range from 17:00 hrs to > 18:00 hrs in 10 minute bins. > > What I am looking for is a script that will run through the data frame and > insert "NA"in the Volume column for any dates that are missing a time > interval. ?For example: > > ? ? ? ? ?Date ? ? ? ? ? ?Time ? ? ? ? ? ? Camera > ?Volume > 57 ?2009-10-09 5:00:00 PM MANBRIN_RIVER_NB ? ?210 > 58 ?2009-10-09 5:10:00 PM MANBRIN_RIVER_NB ? ?207 > 59 ?2009-10-09 5:20:00 PM MANBRIN_RIVER_NB ? ?250 > 60 ?2009-10-09 5:30:00 PM MANBRIN_RIVER_NB ? ?193 > 61 ?2009-10-09 5:40:00 PM MANBRIN_RIVER_NB ? ?205 > 62 ?2009-10-09 6:00:00 PM MANBRIN_RIVER_NB ? ?185 > > Note that between row 61 and row 62, there is a missing time interval (5:50 > PM). ?I want the data frame to look like this: > > ? ? Date ? ? ? ? ? ? ? ? Time ? ? ? ? ? Camera > ?Volume > 57 ?2009-10-09 5:00:00 PM MANBRIN_RIVER_NB ? ?210 > 58 ?2009-10-09 5:10:00 PM MANBRIN_RIVER_NB ? ?207 > 59 ?2009-10-09 5:20:00 PM MANBRIN_RIVER_NB ? ?250 > 60 ?2009-10-09 5:30:00 PM MANBRIN_RIVER_NB ? ?193 > 61 ?2009-10-09 5:40:00 PM MANBRIN_RIVER_NB ? ?205 > *62 ?2009-10-09 5:50:00 PM MANBRIN_RIVER_NB ?NA* > 62 ?2009-10-09 6:00:00 PM MANBRIN_RIVER_NB ? ?185 > > > Thanks in advance, > Kindra > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >