Hello, I have a time-series that has some missing samples. I was thinking on completing them using either zero-order hold or linear interpolation. I am looking for an efiicient way (other than a loop...) of identifiying the missing time slots and filling them. Can you think of any methods that might help here? (obviously which(diff(time)>min(diff(time))) will give the locations, but what then....?) Thanks, Eran. [[alternative HTML version deleted]]
How are your time samples missing? If they are recorded as NA, the na.locf() function will fill them with the previous value (zero-order hold) and with the reversability arguments can give linear interpolation: library(xts) x = c(1:5,NA,6:10) x = xts(x,Sys.Date()+0:10) na.locf(x) (na.locf(x) + na.locf(x,fromLast=TRUE))/2 If the row is "missing" and you really want to put in data, the following may work -- though most time series analysis techniques are usually able to deal with irregularly spaced data, at least in my work -- library(xts) x = c(1:10) x = xts(x,c(Sys.Date() + 0:4,Sys.Date()+6:10)) tx = seq.Date(from = first(time(x)), to = last(time(x)), by min(diff(time(x)))) xNew = xts(rep(NA,length(tx)), tx) xNew[time(x)] <- x then fill xNew as before. Hope this helps, Michael Weylandt On Mon, Sep 12, 2011 at 4:42 AM, Eran Eidinger <eran@taykey.com> wrote:> Hello, > > I have a time-series that has some missing samples. > I was thinking on completing them using either zero-order hold or linear > interpolation. > I am looking for an efiicient way (other than a loop...) of identifiying > the > missing time slots and filling them. > > Can you think of any methods that might help here? (obviously > which(diff(time)>min(diff(time))) will give the locations, but what > then....?) > > Thanks, > Eran. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Eran, You have already gotten some suggestions from Michael, but I think that Rich is correct to question the rational. Any mechanism you choose to replace the missing values will impose its structure on the data. Veritate ab absurdo: ## data x <- sin(seq(1, 17, .1)) + seq(-.5, .5, length.out = 161) y <- x <- ts(x, start = Sys.Date(), end = Sys.Date() + 160) ## add missing values to y y[-c(1, length(x))] <- NA ## interpolate linearlly y[-length(y)] <- cumsum(c(y[1], rep(diff(y[!is.na(y)])/sum(is.na(y)), sum(is.na(y))))) ## plot dev.new(height = 5, width = 10) par(mfrow= c(1, 2)) plot(y, ylim = range(x)) plot(x, ylim = range(x)) Cheers, Josh On Mon, Sep 12, 2011 at 1:42 AM, Eran Eidinger <eran at taykey.com> wrote:> Hello, > > I have a time-series that has some missing samples. > I was thinking on completing them using either zero-order hold or linear > interpolation. > I am looking for an efiicient way (other than a loop...) of identifiying the > missing time slots and filling them. > > Can you think of any methods that might help here? (obviously > which(diff(time)>min(diff(time))) will give the locations, but what > then....?) > > Thanks, > Eran. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/
On Sat, Sep 17, 2011 at 1:43 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:> Hi Eran, > > You have already gotten some suggestions from Michael, but I think > that Rich is correct to question the rational. ?Any mechanism you > choose to replace the missing values will impose its structure on the > data. Veritate ab absurdo:or verum? who remembers their Latin declensions?
On Sep 17, 2011, at 5:04 PM, Joshua Wiley wrote:> On Sat, Sep 17, 2011 at 1:43 PM, Joshua Wiley > <jwiley.psych at gmail.com> wrote: >> Hi Eran, >> >> You have already gotten some suggestions from Michael, but I think >> that Rich is correct to question the rational.I hope we should question the rationale, and doubt that Rich has forsaken the rational.>> Any mechanism you >> choose to replace the missing values will impose its structure on the >> data. Veritate ab absurdo: > > or verum? who remembers their Latin declensions?Veritas is third declension. -- David Winsemius, MD West Hartford, CT
On Mon, Sep 12, 2011 at 4:42 AM, Eran Eidinger <eran at taykey.com> wrote:> Hello, > > I have a time-series that has some missing samples. > I was thinking on completing them using either zero-order hold or linear > interpolation. > I am looking for an efiicient way (other than a loop...) of identifiying the > missing time slots and filling them. > > Can you think of any methods that might help here? (obviously > which(diff(time)>min(diff(time))) will give the locations, but what > then....?) >The zoo package has na.approx, na.fill, na.locf, na, na.spline na.StructTS and the stinepack package has na.stinterp. Each of these fill in NAs in zoo series and certain other objects. See the help files for many examples. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com