Matthew Keller
2009-Jun-17 21:54 UTC
[R] how to interpolate time series data with missingness
Hi all, I have a vector, most of which is missing. The data is always increasing, but may do so in jumps. I would like to interpolate the NAs with 'best guesses', using something like filter(), which doesn't work due to the NAs. Here is an example:> x <- c(2,3,NA,NA,NA,3.2,3.5,NA,NA,6,NA) > x[1] 2.0 3.0 NA NA NA 3.2 3.5 NA NA 6.0 NA I would like a function that would take the NAs and fill in the average values around the NAs. E.g., make a new vector x.new that looks like:> x.new[1] 2.0 3.0 3.1 3.1 3.1 3.2 3.5 4.75 4.75 6 6 Or, alternatively, that could figure out a more likely value than just the average. There must be something simple I'm overlooking, like some kind of loess y-hat or something? Any help would be appreciated, Matt -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com
Gabor Grothendieck
2009-Jun-17 22:07 UTC
[R] how to interpolate time series data with missingness
The zoo package has a number of na.* routines:> library(zoo) > x <- c(2,3,NA,NA,NA,3.2,3.5,NA,NA,6,NA) > na.approx(x)[1] 2.000000 3.000000 3.050000 3.100000 3.150000 3.200000 3.500000 4.333333 [9] 5.166667 6.000000> na.locf(x)[1] 2.0 3.0 3.0 3.0 3.0 3.2 3.5 3.5 3.5 6.0 6.0> na.spline(x)[1] 2.000000 3.000000 3.366531 3.352065 3.211566 3.200000 3.500000 4.045127 [9] 4.857627 6.000000 7.534746 On Wed, Jun 17, 2009 at 5:54 PM, Matthew Keller<mckellercran at gmail.com> wrote:> Hi all, > > I have a vector, most of which is missing. The data is always > increasing, but may do so in jumps. I would like to interpolate the > NAs with 'best guesses', using something like filter(), which doesn't > work due to the NAs. Here is an example: > >> x <- c(2,3,NA,NA,NA,3.2,3.5,NA,NA,6,NA) >> x > ?[1] 2.0 3.0 ?NA ?NA ?NA 3.2 3.5 ?NA ?NA 6.0 ?NA > > I would like a function that would take the NAs and fill in the > average values around the NAs. E.g., make a new vector x.new that > looks like: >> x.new > [1] 2.0 3.0 3.1 3.1 3.1 3.2 3.5 4.75 4.75 6 6 > > Or, alternatively, that could figure out a more likely value than just > the average. There must be something simple I'm overlooking, like some > kind of loess y-hat or something? Any help would be appreciated, > > Matt > > -- > Matthew C Keller > Asst. Professor of Psychology > University of Colorado at Boulder > www.matthewckeller.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >