Jay Rice
2012-Oct-12  00:26 UTC
[R] error msg using na.approx "x and index must have the same length"
Below I have written out some simplified data from my dataset. My goal is to interpolate Price based on timestamp. Therefore the closer a Price is in time to another price, the more like that price it will be. I want the interpolations for each St and not across St (St is a factor with levels A, B, and C). Unfortunately, I get error messages from code I wrote. In the end only IDs 10 and 14 will receive interpolated values because all other NAs occur at the beginning of a level. My code is given below the dataset. ID is int St is factor with 3 levels timestamp is POSIXlt Price is num Data.frame name is portfolio ID St timestamp Price 1 A 2012-01-01 12:50:24.760 NA 2 A 2012-01-01 12:51:25.860 72.09 3 A 2012-01-01 12:52:21.613 72.09 4 A 2012-01-01 12:52:42.010 75.30 5 A 2012-01-01 12:52:42.113 75.30 6 B 2012-01-01 12:56:20.893 NA 7 B 2012-01-01 12:56:46.023 67.70 8 B 2012-01-01 12:57:19.300 76.06 9 B 2012-01-01 12:58:20.750 77.85 10 B 2012-01-01 12:58:20.797 NA 11 B 2012-01-01 12:59:19.527 79.57 12 C 2012-01-01 13:00:21.847 81.53 13 C 2012-01-01 13:00:21.860 81.53 14 C 2012-01-01 13:00:21.873 NA 15 C 2012-01-01 13:00:43.493 84.69 16 D 2012-01-01 12:01:21.520 24.63 17 D 2012-01-01 12:02:18.880 21.13 I tried the following using na.approx from zoo package interpolatedPrice<-unlist(tapply(portfolio$Price, portfolio$St, na.approx, portfolio$timestamp, na.rm=FALSE)) but keep getting error "Error in na.approx.default(X[[1L]], ...) : x and index must have the same length" I checked the length of every variable in the formula and they all have the same length so I am not sure why I get the error message. Jay [[alternative HTML version deleted]]
R. Michael Weylandt
2012-Oct-14  15:46 UTC
[R] error msg using na.approx "x and index must have the same length"
On Fri, Oct 12, 2012 at 1:26 AM, Jay Rice <jsrice18 at gmail.com> wrote:> Below I have written out some simplified data from my dataset. My goal is > to interpolate Price based on timestamp. Therefore the closer a Price is in > time to another price, the more like that price it will be. I want the > interpolations for each St and not across St (St is a factor with levels > A, B, and C). Unfortunately, I get error messages from code I wrote. > > In the end only IDs 10 and 14 will receive interpolated values because all > other NAs occur at the beginning of a level. My code is given below the > dataset. > > ID is int > St is factor with 3 levels > timestamp is POSIXlt > Price is num > > Data.frame name is portfolio > > ID St timestamp Price > 1 A 2012-01-01 12:50:24.760 NA > 2 A 2012-01-01 12:51:25.860 72.09 > 3 A 2012-01-01 12:52:21.613 72.09 > 4 A 2012-01-01 12:52:42.010 75.30 > 5 A 2012-01-01 12:52:42.113 75.30 > 6 B 2012-01-01 12:56:20.893 NA > 7 B 2012-01-01 12:56:46.023 67.70 > 8 B 2012-01-01 12:57:19.300 76.06 > 9 B 2012-01-01 12:58:20.750 77.85 > 10 B 2012-01-01 12:58:20.797 NA > 11 B 2012-01-01 12:59:19.527 79.57 > 12 C 2012-01-01 13:00:21.847 81.53 > 13 C 2012-01-01 13:00:21.860 81.53 > 14 C 2012-01-01 13:00:21.873 NA > 15 C 2012-01-01 13:00:43.493 84.69 > 16 D 2012-01-01 12:01:21.520 24.63 > 17 D 2012-01-01 12:02:18.880 21.13 > > I tried the following using na.approx from zoo package > > interpolatedPrice<-unlist(tapply(portfolio$Price, portfolio$St, na.approx, > portfolio$timestamp, na.rm=FALSE))Your problem is that this splits portfolio$Price by St but not timestamp, so the number of timestamps passed to na.approx() doesn't align with the Price series. I think you want something more like this: lapply(split(portfolio[,-1], portfolio$St), function(x) zoo(na.approx(x[,2], x[,1]), x[,1])) which is admittedly opaque. I think an easier data management strategy for you might be to put your data in a list of zoo/xts series and use lapply generously. E.g., pp <- lapply(split(portfolio[,-1], portfolio$St), as.zoo) and then do your calculations with generous use of lapply() Cheers, Michael> > but keep getting error > "Error in na.approx.default(X[[1L]], ...) : > x and index must have the same length" > > I checked the length of every variable in the formula and they all have the > same length so I am not sure why I get the error message. > > Jay > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.