Richard A. Bilonick
2003-Apr-21 14:14 UTC
[R] Anyone Familiar with Using arima function with exogenous variables?
I've posted this before but have not been able to locate what I'm doing wrong. I cannot determine how the forecast is made using the estimated coefficients from a simple AR(2) model when there is an exogenous variable. Does anyone know what the problem is? The help file for arima doesn't show the model with any exogenous variables. I haven't been able to locate any documents covering this. I put together a simple example of an AR(2) model (no exogenous variables) and another example of an AR(2) with one exogenous variable. In the first case it's easy to see how the forecasts are computed. When there is an exogenous variable, it's not clear (at leat to me) how the forecast is computed. I thought I understood how the model is written but apparently not. Using the LakeHuron data, fit a simple AR(2) model: > data(LakeHuron) > ar.lh <- arima(LakeHuron, order = c(2,0,0)) > ar.lh Call: arima(x = LakeHuron, order = c(2, 0, 0)) Coefficients: ar1 ar2 intercept 1.0436 -0.2495 579.0473 s.e. 0.0983 0.1008 0.3319 sigma2 estimated as 0.4788: log likelihood = -103.63, aic = 215.27 Make a 1-step ahead forecast: > predict(ar.lh,1)[[1]] Time Series: Start = 1973 End = 1973 Frequency = 1 [1] 579.7896 Compute the forecast manually: > sum(ar.lh$coef*c(c(579.96,579.89)-ar.lh$coef[3],1)) [1] 579.7896 This just says that the forecast for the next period (after the end of the data) is 579.0473 + 1.0436*(579.96 - 579.0473) - 0.2495*(579.89 - 579.0473). In other words: the forecast is the intercept plus the AR coefficients times the (previous ts values minus the intercepts). Now add an exogenous variable (in this case, the (year - 1920): > ar.lh <- arima(LakeHuron, order = c(2,0,0), xreg = time(LakeHuron)-1920) > ar.lh Call: arima(x = LakeHuron, order = c(2, 0, 0), xreg = time(LakeHuron) - 1920) Coefficients: ar1 ar2 intercept time(LakeHuron) - 1920 1.0048 -0.2913 579.0993 -0.0216 s.e. 0.0976 0.1004 0.2370 0.0081 sigma2 estimated as 0.4566: log likelihood = -101.2, aic = 212.4 The prediction is: > predict(ar.lh,1,newxreg=53)[[1]] Time Series: Start = 1973 End = 1973 Frequency = 1 [1] 579.3972 Now try to manually forecast when the next time period is 53 (i.e., 1973 - 1920): > sum(ar.lh$coef*c(c(579.96,579.89)-ar.lh$coef[3],1,53)) [1] 578.5907 What am I doing wrong? I've tried this with numerous examples and whenever there is an exogenous variable I cannot get the manual forecast to agree with predict. Is it not correct to just add (-0.0216 times 53) to the rest? I need to know how to write the model correctly. Obviously there is something I am overlooking. R's arima function and predict function work correctly - at least they agree with SAS for example so I'm not doing something right. I would really appreciate some insight here. Rick B.
Richard A. Bilonick
2003-Apr-21 16:32 UTC
[R] Anyone Familiar with Using arima function with exogenous variables?
Spencer Graves wrote:> Have you tried reading predict.Arima? > > Do you have any references that compute a simple numerical example? I > believe there is one in Box, Jenkins, Reinsel, but I don't have time > to research it. > > Hope this helps. > spencer graves >There does not appear to be any relevant information in predict.Arima concerning xreg. I tried using arima to estimate the sales data (Series M in Box and Jenkins) using the leading indicator. I think I estimated the same model correctly. The AR and MA coefficients roughly agreed but the intercept and coefficient for the leading indicator were very different. The intercept was 10 times too large (approximately) and the coefficient for the leading indicator was about 1/10 of that shown in B&J. So far I haven't located any simple examples to try. Thanks. Rick B.
Richard A. Bilonick
2003-Apr-22 16:37 UTC
[R] Anyone Familiar with Using arima function with exogenous variables?
Paul Gilbert wrote:>> So if I have 200 observations and I want to estimate for time t = >> 201, I would use y[200] and x[200] and I would have my forecast. But > > ^^^^^^ > Don't you mean x[201] ? >Yes, I meant x[201]. I found what I was looking for (although it's not in the R documentation). When Y is differenced, fitted by an AR(1) with one exogenous variable with no intercept, the model is written as: (1-B)Y[t] = wX[t] + e[t]/(1 - phi B) Solving for Y[t]: (1 - phi B)(1 - B) Y[t] = w(1 - phi B) X[t] + e[t] (1 - phi B - B + phi B2) Y[t] = w(X[t] - phi X[t-1]) + e[t] Y[t] = (1 + phi)Y[t-1] - phi Y[t-2] + w(X[t] - phi X[t-1]) + e[t] So the 1-step ahead forecast is: Y[t]' = (1 + phi')Y[t-1] - phi' Y[t-2] + w'(X[t] - phi' X[t-1]) Rick B.
Possibly Parallel Threads
- KalmanLike: missing exogenous factor?
- Is there any package for Vector Auto-regressive with exogenous variable other than fastVAR?
- GARCH estimation with exogenous variables in the mean equation
- All subsets vector autoregression with exogenous variables
- appropriate covariance matrix for multiple nominal exogenous and multiple continuous endogenous variables in SEM