Hello, I'd like to simulate data according to an SVAR model in order to demonstrate how other techniques (such as arima) yield biased estimates. I am interested in a 2 variable SVAR with 2 lags (in the notation of the vars vignette, K = 2, P = 2, where B = I_K). I'm using the {vars} package outlined here: http://cran.r-project.org/web/packages/vars/vignettes/vars.pdf I thought that the following would generate the data and demonstrate the accuracy of an SVAR compared to an arima, but the results are not what I expected. I think the problem is the way that I'm generating the data, but I don't understand what I could be doing wrong. Any guidance would be greatly appreciated. Problems: In the code below, the array "means" should show that SVAR parameters are unbiased estiamtors, so the second column of "means" should be near 0, and it's not any closer than the first column (the arima results). So my results are no less biased than the arima results. The first pair of plots should show the same: SVAR results are unbiased. But they don't cluster around the red dots (true parameter values), so they aren't. Finally, the second pair of plots should show the parameters are consistent: that MSE decreases as sample size increases. They don't really show that. Perhaps they would if the smallest sample were smaller than 200, but the SVAR tends not to converge with fewer observations. Further, the MSE tends to be at least as high with the SVAR compared to arima, so it's not any more accurate. Program: ##### Model ##### # Y(t) = a0 + a1*Y(t-1) + a2*Y(t-2) + a3*X(t-1) + a4*X(t-2) + e(t) # X(t) = b0 + b1*X(t-1) + b2*X(t-2) + b3*Y(t-1) + b4*Y(t-2) + b5*Y(t) + d(t) # e(t) & d(t) ~ N(0,s) # So Y has a contemporaneous impact on X # X only has an impact on future Ys # So this is the setup of a SVAR ##### Choosing parameters ##### # Currently, all parameters are just random numbers less than 1 so that # it's a stationary series # The standard deviations of the error terms are also random ~ U(0,2) a0 <- runif(1,-1,1) a1 <- runif(1,-0.9,0.9) a2 <- runif(1,-abs(a1),abs(a1)) a3 <- runif(1,-0.9,0.9) a4 <- runif(1,-abs(a3),abs(a3)) s.e <- runif(1,0,2) b0 <- runif(1,-1,1) b1 <- runif(1,-0.9,0.9) b2 <- runif(1,-abs(b1),abs(b1)) b3 <- runif(1,-0.9,0.9) b4 <- runif(1,-abs(b3),abs(b3)) b5 <- runif(1,-0.9,0.9) s.d <- runif(1,0,2) ################### A Formal Test: Loop ##################### Z <- 100 error <- array(0,dim=c(Z,2,6)) N <- runif(Z,200,1200) # Let the sample size vary so that we can check for consistency ##### Generating the data ##### # Start with 2 initial values and then create a dataset with 1200 observations X <- rep(0,1202) Y <- rep(0,1202) Y[1] <- rnorm(1, a0 + a1*a0 + a2*a0 + a3*b0 + a4*b0, s.e) X[1] <- rnorm(1, b0 + b1*b0 + b2*b0 + b3*a0 + b4*a0 + b5*Y[1], s.d) Y[2] <- rnorm(1, a0 + a1*Y[1] + a2*a0 + a3*X[1] + a4*b0, s.e) X[2] <- rnorm(1, b0 + b1*X[1] + b2*b0 + b3*Y[1] + b4*a0 + b5*Y[2], s.d) for(t in 3:1202){ Y[t] <- rnorm(1, a0 + a1*Y[t-1] + a2*Y[t-2] + a3*X[t-1] + a4*X[t-2], s.e) X[t] <- rnorm(1, b0 + b1*X[t-1] + b2*X[t-2] + b3*Y[t-1] + b4*Y[t-2] + b5*Y[t], s.d) } L1.Y <- rep(NA,1202) L2.Y <- rep(NA,1202) for(t in 2:1202){ L1.Y[t] <- Y[t-1] L2.Y[t] <- L1.Y[t-1] } for(z in 1:Z){ n <- N[z] x <- X[3:(n+2)] y <- Y[3:(n+2)] L1.y <- c(NA,L1.Y[4:(n+2)]) L2.y <- c(NA,NA,L2.Y[5:(n+2)]) ##### Modeling x with inclusion of y ##### m2 <- arima(x,order=c(2,0,0),xreg=cbind(L1.y,L2.y,y)) error[z,1,1] <- m2$coef[3] - b0 error[z,1,2:3] <- m2$coef[1:2] - c(b1,b2) error[z,1,4:6] <- m2$coef[4:6] - c(b3,b4,b5) ##### SVAR of x and y ##### m3 <- VAR(cbind(y,x),p=2) A <- matrix(c(1,NA,0,1),ncol=2) m4 <- SVAR(m3,Amat=A) error[z,2,1] <- m4$var$varresult$x$coeff[5] - b0 error[z,2,2] <- m4$var$varresult$x$coeff[2] - b1 error[z,2,3] <- m4$var$varresult$x$coeff[4] - b2 error[z,2,4] <- m4$var$varresult$x$coeff[1] - b3 error[z,2,5] <- m4$var$varresult$x$coeff[3] - b4 error[z,2,6] <- m4$A[2,1] - b5 } mse <- error^2 means <- array(0,dim=c(6,2)) for(i in 1:2){ means[,i] <- apply(error[,i,],2,mean) } means par(mfrow=c(2,1)) for(i in 1:2){ plot(1+runif(Z,-0.4,0.4),error[,i,1],xlim=c(0.5,6.5), xaxt='n',ylim=c(-1,1),ylab='',xlab='',main=paste("Model ",i)) for(j in 2:6){ points(j+runif(Z,-0.4,0.4),error[,i,j]) } points(seq(1,6,1),c(b0,b1,b2,b3,b4,b5),col=2) } par(mfrow=c(1,1)) par(mfrow=c(2,1)) for(i in 1:2){ plot(N,mse[,i,1],ylim=c(0,max(mse[,i,2:6])),main=paste("Model ",i)) for(j in 2:6){ points(N,mse[,i,j],col=j) }} par(mfrow=c(1,1))