Dimitri Liakhovitski
2011-Aug-12 15:43 UTC
[R] Which Durbin-Watson is correct? (weights involved) - using durbinWatsonTest and dwtest (packages car and lmtest)
Hello! I have a data frame mysample (sorry for a long way of creating it below - but I need it in this form, and it works). I regress Y onto X1 through X11 - first without weights, then with weights: regtest1<-lm(Y~., data=mysample[-13])) regtest2<-lm(Y~., data=mysample[-13]),weights=mysample$weight) summary(regtest1) summary(regtest2) Then I calculate Durbin-Watson for both regressions using 2 different packages: library(car) library(lmtest) durbinWatsonTest(regtest1)[2] dwtest(regtest1)$stat durbinWatsonTest(regtest2)[2] dwtest(regtest2)$stat When there are no weights, the Durbin-Watson statistic is the same. But when there are weights, 2 packages give Durbin-Watson different statistics. Anyone knows why? Also, it's interesting that both of them are also different from what SPSS spits out... Thank you! Dimitri ############################################ ### Run the whole code below to create mysample: intercor<-0.3 # intercorrelation among all predictors k<-10 # number of predictors sigma<-matrix(intercor,nrow=k,ncol=k) # matrix of intercorrelations among predictors diag(sigma)<-1 require(mvtnorm) set.seed(123) mypop<-as.data.frame(rmvnorm(n=100000, mean=rep(0,k), sigma=sigma, method="chol")) names(mypop)<-paste("x",1:k,sep="") set.seed(123) mypop$x11<-sample(c(0,1),100000,replace=T) set.seed(123) betas<-round(abs(rnorm(k+1)),2) # desired betas Y<-as.matrix(mypop) %*% betas mypop<-cbind(mypop, Y) rSQR<-.5 VARofY<- mean(apply(as.data.frame(mypop$Y),2,function(x){x^2})) - mean(mypop$Y)^2 mypop$Y<-mypop$Y + rnorm(100000, mean=0, sd=sqrt(VARofY/rSQR-VARofY)) n<-200 set.seed(123) cases.for.sample<-sample(100000,n,replace=F) mysample<-mypop[cases.for.sample,] mysample<-cbind(mysample[k+2],mysample[1:(k+1)]) #dim(sample) weight<-rep(1:10,20);weight<-weight[order(weight)] mysample$weight<-weight -- Dimitri Liakhovitski marketfusionanalytics.com
Achim Zeileis
2011-Aug-12 18:42 UTC
[R] Which Durbin-Watson is correct? (weights involved) - using durbinWatsonTest and dwtest (packages car and lmtest)
On Fri, 12 Aug 2011, Dimitri Liakhovitski wrote:> Hello! > > I have a data frame mysample (sorry for a long way of creating it > below - but I need it in this form, and it works). I regress Y onto X1 > through X11 - first without weights, then with weights: > > regtest1<-lm(Y~., data=mysample[-13])) > regtest2<-lm(Y~., data=mysample[-13]),weights=mysample$weight) > summary(regtest1) > summary(regtest2) > > Then I calculate Durbin-Watson for both regressions using 2 different packages: > > library(car) > library(lmtest) > > durbinWatsonTest(regtest1)[2] > dwtest(regtest1)$stat > > durbinWatsonTest(regtest2)[2] > dwtest(regtest2)$stat > > When there are no weights, the Durbin-Watson statistic is the same. > But when there are weights, 2 packages give Durbin-Watson different > statistics. Anyone knows why?The result of dwtest() is wrong. Internally, dwtest() extracts the model matrix and response (but no weights) and does all processing based on these. Thus, it computes the DW statistic for regtest1 not regtest2. I've just added a patch to my source code which catches this problem and throws a meaningful error message. It will be part of the next release (0.9-29) in due course. Of course, this doesn't help you with computing the DW statistic for the weighted regression but hopefully it reduces the confusion about the different behaviors... Z> Also, it's interesting that both of them are also different from what > SPSS spits out... > > Thank you! > Dimitri > > > ############################################ > ### Run the whole code below to create mysample: > > intercor<-0.3 # intercorrelation among all predictors > k<-10 # number of predictors > sigma<-matrix(intercor,nrow=k,ncol=k) # matrix of intercorrelations > among predictors > diag(sigma)<-1 > > require(mvtnorm) > set.seed(123) > mypop<-as.data.frame(rmvnorm(n=100000, mean=rep(0,k), sigma=sigma, > method="chol")) > names(mypop)<-paste("x",1:k,sep="") > set.seed(123) > mypop$x11<-sample(c(0,1),100000,replace=T) > > set.seed(123) > betas<-round(abs(rnorm(k+1)),2) # desired betas > Y<-as.matrix(mypop) %*% betas > mypop<-cbind(mypop, Y) > rSQR<-.5 > VARofY<- mean(apply(as.data.frame(mypop$Y),2,function(x){x^2})) - > mean(mypop$Y)^2 > mypop$Y<-mypop$Y + rnorm(100000, mean=0, sd=sqrt(VARofY/rSQR-VARofY)) > > n<-200 > set.seed(123) > cases.for.sample<-sample(100000,n,replace=F) > mysample<-mypop[cases.for.sample,] > mysample<-cbind(mysample[k+2],mysample[1:(k+1)]) #dim(sample) > weight<-rep(1:10,20);weight<-weight[order(weight)] > mysample$weight<-weight > > > > -- > Dimitri Liakhovitski > marketfusionanalytics.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >