thr3ads.net - R help - [R] data order affects glmmPQL [Jan 2006]

If this information is useful, please help other people find it:
Share via:

Jack Tanner

2006-Jan-04 22:25 UTC

[R] data order affects glmmPQL

Is it to be expected that the way a data frame is sorted should affect the 
model fit by glmmPQL?

Example:

library(MASS)
library(nlme)

fit.model <- function(il, model.family) {
  cs <- Initialize(corSymm(form=~1|id), data=il)
  glmmPQL(score~test+coder, random=~1|id, # 
control=lmeControl(msMaxIter=100),
          family=model.family, data=il, correlation=cs)
}

score <- 
c(1,8,1,3,4,4,2,5,3,6,0,3,1,5,0,5,1,11,1,2,4,5,2,4,1,6,1,2,8,16,5,16,3,15,3,12,4,9,2,4,1,8,2,6,4,11,2,9,3,17,2,6)
id <- rep(1:13, rep(4, 13))
test <- gl(2, 1, length(score), labels=c("pre", "post"))
coder <- gl(2, 2, length(score), labels=c("two",
"three"))

foo <- data.frame(id, score, test, coder) # Define data frame
print(summary(fit.model(foo, poisson)))

bar <- foo[order(id, score, test, coder),] # Reorder data frame
print(summary(fit.model(bar, poisson)))

The two summaries are clearly different. Is this to be expected? Is there a 
canonical way one should order a data frame before passing it to glmmPQL?

Spencer Graves

2006-Jan-11 03:23 UTC

head link

[R] data order affects glmmPQL

The correlation between the predictions from your two model fits is 
0.95.  This suggests to me that the differences between the two sets of 
answers have little practical importance, and anyone who disagrees may 
be trying to read more from the results than can actually be supported 
by the data.  It should be fairly easy to select the apparent "best" 
from among several such answers being the one that had a higher 
log(likelihood).  This pushes me to prefer "fit.bar" with a 
log(likelihood) of -32.31 to "fit.foo" with -33.05.

	  I agree that the differences are somewhat disturbing, but you are 
dealing with the output from an iterative solution of a notoriously 
difficult problem, and the standard wisdom is that it is wise to try 
several sets of starting values.  By modifying the order of the 
observations in the data.frame, you have effectively done that.

	  You could probably reduce these differences by creating a local copy 
of "glmmPQL" and modifying the code so a user could provide starting 
values and so you could restart the iteration for either fit with the 
results of the other, possibly iterating to tighter tolerances.

	  hope this helps.
	  spencer graves
p.s.  To got this correlation, I modified the latter portion of your 
script as follows:

fit.foo <- fit.model(foo, poisson)
o.bar <- order(id, score, test, coder)
bar <- foo[o.bar,] # Reorder data frame
fit.bar <- fit.model(bar, poisson)
predict.foo <- predict(fit.foo)
predict.bar <- predict(fit.bar)

cor(predict.foo[o.bar], predict.bar)
[1] 0.9544287

Jack Tanner wrote:
> Is it to be expected that the way a data frame is sorted should affect the 
> model fit by glmmPQL?
> 
> Example:
> 
> library(MASS)
> library(nlme)
> 
> fit.model <- function(il, model.family) {
>   cs <- Initialize(corSymm(form=~1|id), data=il)
>   glmmPQL(score~test+coder, random=~1|id, # 
> control=lmeControl(msMaxIter=100),
>           family=model.family, data=il, correlation=cs)
> }
> 
> score <- 
>
c(1,8,1,3,4,4,2,5,3,6,0,3,1,5,0,5,1,11,1,2,4,5,2,4,1,6,1,2,8,16,5,16,3,15,3,12,4,9,2,4,1,8,2,6,4,11,2,9,3,17,2,6)
> id <- rep(1:13, rep(4, 13))
> test <- gl(2, 1, length(score), labels=c("pre",
"post"))
> coder <- gl(2, 2, length(score), labels=c("two",
"three"))
> 
> foo <- data.frame(id, score, test, coder) # Define data frame
> print(summary(fit.model(foo, poisson)))
> 
> bar <- foo[order(id, score, test, coder),] # Reorder data frame
> print(summary(fit.model(bar, poisson)))
> 
> The two summaries are clearly different. Is this to be expected? Is there a
> canonical way one should order a data frame before passing it to glmmPQL?
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

R help - Jan 2006 - data order affects glmmPQL

[R] data order affects glmmPQL

[R] data order affects glmmPQL