From the gee() help page:
Data are assumed to be sorted so that observations
on a cluster are contiguous rows for all entities in the formula.
-thomas
On Sun, 7 Sep 2008, Qinglin Wu wrote:
> Dear List:
>
> I found an error when I called the 'gee' function. I cannot solve
and explain it. There are no errors when I used the 'geeglm' function.
Both functions fit the gee model. The project supervisor recommends me to use
the 'gee' function. But I cannot explain to him why this error happens.
Would you help me solve this problem? I appreciate your help.
>
> In this project I will use the 'gee' or 'geeglm' and
'glmer' to fit the simulated multivariate count responses. I generated
the data like this:
>
> Set ??0 = ??1 = 1, ??0 = 3, and n = 50.
> For each 1 ?? i ?? n,
> Simulate xi from N (1, 1).
> Simulate zi0 and zit from
> zi0 follows i.d. Poisson (??0) ,
> zit | xi follows i.d. Poisson (??it) , 1 ?? t ?? 3,
> log (??it) = log(E (zit | xi)) = ??0t + xi??1t = 1+xi.
> Let yit = zi0 + zit, 1 ?? t ?? 3.
>
> So my data frame, let me call it 'simdata', the first 10 rows look
like this:
> id y.1 y.2 y.3 x
> 1 3 5 6 -0.06588626
> 2 6 7 6 -0.08265981
> 3 6 8 13 0.58307719
> 4 22 21 28 2.21099940
> 5 5 12 8 1.06299869
> 6 8 21 24 1.47615784
> 7 11 8 9 0.83748390
> 8 16 15 16 1.67011313
> 9 9 7 7 -0.14181264
> 10 31 37 40 2.56751453
>
>
> This is the longitudinal data. I will change its shape to analyze it.
> The changed 'newdata' looks like this:
> id x time y
> 1 -0.06588626 1 3
> 2 -0.08265981 1 6
> 3 0.58307719 1 6
> 4 2.21099940 1 22
> 5 1.06299869 1 5
> 6 1.47615784 1 8
> 7 0.83748390 1 11
> 8 1.67011313 1 16
> 9 -0.14181264 1 9
> 10 2.56751453 1 31
> ...........................
> 1 -0.06588626 2 5
> 2 -0.08265981 2 7
> 3 0.58307719 2 8
> 4 2.21099940 2 21
> 5 1.06299869 2 12
> 6 1.47615784 2 21
> 7 0.83748390 2 8
> 8 1.67011313 2 15
> 9 -0.14181264 2 7
> 10 2.56751453 2 37
> ...........................
> 1 -0.06588626 3 6
> 2 -0.08265981 3 6
> 3 0.58307719 3 13
> 4 2.21099940 3 28
> 5 1.06299869 3 8
> 6 1.47615784 3 24
> 7 0.83748390 3 9
> 8 1.67011313 3 16
> 9 -0.14181264 3 7
> 10 2.56751453 3 40
> ...........................
>
>
> My data 'y' comes from x. So their correlations are not
independent. What does the argument 'corstr' mean it defined in the
function. I tried all choices. But the error was still there. Here was the
function I used in my programming:
>
> mfit1 <-
gee(y~x,data=newdata,family=poisson(link="log"),id=id,corstr="exchangeable")
>
> GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
> gee S-function, version 4.13 modified 98/01/27 (1998)
>
> Model:
> Link: Logarithm
> Variance to Mean Relation: Poisson
> Correlation Structure: Exchangeable
>
> Call:
> gee(formula = y ~ x, id = id, data = newdata, family = poisson(link =
"log"),
> corstr = "exchangeable")
>
> Number of observations : 150
>
> Maximum cluster size : 1
>
> Coefficients:
> (Intercept) x
> 1.5849653 0.7937203
>
> Estimated Scale Parameter: 1.162505
> Number of Iterations: 1
>
> Working Correlation[1:4,1:4]
> Error in print(x$working.correlation[1:4, 1:4], digits = digits) :
> subscript out of bounds
>
>
>
> Is this kind of data not fit for the function 'gee'? Because when
I tested this two functions by using the R data 'warpbreaks' they
worked perfect although some returned objects were different. I used the
following to do it:
> 1. (summary(gee(breaks ~ tension, id=wool, data=warpbreaks,
corstr="exchangeable"))
> 2. summary(geeglm(breaks ~ tension, id=wool, data=warpbreaks,
corstr="exchangeable"))).
>
> The first one is from the example of ?gee file.
>
> I will attach the part of my programming as .R file. You can excute in R
software. Thanks a lot.
>
>
> I appreciate your help.
>
>
> Best regards.
>
> Sincerely,
> Cynthia Wu
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle