thr3ads.net - R help - [R] Analysis of Data with Observation Weights Revisited [Nov 2002]

If this information is useful, please help other people find it:
Share via:

Bliese, Paul D MAJ WRAIR-Wash DC

2002-Nov-21 19:08 UTC

[R] Analysis of Data with Observation Weights Revisited

In lm and glm, the weights command should only be used when the variances of
the observations are inversely proportional to the weights.

Recently, however, a question came up regarding how one could estimate lm
and glm models with weights where the weights refer to number of
observations (counts).

Because lm and glm do not handle this case, SE values will be wrong if one
uses the weight command with counts.  Notice that the SE values in the
second model are larger than those in the first model, though paramater
estimates are identical.
> x <- rep(c(1,2), c(6,4))
> y <- rep(c(1,2,3,4),c(2,3,3,2))
> cbind(x,y)      x y
 [1,] 1 1
 [2,] 1 1
 [3,] 1 2
 [4,] 1 2
 [5,] 1 2
 [6,] 1 3
 [7,] 2 3
 [8,] 2 3
 [9,] 2 4
[10,] 2 4> summary(m <- lm(y~x))$coef             Estimate Std. Error   t value    Pr(>|t|)
(Intercept) 0.1666667  0.6627489 0.2514778 0.807784044
x           1.6666667  0.4468252 3.7300192 0.005787586> 
> x1 <- rep(c(1,2), c(3,2))
> y1 <- rep(c(1,2,3,4), c(1,1,2,1))
> w <- c(2,3,1,2,2)
> cbind(x1,y1,w)     x1 y1 w
[1,]  1  1 2
[2,]  1  2 3
[3,]  1  3 1
[4,]  2  3 2
[5,]  2  4 2> 
> summary(m1 <- lm(y1~x1, weights=w))$coef             Estimate Std. Error   t value  Pr(>|t|)
(Intercept) 0.1666667  1.0822644 0.1539981 0.8873876
x1          1.6666667  0.7296625 2.2841610 0.1065266


The thread left off with Thomas Lumley recommending that one use gee to
estimate the SE values while sticking with the parameter estimates from the
weighted lm or glm model.  Specifically something like,

mod<-gee(y1~x1,id=1:5, weights=w) #were each observation has its own id.

The problem with this solution is that gee does NOT allow weights (probably
something that would be fairly simple to modify); thus, the gee option
doesn't really work.

For those who are interested, one work around is to use the Thomas Lumley's
infjack.glm function in the glmerrs.R file at
http://faculty.washington.edu/tlumley/glmerrs.R
and the code is all in the WEAVE package at
http://faculty.washington.edu/tlumley/weave.html

In the example, notice that the SE estimates are much closer to the SE
values in the original model without weights.
> model<-glm(y1~x1,weights=w)
> varmat<-infjack.glm(model, groups=1:5)
> se<-sqrt(diag(varmat))
> se(Intercept)          x1 
  0.7827224   0.4969040 


MAJ Paul Bliese, Ph.D.
Walter Reed Army Institute of Research
Phone: (301) 319-9873
Fax: (301) 319-9484
paul.bliese at na.amedd.army.mil


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

R help - Nov 2002 - Analysis of Data with Observation Weights Revisited

[R] Analysis of Data with Observation Weights Revisited