nzcoops
2010-Jul-15 03:47 UTC
[R] Longitudinal negative binomial regression - robust sandwich estimator standard errors
Hi All, I have a dataset, longitudinal in nature, each row is a 'visit' to a clinic, which has numerous data fields and a count variable for the number of 'events' that occurred since the previous visit. ~50k rows, ~2k unique subjects so ~25 rows/visits per subject, some have 50 some have 3 or 4. In STATA there is an adjustment for the fact that you have multiple rows per subject, in that you can produce robust standard errors using sandwich estimators, to control for the fact there is a correlation structure within each subjects set of visits. This function is reasonably straight forward, however I'm trying to find something similar in R. http://www.stata.com/help.cgi?nbreg http://www.stata.com/help.cgi?vce_option I'll admit I'm not all that familiar with the inner workings of these functions but am learning about them. glm.nb gives the same coefficients as nbreg in stata which is reassuring, but I haven't yet found the same adjustment that vce is doing. I've tried the cluster function, and the Zelig package with zelig(my model, data = mydata, model= "negbin", by="id") but I continually get the following error: Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : contrasts can be applied only to factors with 2 or more levels I'm not actually sure that is the correct command as when I tried it with a 3 level factor instead of id is just ran the model 3 times, once for each level of the factor, not what I'm after or what I expected from it. Any thoughts or direction on this appreciated. Matt -- View this message in context: http://r.789695.n4.nabble.com/Longitudinal-negative-binomial-regression-robust-sandwich-estimator-standard-errors-tp2289656p2289656.html Sent from the R help mailing list archive at Nabble.com.
Matt Cooper
2010-Jul-15 05:24 UTC
[R] Longitudinal negative binomial regression - robust sandwich estimator standard errors
Hi All, I have a dataset, longitudinal in nature, each row is a 'visit' to a clinic, which has numerous data fields and a count variable for the number of 'events' that occurred since the previous visit. ~50k rows, ~2k unique subjects so ~25 rows/visits per subject, some have 50 some have 3 or 4. In STATA there is an adjustment for the fact that you have multiple rows per subject, in that you can produce robust standard errors using sandwich estimators, to control for the fact there is a correlation structure within each subjects set of visits. This function is reasonably straight forward, however I'm trying to find something similar in R. http://www.stata.com/help.cgi?nbreg http://www.stata.com/help.cgi?vce_option I'll admit I'm not all that familiar with the inner workings of these functions but am learning about them. glm.nb gives the same coefficients as nbreg in stata which is reassuring, but I haven't yet found the same adjustment that vce is doing. I've tried the cluster function, and the Zelig package with zelig(my model, data = mydata, model= "negbin", by="id") but I get the following error: Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : contrasts can be applied only to factors with 2 or more levels I'm not actually sure that is the correct command as when I tried it with a 3 level factor instead of id is just ran the model 3 times, once for each level of the factor, not what I'm after or what I expected from it. Any thoughts or direction on this appreciated. Matt [[alternative HTML version deleted]]
Possibly Parallel Threads
- Negative Binomial Model
- Dispersion parameter in Neg Bin GLM
- [gam] [mgcv] Question in integrating a eiker-white "sandwich" VCV estimator into GAM
- new package clubSandwich: Cluster-Robust (Sandwich) Variance Estimators with Small-Sample Corrections
- new package clubSandwich: Cluster-Robust (Sandwich) Variance Estimators with Small-Sample Corrections