Marco Pomati
2013-May-29 15:29 UTC
[R] Goodness-of-fit tests for Complex Survey Logistic Regression
a) I've recently come across the global Goodness-of-fit tests for complex survey logistic regression. Has it been implemented in R? Paper http://med.stanford.edu/medicine/ArcherLemeshowHosmer.pdf Implementation in Stata http://www.isr.umich.edu/src/smp/asda/Additional%20Analysis%20Example%20Demonstrating%20Use%20of%20Stata%20svy%20logistic%20and%20estat%20gof%20commands.pdf I'm asking because I've fitted a logistic model that's been used many times before (on random samples) to a strartified clustered one. Global fit indices were generally reported so unfortunately I need to carry out a couple of comparable tests... b) Besides the use of regTermTest (survey package) on independent variables, do you think that emulating the Chi-Square Goodness Of Fit Test on the final model in the following way makes sense? library(survey) data(api) dclus2<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2) a<-svyglm(sch.wide~1, design=dclus2,family=quasibinomial()) #Null model c<-svyglm(sch.wide~ell*meals, design=dclus2,family=quasibinomial()) anova(a,c) Many thanks for your help Marco [[alternative HTML version deleted]]
Thomas Lumley
2013-May-29 21:50 UTC
[R] Goodness-of-fit tests for Complex Survey Logistic Regression
On Thu, May 30, 2013 at 3:29 AM, Marco Pomati <ptxmp@bristol.ac.uk> wrote:> a) I've recently come across the global Goodness-of-fit tests for complex > survey logistic regression. Has it been implemented in R? > >It's quite hard to definitively say that something hasn't been implemented in R, but I haven't implemented it, and I'd be a bit surprised if someone else had without asking me about it. It looks from the paper as though the 'mean residual test' could be implemented fairly easily. If I have read it correctly, then r <- residuals(model, type="response") f<-fitted(model) g<- cut(f, c(-Inf, quantile(f, (1:9)/10, Inf)) # now create a new design object with r and g added as variables decilemodel<- svyglm(r~g, design=newdesign) regTermTest(decilemodel, ~g) is the F-adjusted mean residual test from that paper.> I'm asking because I've fitted a logistic model that's been used many times > before (on random samples) to a strartified clustered one. Global fit > indices were generally reported so unfortunately I need to carry out a > couple of comparable tests... > b) Besides the use of regTermTest (survey package) on independent > variables, do you think that emulating the Chi-Square Goodness Of Fit Test > on the final model in the following way makes sense? > > library(survey) > data(api) > dclus2<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2) > a<-svyglm(sch.wide~1, design=dclus2,family=quasibinomial()) #Null model > c<-svyglm(sch.wide~ell*meals, design=dclus2,family=quasibinomial()) > anova(a,c) >That's not really a goodness of fit test, it's a test of whether the model is better than nothing. It should agree with regTermTest(method="LRT"), because that's how it is implemented when the models are 'syntactically' nested, ie, when you can tell just from the model formula that they are nested. (examples of models that are nested but not syntactically nested would be a linear term and a spline) I'm not generally a fan of global goodness-of-fit tests, but this is straightforward enough that I might add it to the survey package (though that's not going to happen for a month or so). -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]]