Christofer Bogaso
2012-Mar-20 10:03 UTC
[R] Fitting loglinear model with glm() and loglm()
Dear all, I have small difficulty in comprehending the loglinear model with R. Assume, we have following data dat <- array(c(911, 44, 538, 456, 3, 2, 43, 279), c(2, 2, 2)) Now I fit a loglinear model with this and get the fitted values: library(MASS) Model_1 <- loglm(~1 + 2 + 3, dat) fitted(Model_1) I could do this same task using glm() function as well because loglinear model is just 1 kind of glm ### Create dummy variables manually Dummy_Variable_Matrix <- rbind(c(1, 1, 1), c(0, 1, 1), c(1, 0, 1), c(0, 0, 1), c(1, 1, 0), c(0, 1, 0), c(1, 0, 0), c(0, 0, 0)) ### Fit glm model_2 <- glm(as.vector(dat) ~ Dummy_Variable_Matrix[,1] + Dummy_Variable_Matrix[,2] + Dummy_Variable_Matrix[,3], poisson(link = log)); fitted(model_2) ### However................ fitted(model_2) == as.vector(fitted(Model_1)) ### do not match However it is true that the difference is very small, still I am wondering whether should I just ingore that small difference? Or I have done something fundamentally wrong? Thanks for your help!
Dear Christofer, loglm uses an iterative proportional scaling (IPS) algorithm for fitting a log-linear model to a contingency table. glm uses an iteratively reweighted least squares algorithm. The result from IPS is exact. Regards S?ren -----Oprindelig meddelelse----- Fra: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] P? vegne af Christofer Bogaso Sendt: 20. marts 2012 11:04 Til: r-help at r-project.org Emne: [R] Fitting loglinear model with glm() and loglm() Dear all, I have small difficulty in comprehending the loglinear model with R. Assume, we have following data dat <- array(c(911, 44, 538, 456, 3, 2, 43, 279), c(2, 2, 2)) Now I fit a loglinear model with this and get the fitted values: library(MASS) Model_1 <- loglm(~1 + 2 + 3, dat) fitted(Model_1) I could do this same task using glm() function as well because loglinear model is just 1 kind of glm ### Create dummy variables manually Dummy_Variable_Matrix <- rbind(c(1, 1, 1), c(0, 1, 1), c(1, 0, 1), c(0, 0, 1), c(1, 1, 0), c(0, 1, 0), c(1, 0, 0), c(0, 0, 0)) ### Fit glm model_2 <- glm(as.vector(dat) ~ Dummy_Variable_Matrix[,1] + Dummy_Variable_Matrix[,2] + Dummy_Variable_Matrix[,3], poisson(link = log)); fitted(model_2) ### However................ fitted(model_2) == as.vector(fitted(Model_1)) ### do not match However it is true that the difference is very small, still I am wondering whether should I just ingore that small difference? Or I have done something fundamentally wrong? Thanks for your help! ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Tue, 20 Mar 2012, Christofer Bogaso wrote:> Dear all, I have small difficulty in comprehending the loglinear model > with R. Assume, we have following data > > dat <- array(c(911, 44, 538, 456, 3, 2, 43, 279), c(2, 2, 2)) > > Now I fit a loglinear model with this and get the fitted values: > > library(MASS) > Model_1 <- loglm(~1 + 2 + 3, dat) > fitted(Model_1) > > I could do this same task using glm() function as well because > loglinear model is just 1 kind of glm > > ### Create dummy variables manually > Dummy_Variable_Matrix <- rbind(c(1, 1, 1), > c(0, 1, 1), > c(1, 0, 1), > c(0, 0, 1), > > c(1, 1, 0), > c(0, 1, 0), > c(1, 0, 0), > c(0, 0, 0)) > > ### Fit glm > > model_2 <- glm(as.vector(dat) ~ > Dummy_Variable_Matrix[,1] + > Dummy_Variable_Matrix[,2] + > Dummy_Variable_Matrix[,3], > poisson(link = log)); > fitted(model_2) > > ### However................ > > fitted(model_2) == as.vector(fitted(Model_1)) ### do not match > > > However it is true that the difference is very small, still I am > wondering whether should I just ingore that small difference? Or I > have done something fundamentally wrong?The fitted values are not the same (==) but equal up to some tolerance appropriate for floating point numbers (see all.equal). The reason is that different numeric algorithms are employed for maximizing the log-likelihood. loglm() internally uses loglin() which uses iterative proportional fitting. glm() internally uses glm.fit() which performs iterative weighted least squares. BTW: Setting up frequencies and factors for glm() modeling based on a table can be done more easily by coercing the "array" to a "table" and then to a "data.frame": tab <- as.table(dat) m1 <- loglm(~ 1 + 2 + 3, data = tab) dframe <- as.data.frame(tab) m2 <- glm(Freq ~ Var1 + Var2 + Var3, data = dframe, family = poisson) all.equal(as.vector(fitted(m1)), as.vector(fitted(m2))) ## TRUE Also, the LR and Pearson statistics from print(m1) can be reproduced via sum(residuals(m2, type = "deviance")^2) sum(residuals(m2, type = "pearson")^2) Hope that helps, Z> Thanks for your help! > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >