Björn Stollenwerk
2005-Nov-28 11:18 UTC
[R] glm: quasi models with logit link function and binary data
# Hello R Users, # # I would like to fit a glm model with quasi family and # logistical link function, but this does not seam to work # with binary data. # # Please don't suggest to use the quasibinomial family. This # works out, but when applied to the true data, the # variance function does not seams to be # appropriate. # # I couldn't see in the # theory why this does not work. # Is this a bug, or are there theoretical reasons? # One problem might be, that logit(0)=-Inf and logit(1)=Inf. # But I can't see how this disturbes the calculation of quasi-Likelihood. # # Thank you very much, # best, # # Bj??rn set.seed(0) y <- sample(c(0,1), size=100, replace=T) # the following models work: glm(y ~ 1) glm(y ~ 1, family=binomial(link=logit)) glm(y ~ 1, family=quasibinomial(link=logit)) # the next model doesn't work: glm(y ~ 1, family=quasi(link=logit))
Sundar Dorai-Raj
2005-Nov-28 12:20 UTC
[R] glm: quasi models with logit link function and binary data
Bj??rn Stollenwerk wrote:> # Hello R Users, > # > # I would like to fit a glm model with quasi family and > # logistical link function, but this does not seam to work > # with binary data. > # > # Please don't suggest to use the quasibinomial family. This > # works out, but when applied to the true data, the > # variance function does not seams to be > # appropriate. > # > # I couldn't see in the > # theory why this does not work. > # Is this a bug, or are there theoretical reasons? > # One problem might be, that logit(0)=-Inf and logit(1)=Inf. > # But I can't see how this disturbes the calculation of quasi-Likelihood. > # > # Thank you very much, > # best, > # > # Bj??rn > > set.seed(0) > y <- sample(c(0,1), size=100, replace=T) > > # the following models work: > glm(y ~ 1) > glm(y ~ 1, family=binomial(link=logit)) > glm(y ~ 1, family=quasibinomial(link=logit)) > > # the next model doesn't work: > glm(y ~ 1, family=quasi(link=logit)) >This is an issue with the starting values provided to glm. Take a look at the difference between: quasibinomial()$initialize and quasi("logit")$initialize and where this is used in glm.fit and you should see the why the error occurs. To avoid this, you can supply your own starting values from a call to glm mustart <- predict(glm(y ~ 1, binomial), type = "response") glm(y ~ 1, quasi("logit"), mustart = mustart) or just use: glm(y ~ 1, quasi("logit"), mustart = rep(0.5, length(y))) HTH, --sundar
Hong Ooi
2005-Nov-29 00:23 UTC
[R] glm: quasi models with logit link function and binary data
_______________________________________________________________________________________ This would be because quasi(link=logit) doesn't actually fit a logistic regression. The default variance function for quasi is the identity, not binomial variance. To emulate a logistic regression, use var="mu(1-mu)" in addition to link=logit.> y <- runif(100) > glm(y ~ 1, family=binomial)Call: glm(formula = y ~ 1, family = binomial) Coefficients: (Intercept) -0.01208 Degrees of Freedom: 99 Total (i.e. Null); 99 Residual Null Deviance: 37.15 Residual Deviance: 37.15 AIC: 140.6 Warning message: non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)> glm(y ~ 1, family=quasi(var="mu(1-mu)", link=logit))Call: glm(formula = y ~ 1, family = quasi(var = "mu(1-mu)", link = logit)) Coefficients: (Intercept) -0.01208 Degrees of Freedom: 99 Total (i.e. Null); 99 Residual Null Deviance: 37.15 Residual Deviance: 37.15 AIC: NA -- Hong Ooi Senior Research Analyst, IAG Limited 388 George St, Sydney NSW 2000 (02) 9292 1566 -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Bj??rn Stollenwerk Sent: Monday, 28 November 2005 10:18 PM To: R-help at stat.math.ethz.ch Subject: [R] glm: quasi models with logit link function and binary data # Hello R Users, # # I would like to fit a glm model with quasi family and # logistical link function, but this does not seam to work # with binary data. # # Please don't suggest to use the quasibinomial family. This # works out, but when applied to the true data, the # variance function does not seams to be # appropriate. # # I couldn't see in the # theory why this does not work. # Is this a bug, or are there theoretical reasons? # One problem might be, that logit(0)=-Inf and logit(1)=Inf. # But I can't see how this disturbes the calculation of quasi-Likelihood. # # Thank you very much, # best, # # Bj??rn set.seed(0) y <- sample(c(0,1), size=100, replace=T) # the following models work: glm(y ~ 1) glm(y ~ 1, family=binomial(link=logit)) glm(y ~ 1, family=quasibinomial(link=logit)) # the next model doesn't work: glm(y ~ 1, family=quasi(link=logit)) ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html _______________________________________________________________________________________ The information transmitted in this message and its attachme...{{dropped}}
Hong Ooi
2005-Nov-29 00:40 UTC
[R] glm: quasi models with logit link function and binary data
_______________________________________________________________________________________ Hm, I should have checked what would happen with binary data and not just continuous. Using glm with quasi(var="mu(1-mu)", link=logit) indeed fails with NAs/NaNs when y is binary. -- Hong Ooi Senior Research Analyst, IAG Limited 388 George St, Sydney NSW 2000 (02) 9292 1566 -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Sundar Dorai-Raj Sent: Monday, 28 November 2005 11:20 PM To: Bj??rn Stollenwerk Cc: R-help at stat.math.ethz.ch Subject: Re: [R] glm: quasi models with logit link function and binary data This is an issue with the starting values provided to glm. Take a look at the difference between: quasibinomial()$initialize and quasi("logit")$initialize and where this is used in glm.fit and you should see the why the error occurs. To avoid this, you can supply your own starting values from a call to glm mustart <- predict(glm(y ~ 1, binomial), type = "response") glm(y ~ 1, quasi("logit"), mustart = mustart) or just use: glm(y ~ 1, quasi("logit"), mustart = rep(0.5, length(y))) HTH, --sundar ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html _______________________________________________________________________________________ The information transmitted in this message and its attachme...{{dropped}}