Hello dear r-gurus! I have a question about the logit-model. I think I have misunderstood something and I'm trying to find a bug from my code or even better from my head. Any help is appreciated. The question is shortly: why I'm not having same coefficients from the logit-regression when using a link-function and an explicite transformation of the dependent. Below some details. I'm not very familiar with the concept. As far as I have understood it's all about transformation of the dependent variable if one have frequency data (grouped data, instead of raw binaries): ln(^p(i)/(1-^p(i)) = c + b_1(X_1) +...+ b_k(X_k) + e(i). where ^p(i) is (estimated) frequency of incident (happened/all = n(i)/N), i is index of observation, c and b_. are coefficients (objects of the estimation), X_. are the explanatory variables and e is residual. So a linear regression. And some testing:> y <- runif(100) > X <- rnorm(100) > glm(y~ X, family=binomial(link=logit))Call: glm(formula = y ~ X, family = binomial(link = logit)) Coefficients: (Intercept) X -0.00956 0.10760 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 43.83 Residual Deviance: 43.49 AIC: 142.3 Warning message: non-integer #successes in a binomial glm! in: eval(expr, envir, enclos) ### OR> glm(cbind(y, 1-y)~ X, family=binomial(link=logit)) ### ?glmCall: glm(formula = cbind(y, 1 - y) ~ X, family = binomial(link = logit)) Coefficients: (Intercept) X -0.00956 0.10760 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 43.83 Residual Deviance: 43.49 AIC: 142.3 Warning message: non-integer counts in a binomial glm! in: eval(expr, envir, enclos) ### BUT> glm(y.logit.transformation(y)~ X)Call: glm(formula = y.logit.transformation(y) ~ X) Coefficients: (Intercept) X 0.1233 0.1023 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 465.6 Residual Deviance: 464.4 AIC: 443.3 ### OR> lm(y.logit.transformation(y)~ X)Call: lm(formula = y.logit.transformation(y) ~ X) Coefficients: (Intercept) X 0.1233 0.1023 It's close (AIC and residual deviance is different due transformation) but I think that relationship should be exact? Or is it just calculation inaccurance? Or is there some hidden reason (to me..)? Is it legimitate to use frequency regression when using R for the logit-model (alternatives?). I would like to know what does exactly mean the warning message: non-integer counts in a binomial glm! in: eval(expr, envir, enclos) For the dependent transformation: "y.logit.transformation" <- function(y) { y.trans <- log(y/(1-y)) y.trans } version platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 1 minor 5.0 year 2002 month 04 day 29 language R OS is Windows2000. Thank you for any help. deadlocked, Jussi M?kinen Analyst State Treasury, Finland phone: +358-9-7725 616 mobile: +358-50-5958 710 www.statetreasury.fi mailto:jussi.makinen at valtiokonttori.fi -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
M?kinen Jussi wrote:>Hello dear r-gurus! > >I have a question about the logit-model. I think I have misunderstood >something and I'm trying to find a bug from my code or even better from my >head. Any help is appreciated. > >The question is shortly: why I'm not having same coefficients from the >logit-regression when using a link-function and an explicite transformation >of the dependent. Below some details. > >I'm not very familiar with the concept. As far as I have understood it's all >about transformation of the dependent variable if one have frequency data >(grouped data, instead of raw binaries): > >ln(^p(i)/(1-^p(i)) = c + b_1(X_1) +...+ b_k(X_k) + e(i). > >where ^p(i) is (estimated) frequency of incident (happened/all = n(i)/N), i >is index of observation, c and b_. are coefficients (objects of the >estimation), X_. are the explanatory variables and e is residual. So a >linear regression. > >And some testing: > > >>y <- runif(100) >>Should you use a binomial (0,1) response variable? best regards!>> >>X <- rnorm(100) >>glm(y~ X, family=binomial(link=logit)) >>>> > >Call: glm(formula = y ~ X, family = binomial(link = logit)) > >Coefficients: >(Intercept) X > -0.00956 0.10760 > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual >Null Deviance: 43.83 >Residual Deviance: 43.49 AIC: 142.3 >Warning message: >non-integer #successes in a binomial glm! in: eval(expr, envir, enclos) > > > >### OR > >>glm(cbind(y, 1-y)~ X, family=binomial(link=logit)) ### ?glm >> > >Call: glm(formula = cbind(y, 1 - y) ~ X, family = binomial(link = logit)) > >Coefficients: >(Intercept) X > -0.00956 0.10760 > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual >Null Deviance: 43.83 >Residual Deviance: 43.49 AIC: 142.3 >Warning message: >non-integer counts in a binomial glm! in: eval(expr, envir, enclos) > > > >### BUT > >>glm(y.logit.transformation(y)~ X) >> > >Call: glm(formula = y.logit.transformation(y) ~ X) > >Coefficients: >(Intercept) X > 0.1233 0.1023 > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual >Null Deviance: 465.6 >Residual Deviance: 464.4 AIC: 443.3 > > >### OR > >>lm(y.logit.transformation(y)~ X) >> > >Call: >lm(formula = y.logit.transformation(y) ~ X) > >Coefficients: >(Intercept) X > 0.1233 0.1023 > > >It's close (AIC and residual deviance is different due transformation) but I >think that relationship should be exact? Or is it just calculation >inaccurance? Or is there some hidden reason (to me..)? Is it legimitate to >use frequency regression when using R for the logit-model (alternatives?). > >I would like to know what does exactly mean the warning message: >non-integer counts in a binomial glm! in: eval(expr, envir, enclos) > >For the dependent transformation: > >"y.logit.transformation" <- function(y) >{ > y.trans <- log(y/(1-y)) > y.trans >} > >version > >platform i386-pc-mingw32 >arch i386 >os mingw32 >system i386, mingw32 >status >major 1 >minor 5.0 >year 2002 >month 04 >day 29 >language R > >OS is Windows2000. > >Thank you for any help. > >deadlocked, > >Jussi M?kinen >Analyst >State Treasury, Finland >phone: +358-9-7725 616 >mobile: +358-50-5958 710 >www.statetreasury.fi >mailto:jussi.makinen at valtiokonttori.fi > > > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- >r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html >Send "info", "help", or "[un]subscribe" >(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I have got few answers which has pointed out that usually logit-model is for a binary response (dependent) variable. And this was a part of my (obviously badly written) question: is it possible to regress frequency data (e.g. not binary response) with glm(y~x, family=binomial(link=logit))? glm-help says: <snip>...For `binomial' models the response can also be specified as a `factor' (when the first level denotes failure and all others success) or as a two-column matrix with the columns giving the numbers of successes and failures....<snip> which led me think that it can handle frequency data (grouped data) as well. But that should give the same result as transforming response and running regular OLS? Jussi M?kinen Jussi wrote:>Hello dear r-gurus! > >I have a question about the logit-model. I think I have misunderstood >something and I'm trying to find a bug from my code or even better from my >head. Any help is appreciated. > >The question is shortly: why I'm not having same coefficients from the >logit-regression when using a link-function and an explicite transformation >of the dependent. Below some details. > >I'm not very familiar with the concept. As far as I have understood it'sall>about transformation of the dependent variable if one have frequency data >(grouped data, instead of raw binaries): > >ln(^p(i)/(1-^p(i)) = c + b_1(X_1) +...+ b_k(X_k) + e(i). > >where ^p(i) is (estimated) frequency of incident (happened/all = n(i)/N), i >is index of observation, c and b_. are coefficients (objects of the >estimation), X_. are the explanatory variables and e is residual. So a >linear regression. > >And some testing: > > >>y <- runif(100) >>Should you use a binomial (0,1) response variable? best regards!>> >>X <- rnorm(100) >>glm(y~ X, family=binomial(link=logit)) >>>> > >Call: glm(formula = y ~ X, family = binomial(link = logit)) > >Coefficients: >(Intercept) X > -0.00956 0.10760 > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual >Null Deviance: 43.83 >Residual Deviance: 43.49 AIC: 142.3 >Warning message: >non-integer #successes in a binomial glm! in: eval(expr, envir, enclos) > > > >### OR > >>glm(cbind(y, 1-y)~ X, family=binomial(link=logit)) ### ?glm >> > >Call: glm(formula = cbind(y, 1 - y) ~ X, family = binomial(link = logit)) > >Coefficients: >(Intercept) X > -0.00956 0.10760 > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual >Null Deviance: 43.83 >Residual Deviance: 43.49 AIC: 142.3 >Warning message: >non-integer counts in a binomial glm! in: eval(expr, envir, enclos) > > > >### BUT > >>glm(y.logit.transformation(y)~ X) >> > >Call: glm(formula = y.logit.transformation(y) ~ X) > >Coefficients: >(Intercept) X > 0.1233 0.1023 > >Degrees of Freedom: 99 Total (i.e. Null); 98 Residual >Null Deviance: 465.6 >Residual Deviance: 464.4 AIC: 443.3 > > >### OR > >>lm(y.logit.transformation(y)~ X) >> > >Call: >lm(formula = y.logit.transformation(y) ~ X) > >Coefficients: >(Intercept) X > 0.1233 0.1023 > > >It's close (AIC and residual deviance is different due transformation) butI>think that relationship should be exact? Or is it just calculation >inaccurance? Or is there some hidden reason (to me..)? Is it legimitate to >use frequency regression when using R for the logit-model (alternatives?). > >I would like to know what does exactly mean the warning message: >non-integer counts in a binomial glm! in: eval(expr, envir, enclos) > >For the dependent transformation: > >"y.logit.transformation" <- function(y) >{ > y.trans <- log(y/(1-y)) > y.trans >} > >version > >platform i386-pc-mingw32 >arch i386 >os mingw32 >system i386, mingw32 >status >major 1 >minor 5.0 >year 2002 >month 04 >day 29 >language R > >OS is Windows2000. > >Thank you for any help. > >deadlocked, > >Jussi M?kinen >Analyst >State Treasury, Finland >phone: +358-9-7725 616 >mobile: +358-50-5958 710 >www.statetreasury.fi >mailto:jussi.makinen at valtiokonttori.fi > > > >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.->r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html >Send "info", "help", or "[un]subscribe" >(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._>-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Apparently Analagous Threads
- A suggestion to improve ifelse behaviour with vector yes/no arguments
- its plot with pch-argument
- A suggestion to improve ifelse behaviour with vector yes/noarguments
- Add-on bug? Win fracdiff failed from http://www.stat.unipg.it/stat/statlib/R/CRAN/ (PR#2505)
- QR decomposition question