thr3ads.net - R help - [R] glm binomial logit [Jun 2009]

If this information is useful, please help other people find it:
Share via:

jagat at cmi.ac.in

2009-Jun-17 06:45 UTC

[R] glm binomial logit

Hi All,

I am using "glm" function to build logistic regression. I noticed that
glm
function glm function is computing many other statistics which are not
required for our analysis. As our dataset is very big and we have to run
logistic regression on several samples the run time drastically increases
if all those statistics are computed. Is these any way to skip computation
in glm function? I am just a beginner of R and hence I am not able to
modify the glm function.
Can anybody give me an alternative way to fit logistic regression which
computes only the estimates(coefficients) of variables.

Waiting for your favourable response.

Regards,
Jagat

Marc Schwartz

2009-Jun-17 13:43 UTC

head link

[R] glm binomial logit

On Jun 17, 2009, at 1:45 AM, jagat at cmi.ac.in wrote:
> Hi All,
>
> I am using "glm" function to build logistic regression. I noticed
> that glm
> function glm function is computing many other statistics which are not
> required for our analysis. As our dataset is very big and we have to  
> run
> logistic regression on several samples the run time drastically  
> increases
> if all those statistics are computed. Is these any way to skip  
> computation
> in glm function? I am just a beginner of R and hence I am not able to
> modify the glm function.
> Can anybody give me an alternative way to fit logistic regression  
> which
> computes only the estimates(coefficients) of variables.
>
> Waiting for your favourable response.
>
> Regards,
> Jagat

If all you need are the coefficients, you may observe greater  
efficiency by using glm.fit() directly instead of glm(), where you  
have pre-constructed the model design matrix and response vector.

For example, using the 'infert' dataset:

MM <- model.matrix( ~ spontaneous + induced, data = infert)

 > coef(glm.fit(MM, infert$case, family = binomial()))
(Intercept) spontaneous     induced
  -1.7078601   1.1972050   0.4181294

That gives you the same output as:

 > coef(glm(case ~ spontaneous + induced, data = infert, family =  
binomial()))
(Intercept) spontaneous     induced
  -1.7078601   1.1972050   0.4181294

In this simple example, the time savings is negligible, but with much  
larger datasets, you may observe enough savings to make it worthwhile  
to consider.

See ?glm.fit and ?model.matrix for more information. Note that  
glm.fit() does not return an object of class 'glm' which restricts the  
use of other functions with glm methods (eg. summary(), anova(),  
predict(), ...) which may or may not be of value to you. So there are  
tradeoffs...

I have not compared Frank's lrm() function in the Design package  
relative to any time savings in comparison to using glm() on large  
datasets, but that may also be something to look into.

HTH,

Marc Schwartz

Thomas Lumley

2009-Jun-17 14:53 UTC

head link

[R] glm binomial logit

You don't say why you think that computing these other statistics is
responsible
for the run time.

If you just want to fit logistic regressions faster, glm.fit() is likely to be
helpful.

         -thomas

On Wed, 17 Jun 2009 jagat at cmi.ac.in wrote:
> Hi All,
>
> I am using "glm" function to build logistic regression. I noticed
that glm
> function glm function is computing many other statistics which are not
> required for our analysis. As our dataset is very big and we have to run
> logistic regression on several samples the run time drastically increases
> if all those statistics are computed. Is these any way to skip computation
> in glm function? I am just a beginner of R and hence I am not able to
> modify the glm function.
> Can anybody give me an alternative way to fit logistic regression which
> computes only the estimates(coefficients) of variables.
>
> Waiting for your favourable response.
>
> Regards,
> Jagat
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html> and provide commented, minimal, self-contained, reproducible code.
>
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Maybe Matching Threads

Search for more maybe matching threads

R help - Jun 2009 - glm binomial logit

[R] glm binomial logit

[R] glm binomial logit

[R] glm binomial logit

Maybe Matching Threads