thr3ads.net - R help - [R] What PRECISELY is the dfbetas() or lm.influence()$coef ? [Jun 2003]

If this information is useful, please help other people find it:
Share via:

Katki, Hormuzd (NIH/NCI)

2003-Jun-12 17:24 UTC

[R] What PRECISELY is the dfbetas() or lm.influence()$coef ?

Hello.  I want to get the proper influence function for the glm
coefficients in R.  This is supposed to be inv(information)*(y-yhat)*x.  So
I am wondering what is the exact mathematical formula for the output that
the functions:

dfbeta()  OR   lm.influence()$coefficients 

return for a glm model.  I am confused because:

1. Their columns don't sum to zero as influences should.  
2. They return different "influences", so the 2 functions are doing
something different.
3. I think they divide each element by the standard error of the
corresponding coefficient, but that's not enough to resolve any
discrepancies

The documentation doesn't provide any details.  Any help would be greatly
appreciated.
> Thank you,
> Hormuzd Katki
> 
> Hormuzd Katki
> Biostatistics Branch, Division of Cancer Epidemiology and Genetics
> National Cancer Institute
> 6120 Executive Blvd. Room 8044 MSC 7244
> Rockville, MD 20852-4910
> 301-594-7818 (voice)
> 301-402-0081 (fax)
> katkih at mail.nih.gov
> 
>

John Fox

2003-Jun-12 20:58 UTC

head link

[R] What PRECISELY is the dfbetas() or lm.influence()$coef ?

Dear Hormuzd,

At 01:24 PM 6/12/2003 -0400, Katki, Hormuzd (NIH/NCI)
wrote:>         Hello.  I want to get the proper influence function for the glm
>coefficients in R.  This is supposed to be inv(information)*(y-yhat)*x.  So
>I am wondering what is the exact mathematical formula for the output that
>the functions:
>
>dfbeta()  OR   lm.influence()$coefficients
>
>return for a glm model.  I am confused because:
>
>1. Their columns don't sum to zero as influences should.
Even in a linear model, where the computation is exact, this isn't the 
case, if influence is defined as the change in the coefficients upon 
deleting each observation in turn (i.e., as dfbeta).
>2. They return different "influences", so the 2 functions are
doing
>something different.
That's odd. I believe that dfbeta() for a GLM simply uses influence.glm, 
which has the same $coefficients component as lm.influence. As such, for a 
GLM, both are based on the last step of the IRLS fit -- i.e., a 
linearization of the model.
>3. I think they divide each element by the standard error of the
>corresponding coefficient, but that's not enough to resolve any
>discrepancies
Perhaps you meant that dfbetas() [not dfbeta()] returns different values 
from lm.influence()$coef (as in your subject line)? dfbetas standardizes 
the coefficient changes by coefficient standard errors, using a deleted 
estimate of the dispersion parameter.
>The documentation doesn't provide any details.  Any help would be
greatly
>appreciated.
I hope that this helps,
  John


-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox

Seemingly Similar Threads

Search for more reasonably related threads

R help - Jun 2003 - What PRECISELY is the dfbetas() or lm.influence()$coef ?

[R] What PRECISELY is the dfbetas() or lm.influence()$coef ?

[R] What PRECISELY is the dfbetas() or lm.influence()$coef ?

Seemingly Similar Threads