thr3ads.net - R help - [R] GLM Question [Dec 2009]

If this information is useful, please help other people find it:
Share via:

Knut Krueger

2009-Dec-03 18:08 UTC

[R] GLM Question

Hi to all

I think this is more an general question to GLMs.

The result was better in all prior GLMs when I admitted the non
significant factors, but this is the first time that the result is worse
than before. What could be the reason for that?

glm(data1~data2+data3+data4+data5+data6,family="gaussian")
The result:

Coefficients:
                 Estimate  Std. Error t value Pr(>|t|)
(Intercept)    3.3670852  0.8978306   3.750 0.000445 ***
data2          0.0002623  0.0001168   2.245 0.029024 *
data3         -0.9742336  0.5032712  -1.936 0.058337 .
data4          0.0628245  0.1503066   0.418 0.677686
data5         -0.0438871  0.0740210  -0.593 0.555818
data6$        -0.0012216  0.0187702  -0.065 0.948357



if I test only or  lm() of course
glm(data1~data2,family="gaussian")

Coefficients:
                Estimate Std. Error t value Pr(>|t|)
(Intercept)   2.473e+00  2.787e-01   8.876 2.86e-12 ***
data2	      7.289e-05  7.485e-05   0.974    0.334



Kind regards Knut

Peter Flom

2009-Dec-03 18:27 UTC

head link

[R] GLM Question

Knut Krueger <rh at krueger-family.de> wrote>
>I think this is more an general question to GLMs.
>
>The result was better in all prior GLMs when I admitted the non
>significant factors, but this is the first time that the result is worse
>than before. What could be the reason for that?
>
>glm(data1~data2+data3+data4+data5+data6,family="gaussian")
>The result:
>
>Coefficients:
>                 Estimate  Std. Error t value Pr(>|t|)
>(Intercept)    3.3670852  0.8978306   3.750 0.000445 ***
>data2          0.0002623  0.0001168   2.245 0.029024 *
>data3         -0.9742336  0.5032712  -1.936 0.058337 .
>data4          0.0628245  0.1503066   0.418 0.677686
>data5         -0.0438871  0.0740210  -0.593 0.555818
>data6$        -0.0012216  0.0187702  -0.065 0.948357
>
>
>
>if I test only or  lm() of course
>glm(data1~data2,family="gaussian")
>
>Coefficients:
>                Estimate Std. Error t value Pr(>|t|)
>(Intercept)   2.473e+00  2.787e-01   8.876 2.86e-12 ***
>data2	      7.289e-05  7.485e-05   0.974    0.334
>
What do you mean by "better"?
Do you mean data2 was significant in one model and not the other?  How is this
"better"?

The two models ask different questions, so, they get different answers.  

The first, more complex model, asks (re data2) what its relationship to data1
is, controlling for the other variables.  The second model asks for
uncontrolled.

Hope this helps

Peter

Peter L. Flom, PhD
Statistical Consultant
Website: www DOT peterflomconsulting DOT com
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter:   @peterflom

Knut Krueger

2009-Dec-04 10:39 UTC

head link

[R] GLM Question

Peter Flom schrieb:>
> What do you mean by "better"?Dear Peter

Thank you for your kind respons as well. You are right, we are in
constant debate whether it makes sense to remove variables (no matter
whether significant or not) from a total dataset which in itself has a
certain meaning and may not stand for other studies with differing
variables.

However, in biology it is common practice to remove non significant
factors (and sometimes also variables)from data sets (so called: forward
and backward elimination process), usually when they are permanently non
significant on all the paticular positions in the factor list.

Some authors suggest only to remove data which may have similar meanings,
and therefore may be understod as pseudorepliaions.

Best regards Knut

R help - Dec 2009 - GLM Question

[R] GLM Question

[R] GLM Question

[R] GLM Question