Hi to all I think this is more an general question to GLMs. The result was better in all prior GLMs when I admitted the non significant factors, but this is the first time that the result is worse than before. What could be the reason for that? glm(data1~data2+data3+data4+data5+data6,family="gaussian") The result: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.3670852 0.8978306 3.750 0.000445 *** data2 0.0002623 0.0001168 2.245 0.029024 * data3 -0.9742336 0.5032712 -1.936 0.058337 . data4 0.0628245 0.1503066 0.418 0.677686 data5 -0.0438871 0.0740210 -0.593 0.555818 data6$ -0.0012216 0.0187702 -0.065 0.948357 if I test only or lm() of course glm(data1~data2,family="gaussian") Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.473e+00 2.787e-01 8.876 2.86e-12 *** data2 7.289e-05 7.485e-05 0.974 0.334 Kind regards Knut
Knut Krueger <rh at krueger-family.de> wrote> >I think this is more an general question to GLMs. > >The result was better in all prior GLMs when I admitted the non >significant factors, but this is the first time that the result is worse >than before. What could be the reason for that? > >glm(data1~data2+data3+data4+data5+data6,family="gaussian") >The result: > >Coefficients: > Estimate Std. Error t value Pr(>|t|) >(Intercept) 3.3670852 0.8978306 3.750 0.000445 *** >data2 0.0002623 0.0001168 2.245 0.029024 * >data3 -0.9742336 0.5032712 -1.936 0.058337 . >data4 0.0628245 0.1503066 0.418 0.677686 >data5 -0.0438871 0.0740210 -0.593 0.555818 >data6$ -0.0012216 0.0187702 -0.065 0.948357 > > > >if I test only or lm() of course >glm(data1~data2,family="gaussian") > >Coefficients: > Estimate Std. Error t value Pr(>|t|) >(Intercept) 2.473e+00 2.787e-01 8.876 2.86e-12 *** >data2 7.289e-05 7.485e-05 0.974 0.334 >What do you mean by "better"? Do you mean data2 was significant in one model and not the other? How is this "better"? The two models ask different questions, so, they get different answers. The first, more complex model, asks (re data2) what its relationship to data1 is, controlling for the other variables. The second model asks for uncontrolled. Hope this helps Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom
Peter Flom schrieb:> > What do you mean by "better"?Dear Peter Thank you for your kind respons as well. You are right, we are in constant debate whether it makes sense to remove variables (no matter whether significant or not) from a total dataset which in itself has a certain meaning and may not stand for other studies with differing variables. However, in biology it is common practice to remove non significant factors (and sometimes also variables)from data sets (so called: forward and backward elimination process), usually when they are permanently non significant on all the paticular positions in the factor list. Some authors suggest only to remove data which may have similar meanings, and therefore may be understod as pseudorepliaions. Best regards Knut