The district a is the baseline and we observe the difference between District a & b is not significant, we can choose to combine these 2 values. How to write code to combine these 2 value ?> m1=glm(Claims~District+Group+Age+log(Holders),fami ly=poisson,data=mydata) > summary(m1)Call: glm(formula = Claims ~ District + Group + Age + log(Holders), family = poisson, data = mydata) Deviance Residuals: Min 1Q Median 3Q Max -2.553115 -0.471819 0.002411 0.455274 1.800739 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.777752 0.689162 -4.031 5.56e-05 *** Districtb 0.119942 0.079861 1.502 0.133125 Districtc 0.228371 0.144503 1.580 0.114019 Districtd 0.571661 0.248792 2.298 0.021576 * Group>2l 0.794721 0.180354 4.406 1.05e-05 *** Group1-1.5l -0.003496 0.127947 -0.027 0.978202 Group1.5-2l 0.379190 0.055856 6.789 1.13e-11 *** Age>35 -1.074971 0.389480 -2.760 0.005780 ** Age25-29 -0.332131 0.129512 -2.564 0.010333 * Age30-35 -0.539815 0.160138 -3.371 0.000749 *** log(Holders) 1.201696 0.144135 8.337 < 2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 4236.68 on 63 degrees of freedom Residual deviance: 49.45 on 53 degrees of freedom AIC: 388.77
You don't. And even if you do get someone to tell you how, you may still not legitimately lower your degrees of freedom. Friends don't let friends use stepwise approaches to regression analysis. -- David Winsemius On Feb 25, 2009, at 10:33 PM, choonhong ang wrote:> The district a is the baseline and we observe the difference between > District a & b is not significant, we can choose to combine these 2 > values. > How to write code to combine these 2 value ? > >> m1=glm(Claims~District+Group+Age+log(Holders),fami >> ly=poisson,data=mydata) >> summary(m1) > > Call: > glm(formula = Claims ~ District + Group + Age + log(Holders), > family = poisson, data = mydata) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.553115 -0.471819 0.002411 0.455274 1.800739 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -2.777752 0.689162 -4.031 5.56e-05 *** > Districtb 0.119942 0.079861 1.502 0.133125 > Districtc 0.228371 0.144503 1.580 0.114019 > Districtd 0.571661 0.248792 2.298 0.021576 * > Group>2l 0.794721 0.180354 4.406 1.05e-05 *** > Group1-1.5l -0.003496 0.127947 -0.027 0.978202 > Group1.5-2l 0.379190 0.055856 6.789 1.13e-11 *** > Age>35 -1.074971 0.389480 -2.760 0.005780 ** > Age25-29 -0.332131 0.129512 -2.564 0.010333 * > Age30-35 -0.539815 0.160138 -3.371 0.000749 *** > log(Holders) 1.201696 0.144135 8.337 < 2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > (Dispersion parameter for poisson family taken to be 1) > > Null deviance: 4236.68 on 63 degrees of freedom > Residual deviance: 49.45 on 53 degrees of freedom > AIC: 388.77 > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi: John Fox's recode function in his car package provides a convenient way for doing what you need. I don't know what your factor is specifically but below is mostly taken out of the help for "recode" and shows how to take a factor and recode it to make it a new factor. you can apply that for your particular problem. x <- gl(3,3,length=9) print(x) print(str(x)) temp <- recode(x,"1:2 = 'A'; 3 = 'B'") print(temp) print(str(temp)) On Wed, Feb 25, 2009 at 10:33 PM, choonhong ang wrote:> The district a is the baseline and we observe the difference between > District a & b is not significant, we can choose to combine these 2 > values. > How to write code to combine these 2 value ? > >> m1=glm(Claims~District+Group+Age+log(Holders),fami >> ly=poisson,data=mydata) >> summary(m1) > > Call: > glm(formula = Claims ~ District + Group + Age + log(Holders), > family = poisson, data = mydata) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.553115 -0.471819 0.002411 0.455274 1.800739 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -2.777752 0.689162 -4.031 5.56e-05 *** > Districtb 0.119942 0.079861 1.502 0.133125 > Districtc 0.228371 0.144503 1.580 0.114019 > Districtd 0.571661 0.248792 2.298 0.021576 * > Group>2l 0.794721 0.180354 4.406 1.05e-05 *** > Group1-1.5l -0.003496 0.127947 -0.027 0.978202 > Group1.5-2l 0.379190 0.055856 6.789 1.13e-11 *** > Age>35 -1.074971 0.389480 -2.760 0.005780 ** > Age25-29 -0.332131 0.129512 -2.564 0.010333 * > Age30-35 -0.539815 0.160138 -3.371 0.000749 *** > log(Holders) 1.201696 0.144135 8.337 < 2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > (Dispersion parameter for poisson family taken to be 1) > > Null deviance: 4236.68 on 63 degrees of freedom > Residual deviance: 49.45 on 53 degrees of freedom > AIC: 388.77 > > > ------------------------------ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
oops, then i guess i should not have sent the recode suggestion. choonhong: I only sent it as an example of how to recode your factor. I didn't mean to imply ( nor did i even give it much thought ) that what you're doing is statistically/philosophically correct. I'm a friend but I think what David is implying is that you are deciding on hypotheses after looking at results which is kind of cheating and means that you can't rely on any statistical tests that you do going forward because they will be biased. I'm sure he has a good point but I don't want to get into this since I really don't know what you're doing and it's a very complex topic. I think Frank's book has a lot to say about this type of thing. On Thu, Feb 26, 2009 at 12:14 AM, David Winsemius wrote:> You don't. > > And even if you do get someone to tell you how, you may still not > legitimately lower your degrees of freedom. Friends don't let friends > use stepwise approaches to regression analysis. > > -- > David Winsemius > > On Feb 25, 2009, at 10:33 PM, choonhong ang wrote: > >> The district a is the baseline and we observe the difference between >> District a & b is not significant, we can choose to combine these 2 >> values. >> How to write code to combine these 2 value ? >> >>> m1=glm(Claims~District+Group+Age+log(Holders),fami >>> ly=poisson,data=mydata) >>> summary(m1) >> >> Call: >> glm(formula = Claims ~ District + Group + Age + log(Holders), >> family = poisson, data = mydata) >> >> Deviance Residuals: >> Min 1Q Median 3Q Max >> -2.553115 -0.471819 0.002411 0.455274 1.800739 >> >> Coefficients: >> Estimate Std. Error z value Pr(>|z|) >> (Intercept) -2.777752 0.689162 -4.031 5.56e-05 *** >> Districtb 0.119942 0.079861 1.502 0.133125 >> Districtc 0.228371 0.144503 1.580 0.114019 >> Districtd 0.571661 0.248792 2.298 0.021576 * >> Group>2l 0.794721 0.180354 4.406 1.05e-05 *** >> Group1-1.5l -0.003496 0.127947 -0.027 0.978202 >> Group1.5-2l 0.379190 0.055856 6.789 1.13e-11 *** >> Age>35 -1.074971 0.389480 -2.760 0.005780 ** >> Age25-29 -0.332131 0.129512 -2.564 0.010333 * >> Age30-35 -0.539815 0.160138 -3.371 0.000749 *** >> log(Holders) 1.201696 0.144135 8.337 < 2e-16 *** >> --- >> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 >> >> (Dispersion parameter for poisson family taken to be 1) >> >> Null deviance: 4236.68 on 63 degrees of freedom >> Residual deviance: 49.45 on 53 degrees of freedom >> AIC: 388.77 >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.