Hi all: Here's a question about result of loglinear analysis. There're 2 factors:area and nation.The raw data is in the attachment. I fit the saturated model of loglinear with the command: glm_sat<-glm(fre~area*nation, family=poisson, data=data_Analysis) After that,I extract the coefficients: result_sat<-summary(glm_sat) result_coe<-result_sat$coefficients I find that all the coeffients are 1 or very near to 1. How does this happen?Why all the coeffients are 1 or very near to 1? Thanks! My best -------------- next part -------------- area nation fre 1 1 0 1 2 0 1 3 85 1 4 2 1 5 0 1 6 0 1 7 0 1 8 0 1 9 1 1 10 0 1 11 0 2 1 0 2 2 0 2 3 253 2 4 2 2 5 0 2 6 10 2 7 0 2 8 0 2 9 1 2 10 2 2 11 0 3 1 5 3 2 0 3 3 1242 3 4 0 3 5 2 3 6 0 3 7 1 3 8 0 3 9 10 3 10 5 3 11 3 4 1 0 4 2 0 4 3 290 4 4 1 4 5 0 4 6 0 4 7 0 4 8 2 4 9 0 4 10 0 4 11 62 5 1 0 5 2 0 5 3 382 5 4 0 5 5 0 5 6 0 5 7 3 5 8 7 5 9 0 5 10 0 5 11 0 6 1 0 6 2 0 6 3 119 6 4 39 6 5 0 6 6 0 6 7 0 6 8 0 6 9 383 6 10 1 6 11 0 7 1 12 7 2 12 7 3 376 7 4 5 7 5 8 7 6 0 7 7 10 7 8 9 7 9 0 7 10 52 7 11 0
On Jan 18, 2011, at 8:45 PM, Lao Meng wrote:> Hi all: > Here's a question about result of loglinear analysis. > There're 2 factors:area and nation.The raw data is in the attachment. > > I fit the saturated model of loglinear with the command: > glm_sat<-glm(fre~area*nation, family=poisson, data=data_Analysis) > > After that,I extract the coefficients: > result_sat<-summary(glm_sat) > result_coe<-result_sat$coefficients > > I find that all the coeffients are 1 or very near to 1.I didn't get that result with that code. Did you perhaps attach some other data structure that you have failed to inform us about? > result_coe Estimate Std. Error z value Pr(>|z|) (Intercept) 5.40707901 0.06797962 79.53971 0.000000e+00 area -0.12665730 0.01498742 -8.45091 2.890419e-17 nation -0.42998381 0.01615198 -26.62112 3.867012e-156 area:nation 0.04823113 0.00317437 15.19392 3.879947e-52> > How does this happen?Why all the coeffients are 1 or very near to 1? > > Thanks! > > My best > <area_nation.txt>______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
---------- Forwarded message ---------- From: Lao Meng <laomeng.3@gmail.com> Date: 2011/1/19 Subject: Re: [R] question about result of loglinear analysis To: David Winsemius <dwinsemius@comcast.net> My command and result are : > result_sat<-summary(glm_sat)> result_coe<-result_sat$coefficients > result_coeEstimate Std. Error z value Pr(>|z|) (Intercept) -26.3025850551 312167 -0.0000842581467488 1 area2 -0.0000000391 441470 -0.0000000000000887 1 area3 27.9120229675 312167 0.0000894138474346 1 area4 -0.0000000396 441470 -0.0000000000000897 1 area5 -0.0000000376 441470 -0.0000000000000852 1 area6 -0.0000000399 441470 -0.0000000000000903 1 area7 28.7874917049 312167 0.0000922183388256 1 nation2 -0.0000000349 441470 -0.0000000000000790 1 nation3 30.7452363116 312167 0.0000984898110791 1 nation4 26.9957322356 312167 0.0000864785861745 1 nation5 -0.0000000231 441470 -0.0000000000000523 1 nation6 -0.0000000349 441470 -0.0000000000000790 1 nation7 -0.0000000286 441470 -0.0000000000000647 1 nation8 -0.0000000367 441470 -0.0000000000000832 1 nation9 26.3025850551 312167 0.0000842581467484 1 nation10 -0.0000000492 441470 -0.0000000000001115 1 nation11 -0.0000000372 441470 -0.0000000000000842 1 area2:nation2 0.0000000364 624333 0.0000000000000584 1 area3:nation2 -27.9120229706 540689 -0.0000516231080772 1 area4:nation2 0.0000000366 624333 0.0000000000000587 1 area5:nation2 0.0000000332 624333 0.0000000000000531 1 area6:nation2 0.0000000368 624333 0.0000000000000590 1 area7:nation2 0.0000000349 441470 0.0000000000000790 1 area2:nation3 1.0907382714 441470 0.0000024706943226 1 area3:nation3 -25.2301959615 312167 -0.0000808228373513 1 area4:nation3 1.2272297061 441470 0.0000027798689650 1 area5:nation3 1.5027693897 441470 0.0000034040098332 1 area6:nation3 0.3364722765 441470 0.0000007621628078 1 area7:nation3 -27.3005538180 312167 -0.0000874550567977 1 area2:nation4 0.0000000391 441470 0.0000000000000887 1 area3:nation4 -54.9077552406 441470 -0.0001243747310970 1 area4:nation4 -0.6931471409 441470 -0.0000015700876663 1 area5:nation4 -26.9957322354 540689 -0.0000499284341769 1 area6:nation4 2.9704145054 441470 0.0000067284576415 1 area7:nation4 -27.8712009730 312167 -0.0000892830775653 1 area2:nation5 0.0000000218 624333 0.0000000000000349 1 area3:nation5 -0.9162907088 441470 -0.0000020755430662 1 area4:nation5 0.0000000248 624333 0.0000000000000397 1 area5:nation5 0.0000000228 624333 0.0000000000000365 1 area6:nation5 0.0000000250 624333 0.0000000000000401 1 area7:nation5 -0.4054650850 441470 -0.0000009184424089 1 area2:nation6 28.6051702221 540689 0.0000529050794386 1 area3:nation6 -27.9120229740 540689 -0.0000516231080491 1 area4:nation6 0.0000000368 624333 0.0000000000000589 1 area5:nation6 0.0000000347 624333 0.0000000000000556 1 area6:nation6 0.0000000370 624333 0.0000000000000593 1 area7:nation6 -28.7874917077 540689 -0.0000532422818772 1 area2:nation7 0.0000000300 624333 0.0000000000000481 1 area3:nation7 -1.6094378839 441470 -0.0000036456308179 1 area4:nation7 0.0000000302 624333 0.0000000000000484 1 area5:nation7 27.4011974099 540689 0.0000506783395313 1 area6:nation7 0.0000000304 624333 0.0000000000000488 1 area7:nation7 -0.1823215282 441470 -0.0000004129870365 1 area2:nation8 0.0000000413 624333 0.0000000000000662 1 area3:nation8 -27.9120229714 540689 -0.0000516231080553 1 area4:nation8 26.9957323120 540689 0.0000499284343439 1 area5:nation8 28.2484952785 540689 0.0000522454114350 1 area6:nation8 0.0000000387 624333 0.0000000000000620 1 area7:nation8 -0.2876820357 441470 -0.0000006516452129 1 area2:nation9 0.0000000391 441470 0.0000000000000887 1 area3:nation9 -25.6094378745 312167 -0.0000820377073224 1 area4:nation9 -26.3025850534 540689 -0.0000486464628969 1 area5:nation9 -26.3025850556 540689 -0.0000486464628975 1 area6:nation9 5.9480350291 441470 0.0000134732380515 1 area7:nation9 -55.0900767979 441470 -0.0001247877181979 1 area2:nation10 26.9957323240 540689 0.0000499284343203 1 area3:nation10 0.0000000492 441470 0.0000000000001115 1 area4:nation10 0.0000000497 624333 0.0000000000000796 1 area5:nation10 0.0000000506 624333 0.0000000000000810 1 area6:nation10 26.3025851442 540689 0.0000486464630491 1 area7:nation10 1.4663371180 441470 0.0000033214849874 1 area2:nation11 0.0000000385 624333 0.0000000000000616 1 area3:nation11 -0.5108255866 441470 -0.0000011571005711 1 area4:nation11 30.4297195169 540689 0.0000562795716938 1 area5:nation11 0.0000000357 624333 0.0000000000000572 1 area6:nation11 0.0000000391 624333 0.0000000000000626 1 area7:nation11 -28.7874917056 540689 -0.0000532422818467 1 2011/1/19 David Winsemius <dwinsemius@comcast.net>> On Jan 18, 2011, at 8:45 PM, Lao Meng wrote: > > Hi all: >> Here's a question about result of loglinear analysis. >> There're 2 factors:area and nation.The raw data is in the attachment. >> >> I fit the saturated model of loglinear with the command: >> glm_sat<-glm(fre~area*nation, family=poisson, data=data_Analysis) >> >> After that,I extract the coefficients: >> result_sat<-summary(glm_sat) >> result_coe<-result_sat$coefficients >> >> I find that all the coeffients are 1 or very near to 1. >> > > I didn't get that result with that code. Did you perhaps attach some other > data structure that you have failed to inform us about? > > > result_coe > Estimate Std. Error z value Pr(>|z|) > (Intercept) 5.40707901 0.06797962 79.53971 0.000000e+00 > area -0.12665730 0.01498742 -8.45091 2.890419e-17 > nation -0.42998381 0.01615198 -26.62112 3.867012e-156 > area:nation 0.04823113 0.00317437 15.19392 3.879947e-52 > > > > >> How does this happen?Why all the coeffients are 1 or very near to 1? >> >> Thanks! >> >> My best >> <area_nation.txt>______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > David Winsemius, MD > West Hartford, CT > >[[alternative HTML version deleted]]
Hi: Well, you fit a saturated model. How many degrees of freedom do you have left for error? The fact that the standard errors are so huge relative to the estimates is a clue. Taking a look at your data, it's pretty clear that nation 3 is an outstanding outlier on its own. It is clearly - nay, blatantly - different from the other nations in the sample. Look at boxplot(fre ~ nation, data = data_Analysis) boxplot(sqrt(fre) ~ nation, data = data_Analysis) the latter to deal with the huge outlier near 1200 in the original data. Even on the square root scale, nation 3 sticks out like a sore thumb. 43/77 of your responses have zero frequency, so you should probably be looking into zero-inflated Poisson models and some of its relatives. Here is one citation to get you started: http://www.jstatsoft.org/v27/i08/paper Package VGAM also has functionality to fit these types of models. Using package sos, I typed # Install package sos first if you don't have it: library(sos) findFn('zero Poisson') which found 255 matches; you should find several packages that pertain to zero-inflated/zero-altered Poisson models. In the absence of the scientific background behind the data, the dominance of nation 3 may well mask more subtle effects among the other nations, so you might want to consider analyses with and without nation 3. HTH, Dennis On Tue, Jan 18, 2011 at 5:45 PM, Lao Meng <laomeng.3@gmail.com> wrote:> Hi all: > Here's a question about result of loglinear analysis. > There're 2 factors:area and nation.The raw data is in the attachment. > > I fit the saturated model of loglinear with the command: > glm_sat<-glm(fre~area*nation, family=poisson, data=data_Analysis) > > After that,I extract the coefficients: > result_sat<-summary(glm_sat) > result_coe<-result_sat$coefficients > > I find that all the coeffients are 1 or very near to 1. > > How does this happen?Why all the coeffients are 1 or very near to 1? > > Thanks! > > My best > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]