I thought of testing the difference in deviance between the null model
and the fitted model, assuming it is distributed as chi-sq. However,
Faraway writes that if the outcome is binary, the deviance
distribution is far from chisq.
I've done a permutation test:
N<-5000; # Towards the upper limit, as there are only 17 over 5 6,188
combination of the T/F data I have..
dev<-rep(0,N);
for (i in 1:N) {
l1<-glm(sample(p)~w,family=binomial);
dev[i]<-l1$dev;
}
print(mean(dev<l$dev))
and the outcome is 0.005 - which is close to the ttest.
I've repeated the same with calculating the statistics on the z-value
in summary(l1) each time instead of the deviance, and got a comparable
result.
I think it means that David is right, the Pr(>|z|) in glm output does
not mean much. I still don't know what does it mean.
Regarding your suggestion of using car's Anova:
> Anova(l)
Anova Table (Type II tests)
Response: p
LR Chisq Df Pr(>Chisq)
w 9.4008 1 0.002169 **
which is identical to:
pchisq(l$null.deviance-l$dev,1,lower=F)
which seems to be too low - which is probably due to the binary response.
would you think the permutation method is appropriate to use in this
case? and extended also to a case with several covariates?
On Tue, Apr 21, 2009 at 10:34 PM, <markleeds at verizon.net>
wrote:> hi: i would wait for one of the guRus to say something but my take ( take
it
> with a grain of salt ) is that the results
> are not so contradictory. the test of the significance of the coefficient
in
> the GLM is 0.06. and the test?that the
> means are difference gives a pv-pvalue of 0.004. ?a couple of reasons why
> this might not be so contradictory:
>
> A) the test gives greater significance in the t-test case?but it's not
> really testing the same thing. the t-test is only testing that
> the means are different. the glm is testing is that log odds of the? means
> of the two events ( pass and fail ) are linearly related to
> a covariate.
>
> b) your t-test is a little weird because it's only got? sample of five
in
> one of the 2 samples and I'm not clear on whether it's assuming
equal
> variances and then pooling ( I think there's a pooled = TRUE option?for
> t.test ?but I don't know the default value ).
> definitely that's not a large sample size?regardless of the pooling
issue.
>
> c) when you test the significance in a glm you need to compare the deviance
> of the model to the deviance of the nested?null model.
> John Fox's book desacribes this but I don't think it's the same
as looking
> as the significance in the table output of glm. that's
> a?wald test and not the same as the deviance comparison ( essentially a
> likelihood ratio test i think ). with small sample sizes, i think these
> differences?between these?various test?can be large. check out john
fox's
> text for a nice description of testing in the generalized linear model
> framework. you can use Anova from his car package to do this.
>
> hopefully someone else wil say something though because i'd be curious
to
> see where i'm wrong/right or something new.
> good luck.
>
>
>
>
>
>
>
> On Apr 21, 2009, ehud cohen <ehudco.list at gmail.com> wrote:
>
> Hi,
>
> We have an experiment with pass/fail outcome, and a continuous
> parameter which may contribute to the outcome.
>
> First, we've analyzed it by:
>
> p=c(F,T,F,F,F,T,T,T,T,T,T,T,F,T,T,T,T);
> w=c(53,67,59,59,53,89,72,56,65,63,62,58,59,72,61,68,63);
> l<-glm(p~w,family=binomial)
> summary(l)
>
> Which turned out to be non significant.
>
> Then, we thought of comparing the parameters of the two groups (passed
> vs. failed)
>
> t.test(w[which(p)],w[which(!p)],alternative="two.sided")
>
> which turned highly significant.
>
> I'd appreciate some insight...
>
> Thanks, Ehud.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>