thr3ads.net - R help - [R] logistic regression or not? [Dec 2010]

If this information is useful, please help other people find it:
Share via:

array chip

2010-Dec-21 00:40 UTC

[R] logistic regression or not?

Hi, I have a dataset where the response for each person on one of the 2 
treatments was a proportion (percentage of certain number of markers being 
positive), I also have the number of positive & negative markers available
for
each person. what is the best way to analyze this kind of data?

I can think of analyzing this data using glm() with the attached dataset:

test<-read.table('test.txt',sep='\t')
fit<-glm(cbind(positive,total-positive)~treatment,test,family=binomial)
summary(fit)
anova(fit, test='Chisq')

First, is this still called logistic regression or something else? I thought 
with logistic regression, the response variable is a binary factor?

Second, then summary(fit) and anova(fit, test='Chisq') gave me different
p
values, why is that? which one should I use?

Third, is there an equivalent model where I can use variable
"percentage"
instead of "positive" & "total"?

Finally, what is the best way to analyze this kind of dataset where it's
almost
the same as ANOVA except that the response variable is a proportion (or success 
and failure)?

Thanks

John



      
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.txt
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20101220/b65c978f/attachment.txt>

Ben Bolker

2010-Dec-21 13:08 UTC

head link

[R] logistic regression or not?

array chip <arrayprofile <at> yahoo.com> writes:

[snip]
> I can think of analyzing this data using glm() with the attached dataset:
> 
> test<-read.table('test.txt',sep='\t')
> fit<-glm(cbind(positive,total-positive)~treatment,test,family=binomial)
> summary(fit)
> anova(fit, test='Chisq')
 > First, is this still called logistic regression or something else? I
thought
> with logistic regression, the response variable is a binary factor?
  Sometimes I've seen it called "binomial regression", or just 
"a binomial generalized linear model"
> Second, then summary(fit) and anova(fit, test='Chisq') gave me
different p
> values, why is that? which one should I use?
  summary(fit) gives you p-values from a Wald test.
  anova() gives you tests based on the Likelihood Ratio Test.
  In general the LRT is more accurate.
> Third, is there an equivalent model where I can use variable
"percentage"
> instead of "positive" & "total"?
  glm(percentage~treatment,weights=total,data=tests,family=binomial)

 is equivalent to the model you fitted above.> 
> Finally, what is the best way to analyze this kind of dataset 
> where it's almost the same as ANOVA except that the response variable
>  is a proportion (or success and failure)?
  Don't quite know what you mean here.  How is the situation "almost
the same as ANOVA" different from the situation you described above?
Do you mean when there are multiple factors? or ???

S Ellison

2010-Dec-21 13:22 UTC

head link

[R] logistic regression or not?

A possible caveat here.

Traditionally, logistic regression was performed on the
logit-transformed proportions, with the standard errors based on the
residuals for the resulting linear fit. This accommodates overdispersion
naturally, but without telling you that you have any.

glm with a binomial family does not allow for overdispoersion unless
you use the quasibinomial family. If you have overdispersion, standard
errors from glm will be unrealistically small. Make sure your model fits
in glm before you believe the standard errors, or use the quasibionomial
family.

Steve Ellison
LGC

>>> Ben Bolker <bbolker at gmail.com> 21/12/2010 13:08:34
>>>array chip <arrayprofile <at> yahoo.com> writes:

[snip]
> I can think of analyzing this data using glm() with the attached
dataset:> 
> test<-read.table('test.txt',sep='\t')
>
fit<-glm(cbind(positive,total-positive)~treatment,test,family=binomial)> summary(fit)
> anova(fit, test='Chisq')
 > First, is this still called logistic regression or something else? I
thought > with logistic regression, the response variable is a binary factor?
  Sometimes I've seen it called "binomial regression", or just 
"a binomial generalized linear model"
> Second, then summary(fit) and anova(fit, test='Chisq') gave me
different p > values, why is that? which one should I use?
  summary(fit) gives you p-values from a Wald test.
  anova() gives you tests based on the Likelihood Ratio Test.
  In general the LRT is more accurate.
> Third, is there an equivalent model where I can use variable
"percentage" > instead of "positive" & "total"?
  glm(percentage~treatment,weights=total,data=tests,family=binomial)

 is equivalent to the model you fitted above.> 
> Finally, what is the best way to analyze this kind of dataset 
> where it's almost the same as ANOVA except that the response
variable>  is a proportion (or success and failure)?
  Don't quite know what you mean here.  How is the situation "almost
the same as ANOVA" different from the situation you described above?
Do you mean when there are multiple factors? or ???

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

S Ellison

2010-Dec-21 16:48 UTC

head link

[R] logistic regression or not?

>...and before you believe in overdispersion, make sure you have acredible explanation for it. All too often, what you really have
>is a model that doesn't fit your data properly.
Well put.

A possible fortune?

S Ellison



*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

Reasonably Related Threads

Search for more reasonably related threads

R help - Dec 2010 - logistic regression or not?

[R] logistic regression or not?

[R] logistic regression or not?

[R] logistic regression or not?

[R] logistic regression or not?

Reasonably Related Threads