Hello! When I am analyzing proportion data, I usually apply logistic regression using a glm model with binomial family. For example: m <- glm( cbind("not realized", "realized") ~ v1 + v2 , family="binomial") However, sometimes I don't have the number of cases (realized, not realized), but only the proportion and thus cannot compute the binomial model. I just found out that the package car contains a function "logit" which allows for logit transformation. Would it be possible to transform the proportion data with this function and analyze the transformed data with a glm with family="gaussian"? Thank you very much -- View this message in context: http://r.789695.n4.nabble.com/function-logit-vs-logistic-regression-tp4646498.html Sent from the R help mailing list archive at Nabble.com.
On Oct 17, 2012, at 11:58 AM, swertie wrote:> Hello! > When I am analyzing proportion data, I usually apply logistic > regression > using a glm model with binomial family. For example: > m <- glm( cbind("not realized", "realized") ~ v1 + v2 , > family="binomial") > > However, sometimes I don't have the number of cases (realized, not > realized), but only the proportion and thus cannot compute the > binomial > model. I just found out that the package car contains a function > "logit" > which allows for logit transformation. Would it be possible to > transform the > proportion data with this function and analyze the transformed data > with a > glm with family="gaussian"?If you had the total number and the row proportions, shouldn't you be able to calculate the original numbers? If you think not then you need to be more clear about exactly what data you do and do not have. -- David Winsemius, MD Alameda, CA, USA
On 18/10/12 07:58, swertie wrote:> Hello! > When I am analyzing proportion data, I usually apply logistic regression > using a glm model with binomial family. For example: > m <- glm( cbind("not realized", "realized") ~ v1 + v2 , family="binomial") > > However, sometimes I don't have the number of cases (realized, not > realized), but only the proportion and thus cannot compute the binomial > model. I just found out that the package car contains a function "logit" > which allows for logit transformation. Would it be possible to transform the > proportion data with this function and analyze the transformed data with a > glm with family="gaussian"? > > Thank you very much.Of course it's possible, but I doubt me an it maketh a great deal of sense. (1) You don't need the car package to get a logit() function. You can roll your own in a couple of lines. (2) I believe that the conventional wisdom is that the arcsin(sqrt(x)) function should be used to transform proportion data to something which vaguely resembles Gaussian data. This transformation has the effect of "stabilizing the variance". (Others on the list may correct me on this point.) (3) Whatever you try is not going to work very well if you have proportion values that are close to 0 or to 1. (4) Whatever you try is going to be a pretty shaganappi approximation. The fact is that the variance of proportions does vary with the number of cases. A variance stabilizing transformation mitigates this effect but does not eliminate it. See fortune(111). cheers, Rolf Turner
On Wed, 17 Oct 2012, swertie wrote:> Hello! > When I am analyzing proportion data, I usually apply logistic regression > using a glm model with binomial family. For example: > m <- glm( cbind("not realized", "realized") ~ v1 + v2 , family="binomial") > > However, sometimes I don't have the number of cases (realized, not > realized), but only the proportion and thus cannot compute the binomial > model. I just found out that the package car contains a function "logit" > which allows for logit transformation. Would it be possible to transform > the proportion data with this function and analyze the transformed data > with a glm with family="gaussian"?In situations like this, beta regression can be useful. It models the mean and optionally also the precision (related to the variance) of a beta-distributed response on the open (0, 1) interval. See http://www.jstatsoft.org/v34/i02/ for an introduction to the betareg package in R and http://www.jstatsoft.org/v48/i11/ for various extended features. Best, Z> Thank you very much > > > > -- > View this message in context: http://r.789695.n4.nabble.com/function-logit-vs-logistic-regression-tp4646498.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thank you very much for replies and the nice explanation about variance stabilization. I heard about the arcsin transformation, but some recent papers were very critical about it (i.e., Warton & Hui, 2011), so that I would better try another way. I will have a look at beta regression. Best, V. -- View this message in context: http://r.789695.n4.nabble.com/function-logit-vs-logistic-regression-tp4646498p4646582.html Sent from the R help mailing list archive at Nabble.com.