Robert Merkel
2002-Jul-16 06:04 UTC
[R] ANOVA-like tests of geometrically-distributed data
I have a statistical problem which has given me no end of grief recently, and am posting here in the hope that somebody can give me a straight answer. I'm a IT postgrad, not a statistician, so people may have to speak really slowly and clearly for me to get it :) I am collecting simulation data, and the results are geometrically distributed (or approximately so). From what I can gather from my stats books, provided the sample size is large enough (>30 or so) I can use t and z-tests to compare means under different experimental conditions as the CLT says that the sample means will be approximately normally distributed. However, also as I understand it, the ANOVA explicitly assumes that the population is normally distributed, which is an assumption that in my case is not satisfied. I have also been told that something called a "generalized linear model" can be used to perform ANOVA-like statistics on geometrically-distributed data, but not how. There is R documentation on a function "glm" and also "anova.glm" which discuss stuff that looks vaguely like what I want to do, but I can't really make sense of it. Can these functions do what I'm trying to do? If so, what's the procedure? Any help will be *much* appreciated. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Bill.Venables@cmis.csiro.au
2002-Jul-16 07:38 UTC
[R] ANOVA-like tests of geometrically-distributed data
Robert Merkel asks:> -----Original Message----- > From: Robert Merkel [mailto:rmerkel at venus.it.swin.edu.au] > Sent: Tuesday, July 16, 2002 4:05 PM > To: r-help at stat.math.ethz.ch > Subject: [R] ANOVA-like tests of geometrically-distributed data > > I have a statistical problem which has given me no end of grief recently, > and am posting here in the hope that somebody can give me a straight > answer. I'm a IT postgrad, not a statistician, so people may have to > speak really slowly and clearly for me to get it :)[WNV] Indeed. I'm a bit surprised you seem to think that all the help you need is a bit of quick advice by email. This stuff can be rather tricky, like a lot of statistics.> I am collecting simulation data, and the results are geometrically > distributed (or approximately so). From what I can gather from my stats > books, provided the sample size is large enough (>30 or so) I can use t > and z-tests to compare means under different experimental conditions as > the CLT says that the sample means will be approximately normally > distributed. > > However, also as I understand it, the ANOVA explicitly assumes that the > population is normally distributed, which is an assumption that in my case > > is not satisfied. > > I have also been told that something called a "generalized linear model" > can be used to perform ANOVA-like statistics on geometrically-distributed > data, but not how. > > There is R documentation on a function "glm" and also "anova.glm" which > discuss stuff that looks vaguely like what I want to do, but I can't > really make sense of it.[WNV] The geometric distribution is a special case of the negative binomial, which can indeed be fitted using glm but you need the negative.binomial( ) function from MASS (or some equivalent) to provide the family. You will need to sort out what special value of theta corresponds to the geometric distribution. It could be 1, but I'm really not sure. This information may be useful to whoever can advise you more fully. Have fun.> Can these functions do what I'm trying to do?[WNV] Easy, yes.> If so, what's the > procedure?[WNV] Much harder to give in detail here.> Any help will be *much* appreciated.[WNV] Even if the news is not good?> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. > -.-.- > r-help mailing list -- Read > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. > _._._-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Hi Robert, if you are not a statistician and you have a statistical problem, please consult a statistician (or someone familiar with Statistics)! I think that the different professions (statistician, medical doctors,....) are very useful to improve everything (including joint works/papers) In my opinion, availability of PC with free statistical softwares (e.g. R) are not sufficient to understand the Statistics and to perform and to solve (correctly) statistical problems. Your words "I have also been told that something called a "generalized linear model" " are well-self-explain, I believe. regards, vito ----- Original Message ----- From: "Robert Merkel" <rmerkel at venus.it.swin.edu.au> To: <r-help at stat.math.ethz.ch> Sent: Tuesday, July 16, 2002 8:04 AM Subject: [R] ANOVA-like tests of geometrically-distributed data> I have a statistical problem which has given me no end of grief recently, > and am posting here in the hope that somebody can give me a straight > answer. I'm a IT postgrad, not a statistician, so people may have to > speak really slowly and clearly for me to get it :) > > I am collecting simulation data, and the results are geometrically > distributed (or approximately so). From what I can gather from my stats > books, provided the sample size is large enough (>30 or so) I can use t > and z-tests to compare means under different experimental conditions as > the CLT says that the sample means will be approximately normally > distributed. > > However, also as I understand it, the ANOVA explicitly assumes that the > population is normally distributed, which is an assumption that in my case > is not satisfied. > > I have also been told that something called a "generalized linear model" > can be used to perform ANOVA-like statistics on geometrically-distributed > data, but not how. > > There is R documentation on a function "glm" and also "anova.glm" which > discuss stuff that looks vaguely like what I want to do, but I can't > really make sense of it. > > Can these functions do what I'm trying to do? If so, what's the > procedure? > > Any help will be *much* appreciated. > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-> r-help mailing list -- Readhttp://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html> Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._