Sorry bit of a Newbie question, and I promise I have searched the forum already, but I'm getting a bit desperate! I have over-dispersed, zero inflated data, with variance greater than the mean, suggesting Zero-Inflated Negative Binomial - which I attempted in R with the pscl package suggested on http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm However my data is non-integer with some pesky decimals (i.e. 33.12) and zinb / pscl doesn't like that - not surprising as zinb is for count data, normally whole integers etc. Does anyone know of a different zinb package that will allow non-integers or and equivalent test/ model to zinb for non-integer data? Or should I try something else like a quasi-Poisson GLM? Apologies for the Newbie question! Any help much appreciated! Thanks! -- View this message in context: http://www.nabble.com/Zinb-for-Non-interger-data-tp24550044p24550044.html Sent from the R help mailing list archive at Nabble.com.
On 18-Jul-09 17:26:36, JPS2009 wrote:> Sorry bit of a Newbie question, and I promise I have searched the > forum already, but I'm getting a bit desperate! > > I have over-dispersed, zero inflated data, with variance greater > than the mean, suggesting Zero-Inflated Negative Binomial - which > I attempted in R with the pscl package suggested on > http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm > > However my data is non-integer with some pesky decimals (i.e. 33.12) > and zinb / pscl doesn't like that - not surprising as zinb is for > count data, normally whole integers etc. > > Does anyone know of a different zinb package that will allow > non-integers or and equivalent test/ model to zinb for non-integer > data? Or should I try something else like a quasi-Poisson GLM? > > Apologies for the Newbie question! Any help much appreciated! > Thanks!The presence of decimals suggests that those data values are records of quantities which ought to be modelled as continuous variables. For instance, in answer to a survey question "How much did you spend on alcoholic drinks yesterday", the answer would be either a positive sum of money (with decimals), or zero, depending on whether the person spent anything at all on alcohol. So: With probability p, the amount spent was positive and, conditional on being positive, has a distribution which can be modelled by a particular continuous distribution (maybe Log-normal?). With probability (1-p), the amount spent was zero. So a correct approach first requires you to face the question of how to model the positive part of the distribution. Once you have settled that question, it is then possible to see whether that particular class of problem is covered by some package in R, or whether you need to develop an approach yourself. In any case, if I am barking up the right tree above, neither negative binomial nor Poisson would, in principle, be correct for such data since, as you observe, these are intended for count data, not for data which is essentially continuous. Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 19-Jul-09 Time: 12:25:39 ------------------------------ XFMail ------------------------------
JPS2009 wrote:> > Sorry bit of a Newbie question, and I promise I have searched the forum > already, but I'm getting a bit desperate! > > I have over-dispersed, zero inflated data, with variance greater than the > mean, suggesting Zero-Inflated Negative Binomial - which I attempted in R > with the pscl package suggested on > http://www.ats.ucla.edu/stat/R/dae/zinbreg.htm > > However my data is non-integer with some pesky decimals (i.e. 33.12) and > zinb / pscl doesn't like that - not surprising as zinb is for count data, > normally whole integers etc. > > Does anyone know of a different zinb package that will allow non-integers > or and equivalent test/ model to zinb for non-integer data? Or should I > try something else like a quasi-Poisson GLM? > > > Apologies for the Newbie question! Any help much appreciated! > Thanks! >Is it really non-integer...or is it a density (in which case you could use NB + offset)? The quasi-Poisson will not help you with the zero inflation. I'm afraid you will have to do some hard programming by combining the 0-1 binomial part with a continuous distribution on the second part of the data......and I guess the easiest is to do this in MCMC. Perhaps the Gamma distribution can be used? You would have to adjust all likelihood equations as Gamma doesn't allow for zeros. But perhaps another continuous distribution is more appropriate...depends on your data. Alain Zuur -- View this message in context: http://www.nabble.com/Zinb-for-Non-interger-data-tp24550044p24568326.html Sent from the R help mailing list archive at Nabble.com.