Emmanuel Charpentier
2009-Apr-18 17:28 UTC
[R] Modelling an "incomplete Poisson" distribution ?
Dear list, I have the following problem : I want to model a series of observations of a given hospital activity on various days under various conditions. among my "outcomes" (dependent variables) is the number of patients for which a certain procedure is done. The problem is that, when no relevant patient is hospitalized on said day, there is no observation (for which the "number of patients" item would be 0). My goal is to model this number of patients as a function of the "various conditions" described by my independant variables, mosty of them observed but uncontrolled, some of them unobservable (random effects). I am tempted to model them along the lines of : glm(NoP~X+Y+..., data=MyData, family=poisson(link=log)) or (accounting for some random effects) : lmer(NoP~X+Y....+(X|Center)), data=Mydata, family=poisson(link=log)) While the preliminary analysis suggest that (the right part of) a Poisson distribution might be reasonable for all real observations, the lack of observations with count==0 bothers me. Is there a way to cajole glm (and lmer, by the way) into modelling these data to an "incomplete Poisson" model, i. e. with unobserved "0" values ? Sincerely, Emmanuel Charpentier
Emmanuel Charpentier
2009-Apr-18 19:12 UTC
[R] Modelling an "incomplete Poisson" distribution ?
I forgot to add that yes, I've done my homework, and that it seems to me that answers pointing to zero-inflated Poisson (and negative binomial) are irrelevant ; I do not have a mixture of distributions but only part of one distribution, or, if you'll have it, a "zero-deflated Poisson". An answer by Brian Ripley (http://finzi.psych.upenn.edu/R/Rhelp02/archive/11029.html) to a similar question leaves me a bit dismayed : if it is easy to compute the probability function of this zero-deflated RV (off the top of my head, Pr(X=x)=e^-lambda.lambda^x/(x!.(1-e^-lambda))), and if I think that I'm (still) able to use optim to ML-estimate lambda, using this to (correctly) model my problem set and test it amounts to re-writing some (large) part of glm. Furthermore, I'd be a bit embarrassed to test it cleanly (i. e. justifiably) : out of the top of my head, only the likelihood ration test seems readily applicable to my problem. Testing contrasts in my covariates ... hum ! So if someone has written something to that effect, I'd be awfully glad to use it. A not-so-cursory look at the existing packages did not ring any bells to my (admittedly untrained) ears... Of course, I could also bootstrap the damn thing and study the distribution of my contrasts. I'd still been hard pressed to formally test hypotheses on factors... Any ideas ? Emmanuel Charpentier Le samedi 18 avril 2009 ? 19:28 +0200, Emmanuel Charpentier a ?crit :> Dear list, > > I have the following problem : I want to model a series of observations > of a given hospital activity on various days under various conditions. > among my "outcomes" (dependent variables) is the number of patients for > which a certain procedure is done. The problem is that, when no relevant > patient is hospitalized on said day, there is no observation (for which > the "number of patients" item would be 0). > > My goal is to model this number of patients as a function of the > "various conditions" described by my independant variables, mosty of > them observed but uncontrolled, some of them unobservable (random > effects). I am tempted to model them along the lines of : > > glm(NoP~X+Y+..., data=MyData, family=poisson(link=log)) > > or (accounting for some random effects) : > > lmer(NoP~X+Y....+(X|Center)), data=Mydata, family=poisson(link=log)) > > While the preliminary analysis suggest that (the right part of) a > Poisson distribution might be reasonable for all real observations, the > lack of observations with count==0 bothers me. > > Is there a way to cajole glm (and lmer, by the way) into modelling these > data to an "incomplete Poisson" model, i. e. with unobserved "0" > values ? > > Sincerely, > > Emmanuel Charpentier > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >