thr3ads.net - R help - [R] clusters in zero-inflated negative binomial models [May 2012]

If this information is useful, please help other people find it:
Share via:

Lies Durnez

2012-May-16 14:13 UTC

[R] clusters in zero-inflated negative binomial models

Dear all,

I want to build a model in R based on animal collection data, that look like the
following

Nr	Village	District	Site	Survey	Species	Count
1	AX	A	F	Dry	B	0
2	AY	A	V	Wet	A	5
3	BX	B	F	Wet	B	1
4	BY	B	V	Dry	B	0

Each data point shows one collection unit in a certain Village, District, Site,
and Survey for a certain Species. 'Count' is the number of animals
collected in that collection unit. It is possible that zero animals are
collected in that unit because of very low densities, but also because of
climatic conditions (wind, rain, etc), so we would expect an excess in zeroes. I
have tested that the data are overdispersed (variance much bigger than mean), so
a zero-inflated negative binomial model seems the most suitable model in this
case. To be sure, I will compare the zero-inflated model to the standard
binomial model using the vuong test. The models will be made for each species
separately. For these models I can use the glm.nb(), and the and zeroinfl () in
the package pscl, looking something like this (after selection of the subset
B<-subset(data, Species=="B")):
NB=glm.nb(formula = Count ~ District+Site+Survey, data = B)
ZINB=zeroinfl(formula = Count ~ District+Site+Survey, dist="negbin",
data = B)
Vuong(NB,ZINB)
I have tried this and it works very elegantly.

However, the animal collections were only done in 4 districts, and in each
district 3 villages were chosen (a total of 12 villages). This should be
included in the design. The package survey allows this for the standard negative
binomial model, but it seems to me that it is not possible for the zero-inflated
NB. So, my question is two-fold:
1. Is a zero-inflated NB possible in the survey package. If yes, how? 
2. If no, how can I build a zero-inflated NB model that takes into account the
clustering of the observations (animal counts) in villages and the clustering of
the villages in districts.

Thank you very much for the help.
ITM Colloquium

Antwerp, Belgium
3-5 December 2012

itg.be/colloq2012

Disclaimer: itg.be/disclaimer

Directions to our location(s): g.co/maps/ua89b

Ben Bolker

2012-May-16 21:26 UTC

head link

[R] clusters in zero-inflated negative binomial models

Lies Durnez <ldurnez <at> itg.be> writes:
> I want to build a model in R based on animal collection data, that look
like
the following> 
> Nr	Village	District	Site	Survey	Species	Count
> 1	AX	A	F	Dry	B	0
> 2	AY	A	V	Wet	A	5
> 3	BX	B	F	Wet	B	1
> 4	BY	B	V	Dry	B	0
 > Each data point shows one collection unit in a certain Village,
> District, Site, and Survey for a certain Species. 'Count' is the
> number of animals collected in that collection unit. It is possible
> that zero animals are collected in that unit because of very low
> densities, but also because of climatic conditions (wind, rain,
> etc), so we would expect an excess in zeroes. I have tested that the
> data are overdispersed (variance much bigger than mean), so a
> zero-inflated negative binomial model seems the most suitable model
> in this case.
 [snip snip snip]
> However, the animal collections were only done in 4 districts, and
> in each district 3 villages were chosen (a total of 12
> villages). This should be included in the design. The package survey
> allows this for the standard negative binomial model, but it seems
> to me that it is not possible for the zero-inflated NB. So, my
> question is two-fold: 1. Is a zero-inflated NB possible in the
> survey package. If yes, how?  2. If no, how can I build a
> zero-inflated NB model that takes into account the clustering of the
> observations (animal counts) in villages and the clustering of the
> villages in districts.
  Treating villages and districts as random effects (clusters)
basically puts you in the domain of generalized linear mixed models.
You can use the glmmADMB package to fit zero-inflated, mixed negative
binomial models.  You can also use the MCMCglmm package to fit
lognormal-Poisson models, which are another form of overdispersed
count data (it depends how strongly you require that the actual model
be NB as opposed to just a reasonable model for overdispersed count
data).

4 districts is not very many for estimating an among-district variance 
(which is basically what you are doing when you fit a clustered/
mixed model), so I might suggest using district as a fixed effect,
but then using district:village (i.e. the interaction between district
and village, or village alone if they are uniquely labeled).

  glmm.wikidot.com/faq may be useful.

  I would suggest that you send follow-ups to the
r-sig-mixed-models <at> r-project.org mailing list.

Maybe Matching Threads

Search for more maybe matching threads

R help - May 2012 - clusters in zero-inflated negative binomial models

[R] clusters in zero-inflated negative binomial models

[R] clusters in zero-inflated negative binomial models

Maybe Matching Threads