thr3ads.net - R help - [R] weights vs. offset (negative binomial regression) [Oct 2023]

If this information is useful, please help other people find it:
Share via:

유준택

2023-Oct-28 07:30 UTC

[R] weights vs. offset (negative binomial regression)

Colleagues,



I have a dataset that includes five variables.

- Catch: the catch number counted in some species (ind.)

- Effort: fishing effort (the number of fishing vessels)

- xx1, xx2, xx3: some environmental factors

As an overdispersion test on the ?Catch? variable, I modeled with negative
binomial distribution using a GLM. The ?Effort? variable showed a gradually
decreasing trend during the study period. I was able to get the results I
wanted when considered ?Effort? function as a weights function in the
negative binomial regression as follows:



library(qcc)

Catch=c(25,2,7,6,75,5,1,4,66,15,9,25,40,8,7,4,36,11,1,14,141,9,74,38,126,3)

Effort=c(258,258,258,258,258,258,258,254,252,252,252,252,252,252,252,252,252,252,252,248,246,246,246,246,246,246)

xx1=c(0.8,0.5,1.2,0.5,1.1,1.1,1.0,0.6,0.9,0.5,1.2,0.6,1.2,0.7,1.0,0.6,1.6,0.7,0.8,0.6,1.7,0.9,1.1,0.5,1.4,0.5)

xx2=c(1.7,1.6,2.7,2.6,1.5,1.5,2.8,2.5,1.7,1.9,2.2,2.4,1.6,1.4,3.0,2.4,1.4,1.5,2.2,2.3,1.7,1.7,1.9,1.9,1.4,1.4)

xx3=c(188,40,2,10,210,102,117,14,141,28,48,15,220,115,10,14,320,20,3,10,400,150,145,160,460,66)

#

edata <- data.frame(Catch, Effort, xx1, xx2, xx3)

#

qcc.overdispersion.test(edata$Catch, type="poisson")

#

summary(glm.nb(Catch~xx1+xx2+xx3, weights=Effort, data=edata))

summary(glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata))



I am not sure the application of the weights function to the negative
binomial regression is correct. Also I wonder if there is a better way
doing this. Can anyone help?

	[[alternative HTML version deleted]]

Ben Bolker

2023-Oct-28 18:21 UTC

head link

[R] weights vs. offset (negative binomial regression)

Using an offset of log(Effort) as in your second model is the more 
standard way to approach this problem; it corresponds to assuming that 
catch is strictly proportional to effort. Adding log(Effort) as a 
covariate (as illustrated below) tests whether a power-law model (catch 
propto (Effort)^(b+1), b!=0) is a better description of the data.  (In 
this case it is not, although the confidence intervals on b are very 
wide, indicating that we have very little information -- this is not 
surprising since the proportional range of effort is very small 
(246-258) in this data set.

   In general you should *not* check overdispersion of the raw data 
(i.e., the *marginal distribution* of the data, you should check 
overdispersion of a fitted (e.g. Poisson) model, as below.

   cheers
    Ben Bolker


edata <- data.frame(Catch, Effort, xx1, xx2, xx3)

## graphical exploration

library(ggplot2); theme_set(theme_bw())
library(tidyr)
edata_long <- edata |> pivot_longer(names_to="var", cols
=-c("Catch",
"Effort"))
ggplot(edata_long, aes(value, Catch)) +
     geom_point(alpha = 0.2, aes(size = Effort)) +
     facet_wrap(~var, scale="free_x") +
     geom_smooth(method = "glm", method.args = list(family = 
"quasipoisson"))
#

library(MASS)
g1 <- glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata)
g2 <- update(g1, . ~ . + log(Effort))
g0 <- glm(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata,
           family = poisson)
performance::check_overdispersion(g0)
summary(g1)
summary(g2)
options(digits = 3)
confint(g2)
summary(g1)



On 2023-10-28 3:30 a.m., ??? wrote:> Colleagues,
> 
> 
> 
> I have a dataset that includes five variables.
> 
> - Catch: the catch number counted in some species (ind.)
> 
> - Effort: fishing effort (the number of fishing vessels)
> 
> - xx1, xx2, xx3: some environmental factors
> 
> As an overdispersion test on the ?Catch? variable, I modeled with negative
> binomial distribution using a GLM. The ?Effort? variable showed a gradually
> decreasing trend during the study period. I was able to get the results I
> wanted when considered ?Effort? function as a weights function in the
> negative binomial regression as follows:
> 
> 
> 
> library(qcc)
> 
> Catch=c(25,2,7,6,75,5,1,4,66,15,9,25,40,8,7,4,36,11,1,14,141,9,74,38,126,3)
> 
>
Effort=c(258,258,258,258,258,258,258,254,252,252,252,252,252,252,252,252,252,252,252,248,246,246,246,246,246,246)
> 
>
xx1=c(0.8,0.5,1.2,0.5,1.1,1.1,1.0,0.6,0.9,0.5,1.2,0.6,1.2,0.7,1.0,0.6,1.6,0.7,0.8,0.6,1.7,0.9,1.1,0.5,1.4,0.5)
> 
>
xx2=c(1.7,1.6,2.7,2.6,1.5,1.5,2.8,2.5,1.7,1.9,2.2,2.4,1.6,1.4,3.0,2.4,1.4,1.5,2.2,2.3,1.7,1.7,1.9,1.9,1.4,1.4)
> 
>
xx3=c(188,40,2,10,210,102,117,14,141,28,48,15,220,115,10,14,320,20,3,10,400,150,145,160,460,66)
> 
> #
> 
> edata <- data.frame(Catch, Effort, xx1, xx2, xx3)
> 
> #
> 
> qcc.overdispersion.test(edata$Catch, type="poisson")
> 
> #
> 
> summary(glm.nb(Catch~xx1+xx2+xx3, weights=Effort, data=edata))
> 
> summary(glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata))
> 
> 
> 
> I am not sure the application of the weights function to the negative
> binomial regression is correct. Also I wonder if there is a better way
> doing this. Can anyone help?
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Oct 2023 - weights vs. offset (negative binomial regression)

[R] weights vs. offset (negative binomial regression)

[R] weights vs. offset (negative binomial regression)