Williamson, Michael
2021-Apr-20 09:18 UTC
[R] mgcv: bam warning messages and non-convergence
I have a large dataset of 118225 observations from 16 columns and as such I?ve
been using bam, rather than gam, for my analyses.
The response variable I?m using is count data but it?s overdispersed, and as
such, I thought I?d use a negative binomial model. I have 5 explanatory
variables, which are biologically important. Two are numerical and 3 are
categorical. I?ve only applied a smoother to the first numerical explanatory
variable, because, from some prior analyses I found that TL had edf values of
1.01 and was therefore linear. I also have included categorical two random
effects in the model.
m3 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year +
s(code, bs = 're') + s(monthyear, bs = 're'),
family=nb(), data=node_dat, method = "REML")
th <- m3$family$getTheta(TRUE) #extracts theta
m3 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year +
s(code, bs = 're') + s(monthyear, bs = 're'),
family=nb(th), data=node_dat, method = "REML")
summary(m3)
However I?m getting this warning and I can?t find out what it means
There were 32 warnings (use warnings() to see them)> warnings()
Warning messages:
1: In pmax(1, y)/mu :
longer object length is not a multiple of shorter object length
2: In y * log(pmax(1, y)/mu) :
longer object length is not a multiple of shorter object length
Is this an issue? The model converges, and I?ve checked overdispersion again and
get this value
> E3 <- resid(m3, type = "pearson")
> sum(E3^2)/m3$df.res
[1] 0.7436045
So this suggests there is some under dispersion now? Also the model summary
gives
> summary(m3)
Family: Negative Binomial(0.055)
Link function: log
I?ve read that the 0.055 is also a measure of dispersion so which one is
correct?
I was confused about all this and I have a lot of zeros in my data (about 96%)
so I thought I?d also try an zero inflated poisson, however is does not
converge.
m4 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year +
s(code, bs = 're') + s(monthyear, bs = 're'),
family=ziP(), data=node_dat, method = "REML")
Warning message:
In bgam.fit(G, mf, chunk.size, gp, scale, gamma, method = method, :
algorithm did not converge
Is there any reason why it does not onverge? And maybe a zero inflated negative
binomial would better but I?m not sure how to undertake that.
I know there?s a lot here but any help would be appreciated.
Many thanks,
Mike
Michael Williamson
London NERC DTP Candidate
Email: michael.williamson at kcl.ac.uk<mailto:michael.williamson at
kcl.ac.uk> Phone: +447764836592 Skype: mikejwilliamson Twitter: @mjw_marine
Website:
www.thenetlab.uk<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.thenetlab.uk%2F&data=01%7C01%7Cmichael.williamson%40kcl.ac.uk%7C07c592826b364b249c9208d84e5dbc12%7C8370cf1416f34c16b83c724071654356%7C0&sdata=vaibGznfTGGiS7l0lHuRaQ3w4fnEQGaXIfgQ34OrhG4%3D&reserved=0>
Most recent paper:
Williamson, M. J. et al. (2021). Analysing detection gaps in acoustic telemetry
data to infer differential movement patterns in fish. Ecology and Evolution, 11,
2717-2730. https://doi.org/10.1002/ece3.7226
[[alternative HTML version deleted]]