Williamson, Michael
2021-Apr-20 09:18 UTC
[R] mgcv: bam warning messages and non-convergence
I have a large dataset of 118225 observations from 16 columns and as such I?ve been using bam, rather than gam, for my analyses. The response variable I?m using is count data but it?s overdispersed, and as such, I thought I?d use a negative binomial model. I have 5 explanatory variables, which are biologically important. Two are numerical and 3 are categorical. I?ve only applied a smoother to the first numerical explanatory variable, because, from some prior analyses I found that TL had edf values of 1.01 and was therefore linear. I also have included categorical two random effects in the model. m3 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year + s(code, bs = 're') + s(monthyear, bs = 're'), family=nb(), data=node_dat, method = "REML") th <- m3$family$getTheta(TRUE) #extracts theta m3 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year + s(code, bs = 're') + s(monthyear, bs = 're'), family=nb(th), data=node_dat, method = "REML") summary(m3) However I?m getting this warning and I can?t find out what it means There were 32 warnings (use warnings() to see them)> warnings()Warning messages: 1: In pmax(1, y)/mu : longer object length is not a multiple of shorter object length 2: In y * log(pmax(1, y)/mu) : longer object length is not a multiple of shorter object length Is this an issue? The model converges, and I?ve checked overdispersion again and get this value> E3 <- resid(m3, type = "pearson") > sum(E3^2)/m3$df.res[1] 0.7436045 So this suggests there is some under dispersion now? Also the model summary gives> summary(m3)Family: Negative Binomial(0.055) Link function: log I?ve read that the 0.055 is also a measure of dispersion so which one is correct? I was confused about all this and I have a lot of zeros in my data (about 96%) so I thought I?d also try an zero inflated poisson, however is does not converge. m4 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year + s(code, bs = 're') + s(monthyear, bs = 're'), family=ziP(), data=node_dat, method = "REML") Warning message: In bgam.fit(G, mf, chunk.size, gp, scale, gamma, method = method, : algorithm did not converge Is there any reason why it does not onverge? And maybe a zero inflated negative binomial would better but I?m not sure how to undertake that. I know there?s a lot here but any help would be appreciated. Many thanks, Mike Michael Williamson London NERC DTP Candidate Email: michael.williamson at kcl.ac.uk<mailto:michael.williamson at kcl.ac.uk> Phone: +447764836592 Skype: mikejwilliamson Twitter: @mjw_marine Website: www.thenetlab.uk<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.thenetlab.uk%2F&data=01%7C01%7Cmichael.williamson%40kcl.ac.uk%7C07c592826b364b249c9208d84e5dbc12%7C8370cf1416f34c16b83c724071654356%7C0&sdata=vaibGznfTGGiS7l0lHuRaQ3w4fnEQGaXIfgQ34OrhG4%3D&reserved=0> Most recent paper: Williamson, M. J. et al. (2021). Analysing detection gaps in acoustic telemetry data to infer differential movement patterns in fish. Ecology and Evolution, 11, 2717-2730. https://doi.org/10.1002/ece3.7226 [[alternative HTML version deleted]]