Hello, I use GAMM function (MGCV package) in R software to study relationship between pollen and pollinosis. My models include autoregressive terms. Here is an exemple: CupAR7_2<-gamm(nb_rca_2~s(Trend,bs="ps",k=49) +Semaine +s(TX03,bs="ps",k=8) +s(UN03,bs="ps",k=4) ......+pollen01 ,correlation=corARMA(p=7,q=0,form=~Trend|Annees), niter=40,control=lmeControl(maxIter=100000,msMaxIter=100000) ,family=quasipoisson, data=donnees, na=na.omit) Each model need 1/2 hour to 1 hour to get a result and I frequently get the following error messages: Erreur dans lme.formula(fixed = fixed, random = random, data = data, correlation = correlation, : nlminb problem, convergence error code = 1 message = singular convergence (7) Erreur dans na.fail.default(list(Xr.4 = c(0.00374377951791214, 0.00438373736920182, : missing values in object Does my computer lack power,(if so,what is required?) or is R limited in speed of execution or is my syntax not ok. Here are the features of my computer: processor: 3.10 GHz i5-2400 CPU RAM: 4 Go (3.87 Go) OS: 64 bits Windows 7 professionnel (Pack 1) Another question: how can we add a smoothing parameter in the model GAMM to help convergence ("sp =" is not accepted)? Thanks for your help Best regards Sylvie Martin SEPIA-Santé Bureau d'études en épidémiologie, biostatistiques, environnement 31 rue de Pontivy 56 150 BAUD FRANCE tél : 02 97 28 80 38 fax: 02 97 28 81 10 [[alternative HTML version deleted]]
Sylvie: I do not know whether your system lacks sufficient power, but its user -- you -- may. The models you are trying to fit are complex and nonlinear, and possibly depending on your data -- do you have a lot of missingness? How large are your data sets? Can they support your models -- you may well be overfitting. Getting a faster system will not affect this -- you'll just get your error messages faster. ... and I would have thought the missing values error is self-explanatory. Sounds to me like you should spend some time with a local statistical expert. Cheers, Bert On Fri, Oct 25, 2013 at 6:19 AM, Sylvie Martin <smartin_sepia at orange.fr> wrote:> > > Hello, > > I use GAMM function (MGCV package) in R software to study relationship between pollen and pollinosis. My models include autoregressive terms. Here is an exemple: > > CupAR7_2<-gamm(nb_rca_2~s(Trend,bs="ps",k=49) +Semaine > +s(TX03,bs="ps",k=8) +s(UN03,bs="ps",k=4) ......+pollen01 > ,correlation=corARMA(p=7,q=0,form=~Trend|Annees), > niter=40,control=lmeControl(maxIter=100000,msMaxIter=100000) > ,family=quasipoisson, data=donnees, na=na.omit) > > Each model need 1/2 hour to 1 hour to get a result and I frequently get the following error messages: > > Erreur dans lme.formula(fixed = fixed, random = random, data = data, correlation = correlation, : > nlminb problem, convergence error code = 1 > message = singular convergence (7) > > Erreur dans na.fail.default(list(Xr.4 = c(0.00374377951791214, 0.00438373736920182, : > missing values in object > > > Does my computer lack power,(if so,what is required?) or is R limited in speed of execution or is my syntax not ok. Here are the features of my computer: > > processor: 3.10 GHz i5-2400 CPU > RAM: 4 Go (3.87 Go) > OS: 64 bits > Windows 7 professionnel (Pack 1) > > Another question: how can we add a smoothing parameter in the model GAMM to help convergence ("sp =" is not accepted)? > > Thanks for your help > Best regards > > Sylvie Martin > SEPIA-Sant? > > Bureau d'?tudes en ?pid?miologie, > biostatistiques, environnement > > 31 rue de Pontivy > > 56 150 BAUD > > FRANCE > t?l : 02 97 28 80 38 > fax: 02 97 28 81 10 > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374
... and, I should have added, your maxiter parameters in the call are set way too high. Lowering them to something more sensible will also get your error messages faster. -- Bert On Fri, Oct 25, 2013 at 6:19 AM, Sylvie Martin <smartin_sepia at orange.fr> wrote:> > > Hello, > > I use GAMM function (MGCV package) in R software to study relationship between pollen and pollinosis. My models include autoregressive terms. Here is an exemple: > > CupAR7_2<-gamm(nb_rca_2~s(Trend,bs="ps",k=49) +Semaine > +s(TX03,bs="ps",k=8) +s(UN03,bs="ps",k=4) ......+pollen01 > ,correlation=corARMA(p=7,q=0,form=~Trend|Annees), > niter=40,control=lmeControl(maxIter=100000,msMaxIter=100000) > ,family=quasipoisson, data=donnees, na=na.omit) > > Each model need 1/2 hour to 1 hour to get a result and I frequently get the following error messages: > > Erreur dans lme.formula(fixed = fixed, random = random, data = data, correlation = correlation, : > nlminb problem, convergence error code = 1 > message = singular convergence (7) > > Erreur dans na.fail.default(list(Xr.4 = c(0.00374377951791214, 0.00438373736920182, : > missing values in object > > > Does my computer lack power,(if so,what is required?) or is R limited in speed of execution or is my syntax not ok. Here are the features of my computer: > > processor: 3.10 GHz i5-2400 CPU > RAM: 4 Go (3.87 Go) > OS: 64 bits > Windows 7 professionnel (Pack 1) > > Another question: how can we add a smoothing parameter in the model GAMM to help convergence ("sp =" is not accepted)? > > Thanks for your help > Best regards > > Sylvie Martin > SEPIA-Sant? > > Bureau d'?tudes en ?pid?miologie, > biostatistiques, environnement > > 31 rue de Pontivy > > 56 150 BAUD > > FRANCE > t?l : 02 97 28 80 38 > fax: 02 97 28 81 10 > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374
1) The model is far too complex. I have trouble imagining a system where the trend potentially uses 49 degrees of freedom and yet the unexplained variance within years follows an AR(7). It looks like you have time points within years over many years; is that why you are trying to use such a complex trend, to capture the repeated seasonal patterns? If so, break that down and fit a spline using `bs = "cc"` (or one of the other cyclic splines in mgcv) and a separate spline for the trend. If you want to allow the seasonal pattern to vary with time, then you can do that with multivariate splines, eg, this way you can compare a model 1 m1 <- gamm(yi ~ ti(DayOfYear, bs = "cc") + ti(Time, bs = "cr"), ....) with model 2 m2 <- gamm(yi ~ ti(DayOfYear, bs = "cc") + ti(Time, bs = "cr") + ti(DayOfYear, Time, bs = c("cc","cr")), ....) Of course, you'll need to fix the the ARAM structure at know coefficient (currently you are estimating it) or you aren't really just comparing the "fixed" part of the models. 2) You aren't heeding Simon's warnings about fiddling with lmeControl. You don't want EM iterations, so you need at least to have `lmeControl(niterEM = 0)`, alongside other options A general word of advice with using gamm(), start simply and make sure you have lots of RAM. These models are very complex so if you start with simple models you will be less likely to run into convergence issues and you get results more quickly. Get lots of RAM because models with `correlation` arguments involve inverting potentially big covariance matrices and that can quite quickly chew through RAM. I've had pretty good success fitting models with non-linear trends etc to environmental time series using gamm(), but the only time things went a bit awry was when too complex a model was required (where I had data covering hundreds of thousands of years from an ocean core and the trend really was very wiggly, and then the correlation structure and the trend fought for control as there was an identifiability issue). HTH Gavin On 25 October 2013 07:19, Sylvie Martin <smartin_sepia at orange.fr> wrote:> > > Hello, > > I use GAMM function (MGCV package) in R software to study relationship between pollen and pollinosis. My models include autoregressive terms. Here is an exemple: > > CupAR7_2<-gamm(nb_rca_2~s(Trend,bs="ps",k=49) +Semaine > +s(TX03,bs="ps",k=8) +s(UN03,bs="ps",k=4) ......+pollen01 > ,correlation=corARMA(p=7,q=0,form=~Trend|Annees), > niter=40,control=lmeControl(maxIter=100000,msMaxIter=100000) > ,family=quasipoisson, data=donnees, na=na.omit) > > Each model need 1/2 hour to 1 hour to get a result and I frequently get the following error messages: > > Erreur dans lme.formula(fixed = fixed, random = random, data = data, correlation = correlation, : > nlminb problem, convergence error code = 1 > message = singular convergence (7) > > Erreur dans na.fail.default(list(Xr.4 = c(0.00374377951791214, 0.00438373736920182, : > missing values in object > > > Does my computer lack power,(if so,what is required?) or is R limited in speed of execution or is my syntax not ok. Here are the features of my computer: > > processor: 3.10 GHz i5-2400 CPU > RAM: 4 Go (3.87 Go) > OS: 64 bits > Windows 7 professionnel (Pack 1) > > Another question: how can we add a smoothing parameter in the model GAMM to help convergence ("sp =" is not accepted)? > > Thanks for your help > Best regards > > Sylvie Martin > SEPIA-Sant? > > Bureau d'?tudes en ?pid?miologie, > biostatistiques, environnement > > 31 rue de Pontivy > > 56 150 BAUD > > FRANCE > t?l : 02 97 28 80 38 > fax: 02 97 28 81 10 > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Gavin Simpson, PhD