Robbie Weterings
2013-Jul-02 12:01 UTC
[R] Non-linear modelling with several variables including a categorical variable
Hello everyone, I am trying to model some data regarding a predator prey interaction experiment (n=26). Predation rate is my response variable and I have 4 explanatory variables: predator density (1,2,3,4 5), predator size, prey density (5,10,15,20,25,30) and prey type (3 categories). I started with several linear models (glm) and found (as expected) that prey and predator density were non-linear related to predation rates. If I use a log transformation on these variables I get really nice curves and an adjusted R2 of 0.82, but it is not really the right approach for modelling non-linear relationships. Therefore I switched to non-linear least square regression (nls). I have several predator-prey models based on existing ecological literature e.g.: model1 <- nls(rates ~ (a * prey)/(1 + b * prey), start = list(a = 0.27,b 0.13), trace = TRUE) ### Holling's type II functional response model2 <- nls(rates ~ (a*prey)/(1+ (b * prey) + c * (pred -1 )), start list(a=0.22451, b=-0.18938, c=1.06941), trace=TRUE, subset=I1) ### Beddington-**DeAngelis functional response These models work perfectly, but now I want to add prey type as well. In the linear models prey type was the most important variable so I don't want to leave it out. I understand that you can't add categorical variables in nls, so I thought I try a generalized additive model (gam). The problem with the gam models is that the smoothers (both spline and loess) don't work on both variables because there are only a very restricted number of values for prey density and predator density. I can manage to get a model with a single variable smoothed using loess. But for two variables it is simply not working. The spline function does not work at all because I have so few values (5) for my variables (see model 4). model3 <- gam(rates~ lo(pred, span=0.9)+prey) ## this one is actually working but does not include a smoother for prey. model4 <- gam(rates~ s(pred)+prey) ## this one gives problems: *A term has fewer unique covariate combinations than specified maximum degrees of freedom* My question is: are there any other possibilities to model data with 2 non-linear related variables in which I can also include a categorical variable. I would prefer to use nls (model2) with for example different intercepts for each category but I'm not sure how to get this sorted, if it is possible at all. The dataset is too small to split it up into the three categories, moreover, one of the categories only contains 5 data points. Any help would be really appreciated. With kind regards, -- Robbie Weterings *Project Manager Cat Drop Thailand ** Tel: +66(0)890176087 * 65/13 Mooban Chakangrao, Naimuang Muang Kamphaeng Phet 62000, Thailand àÅ¢·Õè 65/13 Á.ªÒ¡Ñ§ÃÒÇ ¶¹¹ ÃÒª´íÒà¹Ô¹2 ã¹àÁ×ͧ ÍíÒàÀÍ/ ࢵ àÁ×ͧ¡íÒàྦྷྪà ¨Ñ§ËÇÑ´ ¡íÒàྦྷྪà 62000 <http://www.catdropfoundation.org> <http://www.catdropfoundation.org/facebook/Facebook.html> *www.catdropfoundation.org* <http://www.catdropfoundation.org/> *www.facebook.com/catdropfoundation*<http://www.facebook.com/catdropfoundation> *Boorn 45, 9204 AZ, Drachten, The Netherlands* [[alternative HTML version deleted]]
Prof J C Nash (U30A)
2013-Jul-03 13:51 UTC
[R] Non-linear modelling with several variables including a categorical variable
If preytype is an independent variable, then models based on it should be OK. If preytype comes into the parameters you are trying to estimate, then the easiest way is often to generate all the possible combinations (integers --> fairly modest number of these) and run all the least squares minimizations. Crude but effective. nlxb from nlmrt or nlsLM from minpack.lm may be more robust in doing this, but less efficient if nls works OK. JN On 13-07-03 06:00 AM, r-help-request at r-project.org wrote:> Message: 10 > Date: Tue, 2 Jul 2013 19:01:55 +0700 > From: Robbie Weterings<robbie.weterings at gmail.com> > To:r-help at r-project.org > Subject: [R] Non-linear modelling with several variables including a > categorical variable > Message-ID: > <CAFe5dHZRM+BpG1v77EzHun+tacV64J_9pnSFGh_xne5CSZ9qdQ at mail.gmail.com> > Content-Type: text/plain > > Hello everyone, > > I am trying to model some data regarding a predator prey interaction > experiment (n=26). Predation rate is my response variable and I have 4 > explanatory variables: predator density (1,2,3,4 5), predator size, prey > density (5,10,15,20,25,30) and prey type (3 categories). I started with > several linear models (glm) and found (as expected) that prey and predator > density were non-linear related to predation rates. If I use a log > transformation on these variables I get really nice curves and an adjusted > R2 of 0.82, but it is not really the right approach for modelling > non-linear relationships. Therefore I switched to non-linear least square > regression (nls). I have several predator-prey models based on existing > ecological literature e.g.: > > model1 <- nls(rates ~ (a * prey)/(1 + b * prey), start = list(a = 0.27,b > 0.13), trace = TRUE) ### Holling's type II functional response > > model2 <- nls(rates ~ (a*prey)/(1+ (b * prey) + c * (pred -1 )), start > list(a=0.22451, b=-0.18938, c=1.06941), trace=TRUE, subset=I1) ### > Beddington-**DeAngelis functional response > > These models work perfectly, but now I want to add prey type as well. In > the linear models prey type was the most important variable so I don't want > to leave it out. I understand that you can't add categorical variables in > nls, so I thought I try a generalized additive model (gam). > > The problem with the gam models is that the smoothers (both spline and > loess) don't work on both variables because there are only a very > restricted number of values for prey density and predator density. I can > manage to get a model with a single variable smoothed using loess. But for > two variables it is simply not working. The spline function does not work > at all because I have so few values (5) for my variables (see model 4). > > model3 <- gam(rates~ lo(pred, span=0.9)+prey) ## this one is actually > working but does not include a smoother for prey. > > model4 <- gam(rates~ s(pred)+prey) ## this one gives problems: > *A term has fewer unique covariate combinations than specified maximum > degrees of freedom* > > My question is: are there any other possibilities to model data with 2 > non-linear related variables in which I can also include a categorical > variable. I would prefer to use nls (model2) with for example different > intercepts for each category but I'm not sure how to get this sorted, if it > is possible at all. The dataset is too small to split it up into the three > categories, moreover, one of the categories only contains 5 data points. > > Any help would be really appreciated. > > With kind regards, > -- Robbie Weterings *Project Manager Cat Drop Thailand ** Tel: > +66(0)890176087 * 65/13 Mooban Chakangrao, Naimuang Muang Kamphaeng Phet > 62000, Thailand ?????? 65/13 ?.???????? ??? ??????????2 ??????? ??????/ > ??? ???????????????? ??????? ??????????? 62000 > <http://www.catdropfoundation.org> > <http://www.catdropfoundation.org/facebook/Facebook.html> > *www.catdropfoundation.org* <http://www.catdropfoundation.org/> > *www.facebook.com/catdropfoundation*<http://www.facebook.com/catdropfoundation> > *Boorn 45, 9204 AZ, Drachten, The Netherlands* [[alternative HTML > version deleted]]
Robbie Weterings
2013-Jul-05 01:30 UTC
[R] Non-linear modelling with several variables including a categorical variable
Dear Prof. Nash, I tried to run nls with the nlxb function and as you mention it is fairly slower in terms of running the code. However, if I would have used this function earlier I would have saved a lot of time trying to find the start values. The output looks a little bit sloppy but I think it is very usefull to use in combination with nls. Thanks Robbie On Thu, Jul 4, 2013 at 10:09 PM, Prof J C Nash (U30A) <nashjc@uottawa.ca>wrote:> I was actually thinking of something a little different, but realize that > for what you are trying, the dummy variable approach is likely a much > better choice. Good luck. > > JN > > PS. If you try nlxb from nlmrt, let me know how it works for you. The > intent is to be much more aggressive in finding solutions, but when nls > works, nls should be faster. I've examples where nls fails from about 3/4 > of random starting points, but nlxb gets there 95% of the time. My goal was > to provide something that would be useful when users did not know good > starting values. Also for small residual problems that nls is explicitly > noted as being unsuited for. > > > On 13-07-04 06:40 AM, Robbie Weterings wrote: > >> Thanks for the advise Prof Nash, but I'm not sure if I understood it >> right. I managed to make a new model based on what I think you meant. >> What I did is; I created 3 variables (cat1, cat2, cat3) one for each >> category with either the value 1 or 0 and added these to the model so >> they work as different intercepts: >> >> model <- nls(rates ~ f*cat1 + d*cat2 + e*cat3 +((a * prey)/((1 + b * >> prey) * (1 + c * (pred-1)))), start = list(a = 0.14, b = 0.009, c=0.66, >> d=0.8,e=-0.04,f=1.4), trace = TRUE) >> >> Please correct me if i'm wrong. It seems to work and the model is >> satisfactory. It gives me similar results as what I had with the glm >> with all the variables being log transformed. I first tried to fit >> predator and prey for each category in a model like this, so I would get >> different curves for each category: >> >> model <- nls(rates ~ ((a*cat1) * ((b * prey)/((1 + c * prey) * (1 + d * >> (pred-1))) + ((e*cat1) * ((f * prey)/((1 + g * prey) * (1 + h * >> (pred-1))) ........ >> >> But it did not work moreover it would have resulted in too many >> parameters for the size of the dataset. It will never win it from any of >> the simpler models. >> >> I also got some advise about the gam model from someone at >> stats.stackexchange: >> /The gam model starts with knots set at 10, which you do not have enough >> >> data for, however you have densities of 5 and 6 groups - you can set >> your knots 1 fewer and the model should run. Something like >> model4<-gam(rates~s(pred,k=5)+**s(prey,k=4),data=data) / >> >> >> This also worked fine. However, I would prefer a nls model and by the >> look of it so do my residuals. >> With kind regards >> Robbie >> >> >> On Wed, Jul 3, 2013 at 8:51 PM, Prof J C Nash (U30A) <nashjc@uottawa.ca >> <mailto:nashjc@uottawa.ca>> wrote: >> >> If preytype is an independent variable, then models based on it >> should be OK. If preytype comes into the parameters you are trying >> to estimate, then the easiest way is often to generate all the >> possible combinations (integers --> fairly modest number of these) >> and run all the least squares minimizations. Crude but effective. >> nlxb from nlmrt or nlsLM from minpack.lm may be more robust in doing >> this, but less efficient if nls works OK. >> >> JN >> >> On 13-07-03 06:00 AM, r-help-request@r-project.org >> <mailto:r-help-request@r-**project.org <r-help-request@r-project.org>> >> wrote: >> >> Message: 10 >> Date: Tue, 2 Jul 2013 19:01:55 +0700 >> From: Robbie Weterings<robbie.weterings@__g**mail.com<http://gmail.com> >> <mailto:robbie.weterings@**gmail.com <robbie.weterings@gmail.com> >> >> >> To:r-help@r-project.org <mailto:To%3Ar-help@r-project.**org<To%253Ar-help@r-project.org> >> > >> >> Subject: [R] Non-linear modelling with several variables >> including a >> categorical variable >> Message-ID: >> >> <CAFe5dHZRM+BpG1v77EzHun+__**tacV64J_9pnSFGh_xne5CSZ9qdQ@__** >> mail.gmail.com <http://mail.gmail.com> >> <mailto:CAFe5dHZRM%**2BBpG1v77EzHun%2BtacV64J_** >> 9pnSFGh_xne5CSZ9qdQ@mail.**gmail.com<CAFe5dHZRM%252BBpG1v77EzHun%252BtacV64J_9pnSFGh_xne5CSZ9qdQ@mail.gmail.com> >> >> >> >> Content-Type: text/plain >> >> >> Hello everyone, >> >> I am trying to model some data regarding a predator prey >> interaction >> experiment (n=26). Predation rate is my response variable and I >> have 4 >> explanatory variables: predator density (1,2,3,4 5), predator >> size, prey >> density (5,10,15,20,25,30) and prey type (3 categories). I >> started with >> several linear models (glm) and found (as expected) that prey >> and predator >> density were non-linear related to predation rates. If I use a log >> transformation on these variables I get really nice curves and >> an adjusted >> R2 of 0.82, but it is not really the right approach for modelling >> non-linear relationships. Therefore I switched to non-linear >> least square >> regression (nls). I have several predator-prey models based on >> existing >> ecological literature e.g.: >> >> model1 <- nls(rates ~ (a * prey)/(1 + b * prey), start = list(a >> = 0.27,b >> 0.13), trace = TRUE) ### Holling's type II functional response >> >> model2 <- nls(rates ~ (a*prey)/(1+ (b * prey) + c * (pred -1 )), >> start >> list(a=0.22451, b=-0.18938, c=1.06941), trace=TRUE, subset=I1) ### >> Beddington-**DeAngelis functional response >> >> >> These models work perfectly, but now I want to add prey type as >> well. In >> the linear models prey type was the most important variable so I >> don't want >> to leave it out. I understand that you can't add categorical >> variables in >> nls, so I thought I try a generalized additive model (gam). >> >> The problem with the gam models is that the smoothers (both >> spline and >> loess) don't work on both variables because there are only a very >> restricted number of values for prey density and predator >> density. I can >> manage to get a model with a single variable smoothed using >> loess. But for >> two variables it is simply not working. The spline function does >> not work >> at all because I have so few values (5) for my variables (see >> model 4). >> >> model3 <- gam(rates~ lo(pred, span=0.9)+prey) ## this one is >> actually >> working but does not include a smoother for prey. >> >> model4 <- gam(rates~ s(pred)+prey) ## this one gives problems: >> *A term has fewer unique covariate combinations than specified >> maximum >> degrees of freedom* >> >> >> My question is: are there any other possibilities to model data >> with 2 >> non-linear related variables in which I can also include a >> categorical >> variable. I would prefer to use nls (model2) with for example >> different >> intercepts for each category but I'm not sure how to get this >> sorted, if it >> is possible at all. The dataset is too small to split it up into >> the three >> categories, moreover, one of the categories only contains 5 data >> points. >> >> Any help would be really appreciated. >> >> With kind regards, >> -- Robbie Weterings *Project Manager Cat Drop Thailand ** Tel: >> +66(0)890176087 * 65/13 Mooban Chakangrao, Naimuang Muang >> Kamphaeng Phet >> >> 62000, Thailand àÅ¢·Õè 65/13 Á.ªÒ¡Ñ§ÃÒÇ ¶¹¹ ÃÒª´íÒà¹Ô¹2 ã¹àÁ×ͧ >> ÍíÒàÀÍ/ >> ࢵ àÁ×ͧ¡íÒàྦྷྪà ¨Ñ§ËÇÑ´ ¡íÒàྦྷྪà 62000 >> <http://www.catdropfoundation.**__org >> <http://www.catdropfoundation.**org<http://www.catdropfoundation.org> >> >> >> <http://www.catdropfoundation.**__org/facebook/Facebook.html >> <http://www.catdropfoundation.**org/facebook/Facebook.html<http://www.catdropfoundation.org/facebook/Facebook.html> >> >> >> *www.catdropfoundation.org <http://www.catdropfoundation.**org<http://www.catdropfoundation.org> >> >* >> <http://www.catdropfoundation.**__org/ >> <http://www.catdropfoundation.**org/<http://www.catdropfoundation.org/> >> >> >> *www.facebook.com/__**catdropfoundation*<http://www.facebook.com/__catdropfoundation*> >> <http://www.facebook.com/**catdropfoundation*<http://www.facebook.com/catdropfoundation*> >> ><http://**www. <http://www.>__facebook.com/**catdropfoundation<http://facebook.com/catdropfoundation> >> <http://www.facebook.com/**catdropfoundation<http://www.facebook.com/catdropfoundation> >> >__> >> >> *Boorn 45, 9204 AZ, Drachten, The Netherlands* [[alternative HTML >> version deleted]] >> >> >> >> >> -- >> Robbie Weterings >> /Project Manager Cat Drop Thailand >> >> //Tel: +66(0)890176087 >> / >> >> 65/13 Mooban Chakangrao, Naimuang Muang >> Kamphaeng Phet 62000, Thailand >> àÅ¢·Õè 65/13 Á.ªÒ¡Ñ§ÃÒÇ ¶¹¹ ÃÒª´íÒà¹Ô¹2 ã¹àÁ×ͧ ÍíÒàÀÍ/ >> ࢵ àÁ×ͧ¡íÒàྦྷྪà ¨Ñ§ËÇÑ´ ¡íÒàྦྷྪà 62000 >> >> <http://www.catdropfoundation.**org <http://www.catdropfoundation.org>> >> <http://www.catdropfoundation.**org/facebook/Facebook.html<http://www.catdropfoundation.org/facebook/Facebook.html> >> > >> /www.catdropfoundation.org/ <http://www.catdropfoundation.**org/<http://www.catdropfoundation.org/> >> > >> /www.facebook.com/**catdropfoundation/<http://www.facebook.com/catdropfoundation/> >> <http://www.facebook.com/**catdropfoundation<http://www.facebook.com/catdropfoundation> >> > >> /Boorn 45, 9204 AZ, Drachten, The Netherlands/ >> >[[alternative HTML version deleted]]
Prof J C Nash (U30A)
2013-Jul-05 13:16 UTC
[R] Non-linear modelling with several variables including a categorical variable
Off list I sent the OP a note that wrapnls() from nlmrt calls nls after nlxb finishes. This is not, of course, super-efficient, but returns the nls-structured answer. JN On 13-07-05 06:00 AM, r-help-request at r-project.org wrote:> Message: 49 > Date: Fri, 5 Jul 2013 08:30:39 +0700 > From: Robbie Weterings<robbie.weterings at gmail.com> > To: "Prof J C Nash (U30A)"<nashjc at uottawa.ca>,r-help at r-project.org > Subject: Re: [R] Non-linear modelling with several variables including > a categorical variable > Message-ID: > <CAFe5dHZdXFbFtwKmTE1_QPi1rqNGsd+=82TPrOYfs6mg6zmiCg at mail.gmail.com> > Content-Type: text/plain > > Dear Prof. Nash, > > I tried to run nls with the nlxb function and as you mention it is fairly > slower in terms of running the code. However, if I would have used this > function earlier I would have saved a lot of time trying to find the start > values. The output looks a little bit sloppy but I think it is very usefull > to use in combination with nls. > > Thanks > Robbie