I'm trying to fit the Bass Diffusion Model using the nls function in R but I'm running into a strange problem. The model has either two or three parameters, depending on how it's parameterized, p (coefficient of innovation), q (coefficient of immitation), and sometimes m (maximum market share). Regardless of how I parameterize the model I get an error saying that the step factor has decreased below it's minimum. I have tried re-setting the minimum in nls.controls but that doesn't seem to fix the problem. Likewise, I have run through a variety of start values in the past few days, all to no avail. Looking at the trace output it appears that R believes I always have one more parameter than I actually have (i.e. when the model is parameterized with p and q R seems to be seeing three parameters, when m is also included R seems to be seeing four). My experience with nls is limited, can someone explain to me why it's doing this? I've included the data set I'm working with (published in Michalakelis et al. 2008) and some example code. ## Assign relevant variables adoption <- c(167000,273000,531000,938000,2056452,3894103,5932090,7963742,9314687,10469060,11393302,11976340) time <- seq(from = 1,to = 12, by = 1) ## Models Bass.Model <- adoption ~ ((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) * exp(-(p + q) * time) + 1)^2) ## Starting Parameters Bass.Params <- list(p = 0.1, q = 0.1) ## Model fitting Bass.Fit <- nls(formula = Bass.Model, start = Bass.Params, algorithm "plinear", trace = TRUE) Chris Hulme-Lowe University of Minnesota Department of Psychology Quant. Methods and Psychometrics [[alternative HTML version deleted]]
On Wed, Jun 15, 2011 at 11:06 AM, Christopher Hulme-Lowe <hulme005 at umn.edu> wrote:> I'm trying to fit the Bass Diffusion Model using the nls function in R but > I'm running into a strange problem. The model has either two or three > parameters, depending on how it's parameterized, p (coefficient of > innovation), q (coefficient of immitation), and sometimes m (maximum market > share). Regardless of how I parameterize the model I get an error saying > that the step factor has decreased below it's minimum. I have tried > re-setting the minimum in nls.controls but that doesn't seem to fix the > problem. Likewise, I have run through a variety of start values in the past > few days, all to no avail. Looking at the trace output it appears that R > believes I always have one more parameter than I actually have (i.e. when > the model is parameterized with p and q R seems to be seeing three > parameters, when m is also included R seems to be seeing four). My > experience with nls is limited, can someone explain to me why it's doing > this? I've included the data set I'm working with (published in Michalakelis > et al. 2008) and some example code. > > ## Assign relevant variables > adoption <- > c(167000,273000,531000,938000,2056452,3894103,5932090,7963742,9314687,10469060,11393302,11976340) > time <- seq(from = 1,to = 12, by = 1) > ## Models > Bass.Model <- adoption ~ ((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) * > exp(-(p + q) * time) + 1)^2) > ## Starting Parameters > Bass.Params <- list(p = 0.1, q = 0.1) > ## Model fitting > Bass.Fit <- nls(formula = Bass.Model, start = Bass.Params, algorithm > "plinear", trace = TRUE)Using the default nls algorithm (which means we must specify m in the formula and in the starting values) rather than "plinear" and using commonly found p and q for starting values:> Bass.Model <- adoption ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) *+ exp(-(p + q) * time) + 1)^2)> nls(formula = Bass.Model, start = c(p = 0.03, q = 0.4, m = max(adoption)))Nonlinear regression model model: adoption ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q/p) * exp(-(p + q) * time) + 1)^2) data: parent.frame() p q m 2.70842174019e-03 4.56307730094e-01 1.02730314877e+08 residual sum-of-squares: 2922323788247 Number of iterations to convergence: 14 Achieved convergence tolerance: 3.05692430520e-06 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
There may be two issues here. The first might be that, if I understand the Bass model correctly, the formula you are trying to estimate is the adoption in a given time period. What you supply as data, however, is the cumulative adoption by that time period. The second issue might be that the linear algorithm may fail and that it may be preferable to use Newton-Raphson (the standard) as this may provide better values in the iterations. If you do both, i.e., you do NLS on period adoption and use Newton-Raphson, you get an estimate. Though, I am of course not sure whether that is "correct" in the sense that it is what you would expect to find. adoption <- c(167000,273000,531000,938000,2056452,3894103,5932090,7963742,9314687,10469060,11393302,11976340) time <- seq(from = 1,to = 12, by = 1) adoption2<-c(0,adoption[1:(length(adoption)-1)]) S<-(adoption-adoption2)/max(adoption) ## Models Bass.Model <- S ~ M*((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) * exp(-(p + q) * time) + 1)^2) ## Starting Parameters Bass.Params <- list(p = 0.1, q = 0.1, M=1) ## Model fitting Bass.Fit <- nls(formula = Bass.Model, start = Bass.Params) summary(Bass.Fit) c.hulmelowe wrote:> > I'm trying to fit the Bass Diffusion Model using the nls function in R but > I'm running into a strange problem. The model has either two or three > parameters, depending on how it's parameterized, p (coefficient of > innovation), q (coefficient of immitation), and sometimes m (maximum > market > share). Regardless of how I parameterize the model I get an error saying > that the step factor has decreased below it's minimum. I have tried > re-setting the minimum in nls.controls but that doesn't seem to fix the > problem. Likewise, I have run through a variety of start values in the > past > few days, all to no avail. Looking at the trace output it appears that R > believes I always have one more parameter than I actually have (i.e. when > the model is parameterized with p and q R seems to be seeing three > parameters, when m is also included R seems to be seeing four). My > experience with nls is limited, can someone explain to me why it's doing > this? I've included the data set I'm working with (published in > Michalakelis > et al. 2008) and some example code. > > ## Assign relevant variables > adoption <- > c(167000,273000,531000,938000,2056452,3894103,5932090,7963742,9314687,10469060,11393302,11976340) > time <- seq(from = 1,to = 12, by = 1) > ## Models > Bass.Model <- adoption ~ ((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) * > exp(-(p + q) * time) + 1)^2) > ## Starting Parameters > Bass.Params <- list(p = 0.1, q = 0.1) > ## Model fitting > Bass.Fit <- nls(formula = Bass.Model, start = Bass.Params, algorithm > "plinear", trace = TRUE) > > Chris Hulme-Lowe > University of Minnesota > Department of Psychology > Quant. Methods and Psychometrics > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- View this message in context: http://r.789695.n4.nabble.com/Problems-with-nls-tp3600409p3600697.html Sent from the R help mailing list archive at Nabble.com.
On Wed, Jun 15, 2011 at 6:05 PM, Daniel Malter <daniel at umd.edu> wrote:> There may be two issues here. The first might be that, if I understand the > Bass model correctly, the formula you are trying to estimate is the adoption > in a given time period. What you supply as data, however, is the cumulative > adoption by that time period. > > The second issue might be that the linear algorithm may fail and that it may > be preferable to use Newton-Raphson (the standard) as this may provide > better values in the iterations. > > If you do both, i.e., you do NLS on period adoption and use Newton-Raphson, > you get an estimate. Though, I am of course not sure whether that is > "correct" in the sense that it is what you would expect to find. > > > adoption <- > c(167000,273000,531000,938000,2056452,3894103,5932090,7963742,9314687,10469060,11393302,11976340) > time <- seq(from = 1,to = 12, by = 1) > > adoption2<-c(0,adoption[1:(length(adoption)-1)]) > S<-(adoption-adoption2)/max(adoption) > > ## Models > Bass.Model <- S ~ M*((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) * > exp(-(p + q) * time) + 1)^2) > ## Starting Parameters > Bass.Params <- list(p = 0.1, q = 0.1, M=1) > ## Model fitting > Bass.Fit <- nls(formula = Bass.Model, start = Bass.Params) > summary(Bass.Fit) >If your hypothesis regarding the cumulative vs. adoptions is correct then it may be that poster wants this:> S <- diff(adoption) > time <- seq_along(S) > Bass2 <- S ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) *+ exp(-(p + q) * time) + 1)^2)> nls(formula = Bass2, start = c(p = 0.03, q = 0.4, m = max(S)))Nonlinear regression model model: S ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q/p) * exp(-(p + q) * time) + 1)^2) data: parent.frame() p q m 8.65635536465e-03 6.52817192695e-01 1.23485254536e+07 residual sum-of-squares: 321990186229 Number of iterations to convergence: 16 Achieved convergence tolerance: 8.10600476229e-06> # or equivalently in terms of "plinear" where S, time and Bass2 are > # as written just above > m <- 1 # set m to 1 since we are using .lin instead > nls(formula = Bass2, start = c(p = 0.03, q = 0.4), alg = "plinear")Nonlinear regression model model: S ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q/p) * exp(-(p + q) * time) + 1)^2) data: parent.frame() p q .lin 8.65637919209e-03 6.52816636341e-01 1.23485299874e+07 residual sum-of-squares: 321990186247 Number of iterations to convergence: 9 Achieved convergence tolerance: 5.6090474901e-06 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
Those are in fact the coefficients for p and q they are estimating, though their M is different. Who knows what they did with that. The data source was: Title: Diffusion models of mobile telephony in Greece Source: Telecommunications policy [0308-5961] Michalakelis yr:2008 vol:32 iss:3-4 pg:234 Gabor Grothendieck wrote:> > On Wed, Jun 15, 2011 at 6:05 PM, Daniel Malter <daniel at umd.edu> > wrote: >> There may be two issues here. The first might be that, if I understand >> the >> Bass model correctly, the formula you are trying to estimate is the >> adoption >> in a given time period. What you supply as data, however, is the >> cumulative >> adoption by that time period. >> >> The second issue might be that the linear algorithm may fail and that it >> may >> be preferable to use Newton-Raphson (the standard) as this may provide >> better values in the iterations. >> >> If you do both, i.e., you do NLS on period adoption and use >> Newton-Raphson, >> you get an estimate. Though, I am of course not sure whether that is >> "correct" in the sense that it is what you would expect to find. >> >> >> adoption <- >> c(167000,273000,531000,938000,2056452,3894103,5932090,7963742,9314687,10469060,11393302,11976340) >> time <- seq(from = 1,to = 12, by = 1) >> >> adoption2<-c(0,adoption[1:(length(adoption)-1)]) >> S<-(adoption-adoption2)/max(adoption) >> >> ## Models >> Bass.Model <- S ~ M*((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) * >> exp(-(p + q) * time) + 1)^2) >> ## Starting Parameters >> Bass.Params <- list(p = 0.1, q = 0.1, M=1) >> ## Model fitting >> Bass.Fit <- nls(formula = Bass.Model, start = Bass.Params) >> summary(Bass.Fit) >> > > If your hypothesis regarding the cumulative vs. adoptions is correct > then it may be that poster wants this: > >> S <- diff(adoption) >> time <- seq_along(S) >> Bass2 <- S ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q / p) * > + exp(-(p + q) * time) + 1)^2) >> nls(formula = Bass2, start = c(p = 0.03, q = 0.4, m = max(S))) > Nonlinear regression model > model: S ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q/p) * > exp(-(p + q) * time) + 1)^2) > data: parent.frame() > p q m > 8.65635536465e-03 6.52817192695e-01 1.23485254536e+07 > residual sum-of-squares: 321990186229 > > Number of iterations to convergence: 16 > Achieved convergence tolerance: 8.10600476229e-06 > >> # or equivalently in terms of "plinear" where S, time and Bass2 are >> # as written just above >> m <- 1 # set m to 1 since we are using .lin instead >> nls(formula = Bass2, start = c(p = 0.03, q = 0.4), alg = "plinear") > Nonlinear regression model > model: S ~ m * ((p + q)^2/p) * (exp(-(p + q) * time)/((q/p) * > exp(-(p + q) * time) + 1)^2) > data: parent.frame() > p q .lin > 8.65637919209e-03 6.52816636341e-01 1.23485299874e+07 > residual sum-of-squares: 321990186247 > > Number of iterations to convergence: 9 > Achieved convergence tolerance: 5.6090474901e-06 > > > > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- View this message in context: http://r.789695.n4.nabble.com/Problems-with-nls-tp3600409p3603030.html Sent from the R help mailing list archive at Nabble.com.