thr3ads.net - R help - [R] nls, convergence and starting values [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Patrick Giraudoux

2009-Mar-27 19:38 UTC

[R] nls, convergence and starting values

"in non linear modelling finding appropriate starting values is
something like an art"... (maybe from somewhere in Crawley , 2007)  Here
a colleague and I just want to compare different response models to a
null model. This has worked OK for almost all the other data sets except
that one (dumped below). Whatever our trials and algorithms, even
subsetting data (to check if some singular point was the cause of the
mess), we do not reach convergence... or screw up with singular
gradients (?) etc...

eg:

nls(pourcma~SSlogis(transat, Asym, xmid, scal), start=c(Asym=30,
xmid=0.07, scal=0.02),data=bdd,
weights=sqrt(nbfeces),trace=T,alg="plinear")

As anyone a hint about an alternate approach to fit a model ? Or an idea
to get evidence that such model cannot be fitted to the data....


bdd <-
structure(list(transat = c(0.0697, 0.13079, 0.314265, 0.241613,
0.039319, 0, 0, 0, 0, 0, 0.0805, 0.41, 0.30585, 0.27465, 0.06085,
0.09114, 0.05766, 0.036983, 0.093186, 0.046624, 0, 0, 0, 0, 0.000616,
0, 0.0025, 0.0325, 0.03125, 0.04599, 0.38398, 0.524505, 0.450337,
0.061831, 0.133926, 0.091806, 0.00928, 0.25114, 0.3074, 0.431056,
0.026158), transma = c(0.04141, 0.01599, 0.101803, 0.002378,
0.039319, 0.00472459016393443, 0.0031016393442623, 0.000178524590163934,
0.00255704918032787, 0.000346229508196721, 0.0665, 0.012, 0.0553,
0.0045, 0.0056, 0.00155, 0.00124, 0.011966, 0.001736, 0.004712,
3.62903225806452e-05, 9.79838709677419e-05, 2.20161290322581e-05,
0.00462, 0.0100644444444444, 0.00213111111111111, 0.046, 0.005,
0.01195, 0.07154, 0.08468, 0.141182, 0.086578, 0.027959, 0.003159,
0.003081, 0.13862, 0.00754, 0.078648, 0.068324, 0.025288), nbfeces = c(22L,
26L, 43L, 30L, 35L, 25L, 21L, 36L, 34L, 37L, 23L, 32L, 40L, 35L,
30L, 16L, 25L, 37L, 37L, 34L, 31L, 35L, 41L, 31L, 34L, 39L, 5L,
14L, 31L, 13L, 21L, 34L, 32L, 36L, 36L, 40L, 31L, 35L, 39L, 29L,
32L), pourcma = c(50, 34.6153846153846, 27.9069767441860, 43.3333333333333,
65.7142857142857, 32, 28.5714285714286, 22.2222222222222, 50,
10.8108108108108, 26.0869565217391, 40.625, 12.5, 22.8571428571429,
43.3333333333333, 6.25, 4, 10.8108108108108, 16.2162162162162,
23.5294117647059, 25.8064516129032, 45.7142857142857, 39.0243902439024,
25.8064516129032, 41.6666666666667, 27.5, 20, 14.2857142857143,
22.5806451612903, 15.3846153846154, 38.0952380952381, 17.6470588235294,
78.125, 61.1111111111111, 25, 37.5, 22.5806451612903, 40, 17.9487179487179,
41.3793103448276, 50), pourcat = c(22.7272727272727, 30.7692307692308,
41.8604651162791, 56.6666666666667, 5.71428571428571, 0, 0, 0,
0, 0, 30.4347826086957, 15.625, 45, 74.2857142857143, 13.3333333333333,
50, 12, 18.9189189189189, 27.0270270270270, 20.5882352941176,
0, 0, 0, 0, 0, 5, 40, 0, 0, 7.69230769230769, 9.52380952380952,
38.2352941176471, 59.375, 5.55555555555556, 41.6666666666667,
42.5, 9.67741935483871, 14.2857142857143, 51.2820512820513,
79.3103448275862,
6.25)), .Names = c("transat", "transma",
"nbfeces", "pourcma",
"pourcat"), class = "data.frame", row.names = c(NA, -41L))

Bert Gunter

2009-Mar-27 19:51 UTC

head link

[R] nls, convergence and starting values

Based on a simple scatterplot of pourcma vs  transat, a 4 parameter logistic
looks like wild overfitting, and that may be the source of your problems.
Given the huge scatter, a straight line is about as much as would seem
sensible. I think this falls into the "Why ever would you want to do such a
thing?" category.

-- Bert


Bert Gunter
Genentech Nonclinical Biostatistics
650-467-7374

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
Behalf Of Patrick Giraudoux
Sent: Friday, March 27, 2009 12:39 PM
To: r-help at stat.math.ethz.ch
Cc: Francis Raoul
Subject: [R] nls, convergence and starting values

"in non linear modelling finding appropriate starting values is
something like an art"... (maybe from somewhere in Crawley , 2007)  Here
a colleague and I just want to compare different response models to a
null model. This has worked OK for almost all the other data sets except
that one (dumped below). Whatever our trials and algorithms, even
subsetting data (to check if some singular point was the cause of the
mess), we do not reach convergence... or screw up with singular
gradients (?) etc...

eg:

nls(pourcma~SSlogis(transat, Asym, xmid, scal), start=c(Asym=30,
xmid=0.07, scal=0.02),data=bdd,
weights=sqrt(nbfeces),trace=T,alg="plinear")

As anyone a hint about an alternate approach to fit a model ? Or an idea
to get evidence that such model cannot be fitted to the data....


bdd <-
structure(list(transat = c(0.0697, 0.13079, 0.314265, 0.241613,
0.039319, 0, 0, 0, 0, 0, 0.0805, 0.41, 0.30585, 0.27465, 0.06085,
0.09114, 0.05766, 0.036983, 0.093186, 0.046624, 0, 0, 0, 0, 0.000616,
0, 0.0025, 0.0325, 0.03125, 0.04599, 0.38398, 0.524505, 0.450337,
0.061831, 0.133926, 0.091806, 0.00928, 0.25114, 0.3074, 0.431056,
0.026158), transma = c(0.04141, 0.01599, 0.101803, 0.002378,
0.039319, 0.00472459016393443, 0.0031016393442623, 0.000178524590163934,
0.00255704918032787, 0.000346229508196721, 0.0665, 0.012, 0.0553,
0.0045, 0.0056, 0.00155, 0.00124, 0.011966, 0.001736, 0.004712,
3.62903225806452e-05, 9.79838709677419e-05, 2.20161290322581e-05,
0.00462, 0.0100644444444444, 0.00213111111111111, 0.046, 0.005,
0.01195, 0.07154, 0.08468, 0.141182, 0.086578, 0.027959, 0.003159,
0.003081, 0.13862, 0.00754, 0.078648, 0.068324, 0.025288), nbfeces = c(22L,
26L, 43L, 30L, 35L, 25L, 21L, 36L, 34L, 37L, 23L, 32L, 40L, 35L,
30L, 16L, 25L, 37L, 37L, 34L, 31L, 35L, 41L, 31L, 34L, 39L, 5L,
14L, 31L, 13L, 21L, 34L, 32L, 36L, 36L, 40L, 31L, 35L, 39L, 29L,
32L), pourcma = c(50, 34.6153846153846, 27.9069767441860, 43.3333333333333,
65.7142857142857, 32, 28.5714285714286, 22.2222222222222, 50,
10.8108108108108, 26.0869565217391, 40.625, 12.5, 22.8571428571429,
43.3333333333333, 6.25, 4, 10.8108108108108, 16.2162162162162,
23.5294117647059, 25.8064516129032, 45.7142857142857, 39.0243902439024,
25.8064516129032, 41.6666666666667, 27.5, 20, 14.2857142857143,
22.5806451612903, 15.3846153846154, 38.0952380952381, 17.6470588235294,
78.125, 61.1111111111111, 25, 37.5, 22.5806451612903, 40, 17.9487179487179,
41.3793103448276, 50), pourcat = c(22.7272727272727, 30.7692307692308,
41.8604651162791, 56.6666666666667, 5.71428571428571, 0, 0, 0,
0, 0, 30.4347826086957, 15.625, 45, 74.2857142857143, 13.3333333333333,
50, 12, 18.9189189189189, 27.0270270270270, 20.5882352941176,
0, 0, 0, 0, 0, 5, 40, 0, 0, 7.69230769230769, 9.52380952380952,
38.2352941176471, 59.375, 5.55555555555556, 41.6666666666667,
42.5, 9.67741935483871, 14.2857142857143, 51.2820512820513,
79.3103448275862,
6.25)), .Names = c("transat", "transma",
"nbfeces", "pourcma",
"pourcat"), class = "data.frame", row.names = c(NA, -41L))

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Patrick Giraudoux

2009-Mar-27 22:08 UTC

head link

[R] nls, convergence and starting values

Bert Gunter a ?crit :> Based on a simple scatterplot of pourcma vs  transat, a 4 parameter
logistic
> looks like wild overfitting, and that may be the source of your problems.
> Given the huge scatter, a straight line is about as much as would seem
> sensible. I think this falls into the "Why ever would you want to do
such a
> thing?" category.
>
> -- Bert
>   
Right, well, the general idea was just to show that the "straight
line"
was the best model indeed (in the other data sets, with model 
comparison, the logistic one was clearly shown to be the best... ). Can 
the fact that convergence cannot be obtained be an acceptable and 
sufficient reason to select the null model (the straight line) ?

Patrick

Christian Ritz

2009-Mar-30 13:48 UTC

head link

[R] nls, convergence and starting values

Hi Patrick,

there exist specialized functionality in R that offer both automated calculation
of
starting values and relatively robust optimization, which can be used with
success in many
common cases of nonlinear regression, also for your data:

library(drc)  # on CRAN

## Fitting 3-parameter logistic model
## (slightly different parameterization from SSlogis())
bdd.m1 <- drm(pourcma~transat, weights=sqrt(nbfeces), data=bdd, fct=L.3())

plot(bdd.m1, broken=TRUE, conLevel=0.0001)

summary(bdd.m1)


Of course, standard errors are huge as the data do not really support this model
(as
already pointed out by other replies to this post).


Christian

Apparently Analagous Threads

Search for more reasonably related threads

R help - Mar 2009 - nls, convergence and starting values

[R] nls, convergence and starting values

[R] nls, convergence and starting values

[R] nls, convergence and starting values

[R] nls, convergence and starting values

Apparently Analagous Threads