Joerg van den Hoff
2007-Sep-25 13:56 UTC
[Rd] non-linear fitting (nls) and confidence limits
dear list members, my question concerns computation of confidence intervals in nonlinear fits with `nls' when weigthing the fit. the seemingly correct procedure does not work as (I) expected. I'm posting this here since: (A) the problem might suggest a modification to the `m' component in the return argument of `nls' (making this post formally OK for this list) and (B) I got no response on r-help (my "secret" motivation for posting here, hoping for some clarifications...): for _unweighted_ fits using `nls' one can compute confidence intervals for the fitted model function via #------------------- se.fit <- sqrt(apply(res$m$gradient(), 1, function(x) sum(vcov(res)*outer(x,x)))) luconf <- yfit + outer(se.fit, qnorm(c(probex, 1 - probex))) #------------------- where `res' contains an `nls' object, `x' is the independent variable vector, `yfit' the corresponding model prediction (`fitted(res)'), `se.fit' the corresponding standard error and `luconf' the lower and upper confidence limits at some level specified by `probex'. when using the same approach with _weighted_ fits (even if all weights are equal (but not equal to one)), I noted that the confidence limits depend on the weights in an counter-intuitive, probably erroneous way. I have tracked my problem down to the fact that `res$m$gradient()' does _not_ contain the actual gradients w.r.t. the parameters but rather the gradients multiplied by the sqrt of the weights. question 1: is the fact that `m$gradient' in an `nls' object does compute the "scaled" gradients documented somewhere (I did'nt find any remark)? if no, can someone please give me a hint what the rationale behind this approach is? question 2: am I right to understand, that the above approach to compute `se.fit' (in this compact form proposed by p. daalgaard on r-help some time ago) is erroneous (for weighted nls) that I have to undo the multiplication by sqrt(weights) which is included in `m$gradient' (or compute the true gradients of my model function otherwise, e.g. with `deriv')? here is a simple example demonstrating the problem (yes, I know that this actually is a linar model :-)): #----------------------------------------------------------------------------------- probex <- 0.05 x <- 1:10 y <- rnorm(x, x, .8) w1 <- rep(1, 10) w2 <- w1; w2[7:10] <- 0.01 * w2[7:10] res1 <- nls(y ~ a*x + b, data = list(x = x, y = y), start = list(a = 1, b = 0), weights = w1) res2 <- nls(y ~ a*x + b, data = list(x = x, y = y), start = list(a = 1, b = 0), weights = w2) yfit1 <- fitted(res1) yfit2 <- fitted(res2) se.fit1 <- sqrt(apply(res1$m$gradient(), 1, function(x) sum(vcov(res1)*outer(x,x)))) luconf1 <- yfit1 + outer(se.fit1, qnorm(c(probex, 1 - probex))) se.fit2 <- sqrt(apply(res2$m$gradient(), 1, function(x) sum(vcov(res2)*outer(x,x)))) luconf2 <- yfit2 + outer(se.fit2, qnorm(c(probex, 1 - probex))) op <- par(mfrow = c(2,1)) matplot(x, cbind(y, yfit1,luconf1), type = 'blll', pch = 1, col = c(1,2,4,4), lty = 1) matplot(x, cbind(y, yfit2,luconf2), type = 'blll', pch = 1, col = c(1,2,4,4), lty = 1) par(op, no.readonly = T) #----------------------------------------------------------------------------------- the first fit uses unit weights for all data points (i.e. effectively is unweighted) and yields reasonable confidence limits. the second fit uses unequal weights (where the last 4 points have very small weights and are next to irrelevant for the fit result). the computed confidence intervals in this case are only fine up to point no. 6, but nonsense afterwards. question 3: computing confidence limits for the fitted model is a rather frequent requirement (and it occures by and then on r-help). would it not be a reasonable (and small) modification to `nls' to add a further argument "conf = NA" so that if `nls' is called, e.g. with "conflim = 0.95", the confidence limits are computed and are returned in a new component `res$m$conf' of the `m' component of an nls object? (alternatively, computation of `se.fit' (in the notation used above) would suffice. this could also be achieved by making the `se.fit' flag in `predict.nls' operational. either way would seem a valuable improvement to the nls functionality, I believe.) any feedback appreciated. thanks, joerg