Joerg van den Hoff
2007-Aug-31 09:20 UTC
[R] non-linear fitting (nls) and confidence limits
dear list members, I apologize in advance for posting a second time, but probably after one week chances are, the first try went down the sink.. my question concerns computation of confidence intervals in nonlinear fits with `nls' when weigthing the fit. the seemingly correct procedure does not work as expected, as is detailed in my original post below. any remarks appreciated. greetings joerg original post: -------------- for unweighted fits using `nls' I compute confidence intervals for the fitted model function by using: #------------------- se.fit <- sqrt(apply(rr$m$gradient(), 1, function(x) sum(vcov(rr)*outer(x,x)))) luconf <- yfit + outer(se.fit, qnorm(c(probex, 1 - probex))) #------------------- where `rr' contains an `nls' object, `x' is the independent variable vector, `yfit' the corresponding model prediction (`fitted(rr)'), `se.fit' the corresponding standard error and `luconf' the lower and upper confidence limits at some level specified by `probex'. when using the same approach with weighted fits (even if all weights are equal (but not equal to one)), I noted that the confidence limits depend on the weights in an counter-intuitive, probably erroneous way: consider the following simple example (yes, I know that this actually is a linar model :-)): #----------------------------------------------------------------------------------- probex <- 0.05 x <- 1:10 y <- rnorm(x, x, .8) w1 <- rep(1, 10) w2 <- w1; w2[7:10] <- 0.01 * w2[7:10] rr1 <- nls(y ~ a*x + b, data = list(x = x, y = y), start = list(a = 1, b = 0), weights = w1) rr2 <- nls(y ~ a*x + b, data = list(x = x, y = y), start = list(a = 1, b = 0), weights = w2) yfit1 <- fitted(rr1) yfit2 <- fitted(rr2) se.fit1 <- sqrt(apply(rr1$m$gradient(), 1, function(x) sum(vcov(rr1)*outer(x,x)))) luconf1 <- yfit1 + outer(se.fit1, qnorm(c(probex, 1 - probex))) se.fit2 <- sqrt(apply(rr2$m$gradient(), 1, function(x) sum(vcov(rr2)*outer(x,x)))) luconf2 <- yfit2 + outer(se.fit2, qnorm(c(probex, 1 - probex))) op <- par(mfrow = c(2,1)) matplot(x, cbind(y, yfit1,luconf1), type = 'blll', pch = 1, col = c(1,2,4,4), lty = 1) matplot(x, cbind(y, yfit2,luconf2), type = 'blll', pch = 1, col = c(1,2,4,4), lty = 1) par(op, no.readonly = T) #----------------------------------------------------------------------------------- the second fit uses unequal weights where the last 4 points are next to unimportant for the fit. the confidence intervals in this case are fine up to point no. 6, but nonsense afterwards I believe. I have tracked this down to the fact that `m$gradient' does _not_ contain the actual gradients w.r.t. the parameters but rather the gradients multiplied by the sqrt of the weights. without trying to understand the inner workings of the `nls*' functions in detail my questions are: 1. is the fact that `m$gradient' in an `nls' object does contain the scaled gradients documented somewhere (I did'nt find it)? if yes, where is it, if no, can someone please give me a hint what the rationale behind this approach is? 2. am I right to understand, that the above approach to compute `se.fit' (essentially in this compact form proposed by p. daalgaard on r-help some time ago) is erroneous (for weighted nls) in computing the confidence limits and that I have to undo the multiplication by sqrt(weights) which is included in `m$gradient' (or compute the gradients of my model function myself, e.g. with `deriv')? thanks, joerg ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.