It has bothered me for quite some time that a smoothing spline
fit doesn't allow access to residuals or fitted values in
general, since after
fit <- smooth.spline(x,y, *)
the resulting fit$x is really equal to the unique (up to 1e-6
precision) sorted original x values and fit$yin (and $y) is accordingly.
There are several possible ways to implement the missing
feature. My current implementation would add a new argument
'keep.data' which when set to TRUE would make sure that the
original (x, y, w) are kept such that fitted values and (weighted
or unweighted) residuals are sensibly available from the result.
My main RFC (:= request for comments) is about the
acceptance of the new behavior to become the *default*
(i.e. 'keep.data = TRUE' would be default) such that by default
residuals(smooth.spline(...)) will work.
The drawback of the new default behavior would be that
potentially a 'fit' can become quite a bit larger than previously, e.g.
in the following extremely artificial example
x0 <- seq(0,1, by = 0.1)
x <- sort(sample(x0, 1000, replace = TRUE))
ff <- function(x) 10*(x-1/4)^2 + sin(7*pi*x)
y <- ff(x) + rnorm(x) / 2
fit <- smooth.spline(x,y)
but typically the size increase will only be less than about 40%.
Comments are welcome.
Martin Maechler, ETH Zurich