Hi, If I set df=2 in my smooth.spline function, is that equivalent to running a linear regression through my data? It appears that df=# of data points gives the interpolating spline and that df = 2 gives the linear regression, but I just want to confirm this. Thank you, Steven
The help page for 'smooth.spline' says the argument 'df' is 'the desired equivalent number of degrees of freedom (trace of the smoother matrix).' It also explains that the output of 'smooth.spline' includes a component 'fit', and two components of 'fit' are 'knot' and 'coef'. To learn more, you can run the examples, examine 'str(cars.spl)' and the other objects produced by those examples. You can also read more in the references cited there. If you would like further help from this group, please submit another question, preferably after first reading the posting guide! "www.R-project.org/posting-guide.html". There is substantial but anecdotal evidence to suggest that posts more consistent with that guide tend to get better answers quicker. For example, if the above does NOT answer your question, I believe you would have gotten a better reply if you had provided a simple, self-contained example, rather than having me rely on one from the 'example' section of the 'smooth.spline' help page. Hope this helps. Spencer Graves Steven Shechter wrote:> Hi, > If I set df=2 in my smooth.spline function, is that equivalent to running > a linear regression through my data? It appears that df=# of data points > gives the interpolating spline and that df = 2 gives the linear > regression, but I just want to confirm this. > > Thank you, > Steven > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Steven, I cannot vouch for the behaviour of the function smooth.spline(), but the theoretical answer to your question is yes. If g = Sy is the transformation from data vector y to spline vector g, the equivalent degrees of freedom are usually defined as EDF = trace(S), where S is the n x n smoothing matrix: EDF = sum_i(1/(1+theta*lambda_i)), where lambda_1 to lambda_n are the eigenvalues of S. Two of these are zero, so EDF = 2 + sum(1/(1+theta*lambda_i)) the sum now over i=3 to n. Here theta is the smoothing parameter. Setting theta = 0 (no smoothing) gives EDF=n and produces the interpolating spline. Setting theta = infty gives EDF=2 and a straight line fit. See either Green and Silverman, Nonparametric regression and generalized linear models, (p37), or Hastie and Tibshirani, Generalized additive models, p52. On Sat, Jun 24, 2006 at 11:35:16AM -0400, Steven Shechter wrote:> Hi, > If I set df=2 in my smooth.spline function, is that equivalent to running > a linear regression through my data? It appears that df=# of data points > gives the interpolating spline and that df = 2 gives the linear > regression, but I just want to confirm this. > > Thank you, > Steven > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- ************************************************ * I.White * * University of Edinburgh * * Ashworth Laboratories, West Mains Road * * Edinburgh EH9 3JT * * Fax: 0131 650 6564 Tel: 0131 650 5490 * * E-mail: i.m.s.white at ed.ac.uk *
Maybe Matching Threads
- Incorrect degrees of freedom for splines using GAMM4?
- Generating a stochastic matrix with a specified second dominant eigenvalue
- Joint modelling of survival data
- Computing the minimal polynomial or, at least, its degree
- mgcv: lowest estimated degrees of freedom