I'm in teaching mode, kernel densities. {History: density() was newly introduced in version 0.15, 19 Dec 1996; most probably by Ross or Robert } When I was telling the students about different kernels (and why their choice is not so important, and "equivalent bandwidths" etc,etc) I wondered about the "Cosine" in my teaching notes which is defined there as k(x) = pi/4 * cos(pi/2 * x) * I{ |x| <= 1 } i.e. in R Kcos <- function(x) ifelse(abs(x) <= 1, pi/4 * cos(pi/2 * x), 0) Now, R has instead (for bandwidth h <- bw/1.135724 which makes the bandwidth Gaussian equivalent; here just h == 1/pi to be similar to above) Kcosine <- function(x) ifelse(abs(x) < 1, (1+cos(x*pi))/2 , 0) I've looked in Dave Scott's (and Haerdle's "Smoothing... in S") book, (Silverman doesn't mention any cosine kernel) and both define the cosine kernel as I have it in my notes. With above R code, look at x <- seq(-1.2,1.2,len=501) matplot(x, cbind(Kcos(x),Kcosine(x)), type='l', lty=1) The big difference : - R's version is smooth (differentiable at the border of support) - Scott's (not really "his", of course!) version is not differentiable but looks much closer to the Epanechnikov kernel and is hence almost as `good' (less than half a percent of MSE loss w.r.t Epanechnikov). Problem: - An average user knowing some statistics literature will most probably assume that a "cosine" kernel means the one in the literature, *NOT* the one we have in R now. Proposition / Possibilities / RFC [= Request For Comments] : - We CHANGE the behavior of density(* , kernel="cosine") to use the cosine from the litterature. - provide the current "cosine" as kernel = "smoothcosine" {I'd like to keep the possibility of 1-initial-letter abbreviation} Enhancement (easy, I'll do that): - We further provide both Epanechnikov and "quartic" aka "biweight" additionally in any case. Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO D10 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
1999-Dec-01 10:09 UTC
density(kernel = "cosine") .. the `wrong cosine' ..
On Wed, 1 Dec 1999, Martin Maechler wrote:> Problem: > > - An average user knowing some statistics literature will most probably > assume that a "cosine" kernel means the one in the literature, > *NOT* the one we have in R now.Or they have know what is in S (or what V&R say it is). (Yes, we knew of the discrepancy, so defined it.)> Proposition / Possibilities / RFC [= Request For Comments] : > > - We CHANGE the behavior of density(* , kernel="cosine") > to use the cosine from the litterature.I am against that. S compatibility and all that.> - provide the current "cosine" as kernel = "smoothcosine" > {I'd like to keep the possibility of 1-initial-letter abbreviation}OK, or 3) As it is confusing and never used(?), drop it altogether.> Enhancement (easy, I'll do that): > > - We further provide both > Epanechnikov and "quartic" aka "biweight" additionally > in any case.You may find it hard to get agreement on what those are (the problem being the scale factors). -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._