송상은
2026-Mar-19 14:00 UTC
[R] Query on ksmooth(): Theoretical step function vs. plotted diagonal segments
Dear R-help members,
Hello,
I am studying kernel regression and experimenting with the ksmooth()
function in R.
When using a box kernel, the kernel function is an indicator function
(weight = 1 inside the bandwidth and 0 outside). Based on this definition,
I expected the Nadaraya?Watson estimator to produce a step-like function:
the estimate should remain constant while the set of included points is
unchanged, and then jump when a point enters or leaves the bandwidth window.
However, when I run the following code using the cars dataset:
par(mfrow=c(1,1))
with(cars, {
plot(speed, dist)
lines(ksmooth(speed, dist, "normal", bandwidth = 2),
col = "blue", lwd = 3)
lines(ksmooth(speed, dist, "box", bandwidth = 2),
col = "darkorange", lwd = 3)
})
legend("topleft", c("Normal Kernel with h=2", "Box
Kernel with h=2"),
lwd = c(2,2),
col = c("blue","darkorange"), cex = 2)
the curve produced by the box kernel (dark orange) appears to contain
diagonal line segments rather than the step-like shape I expected. I have
attached the resulting plot for reference.
My understanding is that the theoretical estimator should behave like a
step function because the kernel weights are either 0 or 1. Therefore, I
was wondering whether the diagonal segments arise from how ksmooth()
evaluates the estimator on a grid of x values and then connects those
points with straight lines for plotting, or if there is another
implementation detail that explains this behavior.
Could you please clarify whether this is expected behavior?
Thank you very much for your time.
Best regards,
Sangeun Song
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plot_ksmooth_box.png
Type: image/png
Size: 167309 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20260319/d0cb1687/attachment.png>
Jeff Newmiller
2026-Mar-19 15:42 UTC
[R] Query on ksmooth(): Theoretical step function vs. plotted diagonal segments
You used the lines function to plot the data. Of course it is going to show straight sloped lines between points. On March 19, 2026 7:00:00 AM PDT, "???" <helen.song0206 at gmail.com> wrote:>Dear R-help members, > >Hello, >I am studying kernel regression and experimenting with the ksmooth() >function in R. > >When using a box kernel, the kernel function is an indicator function >(weight = 1 inside the bandwidth and 0 outside). Based on this definition, >I expected the Nadaraya?Watson estimator to produce a step-like function: >the estimate should remain constant while the set of included points is >unchanged, and then jump when a point enters or leaves the bandwidth window. > >However, when I run the following code using the cars dataset: > >par(mfrow=c(1,1)) >with(cars, { >plot(speed, dist) >lines(ksmooth(speed, dist, "normal", bandwidth = 2), >col = "blue", lwd = 3) >lines(ksmooth(speed, dist, "box", bandwidth = 2), >col = "darkorange", lwd = 3) >}) >legend("topleft", c("Normal Kernel with h=2", "Box Kernel with h=2"), >lwd = c(2,2), >col = c("blue","darkorange"), cex = 2) > >the curve produced by the box kernel (dark orange) appears to contain >diagonal line segments rather than the step-like shape I expected. I have >attached the resulting plot for reference. > >My understanding is that the theoretical estimator should behave like a >step function because the kernel weights are either 0 or 1. Therefore, I >was wondering whether the diagonal segments arise from how ksmooth() >evaluates the estimator on a grid of x values and then connects those >points with straight lines for plotting, or if there is another >implementation detail that explains this behavior. > >Could you please clarify whether this is expected behavior? > >Thank you very much for your time. > >Best regards, >Sangeun Song-- Sent from my phone. Please excuse my brevity. [[alternative HTML version deleted]]
Rui Barradas
2026-Mar-19 16:29 UTC
[R] Query on ksmooth(): Theoretical step function vs. plotted diagonal segments
?s 14:00 de 19/03/2026, ??? escreveu:> Dear R-help members, > > Hello, > I am studying kernel regression and experimenting with the ksmooth() > function in R. > > When using a box kernel, the kernel function is an indicator function > (weight = 1 inside the bandwidth and 0 outside). Based on this definition, > I expected the Nadaraya?Watson estimator to produce a step-like function: > the estimate should remain constant while the set of included points is > unchanged, and then jump when a point enters or leaves the bandwidth window. > > However, when I run the following code using the cars dataset: > > par(mfrow=c(1,1)) > with(cars, { > plot(speed, dist) > lines(ksmooth(speed, dist, "normal", bandwidth = 2), > col = "blue", lwd = 3) > lines(ksmooth(speed, dist, "box", bandwidth = 2), > col = "darkorange", lwd = 3) > }) > legend("topleft", c("Normal Kernel with h=2", "Box Kernel with h=2"), > lwd = c(2,2), > col = c("blue","darkorange"), cex = 2) > > the curve produced by the box kernel (dark orange) appears to contain > diagonal line segments rather than the step-like shape I expected. I have > attached the resulting plot for reference. > > My understanding is that the theoretical estimator should behave like a > step function because the kernel weights are either 0 or 1. Therefore, I > was wondering whether the diagonal segments arise from how ksmooth() > evaluates the estimator on a grid of x values and then connects those > points with straight lines for plotting, or if there is another > implementation detail that explains this behavior. > > Could you please clarify whether this is expected behavior? > > Thank you very much for your time. > > Best regards, > Sangeun Song > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.Hello, Use type = "s", see help("plot.default") for the possible values of argument type. You have a link to that help page in the documentation of line(). ?plot.default ?line In the code below, I only use it in the 2nd line() call. # if it's the 1st time you call par(), this does nothing # (it changes a graphics device from rows*cols == 1*1 to 1*1) # anyway, save the value so that you can later reset the default old_par <- par(mfrow = c(1, 1)) # see what mfrow was before old_par #> $mfrow #> [1] 1 1 with(cars, { plot(speed, dist) lines(ksmooth(speed, dist, "normal", bandwidth = 2), col = "blue", lwd = 3) lines(ksmooth(speed, dist, "box", bandwidth = 2), type = "s", col = "darkorange", lwd = 3) }) legend("topleft", c("Normal Kernel with h=2", "Box Kernel with h=2"), lwd = c(2,2), col = c("blue","darkorange"), cex = 2) # reset the default par values par(old_par) Hope this helps, Rui Barradas
Martin Maechler
2026-Mar-20 08:28 UTC
[R] Query on ksmooth(): Theoretical step function vs. plotted diagonal segments
>>>>> ??? >>>>> on Thu, 19 Mar 2026 23:00:00 +0900 writes:> Dear R-help members, > Hello, > I am studying kernel regression and experimenting with the ksmooth() > function in R. > When using a box kernel, the kernel function is an indicator function > (weight = 1 inside the bandwidth and 0 outside). Based on this definition, > I expected the Nadaraya?Watson estimator to produce a step-like function: > the estimate should remain constant while the set of included points is > unchanged, and then jump when a point enters or leaves the bandwidth window. > However, when I run the following code using the cars dataset: > par(mfrow=c(1,1)) > with(cars, { > plot(speed, dist) > lines(ksmooth(speed, dist, "normal", bandwidth = 2), > col = "blue", lwd = 3) > lines(ksmooth(speed, dist, "box", bandwidth = 2), > col = "darkorange", lwd = 3) > }) > legend("topleft", c("Normal Kernel with h=2", "Box Kernel with h=2"), > lwd = c(2,2), > col = c("blue","darkorange"), cex = 2) > the curve produced by the box kernel (dark orange) appears to contain > diagonal line segments rather than the step-like shape I expected. I have > attached the resulting plot for reference. > My understanding is that the theoretical estimator should behave like a > step function because the kernel weights are either 0 or 1. Therefore, I > was wondering whether the diagonal segments arise from how ksmooth() > evaluates the estimator on a grid of x values and then connects those > points with straight lines for plotting, or if there is another > implementation detail that explains this behavior. > Could you please clarify whether this is expected behavior? Definitely, as Jeff and Rui explained. As solution e.g. for teaching / illustration (otherwise the box kernel should *never* be used!), I recommend just to evaluate the resulting function on a finer grid, e.g., at 1000 instead of by default 100 points: ## MM: evaluate the curve on a much finer grid, using n.points = 1000 (default was 100) with(cars, { plot(speed, dist) lines(ksmooth(speed, dist, "norm", bandwidth = 2, n.points=1000), col = "blue", lwd = 3) lines(ksmooth(speed, dist, "box", bandwidth = 2, n.points=1000), col = "darkorange", lwd = 3) }) legend("topleft", c("Normal Kernel with h=2", "Box Kernel with h=2"), lwd = 2, col = c("blue","darkorange"), bty="n") -- Martin