You should definitely read Loader's book. Anyway, in the meantime, you
should
look an introductory paper that you will find at the Locfit web page. I think
that you can set Locfit to estimate at all the sample points, which it does
not by default, and also to use a prespecified constant bandwidth, but notice
that its definition of the h parameter is not the standard one.
Hope this helps,
Miguel A.
On Thursday 14 April 2005 10:47, Jacho-Chavez,DT (pgr)
wrote:> Dear R-users,
>
> One of the main reasons I moved from GAUSS to R (as an econometrician) was
> because of the existence of the library LOCFIT for local polynomial
> regression. While doing some checking between my former `GAUSS code'
and my
> new `R code', I came to realize LOCFIT is not quite doing what I want.
I
> wrote the following example script:
>
> #--------------------------------------------------------------------------
>--------------------------------------- # Plain Vanilla NADARAYA-WATSON
> estimator (or Local Constant regression, e.g. deg=0) # with gaussian kernel
> & fixed bandwidth
>
> mkern<-function(y,x,h){
> Mx <- matrix(x,nrow=length(y),ncol=length(y),byrow=TRUE)
> Mxh <- (1/h)*dnorm((x-Mx)/h)
> Myxh<- (1/h)*y*dnorm((x-Mx)/h)
> yh <- rowMeans(Myxh)/rowMeans(Mxh)
> return(yh)
> }
>
> # Generating the design Y=m(x)+e
> n <- 10
> h <- 0.5
> x <- rnorm(n)
> y <- x + rnorm(n,mean=0,sd=0.5)
>
> # This is what I really want!
> mhat <- mkern(y,x,h)
>
> library(locfit)
> yhl.raw <-
>
locfit(y~x,alpha=c(0,h),kern="gauss",ev="data",deg=0,link="ident")
>
> # This is what I get with LOCFIT
>
print(cbind(x,mhat,residuals(yhl.raw,type="fit"),knots(yhl.raw,what="coef")
>))
> #--------------------------------------------------------------------------
>------------------------------------------
>
> Questions:
> 1) Why are residuals(.) & knots(.) results different from one another?
If I
> want m^(x[i]) at each evaluation point i=1,...,n, which one should I use? I
> do not want interpolation whatsoever. 2) Why are they `close' but not
equal
> to what I want?
>
> I can accept differences for higher degrees and multidimensional data at
> the boundary of the support (given the way we must do the regression in
> areas with sparse data) But why are these difference present for deg=0
> inside the support as well as at the boundary? The computer would still
> give us a result even with a close-to-zero random denominator (admittedly,
> not a reliable one). Unfortunately, I cannot get access to a copy of
> "Loader, C. (1999) Local Regression and Likelihood, Springer"
from my local
> library, so a small explanation or advice would be greatly appreciated.
>
> I do not mind using an improved version of `what I want', but I would
like
> to understand what am I doing?
>
>
> Thanks in advanced for your help,
>
>
> David Jacho-Ch?vez
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html