thr3ads.net - R help - [R] Local polynomial regression using locfit [Jun 2007]

If this information is useful, please help other people find it:
Share via:

Caskenette, Amanda

2007-Jun-04 18:27 UTC

[R] Local polynomial regression using locfit

I have a dataset of pregnancy values for multiple years (and ages, not
included) with missing years. I would like to use local polynomial
regression to smooth the values and estimate for the missing years. I
would also like to use GCV to justify the smoothing parameter selection.


When using locfit() with lp() I found that the gcvplot function does not
work as it is looking for an alpha value to replace so I used gcv():

########################################################################
#
library(locfit)
### approx. my data
Yearc(1954,1965,1966,1967,1968,1969,1970,1978,1979,1980,1981,1982,1985,1987,
1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2002,20
03,2004)
Num.sampledc(3,44,9,19,19,25,13,38,15,2,4,5,3,8,6,9,7,11,11,17,16,13,6,4,10,7,9,4,3
,4)
Num.pregc(1,5,1,4,6,4,3,23,8,1,3,2,1,3,1,1,1,2,3,2,2,6,0,0,3,0,3,1,2,1)

x =  Year
y = Num.preg/Num.sampled

gcv.fit = c(0) 
b = seq(from = 0.4, to = 1, by = 0.01)
for(i in 1:length(b))
{
fit = locfit(y~lp(x, nn = b[i]))         
gcv.fit[i] =  gcv(fit)[4]
}
gcv.fit = round(gcv.fit,4)
b1 = b[min(which.min(gcv.fit))]
fit = locfit(y~lp(x, nn = b1))

## plots and predicting missing years
xg = 1954:2004
plot(x,y,xlab="Years",ylab="Proportion
Pregnant",pch=16,cex=0.5)
lines(xg,predict(fit,newdata=xg),col=4)
x11()
plot(b,gcv.fit) 
########################################################################
#

This seems to work (correct me if I am wrong), however, now I am looking
for a way to penalise pregnancy rates that are derived from low sample
sizes. There isn't much documentation on the arguments to lp() 
The locfit function has the argument "weights", I tried this function
outside of lp() and it had no effect.  It is not listed as an argument
to lp()...however I tried using it as an argument and it changed the
output:

for(i in 1:length(b))
{
fit = locfit(y~lp(x, nn = b[i], weights = n))    
gcv.fit[i] =  gcv(fit)[4]
}
gcv.fit = round(gcv.fit,3)
b1 = b[min(which.min(gcv.fit))]
fit = locfit(y~lp(x, nn = b1, weights = n))    

However, this does not give "pretty" results, so I have a feeling that
it really isn't doing what I think it is doing. (I also tried log(n)).
Can weights be a vector? Should I be altering the sample size is some
way? There isn't much documentation on the weights argument, can it even
be used within the lp()?


Thank you
Amanda


	[[alternative HTML version deleted]]

Maybe Matching Threads

Search for more seemingly similar threads

R help - Jun 2007 - Local polynomial regression using locfit

[R] Local polynomial regression using locfit

Maybe Matching Threads

Wisdom of the Ancients