btyner@stat.purdue.edu
2005-Jun-18 04:19 UTC
[Rd] loess returns different standard errors for identical models (PR#7956)
Full_Name: Benjamin Tyner
Version: 2.1.0, 4/18/2005
OS: i686-redhat-linux-gnu
Submission from: (NULL) (4.64.8.220)
# Just run my.test() below in a newly opened R session. Once too many models
have been fit (~20 on my system), the computed standard error jumps to a
different value. This is (superficially) due to a different residual sum of
squares, not a different one.delta. No other aspect of the fit is affected, just
the computed value of s (I've run extensive testing with all.equal() to make
sure). Issuing a garbage collection before doing a loess fit appears to
"solve"
the problem, which makes me think this is not a problem in loessc.c or loessf.f.
My point is that a few loess fits in one session should not cause the estimated
standard error computation go awry with no warning.
y<-rnorm(50)
x<-seq(0,1,length=50)
my.test<-function(){
counter<-0
go<-0
new.s<-0
while(go<2){
counter<-counter+1
old.s<-new.s
fit<-loess(y~x,family="symmetric")
new.s<-fit$s
if(new.s!=old.s)go<-go+1
print(paste("s = ",fit$s))
}
print(paste("Fit number ",counter," is
different!",sep=""))
}
# If there does turn out to be a way to fix this is loessc.c or loessf.f, I
would be happy to collaborate on that.
Peter Dalgaard
2005-Jun-18 08:27 UTC
[Rd] loess returns different standard errors for identical models (PR#7956)
btyner at stat.purdue.edu writes:> Full_Name: Benjamin Tyner > Version: 2.1.0, 4/18/2005 > OS: i686-redhat-linux-gnu > Submission from: (NULL) (4.64.8.220) > > > # Just run my.test() below in a newly opened R session. Once too many models > have been fit (~20 on my system), the computed standard error jumps to a > different value. This is (superficially) due to a different residual sum of > squares, not a different one.delta. No other aspect of the fit is affected, just > the computed value of s (I've run extensive testing with all.equal() to make > sure). Issuing a garbage collection before doing a loess fit appears to "solve" > the problem, which makes me think this is not a problem in loessc.c or loessf.f. > My point is that a few loess fits in one session should not cause the estimated > standard error computation go awry with no warning.Right. Valgrind has this to say:> my.test()==22986== Use of uninitialised value of size 8 ==22986== at 0x1C97051B: lowesb_ (loessf.f:1542) ==22986== by 0x1C95B399: loess_raw (loessc.c:98) ==22986== by 0x809C9AE: do_dotCode (dotcode.c:1709) ==22986== by 0x80B368F: Rf_eval (eval.c:405) [1] "s = 0.857141235910414" [1] "s = 0.857141235910414" and that certainly fits the pattern. Unfortunately this seems to be in the call to ehg31() in this passage end if setlf=(iv(27).ne.iv(25)) call ehg131(xx,yy,ww,trl,diagl,iv(20),iv(29),iv(3),iv(2),iv(5), + iv(17),iv(4),iv(6),iv(14),iv(19),wv(1),iv(iv(7)),iv(iv(8)), + iv(iv(9)),iv(iv(10)),iv(iv(22)),iv(iv(27)),wv(iv(11)), + iv(iv(23)),wv(iv(13)),wv(iv(12)),wv(iv(15)),wv(iv(16)), + wv(iv(18)),ifloor(iv(3)*wv(2)),wv(3),wv(iv(26)),wv(iv(24)), + wv(4),iv(30),iv(33),iv(32),iv(41),iv(iv(25)),wv(iv(34)), + setlf) if(iv(14).lt.iv(6)+DBLE(iv(4))/2.D0)then call ehg183('k-d tree limited by memory; nvmax=', + iv(14),1,1) (line numbers in optimized code are somewhat unreliable), so there are quite a few items to check. Dumping out the iv and wv arrays at that point is probably a good start if you want to chip in with a bit of debugging. Do yourself a favour and use set.seed() with a value that gives you a minimal repeat count when you start R in a clean state. -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907