btyner@stat.purdue.edu
2005-Jun-18 04:19 UTC
[Rd] loess returns different standard errors for identical models (PR#7956)
Full_Name: Benjamin Tyner Version: 2.1.0, 4/18/2005 OS: i686-redhat-linux-gnu Submission from: (NULL) (4.64.8.220) # Just run my.test() below in a newly opened R session. Once too many models have been fit (~20 on my system), the computed standard error jumps to a different value. This is (superficially) due to a different residual sum of squares, not a different one.delta. No other aspect of the fit is affected, just the computed value of s (I've run extensive testing with all.equal() to make sure). Issuing a garbage collection before doing a loess fit appears to "solve" the problem, which makes me think this is not a problem in loessc.c or loessf.f. My point is that a few loess fits in one session should not cause the estimated standard error computation go awry with no warning. y<-rnorm(50) x<-seq(0,1,length=50) my.test<-function(){ counter<-0 go<-0 new.s<-0 while(go<2){ counter<-counter+1 old.s<-new.s fit<-loess(y~x,family="symmetric") new.s<-fit$s if(new.s!=old.s)go<-go+1 print(paste("s = ",fit$s)) } print(paste("Fit number ",counter," is different!",sep="")) } # If there does turn out to be a way to fix this is loessc.c or loessf.f, I would be happy to collaborate on that.
Peter Dalgaard
2005-Jun-18 08:27 UTC
[Rd] loess returns different standard errors for identical models (PR#7956)
btyner at stat.purdue.edu writes:> Full_Name: Benjamin Tyner > Version: 2.1.0, 4/18/2005 > OS: i686-redhat-linux-gnu > Submission from: (NULL) (4.64.8.220) > > > # Just run my.test() below in a newly opened R session. Once too many models > have been fit (~20 on my system), the computed standard error jumps to a > different value. This is (superficially) due to a different residual sum of > squares, not a different one.delta. No other aspect of the fit is affected, just > the computed value of s (I've run extensive testing with all.equal() to make > sure). Issuing a garbage collection before doing a loess fit appears to "solve" > the problem, which makes me think this is not a problem in loessc.c or loessf.f. > My point is that a few loess fits in one session should not cause the estimated > standard error computation go awry with no warning.Right. Valgrind has this to say:> my.test()==22986== Use of uninitialised value of size 8 ==22986== at 0x1C97051B: lowesb_ (loessf.f:1542) ==22986== by 0x1C95B399: loess_raw (loessc.c:98) ==22986== by 0x809C9AE: do_dotCode (dotcode.c:1709) ==22986== by 0x80B368F: Rf_eval (eval.c:405) [1] "s = 0.857141235910414" [1] "s = 0.857141235910414" and that certainly fits the pattern. Unfortunately this seems to be in the call to ehg31() in this passage end if setlf=(iv(27).ne.iv(25)) call ehg131(xx,yy,ww,trl,diagl,iv(20),iv(29),iv(3),iv(2),iv(5), + iv(17),iv(4),iv(6),iv(14),iv(19),wv(1),iv(iv(7)),iv(iv(8)), + iv(iv(9)),iv(iv(10)),iv(iv(22)),iv(iv(27)),wv(iv(11)), + iv(iv(23)),wv(iv(13)),wv(iv(12)),wv(iv(15)),wv(iv(16)), + wv(iv(18)),ifloor(iv(3)*wv(2)),wv(3),wv(iv(26)),wv(iv(24)), + wv(4),iv(30),iv(33),iv(32),iv(41),iv(iv(25)),wv(iv(34)), + setlf) if(iv(14).lt.iv(6)+DBLE(iv(4))/2.D0)then call ehg183('k-d tree limited by memory; nvmax=', + iv(14),1,1) (line numbers in optimized code are somewhat unreliable), so there are quite a few items to check. Dumping out the iv and wv arrays at that point is probably a good start if you want to chip in with a bit of debugging. Do yourself a favour and use set.seed() with a value that gives you a minimal repeat count when you start R in a clean state. -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907