--AZaLVt6Pw+ Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit Dear all, I was looking at "splinefun" and the underlying C code and believe that there is a memory access error in the C routine "spline_eval". Specifically, on line 368 and following the following code appears: if(ul < x[i] || x[i+1] < ul) { i = 0; j = *n; do { k = (i+j)/2; if(ul < x[k]) j = k; else i = k; } while(j > i+1); } 'ul' is the point at which the spline is to be evaluated and the array 'x' contains the knots. The variable 'i' points to the knots between which the previous point, at which the spline was evaluated, lies. The 'if' clause checks whether the next point lies between the same knots, if not, the correct pair of knots is searched by a binary search method. However, if the previous point was larger than the right most knots, then 'i' will contain the value n-1. If the next point is also larger than the right most knot, then the first part of the logical expression will be false and, hence, the second part evaluated. In that case x[i+1] will access memory beyond the memory that is actually allocated for this array. In an environment in which memory access beyond allocated space is checked, this should lead to a run time error. However, it took me some time to realise that this problem has no impact on the correct working of the routine. If the byte patter at x[i+1] is interpreted as a number larger than ul, then the second part of the logical expression evaluates to true, we use the 'old' i and everything is o.k. If that byte pattern is interpreted as a number smaller than ul, then the logical expression evaluates to false and the binary search will end up finding that we should use i=n-1. Below I attach a small sample program that illustrates the problem. If this program is stored in, say, spline.R, then the command R -d "valgrind --tool=memcheck --leak-check=yes" --vanilla < spline.R produces on my machine the output attached below. Note that valgrind complains about the use of an uninitialised value in splines.c. As far as I can tell, the easiest solution is to safeguard the second comparison in the logical statement and only do this comparison if i is smaller than n-1. I made such a change to the SVN source of the R development tree that I downloaded (patch is attached). After recompiling R with this change, running the above command again produces an output (attached below) in which valgrind does not complain anymore about the use of an uninitialised value. Cheers, Berwin --AZaLVt6Pw+ Content-Type: text/plain Content-Description: Sample Program Content-Disposition: inline; filename="spline.R" Content-Transfer-Encoding: 7bit x <- 0:10/10 y <- (x-0.5)^2 tt <- splinefun(x, y, method="fmm") x1 <- seq(from=-0.5, to=1.5, length=101) y1 <- tt(x1) plot(x1, y1, type="l") points(x,y) --AZaLVt6Pw+ Content-Type: text/plain Content-Description: Output with original splines.c Content-Disposition: inline; filename="Out1.txt" Content-Transfer-Encoding: 7bit ==5085== Memcheck, a memory error detector for x86-linux. ==5085== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al. ==5085== Using valgrind-2.4.0, a program supervision framework for x86-linux. ==5085== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. ==5085== For more details, rerun with: -v ==5085= R : Copyright 2005, The R Foundation for Statistical Computing Version 2.2.0 Under development (unstable) (2005-07-21 r35000) ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for a HTML browser interface to help. Type 'q()' to quit R.> x <- 0:10/10 > y <- (x-0.5)^2 > > tt <- splinefun(x, y, method="fmm") > > x1 <- seq(from=-0.5, to=1.5, length=101) > y1 <- tt(x1)==5085== Use of uninitialised value of size 8 ==5085== at 0x8172B72: spline_eval (splines.c:368) ==5085== by 0x80A690B: do_dotCode (dotcode.c:1674) ==5085== by 0x80BDA22: Rf_eval (eval.c:405) ==5085== by 0x80C0EB0: Rf_DispatchOrEval (eval.c:1697) ==5085== by 0x8157749: do_subset3 (subset.c:921) ==5085== by 0x80BDAAD: Rf_eval (eval.c:382) ==5085== by 0x80BF2D4: do_begin (eval.c:1057) ==5085== by 0x80BDAAD: Rf_eval (eval.c:382) ==5085== by 0x80BDE02: Rf_applyClosure (eval.c:570) ==5085== by 0x80BD834: Rf_eval (eval.c:417) ==5085== by 0x80BFC90: do_set (eval.c:1293) ==5085== by 0x80BDAAD: Rf_eval (eval.c:382)> > plot(x1, y1, type="l") > points(x,y) >==5085===5085== ERROR SUMMARY: 24 errors from 1 contexts (suppressed: 41 from 1) ==5085== malloc/free: in use at exit: 12984429 bytes in 6622 blocks. ==5085== malloc/free: 27051 allocs, 20429 frees, 33057360 bytes allocated. ==5085== For counts of detected errors, rerun with: -v ==5085== searching for pointers to 6622 not-freed blocks. ==5085== checked 13843736 bytes. ==5085===5085===5085== 436 bytes in 20 blocks are definitely lost in loss record 18 of 36 ==5085== at 0x1B90659D: malloc (vg_replace_malloc.c:130) ==5085== by 0x8060C28: Putenv (Renviron.c:123) ==5085== by 0x8060DF2: process_Renviron (Renviron.c:181) ==5085== by 0x8060F31: process_system_Renviron (Renviron.c:203) ==5085== by 0x816068B: Rf_initialize_R (system.c:160) ==5085== by 0x805DA1A: main (Rmain.c:30) ==5085===5085== LEAK SUMMARY: ==5085== definitely lost: 436 bytes in 20 blocks. ==5085== possibly lost: 0 bytes in 0 blocks. ==5085== still reachable: 12983993 bytes in 6602 blocks. ==5085== suppressed: 0 bytes in 0 blocks. ==5085== Reachable blocks (those to which a pointer was found) are not shown. ==5085== To see them, rerun with: --show-reachable=yes --AZaLVt6Pw+ Content-Type: text/plain Content-Description: Output with modified splines.c Content-Disposition: inline; filename="Out2.txt" Content-Transfer-Encoding: 7bit ==20221== Memcheck, a memory error detector for x86-linux. ==20221== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al. ==20221== Using valgrind-2.4.0, a program supervision framework for x86-linux. ==20221== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. ==20221== For more details, rerun with: -v ==20221= R : Copyright 2005, The R Foundation for Statistical Computing Version 2.2.0 Under development (unstable) (2005-07-21 r35000) ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for a HTML browser interface to help. Type 'q()' to quit R.> x <- 0:10/10 > y <- (x-0.5)^2 > > tt <- splinefun(x, y, method="fmm") > > x1 <- seq(from=-0.5, to=1.5, length=101) > y1 <- tt(x1) > > plot(x1, y1, type="l") > points(x,y) >==20221===20221== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 41 from 1) ==20221== malloc/free: in use at exit: 12984429 bytes in 6622 blocks. ==20221== malloc/free: 27051 allocs, 20429 frees, 33057360 bytes allocated. ==20221== For counts of detected errors, rerun with: -v ==20221== searching for pointers to 6622 not-freed blocks. ==20221== checked 13843736 bytes. ==20221===20221== 436 bytes in 20 blocks are definitely lost in loss record 18 of 36 ==20221== at 0x1B90659D: malloc (vg_replace_malloc.c:130) ==20221== by 0x8060C28: Putenv (Renviron.c:123) ==20221== by 0x8060DF2: process_Renviron (Renviron.c:181) ==20221== by 0x8060F31: process_system_Renviron (Renviron.c:203) ==20221== by 0x816068B: Rf_initialize_R (system.c:160) ==20221== by 0x805DA1A: main (Rmain.c:30) ==20221===20221== LEAK SUMMARY: ==20221== definitely lost: 436 bytes in 20 blocks. ==20221== possibly lost: 0 bytes in 0 blocks. ==20221== still reachable: 12983993 bytes in 6602 blocks. ==20221== suppressed: 0 bytes in 0 blocks. ==20221== Reachable blocks (those to which a pointer was found) are not shown. ==20221== To see them, rerun with: --show-reachable=yes --AZaLVt6Pw+ Content-Type: text/plain Content-Description: patch to modify splines.c Content-Disposition: inline; filename="splines.patch" Content-Transfer-Encoding: 7bit Index: src/appl/splines.c ==================================================================--- src/appl/splines.c (revision 35020) +++ src/appl/splines.c (working copy) @@ -348,6 +348,7 @@ int i, j, k, l; double ul, dx, tmp; + const int nm1 = *n - 1; if(*method == 1 && *n > 1) { dx = x[*n - 1] - x[0]; @@ -365,7 +366,7 @@ i = 0; for(l = 0; l < *nu; l++) { ul = v[l]; - if(ul < x[i] || x[i+1] < ul) { + if(ul < x[i] || (i < nm1 && x[i+1] < ul)) { i = 0; j = *n; do { --AZaLVt6Pw+--
maechler@stat.math.ethz.ch
2005-Jul-25 07:41 UTC
[Rd] Buglet in src/appl/splines.c (PR#8030)
>>>>> "BeT" == Berwin A Turlach <berwin at maths.uwa.edu.au> >>>>> on Sun, 24 Jul 2005 16:24:51 +0200 (CEST) writes:BeT> I was looking at "splinefun" and the underlying C code BeT> and believe that there is a memory access error in the BeT> C routine "spline_eval". ........... ........... Excellent, Berwin! Specifically the part about why the bug never really triggers wrong behavior. Your mail should be saved in "The Annals of R" ... I'm fixing according to your proposal -- for "R-devel" or (SVN) R/trunk/ only. Martin Maechler, ETH Zurich