Suharto Anggono Suharto Anggono
2017-Aug-11 17:11 UTC
[Rd] Issues of R_pretty in src/appl/pretty.c
See https://stat.ethz.ch/pipermail/r-devel/2017-August/074746.html for the origin of the example here. That pretty(c(-1,1)*1e300, n = 1e9, min.n = 1) gave 20 intervals, far from 1e9, but pretty(c(-1,1)*1e300, n = 1e6, min.n = 1) gave 1000000 intervals (on a machine), made me trace through the code to function 'R_pretty' in https://svn.r-project.org/R/trunk/src/appl/pretty.c . *lo is -1e300, *up is 1e300. cell = fmax2(fabs(*lo),fabs(*up)); 'cell' is 1e300. i_small = dx < cell * U * imax2(1,*ndiv) * DBL_EPSILON *3; When *ndiv is (int) 1e9, apparently cell * U * imax2(1,*ndiv) overflows to infinity and 'i_small' is 1 (true). It doesn't happen when *ndiv is (int) 1e6. Putting parentheses may avoid the floating point overflow. For example, i_small = dx < cell * (U * imax2(1,*ndiv) * DBL_EPSILON) *3; The part U = (1 + (h5 >= 1.5*h+.5)) ? 1/(1+h) : 1.5/(1+h5); is strange. Because (h5 >= 1.5*h+.5) is 1 or 0, (1 + (h5 >= 1.5*h+.5)) is never zero and 1/(1+h) will always be chosen. The comment for 'rounding_eps' says "1e-7 is consistent with seq.default()". Currently, seq.default() uses 1e-10 as fuzz.
>>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org> >>>>> on Fri, 11 Aug 2017 17:11:06 +0000 writes: >>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org> >>>>> on Fri, 11 Aug 2017 17:11:06 +0000 writes:> See https://stat.ethz.ch/pipermail/r-devel/2017-August/074746.html for the origin of the example here. > That > pretty(c(-1,1)*1e300, n = 1e9, min.n = 1) gave 20 intervals, far from 1e9, but > pretty(c(-1,1)*1e300, n = 1e6, min.n = 1) gave 1000000 intervals > (on a machine), made me trace through the code to function 'R_pretty' in https://svn.r-project.org/R/trunk/src/appl/pretty.c . thank you. > *lo is -1e300, *up is 1e300. > cell = fmax2(fabs(*lo),fabs(*up)); > 'cell' is 1e300. > i_small = dx < cell * U * imax2(1,*ndiv) * DBL_EPSILON *3; > When *ndiv is (int) 1e9, apparently cell * U * imax2(1,*ndiv) overflows to infinity and 'i_small' is 1 (true). It doesn't happen when *ndiv is (int) 1e6. well spotted! > Putting parentheses may avoid the floating point overflow. For example, > i_small = dx < cell * (U * imax2(1,*ndiv) * DBL_EPSILON) *3; yes... but only if the compiler optimization steps "keep the parentheses". AFAIK, there is no guarantee for that. To make sure, I'd replace the above by U *= imax2(1,*ndiv) * DBL_EPSILON; i_small = dx < cell * U * 3; > The part > U = (1 + (h5 >= 1.5*h+.5)) ? 1/(1+h) : 1.5/(1+h5); > is strange. Because (h5 >= 1.5*h+.5) is 1 or 0, (1 + (h5 >= 1.5*h+.5)) is never zero and 1/(1+h) will always be chosen. Yes, strange indeed! here was as a change (not by me!) adding wrong parentheses there (or maybe adding what the previously "missing" parens implied, but not what they intended!). The original code had been U = 1 + (h5 >= 1.5*h+.5) ? 1/(1+h) : 1.5/(1+h5); and "of course" was intended to mean U = 1 + ((h5 >= 1.5*h+.5) ? 1/(1+h) : 1.5/(1+h5)); and this what I'll change it to, now. > The comment for 'rounding_eps' says "1e-7 is consistent with seq.default()". Currently, seq.default() uses 1e-10 as fuzz. Hmm, yes, thank you; this was correct when written, but seq.default had been changed in the mean time, namely in svn r51095 | 2010-02-03 Usually we are cautious / reluctant to change such things w/o any bug that we see to fix. OTOH, we did have bug cases we'd wanted to amend for seq() / seq.int(); and I'll look into updating the "pretty - epsilon" also to 1e-10. Thank you for your analysis and suggestions! Martin
>>>>> Martin Maechler <maechler at stat.math.ethz.ch> >>>>> on Mon, 14 Aug 2017 11:46:07 +0200 writes:>>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org> >>>>> on Fri, 11 Aug 2017 17:11:06 +0000 writes: >>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org> >>>>> on Fri, 11 Aug 2017 17:11:06 +0000 writes:>> See https://stat.ethz.ch/pipermail/r-devel/2017-August/074746.html for the origin of the example here. >> That >> pretty(c(-1,1)*1e300, n = 1e9, min.n = 1) gave 20 intervals, far from 1e9, but >> pretty(c(-1,1)*1e300, n = 1e6, min.n = 1) gave 1000000 intervals >> (on a machine), made me trace through the code to function 'R_pretty' in https://svn.r-project.org/R/trunk/src/appl/pretty.c . > thank you. >> *lo is -1e300, *up is 1e300. >> cell = fmax2(fabs(*lo),fabs(*up)); >> 'cell' is 1e300. >> i_small = dx < cell * U * imax2(1,*ndiv) * DBL_EPSILON *3; >> When *ndiv is (int) 1e9, apparently cell * U * imax2(1,*ndiv) overflows to infinity and 'i_small' is 1 (true). It doesn't happen when *ndiv is (int) 1e6. > well spotted! >> Putting parentheses may avoid the floating point overflow. For example, >> i_small = dx < cell * (U * imax2(1,*ndiv) * DBL_EPSILON) *3; > yes... but only if the compiler optimization steps "keep the parentheses". > AFAIK, there is no guarantee for that. > To make sure, I'd replace the above by > U *= imax2(1,*ndiv) * DBL_EPSILON; > i_small = dx < cell * U * 3; >> The part >> U = (1 + (h5 >= 1.5*h+.5)) ? 1/(1+h) : 1.5/(1+h5); >> is strange. Because (h5 >= 1.5*h+.5) is 1 or 0, (1 + (h5 >= 1.5*h+.5)) is never zero and 1/(1+h) will always be chosen. > Yes, strange indeed! > here was as a change (not by me!) adding wrong parentheses > there (or maybe adding what the previously "missing" parens > implied, but not what they intended!). > The original code had been > U = 1 + (h5 >= 1.5*h+.5) ? 1/(1+h) : 1.5/(1+h5); > and "of course" was intended to mean > U = 1 + ((h5 >= 1.5*h+.5) ? 1/(1+h) : 1.5/(1+h5)); > and this what I'll change it to, now. >> The comment for 'rounding_eps' says "1e-7 is consistent with seq.default()". Currently, seq.default() uses 1e-10 as fuzz. > Hmm, yes, thank you; this was correct when written, > but seq.default had been changed in the mean time, > namely in svn r51095 | 2010-02-03 > Usually we are cautious / reluctant to change such things w/o > any bug that we see to fix. > OTOH, we did have bug cases we'd wanted to amend for seq() / > seq.int(); > and I'll look into updating the "pretty - epsilon" also to > 1e-10. > Thank you for your analysis and suggestions! I've committed now what I think has been suggested above ... to R-devel only : ------------------------------------------------------------------------ r73094 | maechler | 2017-08-15 09:10:27 +0200 (Tue, 15. Aug 2017) | 1 Zeile Ge?nderte Pfade: M doc/NEWS.Rd M src/appl/pretty.c M src/main/engine.c M tests/reg-large.R M tests/reg-tests-2.Rout.save pretty(x, n) fix overflow for large n suggested by Suhartu Aggano, R-devel, 2017-08-11