Full_Name: Morten Welinder Version: snapshot OS: Submission from: (NULL) (65.213.85.129) Two things are wrong. 1. There is nan test outside IEEE_754. 2. The meat part of the function should really be something like... if (!lower_tail) x = -x; if (fabs (x) > 1) { double temp = atan (1 / x) / M_PI; return (x > 0) ? R_D_Clog (temp) : R_D_val (-temp); } else return R_D_val (0.5 + atan (x) / M_PI); ...instead of the current heavily truncated series expansion. The above is much simpler and more precise. (Thanks to Ian Smith for pointing the in-hindsight- obvious 1/x trick out.)
>>>>> "Morten" == Morten Welinder <terra@gnome.org> >>>>> on Sun, 11 Apr 2004 19:04:01 +0200 (CEST) writes:Morten> Full_Name: Morten Welinder Version: snapshot OS:.. Morten> Submission from: (NULL) (65.213.85.129) Morten> Two things are wrong. Morten> 1. There is nan test outside IEEE_754. No, that's not wrong. nmath.h *does* define ISNAN() also "outside IEEE_754" Morten> 2. The meat part of the function should really be something Morten> like... Morten> if (!lower_tail) x = -x; good idea for code simplification. No bug in the current code though. Morten> if (fabs (x) > 1) { double temp = atan (1 / x) / Morten> M_PI; return (x > 0) ? R_D_Clog (temp) : R_D_val Morten> (-temp); } else return R_D_val (0.5 + atan (x) / Morten> M_PI); Morten> ...instead of the current heavily truncated series Morten> expansion. The above is much simpler "simpler": yes. Morten> and more precise. That's quite a strong claim which you haven't supported by any evidence! I don't think there's a precision loss in the current code, particularly not where the "heavily truncated series" is used. I've been very careful to use it only where it's fully accurate. I would expect that your code may be more accurate (2-3 digits) when 1 <= |x| <= x_big := 5478.3. But the current code might well be more accurate for |x| >= x_big - particularly for cases where the C "libm" atan() implementation isn't quite perfect. Morten> (Thanks to Ian Smith for pointing the Morten> in-hindsight- obvious 1/x trick out.) I agree that's a cute idea / "trick". Still, I haven't seen any bug in the current code! Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <><
The current code has xbig such that the x^5 term is beyond the precision limit (although I would use 2.5 instead of 5 in the xbig formula). You get xbig ~5478, I get xbig ~179513609. That is about log_2 (5478) = 12.5 bits lost to cancelation in your case and 27.5 bits in my case. If you want those numbers down and still use a series expansion, you need to have a lot more terms so you can get xbig down. By the way, your xbig point is valid only for log_p==FALSE. For log_p==TRUE it needs to be higher. I don't see why you would be worried over atan's performance in the [-1;+1] interval given that the main branch already relies on atan in that interval. Numbers... There is nothing like numbers. I hacked up a comparison in IEEE double space, i.e., 53-bit mantissa. I calculated a reference value using 113-bit mantissa arithmetic (using the 1/x method and Sun's high- performance libraries) with the result cast back to 53-bit. Fix loc=0 and scale=1 and let eR be the current R method's relative error and let eN be the proposed method's relative error. Then we have the table below. Conclusions... 1. The series expansion loses 3-4 digits just below xbig. 2. The series expansion loses 2-3 digits just above xbig for log_p==TRUE. 3. The series version isn't even used for x>xbig and lower==TRUE. That makes R's log_p==TRUE performance awful. Morten troll:~> for x in 1 10 100 1000 5478 5459 6000 7000 8000 9000 10000 100000 1000000 10000000 100000000; do ./a.out $x 0; done x= 1 y= -0.2876820725 lt=1 log=1 eR= -0 eN= -0 x= 1 y= 0.75 lt=1 log=0 eR= 0 eN= 0 x= 1 y= -1.386294361 lt=0 log=1 eR= -0 eN= -0 x= 1 y= 0.25 lt=0 log=0 eR= 0 eN= 0 x= 10 y= -0.03223967553 lt=1 log=1 eR= -1.721827231e-15 eN= -0 x= 10 y= 0.9682744826 lt=1 log=0 eR= 0 eN= 0 x= 10 y= -3.450633956 lt=0 log=1 eR= 3.860935836e-16 eN= -0 x= 10 y= 0.03172551743 lt=0 log=0 eR= -1.531015449e-15 eN= 2.187164928e-16 x= 100 y= -0.003188069262 lt=1 log=1 eR= -2.122106203e-14 eN= -0 x= 100 y= 0.9968170072 lt=1 log=0 eR= 1.113768141e-16 eN= 0 x= 100 y= -5.749933404 lt=0 log=1 eR= 3.707222428e-15 eN= -0 x= 100 y= 0.003182992765 lt=0 log=0 eR= -2.111865771e-14 eN= 0 x= 1000 y= -0.0003183604514 lt=1 log=1 eR= -1.323068058e-13 eN= 1.702790293e-16 x= 1000 y= 0.9996816902 lt=1 log=0 eR= 0 eN= 0 x= 1000 y= -8.052485498 lt=0 log=1 eR= 1.632420278e-14 eN= -0 x= 1000 y= 0.0003183097801 lt=0 log=0 eR= -1.323278675e-13 eN= 1.703061358e-16 x= 5478 y= -5.81086402e-05 lt=1 log=1 eR= -1.604954363e-12 eN= 1.166137007e-16 x= 5478 y= 0.999941893 lt=1 log=0 eR= 1.11028754e-16 eN= 0 x= 5478 y= -9.753225247 lt=0 log=1 eR= 1.644635682e-13 eN= -0 x= 5478 y= 5.810695193e-05 lt=0 log=0 eR= -1.605000994e-12 eN= 1.166170889e-16 x= 5459 y= -5.831089269e-05 lt=1 log=1 eR= -3.7128847e-13 eN= -0 x= 5459 y= 0.9999416908 lt=1 log=0 eR= 0 eN= 0 x= 5459 y= -9.749750799 lt=0 log=1 eR= 3.807877628e-14 eN= -0 x= 5459 y= 5.830919264e-05 lt=0 log=0 eR= -3.711830826e-13 eN= 1.16212612e-16 x= 6000 y= -5.305305449e-05 lt=1 log=1 eR= 1.294887935e-12 eN= -0 x= 6000 y= 0.9999469484 lt=1 log=0 eR= -1.110281927e-16 eN= 0 x= 6000 y= -9.844244643 lt=0 log=1 eR= -0 eN= -0 x= 6000 y= 5.305164721e-05 lt=0 log=0 eR= 0 eN= 0 x= 7000 y= -4.54738745e-05 lt=1 log=1 eR= 9.137564969e-13 eN= 1.49014432e-16 x= 7000 y= 0.9999545272 lt=1 log=0 eR= 0 eN= 0 x= 7000 y= -9.998395321 lt=0 log=1 eR= 1.776641933e-16 eN= -0 x= 7000 y= 4.547284057e-05 lt=0 log=0 eR= 0 eN= 1.490178201e-16 x= 8000 y= -3.978952716e-05 lt=1 log=1 eR= -2.308452986e-12 eN= -0 x= 8000 y= 0.9999602113 lt=1 log=0 eR= 1.110267201e-16 eN= 0 x= 8000 y= -10.13192671 lt=0 log=1 eR= -0 eN= -0 x= 8000 y= 3.978873557e-05 lt=0 log=0 eR= 0 eN= 0 x= 9000 y= -3.536839044e-05 lt=1 log=1 eR= 1.170620714e-12 eN= -0 x= 9000 y= 0.9999646322 lt=1 log=0 eR= 0 eN= 0 x= 9000 y= -10.24970975 lt=0 log=1 eR= -1.733080139e-16 eN= -0 x= 9000 y= 3.536776499e-05 lt=0 log=0 eR= 0 eN= 1.915943397e-16 x= 10000 y= -3.183149513e-05 lt=1 log=1 eR= 1.955295556e-12 eN= 2.128792113e-16 x= 10000 y= 0.999968169 lt=1 log=0 eR= -1.110258365e-16 eN= 0 x= 10000 y= -10.35507026 lt=0 log=1 eR= -0 eN= -0 x= 10000 y= 3.183098851e-05 lt=0 log=0 eR= 0 eN= 2.128825995e-16 x= 100000 y= -3.183103928e-06 lt=1 log=1 eR= -5.496886055e-12 eN= 1.330514125e-16 x= 100000 y= 0.9999968169 lt=1 log=0 eR= 0 eN= 0 x= 100000 y= -12.65765535 lt=0 log=1 eR= -1.403385374e-16 eN= -1.403385374e-16 x= 100000 y= 3.183098862e-06 lt=0 log=0 eR= 1.330516242e-16 eN= 1.330516242e-16 x= 1000000 y= -3.183099368e-07 lt=1 log=1 eR= 1.358965789e-10 eN= -1.663145038e-16 x= 1000000 y= 0.9999996817 lt=1 log=0 eR= 0 eN= 0 x= 1000000 y= -14.96024044 lt=0 log=1 eR= -0 eN= -0 x= 1000000 y= 3.183098862e-07 lt=0 log=0 eR= 1.663145303e-16 eN= 0 x= 10000000 y= -3.183098912e-08 lt=1 log=1 eR= -3.352301937e-09 eN= -0 x= 10000000 y= 0.9999999682 lt=1 log=0 eR= 1.11022306e-16 eN= 0 x= 10000000 y= -17.26282554 lt=0 log=1 eR= -0 eN= -0 x= 10000000 y= 3.183098862e-08 lt=0 log=0 eR= 0 eN= 0 x= 100000000 y= -3.183098867e-09 lt=1 log=1 eR= 1.059916874e-08 eN= -0 x= 100000000 y= 0.9999999968 lt=1 log=0 eR= 0 eN= 0 x= 100000000 y= -19.56541063 lt=0 log=1 eR= -0 eN= -0 x= 100000000 y= 3.183098862e-09 lt=0 log=0 eR= 1.299332268e-16 eN= 0 x=1000000000 y= -3.183098862e-10 lt=1 log=1 eR= -1.986729412e-07 eN= -0 x=1000000000 y= 0.9999999997 lt=1 log=0 eR= 1.110223025e-16 eN= 0 x=1000000000 y= -21.86799572 lt=0 log=1 eR= -0 eN= -0 x=1000000000 y= 3.183098862e-10 lt=0 log=0 eR= 1.624165335e-16 eN= 1.624165335e-16 x= 1e+10 y= -3.183098862e-11 lt=1 log=1 eR= -1.986729413e-07 eN= -0 x= 1e+10 y= 1 lt=1 log=0 eR= 0 eN= 0 x= 1e+10 y= -24.17058082 lt=0 log=1 eR= -0 eN= -0 x= 1e+10 y= 3.183098862e-11 lt=0 log=0 eR= 2.030206668e-16 eN= 2.030206668e-16 x= 1e+11 y= -3.183098862e-12 lt=1 log=1 eR= 6.777064055e-06 eN= -0 x= 1e+11 y= 1 lt=1 log=0 eR= 0 eN= 0 x= 1e+11 y= -26.47316591 lt=0 log=1 eR= -0 eN= -0 x= 1e+11 y= 3.183098862e-12 lt=0 log=0 eR= 0 eN= 0
>>>>> "Morten" == Morten Welinder <terra@gnome.org> >>>>> on Tue, 13 Apr 2004 16:15:15 +0200 (CEST) writes:Morten> The current code has xbig such that the x^5 term is Morten> beyond the precision limit (although I would use 2.5 Morten> instead of 5 in the xbig formula). You get xbig Morten> ~5478, I get xbig ~179513609. That is about log_2 Morten> (5478) = 12.5 bits lost to cancelation in your case Morten> and 27.5 bits in my case. If you want those numbers Morten> down and still use a series expansion, you need to Morten> have a lot more terms so you can get xbig down. Morten> By the way, your xbig point is valid only for Morten> log_p==FALSE. For log_p==TRUE it needs to be higher. ok, that's an important point. Morten> I don't see why you would be worried over atan's Morten> performance in the [-1;+1] interval given that the Morten> main branch already relies on atan in that interval. I wasn't worried really; just I assume log() and log1p() to be very accurate (only loose 1 or 2 bits) where (bad) system's atan()'s may loose more. Morten> Numbers... There is nothing like numbers. Indeed, that was the evidence I've been asking for. Morten> I hacked up a comparison in IEEE double space, Morten> i.e., 53-bit mantissa. I calculated a reference Morten> value using 113-bit mantissa arithmetic (using the Morten> 1/x method (which has an obvious "design bias" though -- but I agree it's not relevant here) Morten> and Sun's high- performance libraries) Morten> with the result cast back to 53-bit. Morten> Fix loc=0 and scale=1 and let eR be the current R Morten> method's relative error and let eN be the proposed Morten> method's relative error. Then we have the table Morten> below. Morten> Conclusions... Morten> 1. The series expansion loses 3-4 digits just below xbig. Morten> 2. The series expansion loses 2-3 digits just above xbig for log_p==TRUE. Morten> 3. The series version isn't even used for x>xbig and Morten> lower==TRUE. That makes R's log_p==TRUE performance awful. "awful" is strong; but otherwise I agree. Together with the elegance of your proposition, I now agree with you. Thank you, Morten! Martin