Nick.Efthymiou@schwab.com
2003-Dec-30 01:55 UTC
[Rd] Accuracy: Correct sums in rowSums(), colSums() (PR#6196)
Full_Name: Nick Efthymiou Version: R1.5.0 and above OS: Red Hat Linux Submission from: (NULL) (162.93.14.73) With the introduction of the functions rowSums(), colSums(), rowMeans() and colMeans() in R1.5.0, function "SEXP do_colsum(SEXP call, SEXP op, SEXP args, SEXP rho)" was added to perform the fast summations. We have an excellent opportunity to improve the accuracy by implementing Kahan summation here. Kahan summation is described in @article{ goldberg91what, author = "David Goldberg", title = "What Every Computer Scientist Should Know About Floating-Point Arithmetic", journal = "ACM Computing Surveys", volume = "23", number = "1", pages = "5--48", year = "1991", url = "citeseer.nj.nec.com/goldberg91what.html" } (http://citeseer.nj.nec.com/goldberg91what.html) and in the article "Floating-point Summation", by Evan Manning, in C/C++ User's Journal, Sept. 1996. The proposed improvement has been tested in !IEEE_754 mode, the patch is attached. It is intended to be applied against R-1.7.1/src/main/array.c --------- Cut here ---------- *** array.c.old Mon Dec 15 17:33:23 2003 --- array.c Mon Dec 15 17:33:45 2003 *************** *** 1016,1022 **** int OP, n, p, cnt = 0, i, j, type; Rboolean NaRm, keepNA; int *ix; ! double *rx, sum = 0.0; checkArity(op, args); x = CAR(args); args = CDR(args); --- 1016,1022 ---- int OP, n, p, cnt = 0, i, j, type; Rboolean NaRm, keepNA; int *ix; ! double *rx, sum = 0.0, correction = 0.0; checkArity(op, args); x = CAR(args); args = CDR(args); *************** *** 1046,1062 **** switch (type) { case REALSXP: rx = REAL(x) + n*j; #ifdef IEEE_754 if (keepNA) ! for (sum = 0., i = 0; i < n; i++) sum += *rx++; else { ! for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++) ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} else if (keepNA) {sum = NA_REAL; break;} } #else ! for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++) ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} else if (keepNA) {sum = NA_REAL; break;} #endif break; --- 1046,1082 ---- switch (type) { case REALSXP: rx = REAL(x) + n*j; + sum = 0.; #ifdef IEEE_754 if (keepNA) ! for (i = 0; i < n; i++) sum += *rx++; else { ! for (cnt = 0, i = 0; i < n; i++, rx++) ! if (!ISNAN(*rx)) { ! /* TODO: Kahan summation */ ! cnt++; sum += *rx; ! } else if (keepNA) {sum = NA_REAL; break;} } #else ! i = cnt = 0; ! if (i < n) { ! if (ISNAN(*rx)) { ! if (keepNA) {sum = NA_REAL; break /* switch */;} ! } else { ! cnt++; sum = *rx; ! } ! i++; rx++; ! } ! for (; i < n; i++, rx++) ! if (!ISNAN(*rx)) { ! /* Kahan summation */ ! double yhilo = *rx - correction; ! double tsum = sum + yhilo; ! correction = (tsum - sum) - yhilo; ! sum = tsum; ! cnt++; ! } else if (keepNA) {sum = NA_REAL; break;} #endif break; *************** *** 1121,1137 **** switch (type) { case REALSXP: rx = REAL(x) + i; #ifdef IEEE_754 if (keepNA) ! for (sum = 0., j = 0; j < p; j++, rx += n) sum += *rx; else { ! for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n) ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} else if (keepNA) {sum = NA_REAL; break;} } #else ! for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n) ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} else if (keepNA) {sum = NA_REAL; break;} #endif break; --- 1141,1177 ---- switch (type) { case REALSXP: rx = REAL(x) + i; + sum = 0.; #ifdef IEEE_754 if (keepNA) ! for (j = 0; j < p; j++, rx += n) sum += *rx; else { ! for (cnt = 0, j = 0; j < p; j++, rx += n) ! if (!ISNAN(*rx)) { ! /* TODO: Kahan summation */ ! cnt++; sum += *rx; ! } else if (keepNA) {sum = NA_REAL; break;} } #else ! j = cnt = 0; ! if (j < p) { ! if (ISNAN(*rx)) { ! if (keepNA) {sum = NA_REAL; break /* switch */;} ! } else { ! cnt++; sum = *rx; ! } ! j++; rx += n; ! } ! for (; j < p; j++, rx += n) ! if (!ISNAN(*rx)) { ! /* Kahan summation */ ! double yhilo = *rx - correction; ! double tsum = sum + yhilo; ! correction = (tsum - sum) - yhilo; ! sum = tsum; ! cnt++; ! } else if (keepNA) {sum = NA_REAL; break;} #endif break; ---------end of patch ------- Since someone is certainly wondering why I bothered to bring this up - I was trying to optimize the order of a Chebyshev approximation to a specific function so that I accomplish the most power-efficient calculation at one ulp accuracy (multiplications and additions consume battery power). The optimization was based on minimizing the relative error between the true function and the approximation. It involved a lot of sums and was not converging properly. With the patch, I got the convergence I needed. Thanks, - Nick -
Prof Brian Ripley
2003-Dec-30 09:29 UTC
[Rd] Accuracy: Correct sums in rowSums(), colSums() (PR#6196)
Could you please prepare a patch against the current R-devel sources? 1.7.1 is several versions old. I was puzzled as to why you are recommending improving the accuracy of the `fast' methods and not the slow one, applying apply() to mean() and sum(). The improvement might be larger in those cases: what Goldberg calls `a dramatic improvement' is from N eps down to 2 eps, and that is for the worst case and is only dramatic if N is dramatically bigger than 2. It is also not clear that it is effective on machines with extended-precision registers unless N is very large. Some of us are well aware of the issues of floating point arithmetic, but R is normally used for statistical computations, where errors in the data (rounding, measurement etc) normally are orders of magnitude bigger than those in floating-point arithmetic. So implementation effort has been spent on handling the statistical errors first. On Tue, 30 Dec 2003 Nick.Efthymiou@schwab.com wrote:> Full_Name: Nick Efthymiou > Version: R1.5.0 and above > OS: Red Hat Linux > Submission from: (NULL) (162.93.14.73) > > > With the introduction of the functions rowSums(), colSums(), rowMeans() and > colMeans() in R1.5.0, function "SEXP do_colsum(SEXP call, SEXP op, SEXP args, > SEXP rho)" was added to perform the fast summations. We have an excellent > opportunity to improve the accuracy by implementing Kahan summation here. > > Kahan summation is described in > > @article{ goldberg91what, > author = "David Goldberg", > title = "What Every Computer Scientist Should Know About Floating-Point > Arithmetic", > journal = "ACM Computing Surveys", > volume = "23", > number = "1", > pages = "5--48", > year = "1991", > url = "citeseer.nj.nec.com/goldberg91what.html" } > > > (http://citeseer.nj.nec.com/goldberg91what.html) > > and in the article "Floating-point Summation", by Evan Manning, in C/C++ User's > Journal, Sept. 1996. > > The proposed improvement has been tested in !IEEE_754 mode, the patch is > attached. > It is intended to be applied against R-1.7.1/src/main/array.c > > --------- Cut here ---------- > *** array.c.old Mon Dec 15 17:33:23 2003 > --- array.c Mon Dec 15 17:33:45 2003 > *************** > *** 1016,1022 **** > int OP, n, p, cnt = 0, i, j, type; > Rboolean NaRm, keepNA; > int *ix; > ! double *rx, sum = 0.0; > > checkArity(op, args); > x = CAR(args); args = CDR(args); > --- 1016,1022 ---- > int OP, n, p, cnt = 0, i, j, type; > Rboolean NaRm, keepNA; > int *ix; > ! double *rx, sum = 0.0, correction = 0.0; > > checkArity(op, args); > x = CAR(args); args = CDR(args); > *************** > *** 1046,1062 **** > switch (type) { > case REALSXP: > rx = REAL(x) + n*j; > #ifdef IEEE_754 > if (keepNA) > ! for (sum = 0., i = 0; i < n; i++) sum += *rx++; > else { > ! for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++) > ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} > else if (keepNA) {sum = NA_REAL; break;} > } > #else > ! for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++) > ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} > else if (keepNA) {sum = NA_REAL; break;} > #endif > break; > --- 1046,1082 ---- > switch (type) { > case REALSXP: > rx = REAL(x) + n*j; > + sum = 0.; > #ifdef IEEE_754 > if (keepNA) > ! for (i = 0; i < n; i++) sum += *rx++; > else { > ! for (cnt = 0, i = 0; i < n; i++, rx++) > ! if (!ISNAN(*rx)) { > ! /* TODO: Kahan summation */ > ! cnt++; sum += *rx; > ! } > else if (keepNA) {sum = NA_REAL; break;} > } > #else > ! i = cnt = 0; > ! if (i < n) { > ! if (ISNAN(*rx)) { > ! if (keepNA) {sum = NA_REAL; break /* switch */;} > ! } else { > ! cnt++; sum = *rx; > ! } > ! i++; rx++; > ! } > ! for (; i < n; i++, rx++) > ! if (!ISNAN(*rx)) { > ! /* Kahan summation */ > ! double yhilo = *rx - correction; > ! double tsum = sum + yhilo; > ! correction = (tsum - sum) - yhilo; > ! sum = tsum; > ! cnt++; > ! } > else if (keepNA) {sum = NA_REAL; break;} > #endif > break; > *************** > *** 1121,1137 **** > switch (type) { > case REALSXP: > rx = REAL(x) + i; > #ifdef IEEE_754 > if (keepNA) > ! for (sum = 0., j = 0; j < p; j++, rx += n) sum += *rx; > else { > ! for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n) > ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} > else if (keepNA) {sum = NA_REAL; break;} > } > #else > ! for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n) > ! if (!ISNAN(*rx)) {cnt++; sum += *rx;} > else if (keepNA) {sum = NA_REAL; break;} > #endif > break; > --- 1141,1177 ---- > switch (type) { > case REALSXP: > rx = REAL(x) + i; > + sum = 0.; > #ifdef IEEE_754 > if (keepNA) > ! for (j = 0; j < p; j++, rx += n) sum += *rx; > else { > ! for (cnt = 0, j = 0; j < p; j++, rx += n) > ! if (!ISNAN(*rx)) { > ! /* TODO: Kahan summation */ > ! cnt++; sum += *rx; > ! } > else if (keepNA) {sum = NA_REAL; break;} > } > #else > ! j = cnt = 0; > ! if (j < p) { > ! if (ISNAN(*rx)) { > ! if (keepNA) {sum = NA_REAL; break /* switch */;} > ! } else { > ! cnt++; sum = *rx; > ! } > ! j++; rx += n; > ! } > ! for (; j < p; j++, rx += n) > ! if (!ISNAN(*rx)) { > ! /* Kahan summation */ > ! double yhilo = *rx - correction; > ! double tsum = sum + yhilo; > ! correction = (tsum - sum) - yhilo; > ! sum = tsum; > ! cnt++; > ! } > else if (keepNA) {sum = NA_REAL; break;} > #endif > break; > ---------end of patch ------- > > > Since someone is certainly wondering why I bothered to bring this up - I was > trying to optimize the order of a Chebyshev approximation to a specific function > so that I accomplish the most power-efficient calculation at one ulp accuracy > (multiplications and additions consume battery power). The optimization was > based on minimizing the relative error between the true function and the > approximation. It involved a lot of sums and was not converging properly. With > the patch, I got the convergence I needed. > > Thanks, > > - Nick - > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel > >-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595