In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); if (*info == 0){ F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); ........ This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! I googled for C and nan and found a work-around: Change 'if ...' to if (*info == 0 & (hessian[0][0] == hessian[0][0])){ which works as a test of hessian[0][0] (not) being NaN. I'm using the .C interface for calling C code. Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? G?ran Brostr?m PS. Yes, I will switch to .Call! Soon...
> On 9 Feb 2017, at 16:00, G?ran Brostr?m <goran.brostrom at umu.se> wrote: > > In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: > > F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); > if (*info == 0){ > F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); > ........ > > This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! > > I googled for C and nan and found a work-around: Change 'if ...' to > > if (*info == 0 & (hessian[0][0] == hessian[0][0])){ > > which works as a test of hessian[0][0] (not) being NaN. > > I'm using the .C interface for calling C code. > > Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?You should/could use macro R_FINITE to test each entry of the hessian. In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: for (j = 0; j < *n; j++) for (i = 0; i < *n; i++) { if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; } There may be a more compact way with a macro in the R headers. I feel that If other code can't handle non-finite values: then test. Berend Hasselman
On 02/09/2017 05:05 PM, Berend Hasselman wrote:>> On 9 Feb 2017, at 16:00, G?ran Brostr?m <goran.brostrom at umu.se> wrote: >> >> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: >> >> F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); >> if (*info == 0){ >> F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); >> ........ >> >> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! >> >> I googled for C and nan and found a work-around: Change 'if ...' to >> >> if (*info == 0 & (hessian[0][0] == hessian[0][0])){ >> >> which works as a test of hessian[0][0] (not) being NaN. >> >> I'm using the .C interface for calling C code. >> >> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? > You should/could use macro R_FINITE to test each entry of the hessian. > In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: > > for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; > } > > There may be a more compact way with a macro in the R headers. > I feel that If other code can't handle non-finite values: then test. > > Berend HasselmanAnd if performance was of importance, you could use the trick from mayHaveNaNOrInf in array.c (originally from pqR), but be careful to test the individual operands of the sum. mayHaveNaNOrInf does not do the test for performance reasons, but as a result it can have false positives. Rboolean hasNaNOrInf(double *x, R_xlen_t n) { if ((n&1) != 0 && !R_FINITE(x[0])) return TRUE; for (R_xlen_t i = n&1; i < n; i += 2) if (!R_FINITE(x[i]+x[i+1])&& (!R_FINITE(x[i]) || !R_FINITE(x[i+1])) return TRUE; return FALSE; } Tomas> > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
> > On 9 Feb 2017, at 16:00, G?ran Brostr?m <goran.brostrom at umu.se> wrote: > > > > In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: > > > > F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); > > if (*info == 0){ > > F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); > > ........ > > > > This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! > > > > I googled for C and nan and found a work-around: Change 'if ...' to > > > > if (*info == 0 & (hessian[0][0] == hessian[0][0])){ > > > > which works as a test of hessian[0][0] (not) being NaN. > > > > I'm using the .C interface for calling C code. > > > > Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?> You should/could use macro R_FINITE to test each entry of the hessian. > In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:> for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; > }A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in the R sources themselves, that is not the case in package code. Hence, not only nicer to read but even faster is double *fj = REAL(sexp_fjac); for (j = 0; j < *n; j++) for (i = 0; i < *n; i++) { if( !R_FINITE(fj[(*n)*j + i]) ) error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); rjac[(*ldr)*j + i] = fj[(*n)*j + i]; }> There may be a more compact way with a macro in the R headers. > I feel that If other code can't handle non-finite values: then test.> Berend HasselmanIndeed: do test. Much better safe than going for speed and losing in rare cases! The latter is a recipe to airplanes falling out of the sky ( ;-\ ) and is unfortunately used by some (in)famous "optimized" (fast but sometimes wrong!!) Lapack/BLAS libraries. The NEWS about the next version of R (3.4.0 due in April) has a new 2-paragraph entry related to this: --------------------------------------------------------------------------------- SIGNIFICANT USER-VISIBLE CHANGES: [...........] * Matrix products now consistently bypass BLAS when the inputs have NaN/Inf values. Performance of the check of inputs has been improved. Performance when BLAS is used is improved for matrix/vector and vector/matrix multiplication (DGEMV is now used instead of DGEMM). One can now choose from alternative matrix product implementations via options(matprod = ). The "internal" implementation is unoptimized but consistent in precision with other summation in R (uses long double accumulators). "blas" calls BLAS directly for best performance, yet usually with undefined behavior for inputs with NaN/Inf. --------------------------------------------------------------------------------- Last but not least : If you are not afraid of +/- Inf, but really only of NA/NaN's (as the OP said), then note that "THE manual" (= "Writing R Extensions") does mention ISNAN(.) almost in the same place as the first occurence of R_FINITE(.). Best regards, Martin Maechler, ETH Zurich