> > On 9 Feb 2017, at 16:00, G?ran Brostr?m <goran.brostrom at umu.se> wrote: > > > > In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: > > > > F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); > > if (*info == 0){ > > F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); > > ........ > > > > This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! > > > > I googled for C and nan and found a work-around: Change 'if ...' to > > > > if (*info == 0 & (hessian[0][0] == hessian[0][0])){ > > > > which works as a test of hessian[0][0] (not) being NaN. > > > > I'm using the .C interface for calling C code. > > > > Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?> You should/could use macro R_FINITE to test each entry of the hessian. > In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:> for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; > }A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in the R sources themselves, that is not the case in package code. Hence, not only nicer to read but even faster is double *fj = REAL(sexp_fjac); for (j = 0; j < *n; j++) for (i = 0; i < *n; i++) { if( !R_FINITE(fj[(*n)*j + i]) ) error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); rjac[(*ldr)*j + i] = fj[(*n)*j + i]; }> There may be a more compact way with a macro in the R headers. > I feel that If other code can't handle non-finite values: then test.> Berend HasselmanIndeed: do test. Much better safe than going for speed and losing in rare cases! The latter is a recipe to airplanes falling out of the sky ( ;-\ ) and is unfortunately used by some (in)famous "optimized" (fast but sometimes wrong!!) Lapack/BLAS libraries. The NEWS about the next version of R (3.4.0 due in April) has a new 2-paragraph entry related to this: --------------------------------------------------------------------------------- SIGNIFICANT USER-VISIBLE CHANGES: [...........] * Matrix products now consistently bypass BLAS when the inputs have NaN/Inf values. Performance of the check of inputs has been improved. Performance when BLAS is used is improved for matrix/vector and vector/matrix multiplication (DGEMV is now used instead of DGEMM). One can now choose from alternative matrix product implementations via options(matprod = ). The "internal" implementation is unoptimized but consistent in precision with other summation in R (uses long double accumulators). "blas" calls BLAS directly for best performance, yet usually with undefined behavior for inputs with NaN/Inf. --------------------------------------------------------------------------------- Last but not least : If you are not afraid of +/- Inf, but really only of NA/NaN's (as the OP said), then note that "THE manual" (= "Writing R Extensions") does mention ISNAN(.) almost in the same place as the first occurence of R_FINITE(.). Best regards, Martin Maechler, ETH Zurich
> On 9 Feb 2017, at 17:44, Martin Maechler <maechler at stat.math.ethz.ch> wrote: >[snip] ...>> You should/could use macro R_FINITE to test each entry of the hessian. >> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: > >> for (j = 0; j < *n; j++) >> for (i = 0; i < *n; i++) { >> if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) >> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >> rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; >> } > > A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in > the R sources themselves, that is not the case in package code. > > Hence, not only nicer to read but even faster is > > double *fj = REAL(sexp_fjac); > for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(fj[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = fj[(*n)*j + i]; > } >Tom, Martin, Thanks for both of your suggestions. Very helpful. Berend Hasselman
Thanks to all who answered my third question. I learned something, but: On 2017-02-09 17:44, Martin Maechler wrote:> >>> On 9 Feb 2017, at 16:00, G?ran Brostr?m <goran.brostrom at umu.se> wrote: >>> >>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: >>> >>> F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); >>> if (*info == 0){ >>> F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); >>> ........ >>> >>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! >>> >>> I googled for C and nan and found a work-around: Change 'if ...' to >>> >>> if (*info == 0 & (hessian[0][0] == hessian[0][0])){ >>> >>> which works as a test of hessian[0][0] (not) being NaN. >>> >>> I'm using the .C interface for calling C code. >>> >>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? > >> You should/could use macro R_FINITE to test each entry of the hessian. >> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: > >> for (j = 0; j < *n; j++) >> for (i = 0; i < *n; i++) { >> if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) >> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >> rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; >> } > > A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in > the R sources themselves, that is not the case in package code. > > Hence, not only nicer to read but even faster is > > double *fj = REAL(sexp_fjac); > for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(fj[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = fj[(*n)*j + i]; > } >[...] isn't this even easier to read (and correct?): for (j = 0; j < n*; j++) for (i = 0; i < n*; i++){ if ( !R_FINITE(hessian[i][j]) ) error("blah...") } ? In .C land, that is. (And sure, I'm afraid of ?Inf in this context.) Thanks again, G?ran
> On 10 Feb 2017, at 14:53, G?ran Brostr?m <goran.brostrom at umu.se> wrote: > > Thanks to all who answered my third question. I learned something, but: > > On 2017-02-09 17:44, Martin Maechler wrote: >> >>>> On 9 Feb 2017, at 16:00, G?ran Brostr?m <goran.brostrom at umu.se> wrote: >>>> >>>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: >>>> >>>> F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); >>>> if (*info == 0){ >>>> F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); >>>> ........ >>>> >>>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! >>>> >>>> I googled for C and nan and found a work-around: Change 'if ...' to >>>> >>>> if (*info == 0 & (hessian[0][0] == hessian[0][0])){ >>>> >>>> which works as a test of hessian[0][0] (not) being NaN. >>>> >>>> I'm using the .C interface for calling C code. >>>> >>>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? >> >>> You should/could use macro R_FINITE to test each entry of the hessian. >>> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: >> >>> for (j = 0; j < *n; j++) >>> for (i = 0; i < *n; i++) { >>> if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) >>> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >>> rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; >>> } >> >> A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in >> the R sources themselves, that is not the case in package code. >> >> Hence, not only nicer to read but even faster is >> >> double *fj = REAL(sexp_fjac); >> for (j = 0; j < *n; j++) >> for (i = 0; i < *n; i++) { >> if( !R_FINITE(fj[(*n)*j + i]) ) >> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >> rjac[(*ldr)*j + i] = fj[(*n)*j + i]; >> } >> > [...] > > isn't this even easier to read (and correct?): > > for (j = 0; j < n*; j++) > for (i = 0; i < n*; i++){ > if ( !R_FINITE(hessian[i][j]) ) error("blah...") > } > > ? In .C land, that is. (And sure, I'm afraid of ?Inf in this context.) >Only if you have lda and n equal (which you indeed have; but still worth mentioning) when calling dpoco. Berend