>>>>> William Dunlap via R-help <r-help at r-project.org> >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes:> Perhaps the C function Rf_logspace_sum(double *x, int n) would help in > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, avoiding > unnecessary under- and overflow. Indeed! I had thought more than twice to also export it to the R level notably as we have been using two R level versions in a package I maintain ('copula'). They are vectorized there in a way that seemed particularly useful to our (Marius Hofert and my) use cases. More on this -- making these available in R, how exactly? -- probably should move to the R-devel list. Thank you Bill for bringing it up! Martin > Bill Dunlap > TIBCO Software > wdunlap tibco.com > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote: >> On 07/11/16 13:07, William Dunlap wrote: >> >>> Have you tried reparameterizing, using logb (=log(b)) instead of b? >>> >> >> Uh, no. I don't think that that makes any sense in my context. >> >> The "b" values are probabilities and must satisfy a "sum-to-1" >> constraint. To accommodate this constraint I re-parametrise via a >> "logistic" style parametrisation --- basically >> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n >> >> with the parameters that the optimiser works with being z_1, ..., z_{n-1} >> (and with z_n == 0 for identifiability). The objective function is of the >> form sum_i(a_i * log(b_i)), so I transform back >> from the z_i to the b_i in order calculate the value of the objective >> function. But when the z_i get moderately large-negative, the b_i become >> numerically 0 and then log(b_i) becomes -Inf. And the optimiser falls over. >> >> cheers, >> >> Rolf >> >> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com <http://tibco.com> >>> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <r.turner at auckland.ac.nz >>> <mailto:r.turner at auckland.ac.nz>> wrote: >>> >>> >>> I am trying to deal with a maximisation problem in which it is >>> possible for the objective function to (quite legitimately) return >>> the value -Inf, which causes the numerical optimisers that I have >>> tried to fall over. >>> >>> The -Inf values arise from expressions of the form "a * log(b)", >>> with b = 0. Under the *starting* values of the parameters, a must >>> equal equal 0 whenever b = 0, so we can legitimately say that a * >>> log(b) = 0 in these circumstances. However as the maximisation >>> algorithm searches over parameters it is possible for b to take the >>> value 0 for values of >>> a that are strictly positive. (The values of "a" do not change during >>> this search, although they *do* change between "successive searches".) >>> >>> Clearly if one is *maximising* the objective then -Inf is not a value >>> of >>> particular interest, and we should be able to "move away". But the >>> optimising function just stops. >>> >>> It is also clear that "moving away" is not a simple task; you can't >>> estimate a gradient or Hessian at a point where the function value >>> is -Inf. >>> >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser >>> that is equipped to cope with -Inf values in some sneaky way? >>> >>> Various ad hoc kludges spring to mind, but they all seem to be >>> fraught with peril. >>> >>> I have tried changing the value returned by the objective function >>> from >>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. >>> However this seemed to flatten out the objective surface too much, >>> and the search stalled at the 0 value, which is the antithesis of >>> optimal. >>> >>> The problem arises in a context of applying the EM algorithm where >>> the M-step cannot be carried out explicitly, whence numerical >>> optimisation. >>> I can give more detail if anyone thinks that it could be relevant. >>> >>> I would appreciate advice from younger and wiser heads! :-) >>> >>> cheers, >>> >>> Rolf Turner >>> >>> -- >>> Technical Editor ANZJS >>> Department of Statistics >>> University of Auckland >>> Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2 088276> >>> >>> ______________________________________________ >>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- >>> To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> <https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> <http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >> >> -- >> Technical Editor ANZJS >> Department of Statistics >> University of Auckland >> Phone: +64-9-373-7599 ext. 88276 >> > [[alternative HTML version deleted]] > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and Rf_logspace_sub were available as R functions. (I wish the '_sub' were '_subtract' because 'sub' means too many things in R.) I think Rf_logspace_sum in R could be a little better. E.g., using the C code #include <R.h> #include <Rinternals.h> #include <Rmath.h> SEXP Call_logspace_sum(SEXP x) { if (TYPEOF(x) != REALSXP) { Rf_error("'x' must be a numeric vector"); } return ScalarReal(Rf_logspace_sum(REAL(x), length(x))); } and the R functions logspace_sum <- function (x) .Call("Call_logspace_sum", as.numeric(x)) and test <- function (x) { x <- as.numeric(x) rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))), Rf_logspace_sum = logspace_sum(x), subtract_xmax = log(sum(exp(x - max(x)))) + max(x), naive = log(sum(exp(x)))) } R-3.3.2 on Linux gives, after options(digits=17)> test(c(0, -50))[,1] Rmpfr 1.9287498479639178e-22 Rf_logspace_sum 1.9287498479639178e-22 subtract_xmax 0.0000000000000000e+00 naive 0.0000000000000000e+00 which is nice, but also the not so nice> test(c(0, -50, -50))[,1] Rmpfr 3.8574996959278356e-22 Rf_logspace_sum 0.0000000000000000e+00 subtract_xmax 0.0000000000000000e+00 naive 0.0000000000000000e+00 With TERR the second test has Rmpfr==Rf_logspace_sum for that example. Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:> >>>>> William Dunlap via R-help <r-help at r-project.org> > >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes: > > > Perhaps the C function Rf_logspace_sum(double *x, int n) would help > in > > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, > avoiding > > unnecessary under- and overflow. > > Indeed! > > I had thought more than twice to also export it to the R level > notably as we have been using two R level versions in a package > I maintain ('copula'). They are vectorized there in a way that > seemed particularly useful to our (Marius Hofert and my) use cases. > > More on this -- making these available in R, how exactly? -- > probably should move to the R-devel list. > > Thank you Bill for bringing it up! > Martin > > > Bill Dunlap > > TIBCO Software > > wdunlap tibco.com > > > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at auckland.ac.nz> > wrote: > > >> On 07/11/16 13:07, William Dunlap wrote: > >> > >>> Have you tried reparameterizing, using logb (=log(b)) instead of b? > >>> > >> > >> Uh, no. I don't think that that makes any sense in my context. > >> > >> The "b" values are probabilities and must satisfy a "sum-to-1" > >> constraint. To accommodate this constraint I re-parametrise via a > >> "logistic" style parametrisation --- basically > >> > >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n > >> > >> with the parameters that the optimiser works with being z_1, ..., > z_{n-1} > >> (and with z_n == 0 for identifiability). The objective function is > of the > >> form sum_i(a_i * log(b_i)), so I transform back > >> from the z_i to the b_i in order calculate the value of the > objective > >> function. But when the z_i get moderately large-negative, the b_i > become > >> numerically 0 and then log(b_i) becomes -Inf. And the optimiser > falls over. > >> > >> cheers, > >> > >> Rolf > >> > >> > >>> Bill Dunlap > >>> TIBCO Software > >>> wdunlap tibco.com <http://tibco.com> > >>> > >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < > r.turner at auckland.ac.nz > >>> <mailto:r.turner at auckland.ac.nz>> wrote: > >>> > >>> > >>> I am trying to deal with a maximisation problem in which it is > >>> possible for the objective function to (quite legitimately) return > >>> the value -Inf, which causes the numerical optimisers that I have > >>> tried to fall over. > >>> > >>> The -Inf values arise from expressions of the form "a * log(b)", > >>> with b = 0. Under the *starting* values of the parameters, a must > >>> equal equal 0 whenever b = 0, so we can legitimately say that a * > >>> log(b) = 0 in these circumstances. However as the maximisation > >>> algorithm searches over parameters it is possible for b to take the > >>> value 0 for values of > >>> a that are strictly positive. (The values of "a" do not change > during > >>> this search, although they *do* change between "successive > searches".) > >>> > >>> Clearly if one is *maximising* the objective then -Inf is not a > value > >>> of > >>> particular interest, and we should be able to "move away". But the > >>> optimising function just stops. > >>> > >>> It is also clear that "moving away" is not a simple task; you can't > >>> estimate a gradient or Hessian at a point where the function value > >>> is -Inf. > >>> > >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser > >>> that is equipped to cope with -Inf values in some sneaky way? > >>> > >>> Various ad hoc kludges spring to mind, but they all seem to be > >>> fraught with peril. > >>> > >>> I have tried changing the value returned by the objective function > >>> from > >>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. > >>> However this seemed to flatten out the objective surface too much, > >>> and the search stalled at the 0 value, which is the antithesis of > >>> optimal. > >>> > >>> The problem arises in a context of applying the EM algorithm where > >>> the M-step cannot be carried out explicitly, whence numerical > >>> optimisation. > >>> I can give more detail if anyone thinks that it could be relevant. > >>> > >>> I would appreciate advice from younger and wiser heads! :-) > >>> > >>> cheers, > >>> > >>> Rolf Turner > >>> > >>> -- > >>> Technical Editor ANZJS > >>> Department of Statistics > >>> University of Auckland > >>> Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2 > 088276> > >>> > >>> ______________________________________________ > >>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- > >>> To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> <https://stat.ethz.ch/mailman/listinfo/r-help> > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> <http://www.R-project.org/posting-guide.html> > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > >>> > >>> > >> > >> -- > >> Technical Editor ANZJS > >> Department of Statistics > >> University of Auckland > >> Phone: +64-9-373-7599 ext. 88276 > >> > > > [[alternative HTML version deleted]] > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Using log1p instead of log improves the accuracy of the 'subtract xmax' algorithm when the largest x is very close to zero. Perhaps that is what is missing in the Rf_logspace_add. test <- function (x) { x <- as.numeric(x) i <- which.max(x) rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))), Rf_logspace_sum = logspace_sum(x), subtract_xmax_log1p = log1p(sum(exp(x[-i] - x[i]))) + x[i], subtract_xmax_naive = log(sum(exp(x - x[i]))) + x[i], naive = log(sum(exp(x)))) }> test(c(-1e-50, -46, -47))[,1] Rmpfr 1.4404614986241e-20 Rf_logspace_sum -1.0000000000000e-50 subtract_xmax_log1p 1.4404614986241e-20 subtract_xmax_naive -1.0000000000000e-50 naive 0.0000000000000e+00> test(c(0, -46, -47))[,1] Rmpfr 1.4404614986241e-20 Rf_logspace_sum 0.0000000000000e+00 subtract_xmax_log1p 1.4404614986241e-20 subtract_xmax_naive 0.0000000000000e+00 naive 0.0000000000000e+00 Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Nov 7, 2016 at 11:09 AM, William Dunlap <wdunlap at tibco.com> wrote:> It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and > Rf_logspace_sub were available as R functions. (I wish the '_sub' were > '_subtract' because 'sub' means too many things in R.) > > I think Rf_logspace_sum in R could be a little better. E.g., using the C > code > > #include <R.h> > #include <Rinternals.h> > #include <Rmath.h> > > SEXP Call_logspace_sum(SEXP x) > { > if (TYPEOF(x) != REALSXP) > { > Rf_error("'x' must be a numeric vector"); > } > return ScalarReal(Rf_logspace_sum(REAL(x), length(x))); > } > > and the R functions > > logspace_sum <- function (x) .Call("Call_logspace_sum", as.numeric(x)) > > and > > test <- function (x) { > x <- as.numeric(x) > rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, > precBits=5000))))), > Rf_logspace_sum = logspace_sum(x), > subtract_xmax = log(sum(exp(x - max(x)))) + max(x), > naive = log(sum(exp(x)))) > } > > > R-3.3.2 on Linux gives, after options(digits=17) > > test(c(0, -50)) > [,1] > Rmpfr 1.9287498479639178e-22 > Rf_logspace_sum 1.9287498479639178e-22 > subtract_xmax 0.0000000000000000e+00 > naive 0.0000000000000000e+00 > > which is nice, but also the not so nice > > > test(c(0, -50, -50)) > [,1] > Rmpfr 3.8574996959278356e-22 > Rf_logspace_sum 0.0000000000000000e+00 > subtract_xmax 0.0000000000000000e+00 > naive 0.0000000000000000e+00 > > With TERR the second test has Rmpfr==Rf_logspace_sum for that example. > > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler < > maechler at stat.math.ethz.ch> wrote: > >> >>>>> William Dunlap via R-help <r-help at r-project.org> >> >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes: >> >> > Perhaps the C function Rf_logspace_sum(double *x, int n) would help >> in >> > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, >> avoiding >> > unnecessary under- and overflow. >> >> Indeed! >> >> I had thought more than twice to also export it to the R level >> notably as we have been using two R level versions in a package >> I maintain ('copula'). They are vectorized there in a way that >> seemed particularly useful to our (Marius Hofert and my) use cases. >> >> More on this -- making these available in R, how exactly? -- >> probably should move to the R-devel list. >> >> Thank you Bill for bringing it up! >> Martin >> >> > Bill Dunlap >> > TIBCO Software >> > wdunlap tibco.com >> >> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner < >> r.turner at auckland.ac.nz> wrote: >> >> >> On 07/11/16 13:07, William Dunlap wrote: >> >> >> >>> Have you tried reparameterizing, using logb (=log(b)) instead of >> b? >> >>> >> >> >> >> Uh, no. I don't think that that makes any sense in my context. >> >> >> >> The "b" values are probabilities and must satisfy a "sum-to-1" >> >> constraint. To accommodate this constraint I re-parametrise via a >> >> "logistic" style parametrisation --- basically >> >> >> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n >> >> >> >> with the parameters that the optimiser works with being z_1, ..., >> z_{n-1} >> >> (and with z_n == 0 for identifiability). The objective function >> is of the >> >> form sum_i(a_i * log(b_i)), so I transform back >> >> from the z_i to the b_i in order calculate the value of the >> objective >> >> function. But when the z_i get moderately large-negative, the b_i >> become >> >> numerically 0 and then log(b_i) becomes -Inf. And the optimiser >> falls over. >> >> >> >> cheers, >> >> >> >> Rolf >> >> >> >> >> >>> Bill Dunlap >> >>> TIBCO Software >> >>> wdunlap tibco.com <http://tibco.com> >> >>> >> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < >> r.turner at auckland.ac.nz >> >>> <mailto:r.turner at auckland.ac.nz>> wrote: >> >>> >> >>> >> >>> I am trying to deal with a maximisation problem in which it is >> >>> possible for the objective function to (quite legitimately) return >> >>> the value -Inf, which causes the numerical optimisers that I have >> >>> tried to fall over. >> >>> >> >>> The -Inf values arise from expressions of the form "a * log(b)", >> >>> with b = 0. Under the *starting* values of the parameters, a must >> >>> equal equal 0 whenever b = 0, so we can legitimately say that a * >> >>> log(b) = 0 in these circumstances. However as the maximisation >> >>> algorithm searches over parameters it is possible for b to take >> the >> >>> value 0 for values of >> >>> a that are strictly positive. (The values of "a" do not change >> during >> >>> this search, although they *do* change between "successive >> searches".) >> >>> >> >>> Clearly if one is *maximising* the objective then -Inf is not a >> value >> >>> of >> >>> particular interest, and we should be able to "move away". But >> the >> >>> optimising function just stops. >> >>> >> >>> It is also clear that "moving away" is not a simple task; you >> can't >> >>> estimate a gradient or Hessian at a point where the function value >> >>> is -Inf. >> >>> >> >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser >> >>> that is equipped to cope with -Inf values in some sneaky way? >> >>> >> >>> Various ad hoc kludges spring to mind, but they all seem to be >> >>> fraught with peril. >> >>> >> >>> I have tried changing the value returned by the objective function >> >>> from >> >>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. >> >>> However this seemed to flatten out the objective surface too much, >> >>> and the search stalled at the 0 value, which is the antithesis of >> >>> optimal. >> >>> >> >>> The problem arises in a context of applying the EM algorithm where >> >>> the M-step cannot be carried out explicitly, whence numerical >> >>> optimisation. >> >>> I can give more detail if anyone thinks that it could be relevant. >> >>> >> >>> I would appreciate advice from younger and wiser heads! :-) >> >>> >> >>> cheers, >> >>> >> >>> Rolf Turner >> >>> >> >>> -- >> >>> Technical Editor ANZJS >> >>> Department of Statistics >> >>> University of Auckland >> >>> Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2 >> 088276> >> >>> >> >>> ______________________________________________ >> >>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list >> -- >> >>> To UNSUBSCRIBE and more, see >> >>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>> <https://stat.ethz.ch/mailman/listinfo/r-help> >> >>> PLEASE do read the posting guide >> >>> http://www.R-project.org/posting-guide.html >> >>> <http://www.R-project.org/posting-guide.html> >> >>> and provide commented, minimal, self-contained, reproducible code. >> >>> >> >>> >> >>> >> >> >> >> -- >> >> Technical Editor ANZJS >> >> Department of Statistics >> >> University of Auckland >> >> Phone: +64-9-373-7599 ext. 88276 >> >> >> >> > [[alternative HTML version deleted]] >> >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]