On 07/11/16 13:07, William Dunlap wrote:> Have you tried reparameterizing, using logb (=log(b)) instead of b?Uh, no. I don't think that that makes any sense in my context. The "b" values are probabilities and must satisfy a "sum-to-1" constraint. To accommodate this constraint I re-parametrise via a "logistic" style parametrisation --- basically b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n with the parameters that the optimiser works with being z_1, ..., z_{n-1} (and with z_n == 0 for identifiability). The objective function is of the form sum_i(a_i * log(b_i)), so I transform back from the z_i to the b_i in order calculate the value of the objective function. But when the z_i get moderately large-negative, the b_i become numerically 0 and then log(b_i) becomes -Inf. And the optimiser falls over. cheers, Rolf> > Bill Dunlap > TIBCO Software > wdunlap tibco.com <http://tibco.com> > > On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <r.turner at auckland.ac.nz > <mailto:r.turner at auckland.ac.nz>> wrote: > > > I am trying to deal with a maximisation problem in which it is > possible for the objective function to (quite legitimately) return > the value -Inf, which causes the numerical optimisers that I have > tried to fall over. > > The -Inf values arise from expressions of the form "a * log(b)", > with b = 0. Under the *starting* values of the parameters, a must > equal equal 0 whenever b = 0, so we can legitimately say that a * > log(b) = 0 in these circumstances. However as the maximisation > algorithm searches over parameters it is possible for b to take the > value 0 for values of > a that are strictly positive. (The values of "a" do not change during > this search, although they *do* change between "successive searches".) > > Clearly if one is *maximising* the objective then -Inf is not a value of > particular interest, and we should be able to "move away". But the > optimising function just stops. > > It is also clear that "moving away" is not a simple task; you can't > estimate a gradient or Hessian at a point where the function value > is -Inf. > > Can anyone suggest a way out of this dilemma, perhaps an optimiser > that is equipped to cope with -Inf values in some sneaky way? > > Various ad hoc kludges spring to mind, but they all seem to be > fraught with peril. > > I have tried changing the value returned by the objective function from > "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. > However this seemed to flatten out the objective surface too much, > and the search stalled at the 0 value, which is the antithesis of > optimal. > > The problem arises in a context of applying the EM algorithm where > the M-step cannot be carried out explicitly, whence numerical > optimisation. > I can give more detail if anyone thinks that it could be relevant. > > I would appreciate advice from younger and wiser heads! :-) > > cheers, > > Rolf Turner > > -- > Technical Editor ANZJS > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2088276> > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- > To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > >-- Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
On Mon, 7 Nov 2016, Rolf Turner wrote:> On 07/11/16 13:07, William Dunlap wrote: >> Have you tried reparameterizing, using logb (=log(b)) instead of b? > > Uh, no. I don't think that that makes any sense in my context. > > The "b" values are probabilities and must satisfy a "sum-to-1" constraint. > To accommodate this constraint I re-parametrise via a "logistic" style > parametrisation --- basically > > b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n > > with the parameters that the optimiser works with being z_1, ..., z_{n-1} > (and with z_n == 0 for identifiability). The objective function is of the > form sum_i(a_i * log(b_i)),This is sum_i(a_i * z_i) - sum(a_i)*log(sum_j(exp(z_j)), isn't it? So you don't need to evaluate b_i here, do you? Large values of z_j will lead to exp(z_j) == Inf, but using sum_i(a_i * (z_i-max.z)) - sum(a_i)*log(sum_j(exp(z_j-max.z)) will handle that. HTH, Chuck p.s. Regarding "advice from younger and wiser heads", I probably cannot claim to be either.
On 07/11/16 15:46, Charles C. Berry wrote:> On Mon, 7 Nov 2016, Rolf Turner wrote: > >> On 07/11/16 13:07, William Dunlap wrote: >>> Have you tried reparameterizing, using logb (=log(b)) instead of b? >> >> Uh, no. I don't think that that makes any sense in my context. >> >> The "b" values are probabilities and must satisfy a "sum-to-1" >> constraint. To accommodate this constraint I re-parametrise via a >> "logistic" style parametrisation --- basically >> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n >> >> with the parameters that the optimiser works with being z_1, ..., >> z_{n-1} (and with z_n == 0 for identifiability). The objective >> function is of the form sum_i(a_i * log(b_i)), > > > This is sum_i(a_i * z_i) - sum(a_i)*log(sum_j(exp(z_j)), isn't it? > > So you don't need to evaluate b_i here, do you? > > Large values of z_j will lead to exp(z_j) == Inf, but using > > sum_i(a_i * (z_i-max.z)) - sum(a_i)*log(sum_j(exp(z_j-max.z)) > > will handle that.Wow!!! That looks like it will work!!! I won't completely believe it until I've programmed it up and tried it --- but for the first time in days I'm feeling hopeful.> > HTH, > > Chuck > > p.s. Regarding "advice from younger and wiser heads", I probably cannot > claim to be either.On present evidence you certainly appear to be one hell of a lot wiser!!! Thanks. cheers, Rolf -- Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
Perhaps the C function Rf_logspace_sum(double *x, int n) would help in computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, avoiding unnecessary under- and overflow. Bill Dunlap TIBCO Software wdunlap tibco.com On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:> On 07/11/16 13:07, William Dunlap wrote: > >> Have you tried reparameterizing, using logb (=log(b)) instead of b? >> > > Uh, no. I don't think that that makes any sense in my context. > > The "b" values are probabilities and must satisfy a "sum-to-1" > constraint. To accommodate this constraint I re-parametrise via a > "logistic" style parametrisation --- basically > > b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n > > with the parameters that the optimiser works with being z_1, ..., z_{n-1} > (and with z_n == 0 for identifiability). The objective function is of the > form sum_i(a_i * log(b_i)), so I transform back > from the z_i to the b_i in order calculate the value of the objective > function. But when the z_i get moderately large-negative, the b_i become > numerically 0 and then log(b_i) becomes -Inf. And the optimiser falls over. > > cheers, > > Rolf > > >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com <http://tibco.com> >> >> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <r.turner at auckland.ac.nz >> <mailto:r.turner at auckland.ac.nz>> wrote: >> >> >> I am trying to deal with a maximisation problem in which it is >> possible for the objective function to (quite legitimately) return >> the value -Inf, which causes the numerical optimisers that I have >> tried to fall over. >> >> The -Inf values arise from expressions of the form "a * log(b)", >> with b = 0. Under the *starting* values of the parameters, a must >> equal equal 0 whenever b = 0, so we can legitimately say that a * >> log(b) = 0 in these circumstances. However as the maximisation >> algorithm searches over parameters it is possible for b to take the >> value 0 for values of >> a that are strictly positive. (The values of "a" do not change during >> this search, although they *do* change between "successive searches".) >> >> Clearly if one is *maximising* the objective then -Inf is not a value >> of >> particular interest, and we should be able to "move away". But the >> optimising function just stops. >> >> It is also clear that "moving away" is not a simple task; you can't >> estimate a gradient or Hessian at a point where the function value >> is -Inf. >> >> Can anyone suggest a way out of this dilemma, perhaps an optimiser >> that is equipped to cope with -Inf values in some sneaky way? >> >> Various ad hoc kludges spring to mind, but they all seem to be >> fraught with peril. >> >> I have tried changing the value returned by the objective function >> from >> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. >> However this seemed to flatten out the objective surface too much, >> and the search stalled at the 0 value, which is the antithesis of >> optimal. >> >> The problem arises in a context of applying the EM algorithm where >> the M-step cannot be carried out explicitly, whence numerical >> optimisation. >> I can give more detail if anyone thinks that it could be relevant. >> >> I would appreciate advice from younger and wiser heads! :-) >> >> cheers, >> >> Rolf Turner >> >> -- >> Technical Editor ANZJS >> Department of Statistics >> University of Auckland >> Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2 >> 088276> >> >> ______________________________________________ >> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- >> To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> <https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> <http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > > -- > Technical Editor ANZJS > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 >[[alternative HTML version deleted]]
>>>>> William Dunlap via R-help <r-help at r-project.org> >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes:> Perhaps the C function Rf_logspace_sum(double *x, int n) would help in > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, avoiding > unnecessary under- and overflow. Indeed! I had thought more than twice to also export it to the R level notably as we have been using two R level versions in a package I maintain ('copula'). They are vectorized there in a way that seemed particularly useful to our (Marius Hofert and my) use cases. More on this -- making these available in R, how exactly? -- probably should move to the R-devel list. Thank you Bill for bringing it up! Martin > Bill Dunlap > TIBCO Software > wdunlap tibco.com > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote: >> On 07/11/16 13:07, William Dunlap wrote: >> >>> Have you tried reparameterizing, using logb (=log(b)) instead of b? >>> >> >> Uh, no. I don't think that that makes any sense in my context. >> >> The "b" values are probabilities and must satisfy a "sum-to-1" >> constraint. To accommodate this constraint I re-parametrise via a >> "logistic" style parametrisation --- basically >> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n >> >> with the parameters that the optimiser works with being z_1, ..., z_{n-1} >> (and with z_n == 0 for identifiability). The objective function is of the >> form sum_i(a_i * log(b_i)), so I transform back >> from the z_i to the b_i in order calculate the value of the objective >> function. But when the z_i get moderately large-negative, the b_i become >> numerically 0 and then log(b_i) becomes -Inf. And the optimiser falls over. >> >> cheers, >> >> Rolf >> >> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com <http://tibco.com> >>> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <r.turner at auckland.ac.nz >>> <mailto:r.turner at auckland.ac.nz>> wrote: >>> >>> >>> I am trying to deal with a maximisation problem in which it is >>> possible for the objective function to (quite legitimately) return >>> the value -Inf, which causes the numerical optimisers that I have >>> tried to fall over. >>> >>> The -Inf values arise from expressions of the form "a * log(b)", >>> with b = 0. Under the *starting* values of the parameters, a must >>> equal equal 0 whenever b = 0, so we can legitimately say that a * >>> log(b) = 0 in these circumstances. However as the maximisation >>> algorithm searches over parameters it is possible for b to take the >>> value 0 for values of >>> a that are strictly positive. (The values of "a" do not change during >>> this search, although they *do* change between "successive searches".) >>> >>> Clearly if one is *maximising* the objective then -Inf is not a value >>> of >>> particular interest, and we should be able to "move away". But the >>> optimising function just stops. >>> >>> It is also clear that "moving away" is not a simple task; you can't >>> estimate a gradient or Hessian at a point where the function value >>> is -Inf. >>> >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser >>> that is equipped to cope with -Inf values in some sneaky way? >>> >>> Various ad hoc kludges spring to mind, but they all seem to be >>> fraught with peril. >>> >>> I have tried changing the value returned by the objective function >>> from >>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. >>> However this seemed to flatten out the objective surface too much, >>> and the search stalled at the 0 value, which is the antithesis of >>> optimal. >>> >>> The problem arises in a context of applying the EM algorithm where >>> the M-step cannot be carried out explicitly, whence numerical >>> optimisation. >>> I can give more detail if anyone thinks that it could be relevant. >>> >>> I would appreciate advice from younger and wiser heads! :-) >>> >>> cheers, >>> >>> Rolf Turner >>> >>> -- >>> Technical Editor ANZJS >>> Department of Statistics >>> University of Auckland >>> Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2 088276> >>> >>> ______________________________________________ >>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- >>> To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> <https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> <http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >> >> -- >> Technical Editor ANZJS >> Department of Statistics >> University of Auckland >> Phone: +64-9-373-7599 ext. 88276 >> > [[alternative HTML version deleted]] > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.