thr3ads.net - R help - [R] Dealing with -Inf in a maximisation problem. [Nov 2016]

If this information is useful, please help other people find it:
Share via:

Martin Maechler

2016-Nov-07 11:08 UTC

[R] Dealing with -Inf in a maximisation problem.

>>>>> William Dunlap via R-help <r-help at r-project.org>
>>>>>     on Sun, 6 Nov 2016 20:53:17 -0800 writes:
    > Perhaps the C function Rf_logspace_sum(double *x, int n) would help in
    > computing log(b).  It computes log(sum(exp(x_i))) for i in 1..n,
avoiding
    > unnecessary under- and overflow.

Indeed!

I had thought more than twice to also export it to the R level
notably as we have been using two R level versions in a package
I maintain ('copula'). They are vectorized there in a way that
seemed particularly useful to our (Marius Hofert and my) use cases.

More on this -- making these available in R, how exactly? --
probably should move to the R-devel list.

Thank you Bill for bringing it up!
Martin

    > Bill Dunlap
    > TIBCO Software
    > wdunlap tibco.com

    > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at
auckland.ac.nz> wrote:

    >> On 07/11/16 13:07, William Dunlap wrote:
    >> 
    >>> Have you tried reparameterizing, using logb (=log(b)) instead
of b?
    >>> 
    >> 
    >> Uh, no.  I don't think that that makes any sense in my context.
    >> 
    >> The "b" values are probabilities and must satisfy a
"sum-to-1"
    >> constraint.  To accommodate this constraint I re-parametrise via a
    >> "logistic" style parametrisation --- basically
    >> 
    >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
    >> 
    >> with the parameters that the optimiser works with being z_1, ...,
z_{n-1}
    >> (and with z_n == 0 for identifiability).  The objective function is
of the
    >> form sum_i(a_i * log(b_i)), so I transform back
    >> from the z_i to the b_i in order calculate the value of the
objective
    >> function.  But when the z_i get moderately large-negative, the b_i
become
    >> numerically 0 and then log(b_i) becomes -Inf.  And the optimiser
falls over.
    >> 
    >> cheers,
    >> 
    >> Rolf
    >> 
    >> 
    >>> Bill Dunlap
    >>> TIBCO Software
    >>> wdunlap tibco.com <http://tibco.com>
    >>> 
    >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <r.turner at
auckland.ac.nz
    >>> <mailto:r.turner at auckland.ac.nz>> wrote:
    >>> 
    >>> 
    >>> I am trying to deal with a maximisation problem in which it is
    >>> possible for the objective function to (quite legitimately)
return
    >>> the value -Inf, which causes the numerical optimisers that I
have
    >>> tried to fall over.
    >>> 
    >>> The -Inf values arise from expressions of the form "a *
log(b)",
    >>> with b = 0.  Under the *starting* values of the parameters, a
must
    >>> equal equal 0 whenever b = 0, so we can legitimately say that a
*
    >>> log(b) = 0 in these circumstances.  However as the maximisation
    >>> algorithm searches over parameters it is possible for b to take
the
    >>> value 0 for values of
    >>> a that are strictly positive.  (The values of "a" do
not change during
    >>> this search, although they *do* change between "successive
searches".)
    >>> 
    >>> Clearly if one is *maximising* the objective then -Inf is not a
value
    >>> of
    >>> particular interest, and we should be able to "move
away".  But the
    >>> optimising function just stops.
    >>> 
    >>> It is also clear that "moving away" is not a simple
task; you can't
    >>> estimate a gradient or Hessian at a point where the function
value
    >>> is -Inf.
    >>> 
    >>> Can anyone suggest a way out of this dilemma, perhaps an
optimiser
    >>> that is equipped to cope with -Inf values in some sneaky way?
    >>> 
    >>> Various ad hoc kludges spring to mind, but they all seem to be
    >>> fraught with peril.
    >>> 
    >>> I have tried changing the value returned by the objective
function
    >>> from
    >>> "v" to exp(v) --- which maps -Inf to 0, which is nice
and finite.
    >>> However this seemed to flatten out the objective surface too
much,
    >>> and the search stalled at the 0 value, which is the antithesis
of
    >>> optimal.
    >>> 
    >>> The problem arises in a context of applying the EM algorithm
where
    >>> the M-step cannot be carried out explicitly, whence numerical
    >>> optimisation.
    >>> I can give more detail if anyone thinks that it could be
relevant.
    >>> 
    >>> I would appreciate advice from younger and wiser heads! :-)
    >>> 
    >>> cheers,
    >>> 
    >>> Rolf Turner
    >>> 
    >>> --
    >>> Technical Editor ANZJS
    >>> Department of Statistics
    >>> University of Auckland
    >>> Phone: +64-9-373-7599 ext. 88276
<tel:%2B64-9-373-7599%20ext.%2
    088276> 
    >>> 
    >>> ______________________________________________
    >>> R-help at r-project.org <mailto:R-help at r-project.org>
mailing list --
    >>> To UNSUBSCRIBE and more, see
    >>> https://stat.ethz.ch/mailman/listinfo/r-help
    >>> <https://stat.ethz.ch/mailman/listinfo/r-help>
    >>> PLEASE do read the posting guide
    >>> http://www.R-project.org/posting-guide.html
    >>> <http://www.R-project.org/posting-guide.html>
    >>> and provide commented, minimal, self-contained, reproducible
code.
    >>> 
    >>> 
    >>> 
    >> 
    >> --
    >> Technical Editor ANZJS
    >> Department of Statistics
    >> University of Auckland
    >> Phone: +64-9-373-7599 ext. 88276
    >> 

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.

William Dunlap

2016-Nov-07 19:09 UTC

head link

[R] Dealing with -Inf in a maximisation problem.

It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
Rf_logspace_sub were available as R functions.  (I wish the '_sub' were
'_subtract' because 'sub' means too many things in R.)

I think Rf_logspace_sum in R could be a little better. E.g., using the C
code

#include <R.h>
#include <Rinternals.h>
#include <Rmath.h>

SEXP Call_logspace_sum(SEXP x)
{
    if (TYPEOF(x) != REALSXP)
    {
        Rf_error("'x' must be a numeric vector");
    }
    return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
}

and the R functions

logspace_sum <- function (x) .Call("Call_logspace_sum",
as.numeric(x))

and

test <- function (x) {
    x <- as.numeric(x)
    rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))),
          Rf_logspace_sum = logspace_sum(x),
          subtract_xmax = log(sum(exp(x - max(x)))) + max(x),
          naive = log(sum(exp(x))))
}


R-3.3.2 on Linux gives, after options(digits=17)> test(c(0, -50))                                  [,1]
Rmpfr           1.9287498479639178e-22
Rf_logspace_sum 1.9287498479639178e-22
subtract_xmax   0.0000000000000000e+00
naive           0.0000000000000000e+00

which is nice, but also the not so nice
> test(c(0, -50, -50))                                  [,1]
Rmpfr           3.8574996959278356e-22
Rf_logspace_sum 0.0000000000000000e+00
subtract_xmax   0.0000000000000000e+00
naive           0.0000000000000000e+00

With TERR the second test has Rmpfr==Rf_logspace_sum for that example.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler <maechler at
stat.math.ethz.ch>
wrote:
> >>>>> William Dunlap via R-help <r-help at
r-project.org>
> >>>>>     on Sun, 6 Nov 2016 20:53:17 -0800 writes:
>
>     > Perhaps the C function Rf_logspace_sum(double *x, int n) would
help
> in
>     > computing log(b).  It computes log(sum(exp(x_i))) for i in 1..n,
> avoiding
>     > unnecessary under- and overflow.
>
> Indeed!
>
> I had thought more than twice to also export it to the R level
> notably as we have been using two R level versions in a package
> I maintain ('copula'). They are vectorized there in a way that
> seemed particularly useful to our (Marius Hofert and my) use cases.
>
> More on this -- making these available in R, how exactly? --
> probably should move to the R-devel list.
>
> Thank you Bill for bringing it up!
> Martin
>
>     > Bill Dunlap
>     > TIBCO Software
>     > wdunlap tibco.com
>
>     > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at
auckland.ac.nz>
> wrote:
>
>     >> On 07/11/16 13:07, William Dunlap wrote:
>     >>
>     >>> Have you tried reparameterizing, using logb (=log(b))
instead of b?
>     >>>
>     >>
>     >> Uh, no.  I don't think that that makes any sense in my
context.
>     >>
>     >> The "b" values are probabilities and must satisfy a
"sum-to-1"
>     >> constraint.  To accommodate this constraint I re-parametrise
via a
>     >> "logistic" style parametrisation --- basically
>     >>
>     >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>     >>
>     >> with the parameters that the optimiser works with being z_1,
...,
> z_{n-1}
>     >> (and with z_n == 0 for identifiability).  The objective
function is
> of the
>     >> form sum_i(a_i * log(b_i)), so I transform back
>     >> from the z_i to the b_i in order calculate the value of the
> objective
>     >> function.  But when the z_i get moderately large-negative, the
b_i
> become
>     >> numerically 0 and then log(b_i) becomes -Inf.  And the
optimiser
> falls over.
>     >>
>     >> cheers,
>     >>
>     >> Rolf
>     >>
>     >>
>     >>> Bill Dunlap
>     >>> TIBCO Software
>     >>> wdunlap tibco.com <http://tibco.com>
>     >>>
>     >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <
> r.turner at auckland.ac.nz
>     >>> <mailto:r.turner at auckland.ac.nz>> wrote:
>     >>>
>     >>>
>     >>> I am trying to deal with a maximisation problem in which
it is
>     >>> possible for the objective function to (quite
legitimately) return
>     >>> the value -Inf, which causes the numerical optimisers that
I have
>     >>> tried to fall over.
>     >>>
>     >>> The -Inf values arise from expressions of the form "a
* log(b)",
>     >>> with b = 0.  Under the *starting* values of the
parameters, a must
>     >>> equal equal 0 whenever b = 0, so we can legitimately say
that a *
>     >>> log(b) = 0 in these circumstances.  However as the
maximisation
>     >>> algorithm searches over parameters it is possible for b to
take the
>     >>> value 0 for values of
>     >>> a that are strictly positive.  (The values of
"a" do not change
> during
>     >>> this search, although they *do* change between
"successive
> searches".)
>     >>>
>     >>> Clearly if one is *maximising* the objective then -Inf is
not a
> value
>     >>> of
>     >>> particular interest, and we should be able to "move
away".  But the
>     >>> optimising function just stops.
>     >>>
>     >>> It is also clear that "moving away" is not a
simple task; you can't
>     >>> estimate a gradient or Hessian at a point where the
function value
>     >>> is -Inf.
>     >>>
>     >>> Can anyone suggest a way out of this dilemma, perhaps an
optimiser
>     >>> that is equipped to cope with -Inf values in some sneaky
way?
>     >>>
>     >>> Various ad hoc kludges spring to mind, but they all seem
to be
>     >>> fraught with peril.
>     >>>
>     >>> I have tried changing the value returned by the objective
function
>     >>> from
>     >>> "v" to exp(v) --- which maps -Inf to 0, which is
nice and finite.
>     >>> However this seemed to flatten out the objective surface
too much,
>     >>> and the search stalled at the 0 value, which is the
antithesis of
>     >>> optimal.
>     >>>
>     >>> The problem arises in a context of applying the EM
algorithm where
>     >>> the M-step cannot be carried out explicitly, whence
numerical
>     >>> optimisation.
>     >>> I can give more detail if anyone thinks that it could be
relevant.
>     >>>
>     >>> I would appreciate advice from younger and wiser heads!
:-)
>     >>>
>     >>> cheers,
>     >>>
>     >>> Rolf Turner
>     >>>
>     >>> --
>     >>> Technical Editor ANZJS
>     >>> Department of Statistics
>     >>> University of Auckland
>     >>> Phone: +64-9-373-7599 ext. 88276
<tel:%2B64-9-373-7599%20ext.%2
>     088276>
>     >>>
>     >>> ______________________________________________
>     >>> R-help at r-project.org <mailto:R-help at
r-project.org> mailing list --
>     >>> To UNSUBSCRIBE and more, see
>     >>> https://stat.ethz.ch/mailman/listinfo/r-help
>     >>> <https://stat.ethz.ch/mailman/listinfo/r-help>
>     >>> PLEASE do read the posting guide
>     >>> http://www.R-project.org/posting-guide.html
>     >>> <http://www.R-project.org/posting-guide.html>
>     >>> and provide commented, minimal, self-contained,
reproducible code.
>     >>>
>     >>>
>     >>>
>     >>
>     >> --
>     >> Technical Editor ANZJS
>     >> Department of Statistics
>     >> University of Auckland
>     >> Phone: +64-9-373-7599 ext. 88276
>     >>
>
>     > [[alternative HTML version deleted]]
>
>     > ______________________________________________
>     > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>     > https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
>     > and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

William Dunlap

2016-Nov-07 20:08 UTC

head link

[R] Dealing with -Inf in a maximisation problem.

Using log1p instead of log improves the accuracy of the 'subtract xmax'
algorithm when the largest x is very close to zero.  Perhaps that is what
is missing in the Rf_logspace_add.

test <- function (x) {
    x <- as.numeric(x)
    i <- which.max(x)
    rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))),
          Rf_logspace_sum = logspace_sum(x),
          subtract_xmax_log1p = log1p(sum(exp(x[-i] - x[i]))) + x[i],
          subtract_xmax_naive = log(sum(exp(x - x[i]))) + x[i],
          naive = log(sum(exp(x))))
}
> test(c(-1e-50, -46, -47))                                    [,1]
Rmpfr                1.4404614986241e-20
Rf_logspace_sum     -1.0000000000000e-50
subtract_xmax_log1p  1.4404614986241e-20
subtract_xmax_naive -1.0000000000000e-50
naive                0.0000000000000e+00> test(c(0, -46, -47))                                   [,1]
Rmpfr               1.4404614986241e-20
Rf_logspace_sum     0.0000000000000e+00
subtract_xmax_log1p 1.4404614986241e-20
subtract_xmax_naive 0.0000000000000e+00
naive               0.0000000000000e+00


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Nov 7, 2016 at 11:09 AM, William Dunlap <wdunlap at tibco.com>
wrote:
> It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
> Rf_logspace_sub were available as R functions.  (I wish the '_sub'
were
> '_subtract' because 'sub' means too many things in R.)
>
> I think Rf_logspace_sum in R could be a little better. E.g., using the C
> code
>
> #include <R.h>
> #include <Rinternals.h>
> #include <Rmath.h>
>
> SEXP Call_logspace_sum(SEXP x)
> {
>     if (TYPEOF(x) != REALSXP)
>     {
>         Rf_error("'x' must be a numeric vector");
>     }
>     return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
> }
>
> and the R functions
>
> logspace_sum <- function (x) .Call("Call_logspace_sum",
as.numeric(x))
>
> and
>
> test <- function (x) {
>     x <- as.numeric(x)
>     rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x,
> precBits=5000))))),
>           Rf_logspace_sum = logspace_sum(x),
>           subtract_xmax = log(sum(exp(x - max(x)))) + max(x),
>           naive = log(sum(exp(x))))
> }
>
>
> R-3.3.2 on Linux gives, after options(digits=17)
> > test(c(0, -50))
>                                   [,1]
> Rmpfr           1.9287498479639178e-22
> Rf_logspace_sum 1.9287498479639178e-22
> subtract_xmax   0.0000000000000000e+00
> naive           0.0000000000000000e+00
>
> which is nice, but also the not so nice
>
> > test(c(0, -50, -50))
>                                   [,1]
> Rmpfr           3.8574996959278356e-22
> Rf_logspace_sum 0.0000000000000000e+00
> subtract_xmax   0.0000000000000000e+00
> naive           0.0000000000000000e+00
>
> With TERR the second test has Rmpfr==Rf_logspace_sum for that example.
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler <
> maechler at stat.math.ethz.ch> wrote:
>
>> >>>>> William Dunlap via R-help <r-help at
r-project.org>
>> >>>>>     on Sun, 6 Nov 2016 20:53:17 -0800 writes:
>>
>>     > Perhaps the C function Rf_logspace_sum(double *x, int n) would
help
>> in
>>     > computing log(b).  It computes log(sum(exp(x_i))) for i in
1..n,
>> avoiding
>>     > unnecessary under- and overflow.
>>
>> Indeed!
>>
>> I had thought more than twice to also export it to the R level
>> notably as we have been using two R level versions in a package
>> I maintain ('copula'). They are vectorized there in a way that
>> seemed particularly useful to our (Marius Hofert and my) use cases.
>>
>> More on this -- making these available in R, how exactly? --
>> probably should move to the R-devel list.
>>
>> Thank you Bill for bringing it up!
>> Martin
>>
>>     > Bill Dunlap
>>     > TIBCO Software
>>     > wdunlap tibco.com
>>
>>     > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <
>> r.turner at auckland.ac.nz> wrote:
>>
>>     >> On 07/11/16 13:07, William Dunlap wrote:
>>     >>
>>     >>> Have you tried reparameterizing, using logb (=log(b))
instead of
>> b?
>>     >>>
>>     >>
>>     >> Uh, no.  I don't think that that makes any sense in my
context.
>>     >>
>>     >> The "b" values are probabilities and must
satisfy a "sum-to-1"
>>     >> constraint.  To accommodate this constraint I
re-parametrise via a
>>     >> "logistic" style parametrisation --- basically
>>     >>
>>     >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>>     >>
>>     >> with the parameters that the optimiser works with being
z_1, ...,
>> z_{n-1}
>>     >> (and with z_n == 0 for identifiability).  The objective
function
>> is of the
>>     >> form sum_i(a_i * log(b_i)), so I transform back
>>     >> from the z_i to the b_i in order calculate the value of
the
>> objective
>>     >> function.  But when the z_i get moderately large-negative,
the b_i
>> become
>>     >> numerically 0 and then log(b_i) becomes -Inf.  And the
optimiser
>> falls over.
>>     >>
>>     >> cheers,
>>     >>
>>     >> Rolf
>>     >>
>>     >>
>>     >>> Bill Dunlap
>>     >>> TIBCO Software
>>     >>> wdunlap tibco.com <http://tibco.com>
>>     >>>
>>     >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <
>> r.turner at auckland.ac.nz
>>     >>> <mailto:r.turner at auckland.ac.nz>> wrote:
>>     >>>
>>     >>>
>>     >>> I am trying to deal with a maximisation problem in
which it is
>>     >>> possible for the objective function to (quite
legitimately) return
>>     >>> the value -Inf, which causes the numerical optimisers
that I have
>>     >>> tried to fall over.
>>     >>>
>>     >>> The -Inf values arise from expressions of the form
"a * log(b)",
>>     >>> with b = 0.  Under the *starting* values of the
parameters, a must
>>     >>> equal equal 0 whenever b = 0, so we can legitimately
say that a *
>>     >>> log(b) = 0 in these circumstances.  However as the
maximisation
>>     >>> algorithm searches over parameters it is possible for
b to take
>> the
>>     >>> value 0 for values of
>>     >>> a that are strictly positive.  (The values of
"a" do not change
>> during
>>     >>> this search, although they *do* change between
"successive
>> searches".)
>>     >>>
>>     >>> Clearly if one is *maximising* the objective then -Inf
is not a
>> value
>>     >>> of
>>     >>> particular interest, and we should be able to
"move away".  But
>> the
>>     >>> optimising function just stops.
>>     >>>
>>     >>> It is also clear that "moving away" is not a
simple task; you
>> can't
>>     >>> estimate a gradient or Hessian at a point where the
function value
>>     >>> is -Inf.
>>     >>>
>>     >>> Can anyone suggest a way out of this dilemma, perhaps
an optimiser
>>     >>> that is equipped to cope with -Inf values in some
sneaky way?
>>     >>>
>>     >>> Various ad hoc kludges spring to mind, but they all
seem to be
>>     >>> fraught with peril.
>>     >>>
>>     >>> I have tried changing the value returned by the
objective function
>>     >>> from
>>     >>> "v" to exp(v) --- which maps -Inf to 0,
which is nice and finite.
>>     >>> However this seemed to flatten out the objective
surface too much,
>>     >>> and the search stalled at the 0 value, which is the
antithesis of
>>     >>> optimal.
>>     >>>
>>     >>> The problem arises in a context of applying the EM
algorithm where
>>     >>> the M-step cannot be carried out explicitly, whence
numerical
>>     >>> optimisation.
>>     >>> I can give more detail if anyone thinks that it could
be relevant.
>>     >>>
>>     >>> I would appreciate advice from younger and wiser
heads! :-)
>>     >>>
>>     >>> cheers,
>>     >>>
>>     >>> Rolf Turner
>>     >>>
>>     >>> --
>>     >>> Technical Editor ANZJS
>>     >>> Department of Statistics
>>     >>> University of Auckland
>>     >>> Phone: +64-9-373-7599 ext. 88276
<tel:%2B64-9-373-7599%20ext.%2
>>     088276>
>>     >>>
>>     >>> ______________________________________________
>>     >>> R-help at r-project.org <mailto:R-help at
r-project.org> mailing list
>> --
>>     >>> To UNSUBSCRIBE and more, see
>>     >>> https://stat.ethz.ch/mailman/listinfo/r-help
>>     >>> <https://stat.ethz.ch/mailman/listinfo/r-help>
>>     >>> PLEASE do read the posting guide
>>     >>> http://www.R-project.org/posting-guide.html
>>     >>> <http://www.R-project.org/posting-guide.html>
>>     >>> and provide commented, minimal, self-contained,
reproducible code.
>>     >>>
>>     >>>
>>     >>>
>>     >>
>>     >> --
>>     >> Technical Editor ANZJS
>>     >> Department of Statistics
>>     >> University of Auckland
>>     >> Phone: +64-9-373-7599 ext. 88276
>>     >>
>>
>>     > [[alternative HTML version deleted]]
>>
>>     > ______________________________________________
>>     > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>     > https://stat.ethz.ch/mailman/listinfo/r-help
>>     > PLEASE do read the posting guide
http://www.R-project.org/posti
>> ng-guide.html
>>     > and provide commented, minimal, self-contained, reproducible
code.
>>
>
>
	[[alternative HTML version deleted]]

R help - Nov 2016 - Dealing with -Inf in a maximisation problem.

[R] Dealing with -Inf in a maximisation problem.

[R] Dealing with -Inf in a maximisation problem.

[R] Dealing with -Inf in a maximisation problem.