>>>>> William Dunlap via R-help <r-help at r-project.org> >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes:> Perhaps the C function Rf_logspace_sum(double *x, int n) would help in > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, avoiding > unnecessary under- and overflow. Indeed! I had thought more than twice to also export it to the R level notably as we have been using two R level versions in a package I maintain ('copula'). They are vectorized there in a way that seemed particularly useful to our (Marius Hofert and my) use cases. More on this -- making these available in R, how exactly? -- probably should move to the R-devel list. Thank you Bill for bringing it up! Martin > Bill Dunlap > TIBCO Software > wdunlap tibco.com > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote: >> On 07/11/16 13:07, William Dunlap wrote: >> >>> Have you tried reparameterizing, using logb (=log(b)) instead of b? >>> >> >> Uh, no. I don't think that that makes any sense in my context. >> >> The "b" values are probabilities and must satisfy a "sum-to-1" >> constraint. To accommodate this constraint I re-parametrise via a >> "logistic" style parametrisation --- basically >> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n >> >> with the parameters that the optimiser works with being z_1, ..., z_{n-1} >> (and with z_n == 0 for identifiability). The objective function is of the >> form sum_i(a_i * log(b_i)), so I transform back >> from the z_i to the b_i in order calculate the value of the objective >> function. But when the z_i get moderately large-negative, the b_i become >> numerically 0 and then log(b_i) becomes -Inf. And the optimiser falls over. >> >> cheers, >> >> Rolf >> >> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com <http://tibco.com> >>> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <r.turner at auckland.ac.nz >>> <mailto:r.turner at auckland.ac.nz>> wrote: >>> >>> >>> I am trying to deal with a maximisation problem in which it is >>> possible for the objective function to (quite legitimately) return >>> the value -Inf, which causes the numerical optimisers that I have >>> tried to fall over. >>> >>> The -Inf values arise from expressions of the form "a * log(b)", >>> with b = 0. Under the *starting* values of the parameters, a must >>> equal equal 0 whenever b = 0, so we can legitimately say that a * >>> log(b) = 0 in these circumstances. However as the maximisation >>> algorithm searches over parameters it is possible for b to take the >>> value 0 for values of >>> a that are strictly positive. (The values of "a" do not change during >>> this search, although they *do* change between "successive searches".) >>> >>> Clearly if one is *maximising* the objective then -Inf is not a value >>> of >>> particular interest, and we should be able to "move away". But the >>> optimising function just stops. >>> >>> It is also clear that "moving away" is not a simple task; you can't >>> estimate a gradient or Hessian at a point where the function value >>> is -Inf. >>> >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser >>> that is equipped to cope with -Inf values in some sneaky way? >>> >>> Various ad hoc kludges spring to mind, but they all seem to be >>> fraught with peril. >>> >>> I have tried changing the value returned by the objective function >>> from >>> "v" to exp(v) --- which maps -Inf to 0, which is nice and finite. >>> However this seemed to flatten out the objective surface too much, >>> and the search stalled at the 0 value, which is the antithesis of >>> optimal. >>> >>> The problem arises in a context of applying the EM algorithm where >>> the M-step cannot be carried out explicitly, whence numerical >>> optimisation. >>> I can give more detail if anyone thinks that it could be relevant. >>> >>> I would appreciate advice from younger and wiser heads! :-) >>> >>> cheers, >>> >>> Rolf Turner >>> >>> -- >>> Technical Editor ANZJS >>> Department of Statistics >>> University of Auckland >>> Phone: +64-9-373-7599 ext. 88276 <tel:%2B64-9-373-7599%20ext.%2 088276> >>> >>> ______________________________________________ >>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- >>> To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> <https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> <http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >> >> -- >> Technical Editor ANZJS >> Department of Statistics >> University of Auckland >> Phone: +64-9-373-7599 ext. 88276 >> > [[alternative HTML version deleted]] > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
Rf_logspace_sub were available as R functions. (I wish the '_sub' were
'_subtract' because 'sub' means too many things in R.)
I think Rf_logspace_sum in R could be a little better. E.g., using the C
code
#include <R.h>
#include <Rinternals.h>
#include <Rmath.h>
SEXP Call_logspace_sum(SEXP x)
{
if (TYPEOF(x) != REALSXP)
{
Rf_error("'x' must be a numeric vector");
}
return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
}
and the R functions
logspace_sum <- function (x) .Call("Call_logspace_sum",
as.numeric(x))
and
test <- function (x) {
x <- as.numeric(x)
rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))),
Rf_logspace_sum = logspace_sum(x),
subtract_xmax = log(sum(exp(x - max(x)))) + max(x),
naive = log(sum(exp(x))))
}
R-3.3.2 on Linux gives, after options(digits=17)> test(c(0, -50))
[,1]
Rmpfr 1.9287498479639178e-22
Rf_logspace_sum 1.9287498479639178e-22
subtract_xmax 0.0000000000000000e+00
naive 0.0000000000000000e+00
which is nice, but also the not so nice
> test(c(0, -50, -50))
[,1]
Rmpfr 3.8574996959278356e-22
Rf_logspace_sum 0.0000000000000000e+00
subtract_xmax 0.0000000000000000e+00
naive 0.0000000000000000e+00
With TERR the second test has Rmpfr==Rf_logspace_sum for that example.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler <maechler at
stat.math.ethz.ch>
wrote:
> >>>>> William Dunlap via R-help <r-help at
r-project.org>
> >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes:
>
> > Perhaps the C function Rf_logspace_sum(double *x, int n) would
help
> in
> > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n,
> avoiding
> > unnecessary under- and overflow.
>
> Indeed!
>
> I had thought more than twice to also export it to the R level
> notably as we have been using two R level versions in a package
> I maintain ('copula'). They are vectorized there in a way that
> seemed particularly useful to our (Marius Hofert and my) use cases.
>
> More on this -- making these available in R, how exactly? --
> probably should move to the R-devel list.
>
> Thank you Bill for bringing it up!
> Martin
>
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
>
> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <r.turner at
auckland.ac.nz>
> wrote:
>
> >> On 07/11/16 13:07, William Dunlap wrote:
> >>
> >>> Have you tried reparameterizing, using logb (=log(b))
instead of b?
> >>>
> >>
> >> Uh, no. I don't think that that makes any sense in my
context.
> >>
> >> The "b" values are probabilities and must satisfy a
"sum-to-1"
> >> constraint. To accommodate this constraint I re-parametrise
via a
> >> "logistic" style parametrisation --- basically
> >>
> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
> >>
> >> with the parameters that the optimiser works with being z_1,
...,
> z_{n-1}
> >> (and with z_n == 0 for identifiability). The objective
function is
> of the
> >> form sum_i(a_i * log(b_i)), so I transform back
> >> from the z_i to the b_i in order calculate the value of the
> objective
> >> function. But when the z_i get moderately large-negative, the
b_i
> become
> >> numerically 0 and then log(b_i) becomes -Inf. And the
optimiser
> falls over.
> >>
> >> cheers,
> >>
> >> Rolf
> >>
> >>
> >>> Bill Dunlap
> >>> TIBCO Software
> >>> wdunlap tibco.com <http://tibco.com>
> >>>
> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <
> r.turner at auckland.ac.nz
> >>> <mailto:r.turner at auckland.ac.nz>> wrote:
> >>>
> >>>
> >>> I am trying to deal with a maximisation problem in which
it is
> >>> possible for the objective function to (quite
legitimately) return
> >>> the value -Inf, which causes the numerical optimisers that
I have
> >>> tried to fall over.
> >>>
> >>> The -Inf values arise from expressions of the form "a
* log(b)",
> >>> with b = 0. Under the *starting* values of the
parameters, a must
> >>> equal equal 0 whenever b = 0, so we can legitimately say
that a *
> >>> log(b) = 0 in these circumstances. However as the
maximisation
> >>> algorithm searches over parameters it is possible for b to
take the
> >>> value 0 for values of
> >>> a that are strictly positive. (The values of
"a" do not change
> during
> >>> this search, although they *do* change between
"successive
> searches".)
> >>>
> >>> Clearly if one is *maximising* the objective then -Inf is
not a
> value
> >>> of
> >>> particular interest, and we should be able to "move
away". But the
> >>> optimising function just stops.
> >>>
> >>> It is also clear that "moving away" is not a
simple task; you can't
> >>> estimate a gradient or Hessian at a point where the
function value
> >>> is -Inf.
> >>>
> >>> Can anyone suggest a way out of this dilemma, perhaps an
optimiser
> >>> that is equipped to cope with -Inf values in some sneaky
way?
> >>>
> >>> Various ad hoc kludges spring to mind, but they all seem
to be
> >>> fraught with peril.
> >>>
> >>> I have tried changing the value returned by the objective
function
> >>> from
> >>> "v" to exp(v) --- which maps -Inf to 0, which is
nice and finite.
> >>> However this seemed to flatten out the objective surface
too much,
> >>> and the search stalled at the 0 value, which is the
antithesis of
> >>> optimal.
> >>>
> >>> The problem arises in a context of applying the EM
algorithm where
> >>> the M-step cannot be carried out explicitly, whence
numerical
> >>> optimisation.
> >>> I can give more detail if anyone thinks that it could be
relevant.
> >>>
> >>> I would appreciate advice from younger and wiser heads!
:-)
> >>>
> >>> cheers,
> >>>
> >>> Rolf Turner
> >>>
> >>> --
> >>> Technical Editor ANZJS
> >>> Department of Statistics
> >>> University of Auckland
> >>> Phone: +64-9-373-7599 ext. 88276
<tel:%2B64-9-373-7599%20ext.%2
> 088276>
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org <mailto:R-help at
r-project.org> mailing list --
> >>> To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> <https://stat.ethz.ch/mailman/listinfo/r-help>
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> <http://www.R-project.org/posting-guide.html>
> >>> and provide commented, minimal, self-contained,
reproducible code.
> >>>
> >>>
> >>>
> >>
> >> --
> >> Technical Editor ANZJS
> >> Department of Statistics
> >> University of Auckland
> >> Phone: +64-9-373-7599 ext. 88276
> >>
>
> > [[alternative HTML version deleted]]
>
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Using log1p instead of log improves the accuracy of the 'subtract xmax'
algorithm when the largest x is very close to zero. Perhaps that is what
is missing in the Rf_logspace_add.
test <- function (x) {
x <- as.numeric(x)
i <- which.max(x)
rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))),
Rf_logspace_sum = logspace_sum(x),
subtract_xmax_log1p = log1p(sum(exp(x[-i] - x[i]))) + x[i],
subtract_xmax_naive = log(sum(exp(x - x[i]))) + x[i],
naive = log(sum(exp(x))))
}
> test(c(-1e-50, -46, -47))
[,1]
Rmpfr 1.4404614986241e-20
Rf_logspace_sum -1.0000000000000e-50
subtract_xmax_log1p 1.4404614986241e-20
subtract_xmax_naive -1.0000000000000e-50
naive 0.0000000000000e+00> test(c(0, -46, -47))
[,1]
Rmpfr 1.4404614986241e-20
Rf_logspace_sum 0.0000000000000e+00
subtract_xmax_log1p 1.4404614986241e-20
subtract_xmax_naive 0.0000000000000e+00
naive 0.0000000000000e+00
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Nov 7, 2016 at 11:09 AM, William Dunlap <wdunlap at tibco.com>
wrote:
> It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
> Rf_logspace_sub were available as R functions. (I wish the '_sub'
were
> '_subtract' because 'sub' means too many things in R.)
>
> I think Rf_logspace_sum in R could be a little better. E.g., using the C
> code
>
> #include <R.h>
> #include <Rinternals.h>
> #include <Rmath.h>
>
> SEXP Call_logspace_sum(SEXP x)
> {
> if (TYPEOF(x) != REALSXP)
> {
> Rf_error("'x' must be a numeric vector");
> }
> return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
> }
>
> and the R functions
>
> logspace_sum <- function (x) .Call("Call_logspace_sum",
as.numeric(x))
>
> and
>
> test <- function (x) {
> x <- as.numeric(x)
> rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x,
> precBits=5000))))),
> Rf_logspace_sum = logspace_sum(x),
> subtract_xmax = log(sum(exp(x - max(x)))) + max(x),
> naive = log(sum(exp(x))))
> }
>
>
> R-3.3.2 on Linux gives, after options(digits=17)
> > test(c(0, -50))
> [,1]
> Rmpfr 1.9287498479639178e-22
> Rf_logspace_sum 1.9287498479639178e-22
> subtract_xmax 0.0000000000000000e+00
> naive 0.0000000000000000e+00
>
> which is nice, but also the not so nice
>
> > test(c(0, -50, -50))
> [,1]
> Rmpfr 3.8574996959278356e-22
> Rf_logspace_sum 0.0000000000000000e+00
> subtract_xmax 0.0000000000000000e+00
> naive 0.0000000000000000e+00
>
> With TERR the second test has Rmpfr==Rf_logspace_sum for that example.
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler <
> maechler at stat.math.ethz.ch> wrote:
>
>> >>>>> William Dunlap via R-help <r-help at
r-project.org>
>> >>>>> on Sun, 6 Nov 2016 20:53:17 -0800 writes:
>>
>> > Perhaps the C function Rf_logspace_sum(double *x, int n) would
help
>> in
>> > computing log(b). It computes log(sum(exp(x_i))) for i in
1..n,
>> avoiding
>> > unnecessary under- and overflow.
>>
>> Indeed!
>>
>> I had thought more than twice to also export it to the R level
>> notably as we have been using two R level versions in a package
>> I maintain ('copula'). They are vectorized there in a way that
>> seemed particularly useful to our (Marius Hofert and my) use cases.
>>
>> More on this -- making these available in R, how exactly? --
>> probably should move to the R-devel list.
>>
>> Thank you Bill for bringing it up!
>> Martin
>>
>> > Bill Dunlap
>> > TIBCO Software
>> > wdunlap tibco.com
>>
>> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <
>> r.turner at auckland.ac.nz> wrote:
>>
>> >> On 07/11/16 13:07, William Dunlap wrote:
>> >>
>> >>> Have you tried reparameterizing, using logb (=log(b))
instead of
>> b?
>> >>>
>> >>
>> >> Uh, no. I don't think that that makes any sense in my
context.
>> >>
>> >> The "b" values are probabilities and must
satisfy a "sum-to-1"
>> >> constraint. To accommodate this constraint I
re-parametrise via a
>> >> "logistic" style parametrisation --- basically
>> >>
>> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>> >>
>> >> with the parameters that the optimiser works with being
z_1, ...,
>> z_{n-1}
>> >> (and with z_n == 0 for identifiability). The objective
function
>> is of the
>> >> form sum_i(a_i * log(b_i)), so I transform back
>> >> from the z_i to the b_i in order calculate the value of
the
>> objective
>> >> function. But when the z_i get moderately large-negative,
the b_i
>> become
>> >> numerically 0 and then log(b_i) becomes -Inf. And the
optimiser
>> falls over.
>> >>
>> >> cheers,
>> >>
>> >> Rolf
>> >>
>> >>
>> >>> Bill Dunlap
>> >>> TIBCO Software
>> >>> wdunlap tibco.com <http://tibco.com>
>> >>>
>> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <
>> r.turner at auckland.ac.nz
>> >>> <mailto:r.turner at auckland.ac.nz>> wrote:
>> >>>
>> >>>
>> >>> I am trying to deal with a maximisation problem in
which it is
>> >>> possible for the objective function to (quite
legitimately) return
>> >>> the value -Inf, which causes the numerical optimisers
that I have
>> >>> tried to fall over.
>> >>>
>> >>> The -Inf values arise from expressions of the form
"a * log(b)",
>> >>> with b = 0. Under the *starting* values of the
parameters, a must
>> >>> equal equal 0 whenever b = 0, so we can legitimately
say that a *
>> >>> log(b) = 0 in these circumstances. However as the
maximisation
>> >>> algorithm searches over parameters it is possible for
b to take
>> the
>> >>> value 0 for values of
>> >>> a that are strictly positive. (The values of
"a" do not change
>> during
>> >>> this search, although they *do* change between
"successive
>> searches".)
>> >>>
>> >>> Clearly if one is *maximising* the objective then -Inf
is not a
>> value
>> >>> of
>> >>> particular interest, and we should be able to
"move away". But
>> the
>> >>> optimising function just stops.
>> >>>
>> >>> It is also clear that "moving away" is not a
simple task; you
>> can't
>> >>> estimate a gradient or Hessian at a point where the
function value
>> >>> is -Inf.
>> >>>
>> >>> Can anyone suggest a way out of this dilemma, perhaps
an optimiser
>> >>> that is equipped to cope with -Inf values in some
sneaky way?
>> >>>
>> >>> Various ad hoc kludges spring to mind, but they all
seem to be
>> >>> fraught with peril.
>> >>>
>> >>> I have tried changing the value returned by the
objective function
>> >>> from
>> >>> "v" to exp(v) --- which maps -Inf to 0,
which is nice and finite.
>> >>> However this seemed to flatten out the objective
surface too much,
>> >>> and the search stalled at the 0 value, which is the
antithesis of
>> >>> optimal.
>> >>>
>> >>> The problem arises in a context of applying the EM
algorithm where
>> >>> the M-step cannot be carried out explicitly, whence
numerical
>> >>> optimisation.
>> >>> I can give more detail if anyone thinks that it could
be relevant.
>> >>>
>> >>> I would appreciate advice from younger and wiser
heads! :-)
>> >>>
>> >>> cheers,
>> >>>
>> >>> Rolf Turner
>> >>>
>> >>> --
>> >>> Technical Editor ANZJS
>> >>> Department of Statistics
>> >>> University of Auckland
>> >>> Phone: +64-9-373-7599 ext. 88276
<tel:%2B64-9-373-7599%20ext.%2
>> 088276>
>> >>>
>> >>> ______________________________________________
>> >>> R-help at r-project.org <mailto:R-help at
r-project.org> mailing list
>> --
>> >>> To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> <https://stat.ethz.ch/mailman/listinfo/r-help>
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> <http://www.R-project.org/posting-guide.html>
>> >>> and provide commented, minimal, self-contained,
reproducible code.
>> >>>
>> >>>
>> >>>
>> >>
>> >> --
>> >> Technical Editor ANZJS
>> >> Department of Statistics
>> >> University of Auckland
>> >> Phone: +64-9-373-7599 ext. 88276
>> >>
>>
>> > [[alternative HTML version deleted]]
>>
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
http://www.R-project.org/posti
>> ng-guide.html
>> > and provide commented, minimal, self-contained, reproducible
code.
>>
>
>
[[alternative HTML version deleted]]