thr3ads.net - R help - [R] fundamental guide to use of numerical optimizers? [Dec 2011]

If this information is useful, please help other people find it:
Share via:

Paul Johnson

2011-Dec-15 16:37 UTC

[R] fundamental guide to use of numerical optimizers?

I was in a presentation of optimizations fitted with both MPlus and
SAS yesterday.  In a batch of 1000 bootstrap samples, between 300 and
400 of the estimations did not converge.  The authors spoke as if this
were the ordinary cost of doing business, and pointed to some
publications in which the nonconvergence rate was as high or higher.

I just don't believe that's right, and if some problem is posed so
that the estimate is not obtained in such a large sample of
applications, it either means the problem is badly asked or badly
answered.  But I've got no traction unless I can actually do
better....

Perhaps I can use this opportunity to learn about R functions like
optim, or perhaps maxLik.
>From reading r-help, it seems to me there are some basic tips foroptimization, such as:

1. It is wise to scale the data so that all columns have the same
range before running an optimizer.

2. With estimates of variance parameters, don't try to estimate sigma
directly, instead estimate log(sigma) because that puts the domain of
the solution onto the real number line.

3 With estimates of proportions, estimate instead the logit, for the
same reason.

Are these mistaken generalizations?  Are there other tips that
everybody ought to know?

I understand this is a vague question, perhaps the answers are just in
the folklore. But if somebody has written them out, I would be glad to
know.

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

Greg Snow

2011-Dec-15 18:05 UTC

head link

[R] fundamental guide to use of numerical optimizers?

This really depends on more than just the optimizer, a lot can depend on what
the data looks like and what question is being asked.  In bootstrapping it is
possible to get bootstrap samples for which there is no unique correct answer to
converge to, for example if there is a category where there ends up being no
data due to the bootstrap but you still want to estimate a parameter for that
category then there are an infinite number of possible answers that are all
equal in the likelihood so there will be a lack of convergence on that
parameter.  A stratified bootstrap or semi-parametric bootstrap can be used to
avoid this problem (but may change the assumptions being made as well), or you
can just throw out all those samples that don't have a full answer (which
could be what your presenter did).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Paul Johnson
> Sent: Thursday, December 15, 2011 9:38 AM
> To: R-help
> Subject: [R] fundamental guide to use of numerical optimizers?
> 
> I was in a presentation of optimizations fitted with both MPlus and
> SAS yesterday.  In a batch of 1000 bootstrap samples, between 300 and
> 400 of the estimations did not converge.  The authors spoke as if this
> were the ordinary cost of doing business, and pointed to some
> publications in which the nonconvergence rate was as high or higher.
> 
> I just don't believe that's right, and if some problem is posed so
> that the estimate is not obtained in such a large sample of
> applications, it either means the problem is badly asked or badly
> answered.  But I've got no traction unless I can actually do
> better....
> 
> Perhaps I can use this opportunity to learn about R functions like
> optim, or perhaps maxLik.
> 
> >From reading r-help, it seems to me there are some basic tips for
> optimization, such as:
> 
> 1. It is wise to scale the data so that all columns have the same
> range before running an optimizer.
> 
> 2. With estimates of variance parameters, don't try to estimate sigma
> directly, instead estimate log(sigma) because that puts the domain of
> the solution onto the real number line.
> 
> 3 With estimates of proportions, estimate instead the logit, for the
> same reason.
> 
> Are these mistaken generalizations?  Are there other tips that
> everybody ought to know?
> 
> I understand this is a vague question, perhaps the answers are just in
> the folklore. But if somebody has written them out, I would be glad to
> know.
> 
> --
> Paul E. Johnson
> Professor, Political Science
> 1541 Lilac Lane, Room 504
> University of Kansas
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

cberry at tajo.ucsd.edu

2011-Dec-16 01:14 UTC

head link

[R] fundamental guide to use of numerical optimizers?

Paul Johnson <pauljohn32 at gmail.com> writes:
> I was in a presentation of optimizations fitted with both MPlus and
> SAS yesterday.  In a batch of 1000 bootstrap samples, between 300 and
> 400 of the estimations did not converge.  The authors spoke as if this
> were the ordinary cost of doing business, and pointed to some
> publications in which the nonconvergence rate was as high or higher.
>
> I just don't believe that's right, and if some problem is posed so
> that the estimate is not obtained in such a large sample of
> applications, it either means the problem is badly asked or badly
> answered.  But I've got no traction unless I can actually do
> better....
A few years back there was a brouhaha in which a too lax convergence
criterion in the Splus gam() function resulted in wrong results. 

See

        http://www.ihapss.jhsph.edu/publications/Results/nmmaps_faq.htm

I think this was also reported in the lay press.

IIRC, at that time there was an assertion that gam() was buggy, but it
turned out that for the particular problem a more stringent tolerance
was needed than the default provided. The original report used results
that hadn't actually converged.

<rant> The trouble is there are many instances of monkey-see, monkey-do
data analysis. It seems that some authors do not really want to dig into
their data if the story it tells is not simple and firmly supported. And
not understanding why many bootstrap samples do not converge seems like
an instance of sweeping data-dirt under the rug.</rant>

The questions you ask below full under the rubric of 'numerical
analysis'. You might look here to start:

           http://en.wikipedia.org/wiki/Numerical_analysis

Chuck
>
> Perhaps I can use this opportunity to learn about R functions like
> optim, or perhaps maxLik.
>
>>From reading r-help, it seems to me there are some basic tips for
> optimization, such as:
>
> 1. It is wise to scale the data so that all columns have the same
> range before running an optimizer.
>
> 2. With estimates of variance parameters, don't try to estimate sigma
> directly, instead estimate log(sigma) because that puts the domain of
> the solution onto the real number line.
>
> 3 With estimates of proportions, estimate instead the logit, for the
> same reason.
>
> Are these mistaken generalizations?  Are there other tips that
> everybody ought to know?
>
> I understand this is a vague question, perhaps the answers are just in
> the folklore. But if somebody has written them out, I would be glad to
> know.
-- 
Charles C. Berry                            Dept of Family/Preventive Medicine
ccberry at ucsd edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

Apparently Analagous Threads

Search for more reasonably related threads

R help - Dec 2011 - fundamental guide to use of numerical optimizers?

[R] fundamental guide to use of numerical optimizers?

[R] fundamental guide to use of numerical optimizers?

[R] fundamental guide to use of numerical optimizers?

Apparently Analagous Threads