thr3ads.net - R help - [R] normalmixEM gives widely divergent results. [Jan 2016]

If this information is useful, please help other people find it:
Share via:

John Sorkin

2016-Jan-27 16:51 UTC

[R] normalmixEM gives widely divergent results.

I am running normalmixEM:
mixmdlscaled <- normalmixEM(data$FCWg)
summary(mixmdlscaled)
plot(mixmdlscaled,which=2)
 
If I run the program multiple times, I get widely different results:
 > mixmdlscaled <- normalmixEM(data$FCWg)
number of iterations= 41 > summary(mixmdlscaled)summary of normalmixEM object:
          comp 1   comp 2
lambda 0.0818928 0.918107
mu     0.6575938 0.740870
sigma  0.0070562 0.178410
loglik at estimate:  56.87445 > plot(mixmdlscaled,which=2)
> mixmdlscaled <- normalmixEM(data$FCWg)
number of iterations= 357 > summary(mixmdlscaled)summary of normalmixEM object:
         comp 1    comp 2
lambda 0.959912 0.0400879
mu     0.722022 1.0220719
sigma  0.165454 0.0131391
loglik at estimate:  53.66051 > plot(mixmdlscaled,which=2)
 
 
I understand that when run without specifying various parameters (e.g. mu, or
sigma) values are chosen randomly from a normal distribution with center(s)
determined from binning the data. Despite this, would not one expect the results
to be similar? If one is not to expect similar results, how can I get a solution
in which I can have confidence? Should I run the program multiple times and take
the average of the results? Should I look for the solution with the best log
likelihood?
 
Thank you,
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 

Confidentiality Statement:
This email message, including any attachments, is for the sole use of the
intended recipient(s) and may contain confidential and privileged information.
Any unauthorized use, disclosure or distribution is prohibited. If you are not
the intended recipient, please contact the sender by reply email and destroy all
copies of the original message.

William Dunlap

2016-Jan-27 17:36 UTC

head link

[R] normalmixEM gives widely divergent results.

You could start by sorting the components, by lambda (size) or by mu (mean)
since, if you don't supply starting values, the order of the components is
random.  You could use the following to sort normalmixEM's output:

sort.mixEM <- function (x, decreasing = FALSE, ..., by = "lambda")
{
    stopifnot(inherits(x, "mixEM"), is.element(by, names(x)))
    o <- order(x[[by]], decreasing = decreasing)
    x$lambda <- x$lambda[o]
    x$sigma <- x$sigma[o]
    x$mu <- x$mu[o]
    x$posterior[] <- x$posterior[, o, drop = FALSE]
    x
}


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Jan 27, 2016 at 8:51 AM, John Sorkin <JSorkin at
grecc.umaryland.edu>
wrote:
> I am running normalmixEM:
> mixmdlscaled <- normalmixEM(data$FCWg)
> summary(mixmdlscaled)
> plot(mixmdlscaled,which=2)
>
> If I run the program multiple times, I get widely different results:
>
> > mixmdlscaled <- normalmixEM(data$FCWg)
> number of iterations= 41
> > summary(mixmdlscaled)
> summary of normalmixEM object:
>           comp 1   comp 2
> lambda 0.0818928 0.918107
> mu     0.6575938 0.740870
> sigma  0.0070562 0.178410
> loglik at estimate:  56.87445
> > plot(mixmdlscaled,which=2)
> > mixmdlscaled <- normalmixEM(data$FCWg)
> number of iterations= 357
> > summary(mixmdlscaled)
> summary of normalmixEM object:
>          comp 1    comp 2
> lambda 0.959912 0.0400879
> mu     0.722022 1.0220719
> sigma  0.165454 0.0131391
> loglik at estimate:  53.66051
> > plot(mixmdlscaled,which=2)
>
>
>
> I understand that when run without specifying various parameters (e.g. mu,
> or sigma) values are chosen randomly from a normal distribution with
> center(s) determined from binning the data. Despite this, would not one
> expect the results to be similar? If one is not to expect similar results,
> how can I get a solution in which I can have confidence? Should I run the
> program multiple times and take the average of the results? Should I look
> for the solution with the best log likelihood?
>
> Thank you,
> John
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:16}}

Ranjan Maitra

2016-Jan-27 18:07 UTC

head link

[R] normalmixEM gives widely divergent results.

On Wed, 27 Jan 2016 11:51:07 -0500 John Sorkin <JSorkin at
grecc.umaryland.edu> wrote:
> I am running normalmixEM:
> mixmdlscaled <- normalmixEM(data$FCWg)
> summary(mixmdlscaled)
> plot(mixmdlscaled,which=2)
>  
> If I run the program multiple times, I get widely different results:
>  
> > mixmdlscaled <- normalmixEM(data$FCWg)
> number of iterations= 41 
> > summary(mixmdlscaled)
> summary of normalmixEM object:
>           comp 1   comp 2
> lambda 0.0818928 0.918107
> mu     0.6575938 0.740870
> sigma  0.0070562 0.178410
> loglik at estimate:  56.87445 
> > plot(mixmdlscaled,which=2)
> > mixmdlscaled <- normalmixEM(data$FCWg)
> number of iterations= 357 
> > summary(mixmdlscaled)
> summary of normalmixEM object:
>          comp 1    comp 2
> lambda 0.959912 0.0400879
> mu     0.722022 1.0220719
> sigma  0.165454 0.0131391
> loglik at estimate:  53.66051 
> > plot(mixmdlscaled,which=2)
> 
>  
>  
> I understand that when run without specifying various parameters (e.g. mu,
or sigma) values are chosen randomly from a normal distribution with center(s)
determined from binning the data.
I don't know what this means or what the mechanics are.
> Despite this, would not one expect the results to be similar? If one is not
to expect similar results, how can I get a solution in which I can have
confidence? Should I run the program multiple times and take the average of the
results? Should I look for the solution with the best log likelihood?
But if a likelihood has several local maxima wrt its parameters, isn't this
what you would expect? I don't know how familiar you are with statistics so
maybe I am repeating something that you already know, but a MLE (note the
indefinite article) is what is found by the EM or any iterative/root-finding
method in the vicinity of its initialization.

Your best best is to use a package such as EMCluster. If you want to use the
above package, you should make several runs and then choose the one which gives
a stable solution and the highest loglikelihood value. EMCluster does it for
you.

HTH!

Best wishes,
Ranjan


> Thank you,
> John
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and
Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing) 
> 
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:26}}

R help - Jan 2016 - normalmixEM gives widely divergent results.

[R] normalmixEM gives widely divergent results.

[R] normalmixEM gives widely divergent results.

[R] normalmixEM gives widely divergent results.