I have data that may be the mixture of two normal distributions (one contained
within the other) vs. a single normal.
I used normalmixEM to get estimates of parameters assuming two normals:
GLUT <- scale(na.omit(data[,"FCW_glut"]))
GLUT
mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE)
summary(mixmdl)
plot(mixmdl,which=2)
lines(density(data[,"GLUT"]), lty=2, lwd=2)
summary of normalmixEM object:
comp 1 comp 2
lambda 0.7035179 0.296482
mu -0.0592302 0.140545
sigma 1.1271620 0.536076
loglik at estimate: -110.8037
I would like to see if the two normal distributions are a better fit that one
normal. I have two problems
(1) normalmixEM does not seem to what to fit a single normal (even if I address
the error message produced):
> mixmdl = normalmixEM(GLUT,k=1)
Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k = k, :
arbmean and arbvar cannot both be FALSE> mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE)
Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k = k, :
arbmean and arbvar cannot both be FALSE
(2) Even if I had the loglik from a single normal, I am not sure how many DFs to
use when computing the -2LL ratio test.
Any suggestions for comparing the two-normal vs. one normal distribution would
be appreciated.
Thanks
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for the sole use of the
intended recipient(s) and may contain confidential and privileged information.
Any unauthorized use, disclosure or distribution is prohibited. If you are not
the intended recipient, please contact the sender by reply email and destroy all
copies of the original message.
Two normals will **always** be a better fit than one, as the latter must be a subset of the former (with identical parameters for both normals). Cheers, Bert Bert Gunter "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." -- Clifford Stoll On Tue, Sep 22, 2015 at 1:21 PM, John Sorkin <JSorkin at grecc.umaryland.edu> wrote:> I have data that may be the mixture of two normal distributions (one contained within the other) vs. a single normal. > I used normalmixEM to get estimates of parameters assuming two normals: > > > GLUT <- scale(na.omit(data[,"FCW_glut"])) > GLUT > mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE) > summary(mixmdl) > plot(mixmdl,which=2) > lines(density(data[,"GLUT"]), lty=2, lwd=2) > > > > > > summary of normalmixEM object: > comp 1 comp 2 > lambda 0.7035179 0.296482 > mu -0.0592302 0.140545 > sigma 1.1271620 0.536076 > loglik at estimate: -110.8037 > > > > I would like to see if the two normal distributions are a better fit that one normal. I have two problems > (1) normalmixEM does not seem to what to fit a single normal (even if I address the error message produced): > > >> mixmdl = normalmixEM(GLUT,k=1) > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k = k, : > arbmean and arbvar cannot both be FALSE >> mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE) > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k = k, : > arbmean and arbvar cannot both be FALSE > > > > (2) Even if I had the loglik from a single normal, I am not sure how many DFs to use when computing the -2LL ratio test. > > > Any suggestions for comparing the two-normal vs. one normal distribution would be appreciated. > > > Thanks > John > > > > > > > > > > John David Sorkin M.D., Ph.D. > Professor of Medicine > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > Confidentiality Statement: > This email message, including any attachments, is for ...{{dropped:12}}
Bert,Better, perhaps, but will something like the LR test be significant? Adding an extra parameter to a linear regression almost always improves the R2, the if one compares models, the model with the extra parameter is not always significantly better. John P.S. Please forgive the appeal to "significantly better" . . . John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Bert Gunter <bgunter.4567 at gmail.com> 09/22/15 4:30 PM >>>Two normals will **always** be a better fit than one, as the latter must be a subset of the former (with identical parameters for both normals). Cheers, Bert Bert Gunter "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." -- Clifford Stoll On Tue, Sep 22, 2015 at 1:21 PM, John Sorkin <JSorkin at grecc.umaryland.edu> wrote:> I have data that may be the mixture of two normal distributions (one contained within the other) vs. a single normal. > I used normalmixEM to get estimates of parameters assuming two normals: > > > GLUT <- scale(na.omit(data[,"FCW_glut"])) > GLUT > mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE) > summary(mixmdl) > plot(mixmdl,which=2) > lines(density(data[,"GLUT"]), lty=2, lwd=2) > > > > > > summary of normalmixEM object: > comp 1 comp 2 > lambda 0.7035179 0.296482 > mu -0.0592302 0.140545 > sigma 1.1271620 0.536076 > loglik at estimate: -110.8037 > > > > I would like to see if the two normal distributions are a better fit that one normal. I have two problems > (1) normalmixEM does not seem to what to fit a single normal (even if I address the error message produced): > > >> mixmdl = normalmixEM(GLUT,k=1) > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k = k, : > arbmean and arbvar cannot both be FALSE >> mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE) > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k = k, : > arbmean and arbvar cannot both be FALSE > > > > (2) Even if I had the loglik from a single normal, I am not sure how many DFs to use when computing the -2LL ratio test. > > > Any suggestions for comparing the two-normal vs. one normal distribution would be appreciated. > > > Thanks > John > > > > > > > > > > John David Sorkin M.D., Ph.D. > Professor of Medicine > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > Confidentiality Statement: > This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.Call Send SMS Call from mobile Add to Skype You'll need Skype CreditFree via Skype Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
That's true but if he uses some AIC or BIC criterion that penalizes the number of parameters, then he might see something else ? This ( comparing mixtures to not mixtures ) is not something I deal with so I'm just throwing it out there. On Tue, Sep 22, 2015 at 4:30 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:> Two normals will **always** be a better fit than one, as the latter > must be a subset of the former (with identical parameters for both > normals). > > Cheers, > Bert > > > Bert Gunter > > "Data is not information. Information is not knowledge. And knowledge > is certainly not wisdom." > -- Clifford Stoll > > > On Tue, Sep 22, 2015 at 1:21 PM, John Sorkin > <JSorkin at grecc.umaryland.edu> wrote: > > I have data that may be the mixture of two normal distributions (one > contained within the other) vs. a single normal. > > I used normalmixEM to get estimates of parameters assuming two normals: > > > > > > GLUT <- scale(na.omit(data[,"FCW_glut"])) > > GLUT > > mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE) > > summary(mixmdl) > > plot(mixmdl,which=2) > > lines(density(data[,"GLUT"]), lty=2, lwd=2) > > > > > > > > > > > > summary of normalmixEM object: > > comp 1 comp 2 > > lambda 0.7035179 0.296482 > > mu -0.0592302 0.140545 > > sigma 1.1271620 0.536076 > > loglik at estimate: -110.8037 > > > > > > > > I would like to see if the two normal distributions are a better fit > that one normal. I have two problems > > (1) normalmixEM does not seem to what to fit a single normal (even if I > address the error message produced): > > > > > >> mixmdl = normalmixEM(GLUT,k=1) > > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k > k, : > > arbmean and arbvar cannot both be FALSE > >> mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE) > > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k > k, : > > arbmean and arbvar cannot both be FALSE > > > > > > > > (2) Even if I had the loglik from a single normal, I am not sure how > many DFs to use when computing the -2LL ratio test. > > > > > > Any suggestions for comparing the two-normal vs. one normal distribution > would be appreciated. > > > > > > Thanks > > John > > > > > > > > > > > > > > > > > > > > John David Sorkin M.D., Ph.D. > > Professor of Medicine > > Chief, Biostatistics and Informatics > > University of Maryland School of Medicine Division of Gerontology and > Geriatric Medicine > > Baltimore VA Medical Center > > 10 North Greene Street > > GRECC (BT/18/GR) > > Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > > > > > Confidentiality Statement: > > This email message, including any attachments, is for ...{{dropped:12}} > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]