markleeds at verizon.net
2008-Sep-02 00:31 UTC
[R] multinomial estimation output stat question - not R question
I am estimating a multinomial model with two quantitative predictors, X1 and X2, and 3 responses. The responses are called neutral, positive and negative with neutral being the baseline. There are actually many models being estimated because I estimate the model over time and also for various parameter sets but that's not important. When I estimate a model, since neutral is the baseline and there is no interaction term, I get back coefficients X1 negative X2 negative X1 positive X2 positive Usually the signs of the coefficients are what I would expect. Also, I've read about Anova so I think that I kind of understand what that is doing. But, what I'm confused about is the following: In some of the models, I can get back wald statistics for X1 say, where both the X1 negative Wald stat and the X1 positive Wald stat are not significant. Yet, the pvalue from the Anova for the X1 variable overall is significant ? Is this possible ? I think I'm not understanding the Anova output as well as I thought because, to me, that seems inconsistent ? I understand that the Wald statistics for the particular variables are kind of analogous to the t-stats in a regular regression in that they are a function of the decrease in deviance conditional on all the rest of the variables being in the model. The pvalue in the Anova table I thought was kind of doing the same thing except not differentiating between the factors and just calculating the decrease in deviance due to X1 overall without regard to the particular factor response ? If I'm right in my interpretation of the Anova output, then can that still happen ? If I'm wrong about my interpretation, and it can happen, can someone tell me where to look for an explanation on why that can happen and possibly explain where my interpretation is wrong ? I just want to understand my output as best as I can. If it can't happen, then it's puzzling because it is happening. Thanks for any insights, comments or references. The output is not easily reproducible or else I would reproduce it here.
Greg Snow
2008-Sep-02 17:55 UTC
[R] multinomial estimation output stat question - not R question
Mark, There are a couple of possible things that could be going on here: In regular ANOVA cases you can have a situation where you have 3 groups, A, B, and C where A and C are significantly different from each other, but B lies between them in such a way that we cannot say that B is significantly different from A or C (variation is large enough that the mean of B could equal that of A or C). Clearly B cannot equal both A and C if C and A are not the same, it is just a matter of lack of evidence. B could be the same as A, or it could be the same as C, or it could be something different from either. So in your case it could be that there is evidence to show a significant difference between positive and negative, but not enough to show how neutral compares to them. You could try refitting your model with a different baseline to see if there is a significant difference between the new baseline and one of the other levels of the factor. Another possibility is that in some cases of logistic regressions (and that could easily carry over to multinomial regressions) you get a large coefficient that is very meaningful, but due to the flattness of the likelihood in that region, the wald test overestimates the variance by quite a bit and results in non-significant conclusions. Look at the size of the coefficient and the size of the standard error estimate, if both are large, then this could be the case and you should ignore the wald test and look more at other types of tests. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of > markleeds at verizon.net > Sent: Monday, September 01, 2008 6:31 PM > To: r-help at r-project.org > Subject: [R] multinomial estimation output stat question - > not R question > > I am estimating a multinomial model with two quantitative > predictors, X1 and X2, and 3 responses. The responses are > called neutral, positive and negative with neutral being the > baseline. There are actually many models being estimated > because I estimate the model over time and also for various > parameter sets but that's not important. When I estimate a > model, since neutral is the baseline and there is no > interaction term, I get back coefficients > > X1 negative > X2 negative > > X1 positive > X2 positive > > Usually the signs of the coefficients are what I would > expect. Also, I've read about Anova so I think that I kind of > understand what that is doing. But, what I'm confused about > is the following: In some of the models, I can get back wald > statistics for X1 say, where both the X1 negative Wald stat > and the X1 positive Wald stat are not significant. > Yet, the pvalue from the Anova for the X1 variable overall is > significant ? Is this possible ? I think I'm not > understanding the Anova output as well as I thought because, > to me, that seems inconsistent ? > > I understand that the Wald statistics for the particular > variables are kind of analogous to the t-stats in a regular > regression in that they are a function of the decrease in > deviance conditional on all the rest of the variables being > in the model. The pvalue in the Anova table I thought was > kind of doing the same thing except not differentiating > between the factors and just calculating the decrease in > deviance due to > X1 overall without regard to the particular factor response ? > > If I'm right in my interpretation of the Anova output, then > can that still happen ? > > If I'm wrong about my interpretation, and it can happen, can > someone tell me where to look for an explanation on why that > can happen and possibly explain where my interpretation is > wrong ? I just want to understand my output as best as I can. > > If it can't happen, then it's puzzling because it is happening. > > Thanks for any insights, comments or references. The output > is not easily reproducible or else I would reproduce it here. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >