thr3ads.net - R help - [R] [OT] 1 vs 2-way anova technical question [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Giovanni Azua

2011-Nov-21 11:04 UTC

[R] [OT] 1 vs 2-way anova technical question

Hello,

I know there is plenty of people in this group who can give me a good answer :)

I have a 2^k model where k=4 like this:
Model 1) R~A*B*C*D

If I use the "*" in R among all elements it means to me to explore all
interactions and include them in the model i.e. I think this would be the so
called 2-way anova. However, if I do this, it leads to model violations i.e. the
homoscedasticity is violated, the normality assumption of the sample errors i.e.
residuals is violated etc. I tried correcting the issues using different
standard transformations: log, sqrt, Box-Cox forms etc but none really improve
the result. In this case even though the model assumptions do not hold, some of
the interactions are found to significatively influence the response variable.
But then shall I trust the results of this Model 1) given that the assumptions
do not hold?

Then I try this other model where I exclude the interactions (is this the 1-way
anova?):
Model 2) R~A+B+C+D

In this one the model assumptions hold except the existence of some outliers and
a slightly heavy tail in the QQ-plot.

Given that the assumptions for Model 1) do not hold, I assume I should ignore
the results altogether for Model 1) or? or instead can I safely use the Sum Sq.
of Model 1) to get my table of percent of variations?

This to me was a bit counter-intuitive since I assumed that if there was
collinearity among factors (and there is e.g. I(A*B*C)) the Model 1) and I
included those interactions, my model would be more accurate ... ok this turned
into a brand new topic of model selection but I am mostly interested in the
question: if model is violated can I or must I not use the results e.g. Sum Sqr
for that model?

Can anyone advice please?

btw I have bought most books on R and statistical analysis. I have researched
them all and the ANOVA coverage is very shallow in most of them specially in the
R-sy ones, they just offer a slightly pimped up version of the R-help.

I am also unofficially following a course on ANOVA from the university I am
registered in and most examples are too simplistic and either the assumptions
just hold easily or the assumptions don't hold and nothing happens.

Thanks in advance,
Best regards,
Giovanni

Giovanni Azua

2011-Nov-21 13:02 UTC

head link

[R] [OT] 1 vs 2-way anova technical question

Hello,

Couple of clarifications: 
- A,B,C,D are factors and I am also interested in possible interactions but the
model that comes out from aov R~A*B*C*D violates the model assumptions
- My 2^k is unbalanced i.e. missing data and an additional level I also include
in one of the factors i.e. C
- I was referring in the OP to the 4-way interactions and not 2-way, I'm
sorry for my confusion.
- I tried to create an aov model with less interactions this way but I get the
following error:

model.aov <- aov(log(R)~A+B+I(A*B)+C+D,data=throughput)
Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : 
  contrasts can be applied only to factors with 2 or more levels
In addition: Warning message:
In Ops.factor(A, B) : * not meaningful for factors

Here I was trying to say: do a one-way anova except for the A and B factors for
which I would like to get their 2-way interactions ...

Thanks in advance,
Best regards,
Giovanni

On Nov 21, 2011, at 12:04 PM, Giovanni Azua wrote:
> Hello,
> 
> I know there is plenty of people in this group who can give me a good
answer :)
> 
> I have a 2^k model where k=4 like this:
> Model 1) R~A*B*C*D
> 
> If I use the "*" in R among all elements it means to me to
explore all interactions and include them in the model i.e. I think this would
be the so called 2-way anova. However, if I do this, it leads to model
violations i.e. the homoscedasticity is violated, the normality assumption of
the sample errors i.e. residuals is violated etc. I tried correcting the issues
using different standard transformations: log, sqrt, Box-Cox forms etc but none
really improve the result. In this case even though the model assumptions do not
hold, some of the interactions are found to significatively influence the
response variable. But then shall I trust the results of this Model 1) given
that the assumptions do not hold?
> 
> Then I try this other model where I exclude the interactions (is this the
1-way anova?):
> Model 2) R~A+B+C+D
> 
> In this one the model assumptions hold except the existence of some
outliers and a slightly heavy tail in the QQ-plot.
> 
> Given that the assumptions for Model 1) do not hold, I assume I should
ignore the results altogether for Model 1) or? or instead can I safely use the
Sum Sq. of Model 1) to get my table of percent of variations?
> 
> This to me was a bit counter-intuitive since I assumed that if there was
collinearity among factors (and there is e.g. I(A*B*C)) the Model 1) and I
included those interactions, my model would be more accurate ... ok this turned
into a brand new topic of model selection but I am mostly interested in the
question: if model is violated can I or must I not use the results e.g. Sum Sqr
for that model?
> 
> Can anyone advice please?
> 
> btw I have bought most books on R and statistical analysis. I have
researched them all and the ANOVA coverage is very shallow in most of them
specially in the R-sy ones, they just offer a slightly pimped up version of the
R-help.
> 
> I am also unofficially following a course on ANOVA from the university I am
registered in and most examples are too simplistic and either the assumptions
just hold easily or the assumptions don't hold and nothing happens.
> 
> Thanks in advance,
> Best regards,
> Giovanni
> 

	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more possibly parallel threads

R help - Nov 2011 - [OT] 1 vs 2-way anova technical question

[R] [OT] 1 vs 2-way anova technical question

[R] [OT] 1 vs 2-way anova technical question

Reasonably Related Threads