Hello,
Couple of clarifications:
- A,B,C,D are factors and I am also interested in possible interactions but the
model that comes out from aov R~A*B*C*D violates the model assumptions
- My 2^k is unbalanced i.e. missing data and an additional level I also include
in one of the factors i.e. C
- I was referring in the OP to the 4-way interactions and not 2-way, I'm
sorry for my confusion.
- I tried to create an aov model with less interactions this way but I get the
following error:
model.aov <- aov(log(R)~A+B+I(A*B)+C+D,data=throughput)
Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") :
contrasts can be applied only to factors with 2 or more levels
In addition: Warning message:
In Ops.factor(A, B) : * not meaningful for factors
Here I was trying to say: do a one-way anova except for the A and B factors for
which I would like to get their 2-way interactions ...
Thanks in advance,
Best regards,
Giovanni
On Nov 21, 2011, at 12:04 PM, Giovanni Azua wrote:
> Hello,
>
> I know there is plenty of people in this group who can give me a good
answer :)
>
> I have a 2^k model where k=4 like this:
> Model 1) R~A*B*C*D
>
> If I use the "*" in R among all elements it means to me to
explore all interactions and include them in the model i.e. I think this would
be the so called 2-way anova. However, if I do this, it leads to model
violations i.e. the homoscedasticity is violated, the normality assumption of
the sample errors i.e. residuals is violated etc. I tried correcting the issues
using different standard transformations: log, sqrt, Box-Cox forms etc but none
really improve the result. In this case even though the model assumptions do not
hold, some of the interactions are found to significatively influence the
response variable. But then shall I trust the results of this Model 1) given
that the assumptions do not hold?
>
> Then I try this other model where I exclude the interactions (is this the
1-way anova?):
> Model 2) R~A+B+C+D
>
> In this one the model assumptions hold except the existence of some
outliers and a slightly heavy tail in the QQ-plot.
>
> Given that the assumptions for Model 1) do not hold, I assume I should
ignore the results altogether for Model 1) or? or instead can I safely use the
Sum Sq. of Model 1) to get my table of percent of variations?
>
> This to me was a bit counter-intuitive since I assumed that if there was
collinearity among factors (and there is e.g. I(A*B*C)) the Model 1) and I
included those interactions, my model would be more accurate ... ok this turned
into a brand new topic of model selection but I am mostly interested in the
question: if model is violated can I or must I not use the results e.g. Sum Sqr
for that model?
>
> Can anyone advice please?
>
> btw I have bought most books on R and statistical analysis. I have
researched them all and the ANOVA coverage is very shallow in most of them
specially in the R-sy ones, they just offer a slightly pimped up version of the
R-help.
>
> I am also unofficially following a course on ANOVA from the university I am
registered in and most examples are too simplistic and either the assumptions
just hold easily or the assumptions don't hold and nothing happens.
>
> Thanks in advance,
> Best regards,
> Giovanni
>
[[alternative HTML version deleted]]