Hi, I have been having some trouble using aov to do an anova, probably because I'm not understanding how to use this function correctly. For some reason it always tells me that "Estimated effects may be unbalanced", though I'm not sure what this means. Is the formula I am using written incorrectly? Below is the code I am using along with the data:> my.dataresponse species sex line replicate plate 1 -7.092854e-03 1 1 1 1 1 2 -8.663481e-04 1 2 1 1 1 3 -5.797276e-03 1 1 2 1 1 4 -2.598078e-03 1 2 2 1 1 5 7.832551e-04 2 1 1 1 1 6 1.333361e-03 2 2 1 1 1 7 -8.972490e-04 2 1 2 1 1 8 -2.834589e-03 2 2 2 1 1 9 5.655464e-04 3 1 1 1 1 10 7.371403e-03 3 2 1 1 1 11 3.160040e-03 3 1 2 1 1 12 -4.110653e-03 1 1 2 2 2 13 -2.262314e-03 1 2 2 2 2 14 -3.259483e-03 1 1 3 1 2 15 -5.671712e-03 1 2 3 1 2 16 -3.636077e-03 2 1 2 2 2 17 -3.904864e-03 2 2 2 2 2 18 1.025440e-03 2 1 3 1 2 19 -3.789292e-03 2 2 3 1 2 20 3.396270e-03 3 1 2 2 2 21 8.807778e-03 3 2 2 2 2 22 5.456604e-03 3 2 3 1 2 23 -1.134216e-02 1 1 3 2 3 24 -7.725740e-03 1 2 3 2 3 25 -1.589719e-03 1 1 4 1 3 26 4.574659e-04 1 2 4 1 3 27 -2.899983e-03 2 1 3 2 3 28 -4.310185e-03 2 2 3 2 3 29 -3.200475e-05 2 1 4 1 3 30 3.166308e-03 3 1 3 2 3 31 5.697712e-03 3 2 3 2 3 32 6.058486e-03 3 1 4 1 3 33 6.941016e-03 3 2 4 1 3 34 -2.794982e-03 1 1 4 2 4 35 -4.416711e-03 1 1 5 1 4 36 -4.062832e-03 1 2 5 1 4 37 1.763941e-03 2 1 4 2 4 38 -2.928930e-03 2 2 4 2 4 39 -2.869975e-03 2 2 5 1 4 40 6.949621e-03 3 1 4 2 4 41 5.766447e-03 3 2 4 2 4 42 2.510278e-03 3 1 5 1 4 43 5.507496e-03 3 2 5 1 4 44 -1.197325e-02 1 2 5 2 5 45 -6.556955e-03 1 1 6 1 5 46 3.622169e-04 2 1 5 2 5 47 -1.288784e-03 2 2 5 2 5 48 -2.863541e-03 2 1 6 1 5 49 -7.082933e-03 2 2 6 1 5 50 3.813700e-03 3 1 5 2 5 51 9.593295e-03 3 2 5 2 5 52 9.881930e-03 3 2 6 1 5 53 -1.081725e-02 1 1 6 2 6 54 -8.870041e-03 1 2 6 2 6 55 -5.305931e-04 1 2 7 1 6 56 2.835570e-03 2 1 6 2 6 57 4.541555e-03 2 2 6 2 6 58 -5.909101e-03 2 1 7 1 6 59 -2.768342e-03 2 2 7 1 6 60 8.835976e-03 3 1 6 2 6 61 1.234038e-02 3 2 6 2 6 62 2.015527e-03 3 1 7 1 6 63 6.485565e-03 3 2 7 1 6 64 -8.372922e-03 1 1 7 2 7 65 -9.439749e-03 1 2 7 2 7 66 -3.782672e-03 1 1 8 1 7 67 -2.576470e-03 1 2 8 1 7 68 2.878789e-03 2 1 7 2 7 69 -9.458139e-04 2 2 7 2 7 70 -3.993852e-03 2 2 8 1 7 71 5.997718e-03 3 1 7 2 7 72 -9.595505e-05 3 1 8 1 7 73 8.167411e-03 3 2 8 1 7 74 -1.181158e-02 1 1 8 2 8 75 -1.072585e-02 1 2 8 2 8 76 -2.856532e-03 1 1 9 1 8 77 -4.944013e-03 1 2 9 1 8 78 2.558783e-03 2 1 8 2 8 79 3.393314e-03 2 2 8 2 8 80 -4.466758e-03 2 1 9 1 8 81 -5.667622e-03 2 2 9 1 8 82 7.491253e-03 3 2 8 2 8 83 4.380724e-03 3 1 9 1 8 84 2.827233e-03 3 2 9 1 8 85 -7.433928e-03 1 2 9 2 9 86 -9.177664e-03 1 1 10 1 9 87 -6.040020e-04 1 2 10 1 9 88 1.394224e-03 2 1 9 2 9 89 -7.455449e-04 2 1 10 1 9 90 -2.251806e-03 2 2 10 1 9 91 7.865773e-03 3 1 9 2 9 92 6.287781e-03 3 2 9 2 9 93 7.734405e-03 3 1 10 1 9 94 9.757342e-03 3 2 10 1 9 95 -6.876948e-03 1 1 1 2 10 96 -4.974144e-03 1 2 1 2 10 97 -2.959226e-03 1 1 10 2 10 98 8.058296e-04 1 2 10 2 10 99 5.314729e-03 2 1 1 2 10 100 1.251126e-03 2 2 1 2 10 101 4.012311e-03 2 2 10 2 10 102 3.479155e-03 3 1 1 2 10 103 1.144813e-02 3 2 1 2 10 104 4.090214e-03 3 1 10 2 10 105 5.196910e-03 3 2 10 2 10 106 -9.038264e-03 1 2 6 1 11 107 -6.184877e-03 1 1 7 1 11 108 4.255164e-03 2 2 4 1 11 109 8.291281e-03 2 1 5 1 11 110 -5.368315e-04 2 2 9 2 11 111 -7.792906e-04 2 1 10 2 11 112 6.312335e-03 3 1 3 1 11 113 1.243561e-02 3 2 7 2 11 114 6.223999e-04 3 1 8 2 11 115 -6.517484e-03 1 2 4 2 12 116 -1.009622e-02 1 1 5 2 12 117 -4.414381e-04 1 1 9 2 12 118 2.221470e-03 2 1 8 1 12 119 1.041818e-02 3 2 2 1 12 120 3.384938e-04 3 1 6 1 12 I am treating all the variables as factors (except for response, obviously). formula<-response~species+line%in%species+replicate%in%line+sex%in%species+plate model<-aov(formula, data=my.data) This is the output:> modelCall: aov(formula = formula, data = my.data) Terms: species plate species:line line:replicate Sum of Squares 0.0026469288 0.0000945202 0.0003320255 0.0002008000 Deg. of Freedom 2 11 27 10 species:sex Residuals Sum of Squares 0.0001383116 0.0006315465 Deg. of Freedom 3 66 Residual standard error: 0.003093362 Estimated effects may be unbalanced Any help would be greatly appreciated as the R help documentation for aov does not address this issue. Thanks! -- Brooke LaFlamme
On Fri, 14 Sep 2007, Brooke LaFlamme wrote:> Hi, I have been having some trouble using aov to do an anova, probably > because I'm not understanding how to use this function correctly. For > some reason it always tells me that "Estimated effects may be > unbalanced", though I'm not sure what this means. Is the formula I am > using written incorrectly? Below is the code I am using along with the > data:[...]> I am treating all the variables as factors (except for response, obviously). > > formula<-response~species+line%in%species+replicate%in%line+sex%in%species+plate > model<-aov(formula, data=my.data) > > This is the output: > >> model > Call: > aov(formula = formula, data = my.data) > > Terms: > species plate species:line line:replicate > Sum of Squares 0.0026469288 0.0000945202 0.0003320255 0.0002008000 > Deg. of Freedom 2 11 27 10 > species:sex Residuals > Sum of Squares 0.0001383116 0.0006315465 > Deg. of Freedom 3 66 > > Residual standard error: 0.003093362 > Estimated effects may be unbalanced > > Any help would be greatly appreciated as the R help documentation for > aov does not address this issue.For the benefit of those who are unable to appreciate fortunes::fortune("WTFM"), the help page actually says 'aov' is designed for balanced designs, and the results can be hard to interpret without balance: beware that missing values in the response(s) will likely lose the balance. If there are two or more error strata, the methods used are statistically inefficient without balance, and it may be better to use 'lme'. Balance can be checked with the 'replications' function. So let's do as it suggests:> replications(formula, data=my.data)$species [1] 40 $plate plate 1 2 3 4 5 6 7 8 9 10 11 12 11 11 11 10 9 11 10 11 10 11 9 6 $`species:line` [1] 4 $`line:replicate` [1] 6 $`species:sex` [1] 20 and the problem will be clear to those who have read ?replications. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thank you for the suggestions from Professor Ripley and Steve Elliot. I see now why my data are unbalanced even though I don't have any missing data. I think I should use other methods designed for unbalanced data, but does using lme with plate as a random effect also help to fix this problem? I am still very new at this type of analysis. Thank you for the help. Brooke -----Original Message-----> Date: Tue Sep 18 12:38:37 EDT 2007 > From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk> > Subject: Re: [R] unbalanced effects in aov > To: > > On Fri, 14 Sep 2007, Brooke LaFlamme wrote: > > > Hi, I have been having some trouble using aov to do an anova, probably > > because I'm not understanding how to use this function correctly. For > > some reason it always tells me that "Estimated effects may be > > unbalanced", though I'm not sure what this means. Is the formula I am > > using written incorrectly? Below is the code I am using along with the > > data: > > [...] > > > I am treating all the variables as factors (except for response, obviously). > > > > formula<-response~species+line%in%species+replicate%in%line+sex%in%species+plate > > model<-aov(formula, data=my.data) > > > > This is the output: > > > >> model > > Call: > > aov(formula = formula, data = my.data) > > > > Terms: > > species plate species:line line:replicate > > Sum of Squares 0.0026469288 0.0000945202 0.0003320255 0.0002008000 > > Deg. of Freedom 2 11 27 10 > > species:sex Residuals > > Sum of Squares 0.0001383116 0.0006315465 > > Deg. of Freedom 3 66 > > > > Residual standard error: 0.003093362 > > Estimated effects may be unbalanced > > > > Any help would be greatly appreciated as the R help documentation for > > aov does not address this issue. > > For the benefit of those who are unable to appreciate > fortunes::fortune("WTFM"), the help page actually says > > 'aov' is designed for balanced designs, and the results can be > hard to interpret without balance: beware that missing values in > the response(s) will likely lose the balance. If there are two or > more error strata, the methods used are statistically inefficient > without balance, and it may be better to use 'lme'. > > Balance can be checked with the 'replications' function. > > So let's do as it suggests: > > > replications(formula, data=my.data) > $species > [1] 40 > > $plate > plate > 1 2 3 4 5 6 7 8 9 10 11 12 > 11 11 11 10 9 11 10 11 10 11 9 6 > > $`species:line` > [1] 4 > > $`line:replicate` > [1] 6 > > $`species:sex` > [1] 20 > > and the problem will be clear to those who have read ?replications. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595