Hello all, I was wondering if anyone could help me formulate a Two-way ANOVA for unbalanced multiple sample data? We have a new study method aimed to help students to study for tests using computers. (I am a computer scientists, hence my soon-to-be-apparent lack of statistical knowledge). To test this study method we devised a user study where 30 participant attended 2 lectures, lecture1 and lecture2. Two test were created, test1 and test2. test1 corresponds to the material in lecture1 and test2 corresponds to the material in lecture2. The 30 participants were split into two groups, group1 and group2. group1 used our new study method to review for lecture1 and their existing study method to review the material from lecture2 group2 used our new study method to review for lecture2 and their existing study method to review the material from lecture1 Each group then took the two test. This is a repeated measure experiment because we have 2 exam scores for each participant, one using our new method to study and one not using our new method to study. The data is unbalanced because participants did not take the same test twice. From what I understand balanced data would look like ID TEST SYSTEM SCORE 1 1 1 80 1 1 0 70 1 2 1 90 1 2 0 95 2 1 1 70 2 1 0 75 2 2 1 80 2 2 0 75 But instead our data look like this: ID TEST SYSTEM SCORE 1 1 1 80 1 2 0 95 2 1 0 75 2 2 1 80 So participant 2 never took test1 using our system. Anyway, I want to look to see if our new study method had an impact one test results. Also, I want to see if the test number had an impact on the exam results. Here is some sample data: ------------ >dataSet <- data.frame( particID=factor(c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8)), whichExam=factor(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)), studyMethod=factor(c(1,0,1,0,1,0,1,0,0,1,0,1,0,1,0,1)), score=c(90,80,75,70,70,58,73,68,69,87,68,79,80,80,99,95)) ------------ From what I have read this should be how to compute and ANOVA on this data: ------------ > summary(aov(score~whichExam*studyMethod+Error(particID),data=dataSet)) Error: particID Df Sum Sq Mean Sq F value Pr(>F) whichExam:studyMethod 1 333.06 333.06 1.8211 0.2259 Residuals 6 1097.38 182.90 Error: Within Df Sum Sq Mean Sq F value Pr(>F) whichExam 1 3.062 3.062 0.1072 0.75445 studyMethod 1 203.062 203.062 7.1094 0.03721 * Residuals 6 171.375 28.562 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ------------ Is this correct way do do an ANOVA test for this data? From what I can tell this means that the study method did have a statistically significant impact on the scores, is that correct? This also shows that it did not matter which test the subject took, meaning that the two test were equally difficult. What exactly do the titles "Error ..." mean? What are "Residuals"? Can anyone recommend a good book on R which covers this information, all I can find are books on SPSS?