I guess this isn't so much of a help request as a show-and-tell from a non-statistician homebrewer who has been fumbling around with R. If nothing else it provides yet another data set. I hope it is not out of line. Anyway, the plots I have produced are at http://brewiki.org/BatchSparge#poll The polling method is somewhat simple, its just one of those multiple choice style polls you can create on various web forums. The poll was prompted by the ongoing claim from fly spargers that "their" method is more efficient, but I had never seen data to support that. I thought maybe it was a bit of snobbery. Maybe they are right. However if I conveniently ignore that annoying bump on the left of the batch sparge histogram then the two groups start to look very similar. I was going to go out on a limb and say I learn heaps from reading the posts here so please don't ruin my delusion too much if my output violates all principles of good statistics. OTOH if you can suggest other cool looking graphs please feel free. The more difficult to pronounce the names are, the better :-) The data set is (efficiency is the low end of its bin): method efficiency count source fly 95 0 bb fly 90 0 bb fly 85 2 bb fly 80 8 bb fly 75 13 bb fly 70 8 bb fly 65 3 bb fly 60 0 bb fly 55 0 bb batch 95 0 bb batch 90 0 bb batch 85 4 bb batch 80 3 bb batch 75 15 bb batch 70 10 bb batch 65 6 bb batch 60 7 bb batch 55 1 bb And the R code: # Crunch some stuff with brewboard (and similar polls). # Data is already tabulated. x = read.csv("SpargeEff.csv") # Shift value to centre of bin x$efficiency = x$efficiency + 2.5 # Ignore rows with no votes (NA), zeros are ok though y = x[which(!is.na(x$count)),] r = rep(row.names(y), y$count) z = y[r,] z$count = 1 par(mfrow=c(2,2)) barplot(table(z$method), main="number of responses") barplot(table(z$method, z$efficiency), beside=T, legend=T, main="Mash efficiency by method", sub="paul sorenson 2005 brewiki.org") boxplot(z$efficiency ~ z$method, main="Mash efficiency") z.h = hist(z$efficiency, prob=T, main="Efficiencies,\n all methods combined", xlab="efficiency") z.md = max(z.h$density) lines(density(z$efficiency, bw=3.0), col='blue') #qqnorm(x$efficiency) t.test(efficiency ~ method, data=z) #by(z, z$method, summary) zs = split(z, z$method) summary(zs$batch) summary(zs$fly) # fit a normal distribution require(MASS) z.fit = fitdistr(z$efficiency, 'normal') q = 55:95 + 2.5 lines(q, dnorm(q, z.fit$estimate['mean'], z.fit$estimate['sd']), col='red') legend('topleft', legend=c('density', 'fitted'), col=c('blue', 'red'), lwd=1, inset=0.05) # Factor out lowball values. z.f = z[which(z$efficiency >= 65),] by(z.f, z.f$method, summary)