I have a simple R script which uses the psych package to calculate cohens kappa (weighted) from a csv file. The csv file has 2 columns of ordinal data - one ranked by an experienced observer and the other by a novice who is undergoing training. What I would like to do is measure kappa for subsets of this data (say 30-50 ratings) and check whether the observed agreement between novice and experienced observer improves with training. Which metric should I use to compare to kappas? At the moment the 95% CI for 100 observations is quite wide - 0.31 - 0.71 -- View this message in context: http://r.789695.n4.nabble.com/Comparing-Kappa-coefficients-tp4652780.html Sent from the R help mailing list archive at Nabble.com.