Hi all Maybe someone knows a way to solve this anomaly in sample(): I like to compute a sample (n=100) with replications from a population of 2500 units but if I draw repeated samples from it I dont get what seems to be a representative sample if I look at other partitions of the population. Enclosed is the population g99 with 4 columns: (units, partition 1 (site), partition 2 (type), weights); and my R program. The problem: Some categories from partition 2 (type) which I use to check for representativeness, deviates up to 20 percentage points from the population. As criterion I have computed the mean difference and the SD of the relative frequencies between sample and pop. What mean deviation is to expect? Thanks for any ideas, W. Polasek dimnames(g99)[[1]] =paste(g99[,1]) s1= g99[paste(sample(g99[,1], 100, F, g99[,4])),1:4] dim(s1) j2 =table(s1[,3]) #sample density j2g= table(g99[,3]) #pop density chisq.test(j2g,j2) p2=100*j2g / sum(j2g) #rel. frequency in pop pd=p2-100* j2/sum(j2) #difference of rel. frequency between pop and sample round(rbind(j2g, p2, pd),2) sum(abs(pd));sd(pd) #look for the 'best' representative sample