Dear help list, I think I found a bug a the R Random Forest. Hopefully, you are able to reproduce it. I use R version 2.7.2 and RF version 4.5-27. This is a minimal code to describe the problem: library(randomForest) tries <- 20 dimension <- 20 n <- 200 outlyingness <- rep(NaN,tries) for (o_number in 1:tries){ features <- matrix(rnorm(n*dimension,0,1),n,dimension) #Generate features, n uncorrelated normally distributed points outlier.rf <- randomForest(features, ntree=100, proximity=TRUE) #Compute Random Forest including the proximity matrix outlyingness_all <- apply(outlier.rf$proximity,2,mean) #Compute the mean proximity for each of the n points better <- sum(outlyingness_all[1]<outlyingness_all) #Compute the rank of a certain point according to the outlyingness outlyingness[o_number] <- 1+better } outlyingness Point number 1 plays a special role in this code fragment. A typical value for "outlyingness" is 200 200 200 200 196 200 200 200 200 200 200 200 200 200 200 200 199 200 200 200 whereas one obtains what one would expect for any other point. So, if better <- sum(outlyingness_all[1]<outlyingness_all) is for example replaced by better <- sum(outlyingness_all[17]<outlyingness_all) one gets 194 7 184 76 25 40 175 174 137 75 49 146 175 150 148 118 100 88 121 14 Is this a bug or am I confused? Can anybody help me? Does anybody know the problem? Best regards Jens Roeder [[alternative HTML version deleted]]