I've taken a look at this. What the R code does is to recalculate the nearest neighbours & distances after updating the distances, for all clusters other than the new one which it attempted to do on the fly. The problem is that merging two clusters can make distances to the cluster go up and so what was a nearest neighbour may stop being so, as well as the reverse. So I am not convinced that the correction is in fact enough (what if i2 had previously been the nearest neighbour of k?) although this may not affect the later steps. It is as fast just to update all nearest neighbours, and I have changed the R code to do so. It is a lot easier to convince oneself that the code gives the correct answer! I now get an answer much closer to yours and hope the difference is due to rounding errors in dumping the dataset. (It is also what I got by incorporating the fix in the C code you pointed us to.) BDR -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595