Dear All, I need to calculate the optimal number of clusters for a classification based on a large number of observations (tens of thousands). Thibshirani et al. proposed the gap statistic for this purpose. I tried the R-code developed by R. J?rnsten but R hangs with such amount of data (). Is it available any other (optimised) code? Any help would be appreciated, including suggestions about other alternatives for the selection of an optimal number of cluster from large datasets. Thanks, N?stor Fern?ndez, PhD. Department of Ecological Modelling UFZ - Centre for Environmental Research PF 500136, DE-04301, Leipzig, Germany. Tel: +49 341-2352034 E-mail: nestor.fernandez at ufz.de