Hello, I am using the function mda of the mda library in order to discriminate 4 groups with 8 explanatory variables. I only have 66 observations. I tested all possible combinations of those variable and run for each the Mixture Discriminant Analysis. For some iterations, I got an error message: "error in kmeans(xx, start): initial centers are not distinct". I understood that the function kmeans() called by mda() choose randomly the initial centers for starting the clustering procedure. As I aim to boostrap this function and need a lot of random selections, I'd like to avoid the effects of replicated centers by keeping the initial centers constant. When debugging, it seems that mda() is linked with kmeans() by the following condition: if (inherits(weights, "mda")) { if (is.null(weights$weights)) weights <- predict(weights, x, type = "weights", g = fg) else weights <- weights$weights } This condition call mda.start() if "weight" is null. Kmeans() is called in mda.start() by starter() where arguments for kmeans (xx and start) are calculated. The problem arises in the function sample() in starter() which sample randomly the data set. For example, I could obtain duplicated row such as followed: Debug: start <- xx[sample(1:nrow(xx), size = nc), ] debug: TT <- kmeans(xx, start) Browse[1]> start etm5 etm6 elevation slope SI NDVI EVI 28 0.7746975 0.4611835 -0.5566161 1.646738 4.5260250 1.519095 0.2501180 28.1 0.7746975 0.4611835 -0.5566161 1.646738 4.5260250 1.519095 0.2501180 30.1 0.4137596 0.2615745 -0.5367707 1.889310 -0.2040883 0.824643 -0.1526292 In sample function,it seems that sampling without replacement is the default. But actually, in the case above it sampled 2 times the same row (28). So, this is still a black box for me. Even if as it is mentionned in the help page of mda(), "the 'weights' argument need never be accessed", do you think it's possible to avoid this duplicated sampling? Thanks in advance for your ideas, Amelie Vaniscotte University of Franche-comt? 25000 Besan?on