I have a training set of data for known classes with 5 observations of 12 variables for each class. I want to use this information to classify new data into classes which are known to be different to those in the training set but each new class may contain one or more observations. The distribution of within class distances is expected to be similar for all classes and this is found to be the case for the training data. I have tried using the maximum within class distance for the training data to set the h variable in cutree for the clustered new data. This appears to work fine for "average" and "complete" clustering methods but not for the Ward clustering method as the distance axis of the dendrogram does not directly relate to the distances between observations. Can anyone advise on how to optimise the h value of cutree when using the Ward clustering method or is there a better approach to this type of classification problem? Thanks Mike White