Hi, I have a regression model, where the explanatory variables are factors, and I want to include interaction terms, but some combinations occur in the data very infrequently. Hence, I'm using hclust and cutree to hierarchically cluster the levels, and get new combined levels to regress on. Ideally, I would like to be able to cut the tree to achieve clusters with at least k observations each. That is, cut the tree at an appropriate height for each branch (combine nodes only when they have fewer than k obs). AFAIK, cutree cuts at a uniform height and there's no easy way of extracting the number of observations per cluster from hclust (except by assigning the new levels to the data and then counting the occurrences). Does anyone know of code that does this already? Thanks, Gad -- Gad Abraham Dept. CSSE and NICTA The University of Melbourne Parkville 3010, Victoria, Australia email: gabraham at csse.unimelb.edu.au web: http://www.csse.unimelb.edu.au/~gabraham