Dear All, I have a data, suppose it is an N*M matrix data. All I want is to classify it into, let see, 3 classes. Which method(s) do you think is(are) appropriate for this purpose? Any reference will be welcome! Thanks! Best, Baoqiang Cao
if you want to classify rows or columns, read: ?hclust ?kmeans library(cluster) ?pam Baoqiang Cao a ??crit :>Dear All, > >I have a data, suppose it is an N*M matrix data. All I want is to classify it into, let see, 3 classes. Which method(s) do you think is(are) appropriate for this purpose? Any reference will be welcome! Thanks! > >Best, > Baoqiang Cao > > > >------------------------------------------------------------------------ > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Thanks! I tried kmeans, the results is not very positive. Anyway, thanks Jacques! Please let me know if you have any other thoughts! Best regards, Baoqiang Cao ======= At 2006-03-29, 00:08:44 you wrote: ======>if you want to classify rows or columns, read: >?hclust >?kmeans >library(cluster) >?pam > > >Baoqiang Cao a ??crit : > >>Dear All, >> >>I have a data, suppose it is an N*M matrix data. All I want is to classify it into, let see, 3 classes. Which method(s) do you think is(are) appropriate for this purpose? Any reference will be welcome! Thanks! >> >>Best, >> Baoqiang Cao >> >> >> >>------------------------------------------------------------------------ >> >>______________________________________________ >>R-help at stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> > >.= = = = = = = = = = = = = = = = = = = Baoqiang Cao caobg at email.uc.edu 2006-03-29
try this (suppose mat is your matrix): hc <- hclust(dist(mat,"manhattan"), "ward") plot(hc, hang=-1) (x <- identify(hc)) # rightclick to stop cutree(hc, 3) km<- kmeans(mat, 3) km$cluster km$centers pam(daisy(mat, metric = "manhattan"), k=3, diss=T)$clust Baoqiang Cao a ??crit :>Thanks! >I tried kmeans, the results is not very positive. Anyway, thanks Jacques! Please let me know if you have any other thoughts! > >Best regards, > Baoqiang Cao > >======= At 2006-03-29, 00:08:44 you wrote: ======> > > >>if you want to classify rows or columns, read: >>?hclust >>?kmeans >>library(cluster) >>?pam >> >> >>Baoqiang Cao a ??crit : >> >> >> >>>Dear All, >>> >>>I have a data, suppose it is an N*M matrix data. All I want is to classify it into, let see, 3 classes. Which method(s) do you think is(are) appropriate for this purpose? Any reference will be welcome! Thanks! >>> >>>Best, >>> Baoqiang Cao >>> >>> >>> >>>------------------------------------------------------------------------ >>> >>>______________________________________________ >>>R-help at stat.math.ethz.ch mailing list >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >>> >>> >>> >>. >> >> > >= = = = = = = = = = = = = = = = = = = > >Baoqiang Cao >caobg at email.uc.edu >2006-03-29 > > > >
We have to be careful here. Classification (which is the terminology that the original poster used) is NOT the same as clustering, although the two are often confused. If the original poster wants to do clustering and examine the results for the presence of three clusters, that is fine and there are many methods for clustering that could be used. However, classification will require a different set of tools. If the clustering tools already pointed out are not doing what is needed (that is, that Cao actually is interested in clustering and not classification), then perhaps a further explanation of what the problem would help clarify. Sean On 3/29/06 1:46 AM, "Jacques VESLOT" <jacques.veslot at cirad.fr> wrote:> try this (suppose mat is your matrix): > > hc <- hclust(dist(mat,"manhattan"), "ward") > plot(hc, hang=-1) > (x <- identify(hc)) # rightclick to stop > cutree(hc, 3) > > km<- kmeans(mat, 3) > km$cluster > km$centers > > pam(daisy(mat, metric = "manhattan"), k=3, diss=T)$clust > > > > Baoqiang Cao a ?crit : > >> Thanks! >> I tried kmeans, the results is not very positive. Anyway, thanks Jacques! >> Please let me know if you have any other thoughts! >> >> Best regards, >> Baoqiang Cao >> >> ======= At 2006-03-29, 00:08:44 you wrote: ======>> >> >> >>> if you want to classify rows or columns, read: >>> ?hclust >>> ?kmeans >>> library(cluster) >>> ?pam >>> >>> >>> Baoqiang Cao a ?crit : >>> >>> >>> >>>> Dear All, >>>> >>>> I have a data, suppose it is an N*M matrix data. All I want is to classify >>>> it into, let see, 3 classes. Which method(s) do you think is(are) >>>> appropriate for this purpose? Any reference will be welcome! Thanks! >>>> >>>> Best, >>>> Baoqiang Cao >>>> >>>> >>>> >>>> ------------------------------------------------------------------------ >>>> >>>> ______________________________________________ >>>> R-help at stat.math.ethz.ch mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide! >>>> http://www.R-project.org/posting-guide.html >>>> >>>> >>>> >>> . >>> >>> >> >> = = = = = = = = = = = = = = = = = = = >> >> Baoqiang Cao >> caobg at email.uc.edu >> 2006-03-29 >> >> >> >> > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>On Wed, 29 Mar 2006, Sean Davis wrote: > >> We have to be careful here. Classification (which is the terminology that >> the original poster used) is NOT the same as clustering, although the two >> are often confused. > >Well, in one of its two English senses it is the same. From a recent talk >of mine (GfKL30), quoting the Concise Oxford Dictionary: > >\emph{Classification} has two senses: > >\begin{itemize} >\item `to arrange in classes or categories' >\item `assign (a thing) to a class or category' >\end{itemize} > >There is a community (q.v. the International Federation of Classification >Societies and Journal of Classification as well as the entry in the >original Encyclopedia of Statistical Sciences) that meams (almost) >entirely the first sense. > >To add to this, the similar words to classification in e.g. French or >German have (I am told) different shades of meaning. > > >> If the original poster wants to do clustering and >> examine the results for the presence of three clusters, that is fine and >> there are many methods for clustering that could be used. However, >> classification will require a different set of tools. If the clustering >> tools already pointed out are not doing what is needed (that is, that Cao >> actually is interested in clustering and not classification), then perhaps a >> further explanation of what the problem would help clarify. > >Yes, further explanation would help.My intension is to arrange all the samples in classes. As a non-native English speaker, I should have checked the word before I actually use it to express myself. The quoting makes perfect sense to me. Appreciate! Thank you Jacques and Martin, your comments and suggestion are well received! Best, Baoqiang Cao> >> Sean >> >> >> On 3/29/06 1:46 AM, "Jacques VESLOT" <jacques.veslot at cirad.fr> wrote: >> >>> try this (suppose mat is your matrix): >>> >>> hc <- hclust(dist(mat,"manhattan"), "ward") >>> plot(hc, hang=-1) >>> (x <- identify(hc)) # rightclick to stop >>> cutree(hc, 3) >>> >>> km<- kmeans(mat, 3) >>> km$cluster >>> km$centers >>> >>> pam(daisy(mat, metric = "manhattan"), k=3, diss=T)$clust >>> >>> >>> >>> Baoqiang Cao a ?crit : >>> >>>> Thanks! >>>> I tried kmeans, the results is not very positive. Anyway, thanks Jacques! >>>> Please let me know if you have any other thoughts! >>>> >>>> Best regards, >>>> Baoqiang Cao >>>> >>>> ======= At 2006-03-29, 00:08:44 you wrote: ======>>>> >>>> >>>> >>>>> if you want to classify rows or columns, read: >>>>> ?hclust >>>>> ?kmeans >>>>> library(cluster) >>>>> ?pam >>>>> >>>>> >>>>> Baoqiang Cao a ?crit : >>>>> >>>>> >>>>> >>>>>> Dear All, >>>>>> >>>>>> I have a data, suppose it is an N*M matrix data. All I want is to classify >>>>>> it into, let see, 3 classes. Which method(s) do you think is(are) >>>>>> appropriate for this purpose? Any reference will be welcome! Thanks! >>>>>> >>>>>> Best, >>>>>> Baoqiang Cao >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------ >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at stat.math.ethz.ch mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide! >>>>>> http://www.R-project.org/posting-guide.html >>>>>> >>>>>> >>>>>> >>>>> . >>>>> >>>>> >>>> >>>> = = = = = = = = = = = = = = = = = = = >>>> >>>> Baoqiang Cao >>>> caobg at email.uc.edu >>>> 2006-03-29 >>>> >>>> >>>> >>>> >>> >>> ______________________________________________ >>> R-help at stat.math.ethz.ch mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> >> ______________________________________________ >> R-help at stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> > >-- >Brian D. Ripley, ripley at stats.ox.ac.uk >Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >University of Oxford, Tel: +44 1865 272861 (self) >1 South Parks Road, +44 1865 272866 (PA) >Oxford OX1 3TG, UK Fax: +44 1865 272595= = = = = = = = = = = = = = = = = = = Baoqiang Cao caobg at email.uc.edu 2006-03-29
In addition to Brian's comment, Gordon's book, already in 2nd edition, is all about clustering, but the title is simply `Classification'. Andy From: Sean Davis> > We have to be careful here. Classification (which is the > terminology that the original poster used) is NOT the same as > clustering, although the two are often confused. If the > original poster wants to do clustering and examine the > results for the presence of three clusters, that is fine and > there are many methods for clustering that could be used. > However, classification will require a different set of > tools. If the clustering tools already pointed out are not > doing what is needed (that is, that Cao actually is > interested in clustering and not classification), then > perhaps a further explanation of what the problem would help clarify. > > Sean > > > On 3/29/06 1:46 AM, "Jacques VESLOT" <jacques.veslot at cirad.fr> wrote: > > > try this (suppose mat is your matrix): > > > > hc <- hclust(dist(mat,"manhattan"), "ward") > > plot(hc, hang=-1) > > (x <- identify(hc)) # rightclick to stop > > cutree(hc, 3) > > > > km<- kmeans(mat, 3) > > km$cluster > > km$centers > > > > pam(daisy(mat, metric = "manhattan"), k=3, diss=T)$clust > > > > > > > > Baoqiang Cao a ?crit : > > > >> Thanks! > >> I tried kmeans, the results is not very positive. Anyway, thanks > >> Jacques! Please let me know if you have any other thoughts! > >> > >> Best regards, > >> Baoqiang Cao > >> > >> ======= At 2006-03-29, 00:08:44 you wrote: ======> >> > >> > >> > >>> if you want to classify rows or columns, read: > >>> ?hclust > >>> ?kmeans > >>> library(cluster) > >>> ?pam > >>> > >>> > >>> Baoqiang Cao a ?crit : > >>> > >>> > >>> > >>>> Dear All, > >>>> > >>>> I have a data, suppose it is an N*M matrix data. All I > want is to > >>>> classify it into, let see, 3 classes. Which method(s) do > you think > >>>> is(are) appropriate for this purpose? Any reference will be > >>>> welcome! Thanks! > >>>> > >>>> Best, > >>>> Baoqiang Cao > >>>> > >>>> > >>>> > >>>> > ------------------------------------------------------------------- > >>>> ----- > >>>> > >>>> ______________________________________________ > >>>> R-help at stat.math.ethz.ch mailing list > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>> PLEASE do read the posting guide! > >>>> http://www.R-project.org/posting-guide.html > >>>> > >>>> > >>>> > >>> . > >>> > >>> > >> > >> = = = = = = = = = = = = = = = = = = = > >> > >> Baoqiang Cao > >> caobg at email.uc.edu > >> 2006-03-29 > >> > >> > >> > >> > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >