Hello there. Is there any function in R that can do cluster on a set of data that has both categorical and numerical variables? thanks. siangli
You could have a look at library(analogue) , function ?distance and library (cluster), function ?agnes B. Chua Siang Li wrote:> > > Hello there. Is there any function in R that can do cluster on a set > of > data that has both categorical and numerical variables? thanks. > siangli > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >----- The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/Cluster-on-both-categorical-and-numerical-data-tp17979370p17980106.html Sent from the R help mailing list archive at Nabble.com.
hi, Chua Siang I think the mclust package is what you need. regards. On 2008-6-18, at ??5:46, Chua Siang Li wrote:> > Hello there. Is there any function in R that can do cluster on a > set of > data that has both categorical and numerical variables? thanks. > siangli > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-------------------------- Peng Jiang ?? Ph.D. Candidate Antai College of Economics & Management ???????? Department of Mathematics ??? Shanghai Jiaotong University (Minhang Campus) 800 Dongchuan Road 200240 Shanghai P. R. China
okay, when you cluster information, you can have two inputs raw data information which the algorithms converts have into a matrix and then processes a pre-processed matrix which you create yourself to input into a package essentially, packages will have a default assumption about the data you are using or the type of matrix you are using these matrices are often defined in simplistic terms as either a similarity or dissimilarity matrix think of a correlation matrix as an example of a matrix which represents similarity i think you will need to create a dissimilarity matrix (think of something that is like a correlation matrix which measures similarity in the diagonals) and it is the opposite of this (technically not correct, but you get the idea I hope) i use clustan graphics for all my clustering needs and gower's coefficient is the input i use when i have mixed variables if you pre-process (create a dissimilarity matrix) using Gowers algorithm, then specify this everything should work fine once you get this sorted, it should be all straight-forward PD ----- Original Message ----- From: "Chua Siang Li" <siang.li.chua at acceval-intl.com> To: <r-help at r-project.org> Sent: Wednesday, June 18, 2008 7:46 PM Subject: [R] Cluster on both categorical and numerical data> > Hello there. Is there any function in R that can do cluster on a set of > data that has both categorical and numerical variables? thanks. > siangli > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >