thr3ads.net - R help - [R] calculating dissimilarities in R [Sep 2006]

If this information is useful, please help other people find it:
Share via:

virgin

2006-Sep-26 05:48 UTC

[R] calculating dissimilarities in R

?Dear All,
I?ve got a statistical question on calculating
dissimilarities in R.
I want to calculate the different types of dissimilarities
on the ?flower? dataset found in the package
?cluster?. Flower is a data frame with 18 observations
on 8 variables. Variable 1 and 2 are binary, variable 3 is
asymmetric binary, variable 4 is nominal, variable 5 and 6
are ordered and variable 7 and 8 are interval scaled.

Commands to load the dataset in R.
library(cluster)
data(flower)
flower


What are the different types of dissimilarities that can be
calculated on such a dataset?  
Do I need to group the types of variables first i.e. all
binary together then run the calculation?  Do I use
dissimilarity indices such as Jaccard or should it be
classification function such as ?daisy? which should be
used? 

Many thanks,

Elvina Payet (MSc)
University of La Reunion

Martin Maechler

2006-Sep-26 07:55 UTC

head link

[R] calculating dissimilarities in R

Hi Elvina,
>>>>> "Elvina" == Elvina Payet <virgin at
seychelles.sc>
>>>>>     on Tue, 26 Sep 2006 05:48:01 GMT writes:
    Elvina> ,A (BDear All,
    Elvina> I?ve got a statistical question on calculating
    Elvina> dissimilarities in R.
    Elvina> I want to calculate the different types of dissimilarities
    Elvina> on the ?flower? dataset found in the package
    Elvina> ?cluster?. Flower is a data frame with 18 observations
    Elvina> on 8 variables. Variable 1 and 2 are binary, variable 3 is
    Elvina> asymmetric binary, variable 4 is nominal, variable 5 and 6
    Elvina> are ordered and variable 7 and 8 are interval scaled.

    Elvina> Commands to load the dataset in R.

      > library(cluster)
      > data(flower)

or  data(flower, package = "cluster")


    Elvina> What are the different types of dissimilarities that can be
    Elvina> calculated on such a dataset?  
    Elvina> Do I need to group the types of variables first i.e. all
    Elvina> binary together then run the calculation?  Do I use
    Elvina> dissimilarity indices such as Jaccard or should it be
    Elvina> classification function such as ?daisy? which should be
    Elvina> used? 


Yes, you should use  daisy() to calculate dissimilarities,
particularly when you are interested in the difference between
symmetric and asymmetric binary.

Do read  help(daisy)  and look at its examples.

Maybe this will answer all your questions or then it will help
you to ask a much more specific question as suggested by the
posting guide (see link below!)

      [.........]

    virgin> ______________________________________________

      [.........]
    virgin> PLEASE do read the posting guide

    virgin> http://www.R-project.org/posting-guide.html 
	    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    virgin> and provide commented, minimal, self-contained, reproducible
code.

Regards,
Martin Maechler, ETH Zurich

R help - Sep 2006 - calculating dissimilarities in R

[R] calculating dissimilarities in R

[R] calculating dissimilarities in R