Pierre Antoine DuBoDeNa
2013-Mar-28  22:16 UTC
[R] hierarchical clustering with pearson's coefficient
Hello, I want to use pearson's correlation as distance between observations and then use any centroid based linkage distance (ex. Ward's distance) When linkage distances are formed as the Lance-Williams recursive formulation, they just require the initial distance between observations. See here: http://en.wikipedia.org/wiki/Ward%27s_method It is said that you have to use euclidean distance between the initial observations. However i have found this: http://research.stowers-institute.org/efg/R/Visualization/cor-cluster/ where they use pearson's correlation for hierarchical clustering. Any idea if anything is violated in case pearson's correlation is used with Ward's linkage function? the dissimilarity of pearson's correlation can be defined as d sqrt(1-pearsonsimilarity^2). can that be considered as norm1 distance? and thus norm2 if we square it? so that the wikipedia's statement "To apply a recursive algorithm under this objective function, the initial distance between individual objects must be (proportional to) squared Euclidean distance." is valid? Best, Pierre [[alternative HTML version deleted]]
Pierre Antoine DuBoDeNa
2013-Mar-30  02:58 UTC
[R] hierarchical clustering with pearson's coefficient
Anyone for that question? 2013/3/28 Pierre Antoine DuBoDeNa <padbdn@gmail.com>> Hello, > > I want to use pearson's correlation as distance between observations and > then use any centroid based linkage distance (ex. Ward's distance) > > When linkage distances are formed as the Lance-Williams recursive > formulation, they just require the initial distance between observations. > See here: http://en.wikipedia.org/wiki/Ward%27s_method > > It is said that you have to use euclidean distance between the initial > observations. However i have found this: > > http://research.stowers-institute.org/efg/R/Visualization/cor-cluster/ > > where they use pearson's correlation for hierarchical clustering. > > Any idea if anything is violated in case pearson's correlation is used > with Ward's linkage function? > > the dissimilarity of pearson's correlation can be defined as d > sqrt(1-pearsonsimilarity^2). can that be considered as norm1 distance? and > thus norm2 if we square it? so that the wikipedia's statement "To apply a > recursive algorithm under this objective function, the initial distance > between individual objects must be (proportional to) squared Euclidean > distance." is valid? > > Best, > Pierre >[[alternative HTML version deleted]]
I am not sure about your question but i did find this: http://research.med.helsinki.fi/corefacilities/proteinchem/hierarchical_clustering_basics.pdf it seems to address all three topics so perhaps the answer is in there?? On Mar 28, 2013, at 6:16 PM, Pierre Antoine DuBoDeNa wrote:> Hello, > > I want to use pearson's correlation as distance between observations and > then use any centroid based linkage distance (ex. Ward's distance) > > When linkage distances are formed as the Lance-Williams recursive > formulation, they just require the initial distance between observations. > See here: http://en.wikipedia.org/wiki/Ward%27s_method > > It is said that you have to use euclidean distance between the initial > observations. However i have found this: > > http://research.stowers-institute.org/efg/R/Visualization/cor-cluster/ > > where they use pearson's correlation for hierarchical clustering. > > Any idea if anything is violated in case pearson's correlation is used with > Ward's linkage function? > > the dissimilarity of pearson's correlation can be defined as d > sqrt(1-pearsonsimilarity^2). can that be considered as norm1 distance? and > thus norm2 if we square it? so that the wikipedia's statement "To apply a > recursive algorithm under this objective function, the initial distance > between individual objects must be (proportional to) squared Euclidean > distance." is valid? > > Best, > Pierre > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]