Dear All, I have spatial data (presence/absence for 4000 squares) on 250 bird species and would like to use a model-based clustering technique to test for species associations. Is there any way of passing a distance/correlation matrix to mclust as with hclust, rather than the actual data? Or alternatively, is there a way of getting mclust to handle binary data? I'd appreciate any suggestions! Cheers, Jarrod
On Thu, 18 Dec 2003, Jarrod Hadfield wrote:> Dear All, > > I have spatial data (presence/absence for 4000 squares) on 250 bird > species and would like to use a model-based clustering technique to > test for species associations. Is there any way of passing a > distance/correlation matrix to mclust as with hclust, rather than the > actual data? Or alternatively, is there a way of getting mclust to > handle binary data? > > I'd appreciate any suggestions!Why not simply use dist() and hclust() ? Starting with presence/absence data, what could mclust() possibly do that is different from hclust() ?> > Cheers, > > Jarrod >- tom blackwell - u michigan medical school - ann arbor -
Thomas W Blackwell wrote:> [...] > Why not simply use dist() and hclust() ? Starting with > presence/absence data, what could mclust() possibly do that > is different from hclust() ?Um, fit a statistical model.> > - tom blackwell - u michigan medical school - ann arbor - > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >-- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj at waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk +64 7 849 6486 home Mobile 021 1395 862
You could just convert your binary spatial data to numeric 0/1 or -1/1 and give it to Mclust. That would violate the assumptions of the Gaussian model in Mclust, so you should be very careful about interpreting the results. However, if the results are at least as "interesting" as those you get from a non-model-based hierarchical clustering run, then that could be an indication that the approach has merit, and then you could investigate how to build a model-based clustering algorithm that is appropriate for your data. (I don't think it would be that hard to write down some equations giving the probability of each presence matrix being generated for each component of the mixture model, but I don't know how hard it would be implement the EM search for an appropriate mixture model.) -- Tony Plate At Thursday 08:55 PM 12/18/2003 +0000, Jarrod Hadfield wrote:>Dear All, > >I have spatial data (presence/absence for 4000 squares) on 250 bird >species and would like to use a model-based clustering technique to test >for species associations. Is there any way of passing a >distance/correlation matrix to mclust as with hclust, rather than the >actual data? Or alternatively, is there a way of getting mclust to handle >binary data? > >I'd appreciate any suggestions! > >Cheers, > >Jarrod > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help