Karen R. Khar
2011-Jun-27 07:43 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
Hi, I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a distance matrix I created from the data on my own and called it D10.dist. I loaded the cluster package. Then tried the following command...> agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, > method = "average", par.method, keep.diss = n < 1000, keep.data = !diss)And it responded... Error in agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand FALSE, : x is not and cannot be converted to class dissimilarity D10.dist has the following data... D1 0 D2 0.608392 0 D3 0.497451 0.537662 0 D4 0.634548 0.393343 0.537426 0 D5 0.558785 0.543399 0.632221 0.726633 0 D6 0.659483 0.701778 0.741425 0.668624 0.655914 0 D7 0.603012 0.659173 0.571776 0.687599 0.383712 0.683948 0 D8 0.611919 0.665357 0.526453 0.715093 0.457496 0.698213 0.317039 0 D9 0.41501 0.652117 0.552011 0.68969 0.485988 0.702738 0.42819 0.442598 0 D10 0.376512 0.600607 0.517857 0.673515 0.530421 0.667736 0.537025 0.48062 0.240559 0 I would appreciate any suggestions. Please assume I know virtually nothing about R. Thanks, Karen PS I'll eventually be using ~10,000 "species" to cluster. I'll need to have within and between cluster distance info and I'll want a plot colored by cluster. I agnes the right R tool to use? -- View this message in context: http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627154.html Sent from the R help mailing list archive at Nabble.com.
Karen R. Khar
2011-Jun-27 12:57 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
I also tried... D1 D2 D3 D4 D5 D6 D7 D8 D9 D2 0.608392 D3 0.497451 0.537662 D4 0.634548 0.393343 0.537426 D5 0.558785 0.543399 0.632221 0.726633 D6 0.659483 0.701778 0.741425 0.668624 0.655914 D7 0.603012 0.659173 0.571776 0.687599 0.383712 0.683948 D8 0.611919 0.665357 0.526453 0.715093 0.457496 0.698213 0.317039 D9 0.41501 0.652117 0.552011 0.68969 0.485988 0.702738 0.42819 0.442598 D10 0.376512 0.600607 0.517857 0.673515 0.530421 0.667736 0.537025 0.48062 0.240559 -- View this message in context: http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627658.html Sent from the R help mailing list archive at Nabble.com.
Sarah Goslee
2011-Jun-27 13:48 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
On Mon, Jun 27, 2011 at 3:43 AM, Karen R. Khar <karen.khar at gmail.com> wrote:> Hi, > > I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a > distance matrix I created from the data on my own and called it D10.dist. I > loaded the cluster package. Then tried the following command... > >> agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, >> method = "average", par.method, keep.diss = n < 1000, keep.data = !diss) > And it responded... > > Error in agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand > FALSE, : > x is not and cannot be converted to class dissimilarity >At a guess, you need to actually import D10.dist into R. It looks like you're trying to give agnes() a string that contains the path to the file, but agnes() just sees that as a string, and of course can't figure out what to do with it.> D10.dist has the following data... > > D1 ? ? ?0 > D2 ? ? ?0.608392 ? ? ? ?0 > D3 ? ? ?0.497451 ? ? ? ?0.537662 ? ? ? ?0 > D4 ? ? ?0.634548 ? ? ? ?0.393343 ? ? ? ?0.537426 ? ? ? ?0 > D5 ? ? ?0.558785 ? ? ? ?0.543399 ? ? ? ?0.632221 ? ? ? ?0.726633 ? ? ? ?0 > D6 ? ? ?0.659483 ? ? ? ?0.701778 ? ? ? ?0.741425 ? ? ? ?0.668624 ? ? ? ?0.655914 ? ? ? ?0 > D7 ? ? ?0.603012 ? ? ? ?0.659173 ? ? ? ?0.571776 ? ? ? ?0.687599 ? ? ? ?0.383712 ? ? ? ?0.683948 ? ? ? ?0 > D8 ? ? ?0.611919 ? ? ? ?0.665357 ? ? ? ?0.526453 ? ? ? ?0.715093 ? ? ? ?0.457496 ? ? ? ?0.698213 ? ? ? ?0.317039 ? ? ? ?0 > D9 ? ? ?0.41501 0.652117 ? ? ? ?0.552011 ? ? ? ?0.68969 0.485988 ? ? ? ?0.702738 ? ? ? ?0.42819 0.442598 ? ? ? ?0 > D10 ? ? 0.376512 ? ? ? ?0.600607 ? ? ? ?0.517857 ? ? ? ?0.673515 ? ? ? ?0.530421 ? ? ? ?0.667736 ? ? ? ?0.537025 ? ? ? ?0.48062 > 0.240559 ? ? ? ?0If you can convince whatever software you used to write out a full symmetric matrix instead of a lower-triangular matrix, you can easily use read.table() to import it.> I would appreciate any suggestions. Please assume I know virtually nothing > about R.Then Karen, my first suggestion is that you read one of the many excellent intro to R guides available online. Sarah -- Sarah Goslee http://www.functionaldiversity.org
Bill.Venables at csiro.au
2011-Jun-27 21:50 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
The first problem is that you are using a character string as the first argument to agnes() The help information for agnes says that its first argument, x, is x: data matrix or data frame, or dissimilarity matrix, depending on the value of the 'diss' argument. Not a character string. So first you have to read your data into R and hold it as a "data matrix or data frame". Then you have a choice. Either you can calculate your own distance matrix with it and then call agnes() with that as the first argument (and with diss = TRUE) or you can get agnes() to calculate the distance matrix for you, in which case you need to specify how, using the metric = argument. With 10000 entities to cluster, your distance matrix will require> 10000*9999/2[1] 49995000 numbers to be stored at once. I hope you are using a 64-bit OS! With such large numbers of entities to cluster, the usual advice is to try something more suited to the job. clara() is designed for this kind of problem. It might be useful to keep in mind that R is not a package. (Repeat: R is NOT a package - I cannot stress that strongly enough.) It is a programming language. To use it effectively you really need to know something about how it works, first. It might pay you to spend a little time getting used to the protocols, how to do simple things in R like reading in data and manipulating it, before you tackle such a large and potentially tricky clustering problem. Bill Venables. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Karen R. Khar Sent: Monday, 27 June 2011 5:44 PM To: r-help at r-project.org Subject: [R] New to R, trying to use agnes, but can't load my ditance matrix Hi, I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a distance matrix I created from the data on my own and called it D10.dist. I loaded the cluster package. Then tried the following command...> agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, > method = "average", par.method, keep.diss = n < 1000, keep.data = !diss)And it responded... Error in agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand FALSE, : x is not and cannot be converted to class dissimilarity D10.dist has the following data... D1 0 D2 0.608392 0 D3 0.497451 0.537662 0 D4 0.634548 0.393343 0.537426 0 D5 0.558785 0.543399 0.632221 0.726633 0 D6 0.659483 0.701778 0.741425 0.668624 0.655914 0 D7 0.603012 0.659173 0.571776 0.687599 0.383712 0.683948 0 D8 0.611919 0.665357 0.526453 0.715093 0.457496 0.698213 0.317039 0 D9 0.41501 0.652117 0.552011 0.68969 0.485988 0.702738 0.42819 0.442598 0 D10 0.376512 0.600607 0.517857 0.673515 0.530421 0.667736 0.537025 0.48062 0.240559 0 I would appreciate any suggestions. Please assume I know virtually nothing about R. Thanks, Karen PS I'll eventually be using ~10,000 "species" to cluster. I'll need to have within and between cluster distance info and I'll want a plot colored by cluster. I agnes the right R tool to use? -- View this message in context: http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627154.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.