Karen R. Khar
2011-Jun-27 07:43 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
Hi, I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a distance matrix I created from the data on my own and called it D10.dist. I loaded the cluster package. Then tried the following command...> agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, > method = "average", par.method, keep.diss = n < 1000, keep.data = !diss)And it responded... Error in agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand FALSE, : x is not and cannot be converted to class dissimilarity D10.dist has the following data... D1 0 D2 0.608392 0 D3 0.497451 0.537662 0 D4 0.634548 0.393343 0.537426 0 D5 0.558785 0.543399 0.632221 0.726633 0 D6 0.659483 0.701778 0.741425 0.668624 0.655914 0 D7 0.603012 0.659173 0.571776 0.687599 0.383712 0.683948 0 D8 0.611919 0.665357 0.526453 0.715093 0.457496 0.698213 0.317039 0 D9 0.41501 0.652117 0.552011 0.68969 0.485988 0.702738 0.42819 0.442598 0 D10 0.376512 0.600607 0.517857 0.673515 0.530421 0.667736 0.537025 0.48062 0.240559 0 I would appreciate any suggestions. Please assume I know virtually nothing about R. Thanks, Karen PS I'll eventually be using ~10,000 "species" to cluster. I'll need to have within and between cluster distance info and I'll want a plot colored by cluster. I agnes the right R tool to use? -- View this message in context: http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627154.html Sent from the R help mailing list archive at Nabble.com.
Karen R. Khar
2011-Jun-27 12:57 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
I also tried...
D1 D2 D3 D4 D5 D6 D7 D8 D9
D2 0.608392
D3 0.497451 0.537662
D4 0.634548 0.393343 0.537426
D5 0.558785 0.543399 0.632221 0.726633
D6 0.659483 0.701778 0.741425 0.668624
0.655914
D7 0.603012 0.659173 0.571776 0.687599
0.383712 0.683948
D8 0.611919 0.665357 0.526453 0.715093
0.457496 0.698213 0.317039
D9 0.41501 0.652117 0.552011 0.68969 0.485988
0.702738 0.42819 0.442598
D10 0.376512 0.600607 0.517857 0.673515
0.530421 0.667736 0.537025 0.48062 0.240559
--
View this message in context:
http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627658.html
Sent from the R help mailing list archive at Nabble.com.
Sarah Goslee
2011-Jun-27 13:48 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
On Mon, Jun 27, 2011 at 3:43 AM, Karen R. Khar <karen.khar at gmail.com> wrote:> Hi, > > I'm mighty new to R. I'm using it on Windows. I'm trying to cluster using a > distance matrix I created from the data on my own and called it D10.dist. I > loaded the cluster package. Then tried the following command... > >> agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand = FALSE, >> method = "average", par.method, keep.diss = n < 1000, keep.data = !diss) > And it responded... > > Error in agnes("E:D10.dist", diss = TRUE, metric = "euclidean", stand > FALSE, : > x is not and cannot be converted to class dissimilarity >At a guess, you need to actually import D10.dist into R. It looks like you're trying to give agnes() a string that contains the path to the file, but agnes() just sees that as a string, and of course can't figure out what to do with it.> D10.dist has the following data... > > D1 ? ? ?0 > D2 ? ? ?0.608392 ? ? ? ?0 > D3 ? ? ?0.497451 ? ? ? ?0.537662 ? ? ? ?0 > D4 ? ? ?0.634548 ? ? ? ?0.393343 ? ? ? ?0.537426 ? ? ? ?0 > D5 ? ? ?0.558785 ? ? ? ?0.543399 ? ? ? ?0.632221 ? ? ? ?0.726633 ? ? ? ?0 > D6 ? ? ?0.659483 ? ? ? ?0.701778 ? ? ? ?0.741425 ? ? ? ?0.668624 ? ? ? ?0.655914 ? ? ? ?0 > D7 ? ? ?0.603012 ? ? ? ?0.659173 ? ? ? ?0.571776 ? ? ? ?0.687599 ? ? ? ?0.383712 ? ? ? ?0.683948 ? ? ? ?0 > D8 ? ? ?0.611919 ? ? ? ?0.665357 ? ? ? ?0.526453 ? ? ? ?0.715093 ? ? ? ?0.457496 ? ? ? ?0.698213 ? ? ? ?0.317039 ? ? ? ?0 > D9 ? ? ?0.41501 0.652117 ? ? ? ?0.552011 ? ? ? ?0.68969 0.485988 ? ? ? ?0.702738 ? ? ? ?0.42819 0.442598 ? ? ? ?0 > D10 ? ? 0.376512 ? ? ? ?0.600607 ? ? ? ?0.517857 ? ? ? ?0.673515 ? ? ? ?0.530421 ? ? ? ?0.667736 ? ? ? ?0.537025 ? ? ? ?0.48062 > 0.240559 ? ? ? ?0If you can convince whatever software you used to write out a full symmetric matrix instead of a lower-triangular matrix, you can easily use read.table() to import it.> I would appreciate any suggestions. Please assume I know virtually nothing > about R.Then Karen, my first suggestion is that you read one of the many excellent intro to R guides available online. Sarah -- Sarah Goslee http://www.functionaldiversity.org
Bill.Venables at csiro.au
2011-Jun-27 21:50 UTC
[R] New to R, trying to use agnes, but can't load my ditance matrix
The first problem is that you are using a character string as the first argument
to agnes()
The help information for agnes says that its first argument, x, is
x: data matrix or data frame, or dissimilarity matrix, depending
on the value of the 'diss' argument.
Not a character string. So first you have to read your data into R and hold it
as a "data matrix or data frame". Then you have a choice. Either you
can calculate your own distance matrix with it and then call agnes() with that
as the first argument (and with diss = TRUE) or you can get agnes() to calculate
the distance matrix for you, in which case you need to specify how, using the
metric = argument.
With 10000 entities to cluster, your distance matrix will require
> 10000*9999/2
[1] 49995000
numbers to be stored at once. I hope you are using a 64-bit OS!
With such large numbers of entities to cluster, the usual advice is to try
something more suited to the job. clara() is designed for this kind of problem.
It might be useful to keep in mind that R is not a package. (Repeat: R is NOT a
package - I cannot stress that strongly enough.) It is a programming language.
To use it effectively you really need to know something about how it works,
first. It might pay you to spend a little time getting used to the protocols,
how to do simple things in R like reading in data and manipulating it, before
you tackle such a large and potentially tricky clustering problem.
Bill Venables.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Karen R. Khar
Sent: Monday, 27 June 2011 5:44 PM
To: r-help at r-project.org
Subject: [R] New to R, trying to use agnes, but can't load my ditance matrix
Hi,
I'm mighty new to R. I'm using it on Windows. I'm trying to cluster
using a
distance matrix I created from the data on my own and called it D10.dist. I
loaded the cluster package. Then tried the following command...
> agnes("E:D10.dist", diss = TRUE, metric = "euclidean",
stand = FALSE,
> method = "average", par.method, keep.diss = n < 1000,
keep.data = !diss)
And it responded...
Error in agnes("E:D10.dist", diss = TRUE, metric =
"euclidean", stand FALSE, :
x is not and cannot be converted to class dissimilarity
D10.dist has the following data...
D1 0
D2 0.608392 0
D3 0.497451 0.537662 0
D4 0.634548 0.393343 0.537426 0
D5 0.558785 0.543399 0.632221 0.726633 0
D6 0.659483 0.701778 0.741425 0.668624 0.655914 0
D7 0.603012 0.659173 0.571776 0.687599 0.383712 0.683948 0
D8 0.611919 0.665357 0.526453 0.715093 0.457496 0.698213 0.317039 0
D9 0.41501 0.652117 0.552011 0.68969 0.485988 0.702738 0.42819 0.442598 0
D10 0.376512 0.600607 0.517857 0.673515 0.530421 0.667736 0.537025 0.48062
0.240559 0
I would appreciate any suggestions. Please assume I know virtually nothing
about R.
Thanks,
Karen
PS I'll eventually be using ~10,000 "species" to cluster. I'll
need to have
within and between cluster distance info and I'll want a plot colored by
cluster. I agnes the right R tool to use?
--
View this message in context:
http://r.789695.n4.nabble.com/New-to-R-trying-to-use-agnes-but-can-t-load-my-ditance-matrix-tp3627154p3627154.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.