thr3ads.net - R help - [R] 50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary [Jun 2009]

If this information is useful, please help other people find it:
Share via:

leif olson

2009-Jun-26 19:40 UTC

[R] 50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary

Hello, Im working on a 50933 point count bird abundance dataset. I've
succeeded in calculating a distance matrix for this entire set, but I don't
have sufficient memory to convert this to a matrix, as below...
abun.dist <- dist(abun.mat[1:50993,1:235)
test <- rowMeans(as.matrix(abun.dist))
Error in matrix(0, size, size) : too many elements specified

ive been able to run a hclust() clustering procedure, due to the fact that
hclust() makes a call to fortran code, but id like to be able to generate a
calinski index for each of the clusters to assess the validity.
Unfortunately, all the validation routines I have found are all native R
code, and usually call as.matrix, resulting in the same error i receive
above.
What I'd like to figure out is how to just go through, one point at a time,
and calculate the values i need. But I've been unable to come up with code
to call the correct positions in the dist vector, can anyone suggest some
code that might do this? Thanks...

...leif

-- 

-- 
First they ignore you, then they laugh at you, then they fight you, then you
win
- Mohandas Gandhi

	[[alternative HTML version deleted]]

Romain Francois

2009-Jun-27 07:32 UTC

head link

[R] 50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary

Hi,

If you are only interested in row means, you can work the distance 
matrix at the c level.

You might like to adapt this post:
http://tolstoy.newcastle.edu.au/R/e6/devel/09/04/1378.html

Romain

On 06/26/2009 09:40 PM, leif olson wrote:> Hello, Im working on a 50933 point count bird abundance dataset. I've
> succeeded in calculating a distance matrix for this entire set, but I
don't
> have sufficient memory to convert this to a matrix, as below...
> abun.dist<- dist(abun.mat[1:50993,1:235)
> test<- rowMeans(as.matrix(abun.dist))
> Error in matrix(0, size, size) : too many elements specified
>
> ive been able to run a hclust() clustering procedure, due to the fact that
> hclust() makes a call to fortran code, but id like to be able to generate a
> calinski index for each of the clusters to assess the validity.
> Unfortunately, all the validation routines I have found are all native R
> code, and usually call as.matrix, resulting in the same error i receive
> above.
> What I'd like to figure out is how to just go through, one point at a
time,
> and calculate the values i need. But I've been unable to come up with
code
> to call the correct positions in the dist vector, can anyone suggest some
> code that might do this? Thanks...
>
> ...leif
>    

-- 
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

Maybe Matching Threads

Search for more maybe matching threads

R help - Jun 2009 - 50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary

[R] 50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary

[R] 50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary

Maybe Matching Threads