Displaying 3 results from an estimated 3 matches for "4000x3".
Did you mean:
40003
2004 May 24
1
non-hierarchical non-exclusive clustering of large data sets
Hi,
I'm trying to use R to cluster words with related meanings. Does anyone
know of a non-hierarchical clustering method in R that produces
non-exclusive clusters? With non-exclusive, I mean that words should be
allowed to be part of multiple clusters. So my data matrix would look
something like:
T1 T2 T3
CLOWN_N 0 1 0
BANK_N 3 0 2
RIVER_N 0 0 2
FLOW_V 0 0 3
MONEY_N 2 0 0
PAY_V 2 0 0
The
2004 May 24
0
AW: non-hierarchical non-exclusive clustering of large data sets
...) outputs probabilities
> of membership in each cluster.
>
> > the one above, its dimensions would be in the order of (100000,
> > 100000). Does anyone know if this would cause practical problems,
> > perhaps very slow clustering?
>
> I had a much smaller matrix, 4000x3, fanny took about 4
> minutes wall clock time on a lightly loaded (there were many
> other processes, but none
> computational) 1.4 GHz Athlon, It was completely CPU-bound.
>
> --
> bhaskar
>
> ______________________________________________
> R-help at stat.math.ethz...
1998 Nov 30
1
[R] R functionality
...but practically infeasible in S, unless
> they are coded in C or Fortran and called from S. These problems seem to run
> fine in R.
Some but not all. Last night I was trying to run my survival examples
in R on Windows, including a large loop or alternatively a vectorized
version using many 4000x3 matrices. Both run happily in S-PLUS 4.5 in
about three minutes and less than 10Mb allocated. R needs a largish
heap (20Mb failed, 30Mb seemed enough) that got partially swapped out
(the machine has 32Mb RAM) and I gave up after an hour of thrashing on
the loop version, and the vectorized version...