similar to: kmeans.big.matrix

Displaying 20 results from an estimated 2000 matches similar to: "kmeans.big.matrix"

2009 Jul 18
1
Building a big.matrix using foreach
Hi there! I have become a big fan of the 'foreach' package allowing me to do a lot of stuff in parallel. For example, evaluating the function f on all elements in a vector x is easily accomplished: foreach(i=1:length(x),.combine=c) %dopar% f(x[i]) Here the .combine=c option tells foreach to combine output using the c()-function. That is, to return it as a vector. Today I discovered the
2010 Feb 24
1
Sparse KMeans/KDE/Nearest Neighbors?
hi, I have a dataset (the netflix dataset) which is basically ~18k columns and well variable number of rows but let's assume 25 thousand for now. The dataset is very sparse. I was wondering how to do kmeans/nearest neighbors or kernel density estimation on it. I tired using the spMatrix function in "Matrix" package. I think I'm able to create the matrix but as soon as I pass
2012 Jan 18
1
kmeans clustering on large but sparse matrix
Hi, I have a 60k*600k matrix, which exceed the vector length limit of 2^32-1. But it's rather sparse, only 0.02% has value. So I save is as MarketMatrix (mm) file, it's about 300M in size. I use readMM in Matrix package to read it in. If do so, the data type becomes dgTMatrix in 'Matrix' package instead of the common matrix type. The problem is, if I run k-means only on part of
2009 Jun 02
2
bigmemory - extracting submatrix from big.matrix object
I am using the library(bigmemory) to handle large datasets, say 1 GB, and facing following problems. Any hints from anybody can be helpful. _Problem-1: _ I am using "read.big.matrix" function to create a filebacked big matrix of my data and get the following warning: > x = read.big.matrix("/home/utkarsh.s/data.csv",header=T,type="double",shared=T,backingfile
2013 Jul 26
1
variación en los resultados de k medias (Alfredo Alvarez)
Buen día, no sé si estoy utilizando bien la lista, es la primera vez. Si lo hago mal me corrigen por favor. Sobre tu comentario Pedro, muchas gracias. Lo qeu entiendo con tu sugerencia de set.seed es qeu de esa forma fijas los resultados, pero no estoy seguro si otra agrupación funcione mejor. Es decir me interesa un método de agrupación que genere la "mejor" agrupación y como los
2010 Dec 17
1
[Fwd: adding more columns in big.matrix object of bigmemory package]
Hi, With reference to the mail below, I have large datasets, coming from various different sources, which I can read into filebacked big.matrix using library bigmemory. I want to merge them all into one 'big.matrix' object. (Later, I want to run regression using library 'biglm'). I am unsuccessfully trying to do this from quite some time now. Can you please
2011 Sep 29
1
efficient coding with foreach and bigmemory
I recently learned about the bigmemory and foreach packages and am trying to use them to help me create a very large matrix. Without those packages, I can create the type of matrix that I want with 10 columns and 5e6 rows. I would like to be able to scale up to 5e9 rows, or more, if possible. I have created a simplified example of what I'm trying to do, below. The first part of the
2010 Apr 23
2
bigmemory package woes
I have pretty big data sizes, like matrices of .5 to 1.5GB so once i need to juggle several of them i am in need of disk cache. I am trying to use bigmemory package but getting problems that are hard to understand. I am getting seg faults and machine just hanging. I work by the way on Red Hat Linux, 64 bit R version 10. Simplest problem is just saving matrices. When i do something like
2011 Apr 06
2
Help in kmeans
Hi All, I was using the following command for performing kmeans for Iris dataset. Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3) This was giving proper results for me. But, in my application we generate the R commands dynamically and there was a requirement that the column names will be sent instead of column indices to the R commands.Hence, to incorporate this, i tried using the R
2003 Jun 05
1
kmeans (again)
Regarding a previous question concerning the kmeans function I've tried the same example and I also get a strange result (at least according to what is said in the help of the function kmeans). Apparently, the function is disregarding the initial cluster centers one gives it. According to the help of the function: centers: Either the number of clusters or a set of initial cluster
2013 Mar 13
1
Empty cluster / segfault using vanilla kmeans with version 2.15.2
Hello, here is a working reproducible example which crashes R using kmeans or gives empty clusters using the nstart option with R 15.2. library(cluster) kmeans(ruspini,4) kmeans(ruspini,4,nstart=2) kmeans(ruspini,4,nstart=4) kmeans(ruspini,4,nstart=10) ?kmeans either we got empty always clusters and or, after some further commands an segfault. regards, Detlef Groth ------------ [R] Empty
2004 May 11
1
AW: Probleme with Kmeans...
Sorry, to solve your question I had tried: data(faithful) kmeans(faithful[c(1:20),1],10) Error: empty cluster: try a better set of initial centers But when I run this a second time it will be ok. It seems, that kmeans has problems to initialize good starting points, because of the random choose of these starting initial points. With kmeans(data,k,centers=c(...) the problem can be solved.
2006 Jul 09
2
distance in kmeans algorithm?
Hello. Is it possible to choose the distance in the kmeans algorithm? I have m vectors of n components and I want to cluster them using kmeans algorithm but I want to use the Mahalanobis distance or another distance. How can I do it in R? If I use kmeans, I have no option to choose the distance. Thanks in advance, Arnau.
2003 Apr 14
2
kmeans clustering
Hi, I am using kmeans to cluster a dataset. I test this example: > data<-matrix(scan("data100.txt"),100,37,byrow=T) (my dataset is 100 rows and 37 columns--clustering rows) > c1<-kmeans(data,3,20) > c1 $cluster [1] 1 1 1 1 1 1 1 3 3 3 1 3 1 3 3 1 1 1 1 3 1 3 3 1 1 1 3 3 1 1 3 1 1 1 1 3 3 [38] 3 1 1 1 3 1 1 1 1 3 3 3 1 1 1 1 1 1 3 1 3 1 1 3 1 1 1 1 3 1 1 1 1 1 1 3
2006 Apr 07
2
cclust causes R to crash when using manhattan kmeans
Dear R users, When I run the following code, R crashes: require(cclust) x <- matrix(c(0,0,0,1.5,1,-1), ncol=2, byrow=TRUE) cclust(x, centers=x[2:3,], dist="manhattan", method="kmeans") While this works: cclust(x, centers=x[2:3,], dist="euclidean", method="kmeans") I'm posting this here because I am not sure if it is a bug. I've been searching
2012 Feb 27
2
kmeans: how to retrieve clusters
Hello, I'd like to classify data with kmeans algorithm. In my case, I should get 2 clusters in output. Here is my data colCandInd colCandMed 1 82 2950.5 2 83 1831.5 3 1192 2899.0 4 1193 2103.5 The first cluster is the two first lines the 2nd cluster is the two last lines Here is the code: x = colCandList$colCandInd y = colCandList$colCandMed m = matrix(c(x, y),
2003 Jun 06
1
Kmeans again
Dear helpers I'm sorry to insist but I still think there is something wrong with the function kmeans. For instance, let's try the same small example: > dados<-matrix(c(-1,0,2,2.5,7,9,0,3,0,6,1,4),6,2) I will choose observations 3 and 4 for initial centers and just one iteration. The results are > A<-kmeans(dados,dados[c(3,4),],1) > A $cluster [1] 1 1 1 1 2 2 $centers
2005 Jun 14
1
KMEANS output...
Using R 2.1.0 on Windows 2 questions: 1. Is there a way to parse the output from kmeans within R? 2. If the answer to 1. is convoluted or impossible, how do you save the output from kmeans in a plain text file for further processing outside R? Example: > ktx<-kmeans(x,12, nstart = 200) I would like to parse ktx within R to extract cluster sizes, sum-of-squares values, etc., OR save ktx in
2010 May 05
2
custom metric for dist for use with hclust/kmeans
Hi guys, I've been using the kmeans and hclust functions for some time now and was wondering if I could specify a custom metric when passing my data frame into hclust as a distance matrix. Actually, kmeans doesn't even take a distance matrix; it takes the data frame directly. I was wondering if there's a way or if there's a package that lets you create distance matrices from
2003 Jun 03
1
kmeans
Dear helpers I was working with kmeans from package mva and found some strange situations. When I run several times the kmeans algorithm with the same dataset I get the same partition. I simulated a little example with 6 observations and run kmeans giving the centers and making just one iteration. I expected that the algorithm just allocated the observations to the nearest center but think this