I'm using bigkmeans in 'biganalytics' to cluster my 60,000 by
600,000 matrix.
I'm using a 8 core Linux VM.
I have register parallel backend with >registerDoMC()
And I checked how many cores registered with>getDoParWorkers()
It returns 8, which is the number of cores I have on my machine.
And I run the test below, whose results shows improved speed due to
parallel.
check <-function(n) {
+ for(i in 1:1000)
+ {
+ sme <- matrix(rnorm(100), 10,10)
+ solve(sme)
+ }
+ }
times <- 100 # times to run the loop
system.time(x <- foreach(j=1:times ) %dopar% check(j))
user system elapsed
----- ------ 4
system.time(x <- foreach(j=1:times ) %do% check(j))
user system elapsed
----- ------- 16
But when I run my data in bigkmeans>ans <- bigkmeans(data,200,nstart=5,iter.max=20)
I see only one R process in system monitor, and only one CPU usage is high.
I guess it's not really parallel.
I also tried DoSNOW, though it's used for multi clusters.
>cl <- makeCluster(8,type="SOCK")
>registerDoSNOW(cl)
>ans <- bigkmeans(data,200,nstart = 30)
There are 8 R processes but only 1 running.
Is it because I have something misconfigured? Or is the bigkmeans do not
support parallel?
Thanks in advance to any advise.
Regards,
Lishu
--
View this message in context:
http://r.789695.n4.nabble.com/bigkmeans-not-parallel-tp4353036p4353036.html
Sent from the R help mailing list archive at Nabble.com.