Hi, I am running my code in a cluster at Arizona State University. I have a huge climate data, 66000 X 500 I am not sure if I can find correlation of such a huge data in the cluster. Normally I allocate 20000M and operate on 5 X 20000. Even this is taking lot of time. Is there any way I can find cl = cor(cdata) utilizing the computers in the clusters(I am using 32 nodes ). I am using the following code to get the cluster.. library(snow) cl <- makeCluster(64) clusterExport(cl, "fakeData") clusterEvalQ(cl, library(boot)) system.time(out2 <- clusterApplyLB(cl, pair, geneCor)) stopCluster(cl) But here the geneCor and pair is calculated as, pair <- combn(1:nrow(fakeData), 2, simplify = F) geneCor <- function(x, gene = fakeData) { cor(t(gene[x[1], ]), t(gene[x[2], ])) } #This calculates for each pairs. But I want cor(data) for a 2D matrix to be parallelized. regards, Kumaraguru -- View this message in context: http://r.789695.n4.nabble.com/Parallelizing-cor-for-large-data-set-using-Cluster-tp3245821p3245821.html Sent from the R help mailing list archive at Nabble.com.