ivo welch
2013-May-31 16:14 UTC
[Rd] R 3.0.1 : parallel collection triggers "long memory not supported yet"
Dear R developers: ... 7: lapply(seq_len(cores), inner.do) 8: FUN(1:3[[3]], ...) 9: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) Selection: .....................Error in sendMaster(try(lapply(X = S, FUN FUN, ...), silent = TRUE)) : long vectors not supported yet: memory.c:3100 admittedly, my outcome will be a very big list, with 30,000 elements, each containing data frames with 14 variables and around 200 to 5000 observations (say, 64KB on average). thus, I estimate that the resulting list is 20GB. the specific code that triggers this is exposures.list <- mclapply(1:length(crsp.list.by.permno), FUN=function(i, NMO=NMO) { calcbeta.for.one.stock(crsp.list.by.permno[[i]], NMO=NMO) }, NMO=NMO, mc.cores=3 ) the release docs to 3.0.0 suggest this error should occur primarily in unusual situations. so, it's not really a bug. but I thought I would point this out. maybe this is a forgotten updatedlet. regards, /iaw ---- Ivo Welch (ivo.welch@gmail.com) [[alternative HTML version deleted]]
Simon Urbanek
2013-May-31 16:47 UTC
[Rd] R 3.0.1 : parallel collection triggers "long memory not supported yet"
On May 31, 2013, at 12:14 PM, ivo welch wrote:> Dear R developers: > > ... > 7: lapply(seq_len(cores), inner.do) > 8: FUN(1:3[[3]], ...) > 9: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) > > Selection: .....................Error in sendMaster(try(lapply(X = S, FUN > FUN, ...), silent = TRUE)) : > long vectors not supported yet: memory.c:3100 > > > admittedly, my outcome will be a very big list, with 30,000 elements, each > containing data frames with 14 variables and around 200 to 5000 > observations (say, 64KB on average). thus, I estimate that the resulting > list is 20GB. the specific code that triggers this is > > > exposures.list <- mclapply(1:length(crsp.list.by.permno), > FUN=function(i, NMO=NMO) { > > calcbeta.for.one.stock(crsp.list.by.permno[[i]], NMO=NMO) > }, > NMO=NMO, mc.cores=3 ) > > the release docs to 3.0.0 suggest this error should occur primarily in > unusual situations. so, it's not really a bug. but I thought I would > point this out. maybe this is a forgotten updatedlet. >mclapply uses sendMaster() to send the results (serialized into a raw vector) from the worker back to the parent R session. Apparently your serialized result from one worker is more than 2Gb. The multicore part of parallel currently doesn't support long vectors for the transmission so the result for one worker cannot exceed 2Gb. I'll put long vector support on my ToDo list. In your case you should be able to work around it by disabling pre-scheduling (you may want to do some grouping if you have 30,000 short iterations, though). Cheers, Simon