Displaying 3 results from an estimated 3 matches for "multi_add".
Did you mean:
mul_add
2018 Sep 19
5
segfault issue with parallel::mclapply and download.file() on Mac OS X
I have an lapply function call that I want to parallelize. Below is a very
simplified version of the code:
url_base <- "https://cloud.r-project.org/src/contrib/"
files <- c("A3_1.0.0.tar.gz", "ABC.RAP_0.9.0.tar.gz")
res <- parallel::mclapply(files, function(s) download.file(paste0(url_base,
s), s))
Instead of download a couple of files in parallel, I get a
2018 Oct 04
1
segfault issue with parallel::mclapply and download.file() on Mac OS X
...url::curl_version() for your local config. Don't count in this
though, Apple might switch back to the fork-unsafe DarwinSSL once they
support ALPN, which is needed for HTTP/2.
As Gabor already suggested, libcurl has built-in systems for
concurrent connections. The curl package exposes this via multi_add
function. Not only is this safer than forking, it will be much faster
because it takes advantage of HTTP keep-alive and when supported it
uses HTTP2 multiplexing which allows efficiently performing thousands
of concurrent HTTPS requests over a single TCP connection.
2018 Sep 20
0
segfault issue with parallel::mclapply and download.file() on Mac OS X
...esses to perform HTTP in parallel is very often bad
practice, actually. Whenever you can, use I/O multiplexing instead,
since the main R process is not doing anything, anyway, just waiting
for the data to come in. So you don't need more processes, you need
parallel I/O. Take a look at the curl::multi_add() etc. functions.
Btw. download.file() can actually download files in parallel if the
liburl method is used, just give it a list of URLs in a character
vector. This API is very restricted, though, so I suggest to look at
the curl package.
GaborOn Thu, Sep 20, 2018 at 8:44 AM Seth Russell
<seth...