>>>>> G?bor Cs?rdi
>>>>> on Wed, 22 Jan 2020 22:56:17 +0000 writes:
> Hi All,
> I think there is a memory error in the libcurl connection code that
> typically happens when libcurl reads big chunks of data. This
> potentially affects all code that use url() with the libcurl download
> method, which is the default in most builds. In practice it tends to
> happen more with HTTP/2 and if the connection is wrapped into a
> gzcon(). macOS Catalina has a libcurl build with HTTP/2 error, so many
> users that upgraded macOS are starting to see this.
> The workaround is to avoid using url(), if you can. If you need an
> HTTP stream, you can use curl::curl(), which is a drop-in replacement.
> To reproduce, the easiest is a libcurl build that has HTTP/2 support
> and a server with HTTP/2 as well, e.g. the cloud mirror:
> ------------------------------------------------
> ~ # R --slave -e 'options(internet.info = 0); foo <-
>
readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))'
> * Trying 13.33.54.118:443...
> * TCP_NODELAY set
> * Connected to cran.rstudio.com (13.33.54.118) port 443 (#0)
> * ALPN, offering h2
> * ALPN, offering http/1.1
> * successfully set certificate verify locations:
> * CAfile: /etc/ssl/certs/ca-certificates.crt
> CApath: none
> * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
> * ALPN, server accepted to use h2
> * Server certificate:
> * subject: CN=cran.rstudio.com
> * start date: Jul 24 00:00:00 2019 GMT
> * expire date: Aug 24 12:00:00 2020 GMT
> * subjectAltName: host "cran.rstudio.com" matched cert's
"cran.rstudio.com"
> * issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
> * SSL certificate verify ok.
> * Using HTTP2, server supports multi-use
> * Connection state changed (HTTP/2 confirmed)
> * Copying HTTP/2 data in stream buffer to connection buffer after
upgrade: len=0
> * Using Stream ID: 1 (easy handle 0x56303c2910e0)
>> GET /src/contrib/Meta/archive.rds HTTP/2
> Host: cran.rstudio.com
> User-Agent: R (3.4.4 x86_64-pc-linux-gnu x86_64 linux-gnu)
> Accept: */*
> * Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
> < HTTP/2 200
> < content-length: 2483432
> < date: Wed, 22 Jan 2020 21:22:04 GMT
> < server: Apache/2.4.39 (Unix)
> < last-modified: Wed, 22 Jan 2020 17:10:22 GMT
> < etag: "25e4e8-59cbd998a0360"
> < accept-ranges: bytes
> < cache-control: max-age=1800
> < expires: Wed, 22 Jan 2020 21:52:04 GMT
> < x-cache: Hit from cloudfront
> < via: 1.1 6cbe48f9f9ff0c768f29d83804f75d4c.cloudfront.net
(CloudFront)
> < x-amz-cf-pop: MAN50-C1
> < x-amz-cf-id:
WwCQVQz9g8ZP6Az4m4n__h7aUW6vwlg0-AkiCv_DnVfGe10bzaFtfg= > < age: 960
> <
> * 85 data bytes written
> Error in
readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))
> :
> reference index out of range
> * stopped the pause stream!
> * Connection #0 to host cran.rstudio.com left intact
> Execution halted
> ------------------------------------------------
> Sometimes you get a crash, sometimes a corrupt stream, etc. Sometimes
> is actually works.
> It seems that the fix is simply this:
> ------------------------------------
> --- src/modules/internet/libcurl.c~
> +++ src/modules/internet/libcurl.c
> @@ -762,6 +762,7 @@
> void *newbuf = realloc(ctxt->buf, newbufsize);
> if (!newbuf) error("Failure in re-allocation in rcvData");
ctxt-> buf = newbuf; ctxt->bufsize = newbufsize;
> + ctxt->current = ctxt->buf;
> }
> memcpy(ctxt->buf + ctxt->filled, ptr, add);
> ------------------------------------
> Best,
> Gabor
Thanks a lot, G?bor!
I can reproduce the problem (on Linux Fedora 30) and confirm
that your patch works.
Even more, the patch looks "almost obvious",
because
ctxt->current = ctxt->buf
happens earlier in rcvData() after a change to ctxt->buf and so
should be updated if buf is.
An even slightly "better" patch just moves that statement down
to after the if(add) { .. } clause.
I'll patch the sources, and will port to 'R 3.6.2 patched'.
Martin