Simon Andrews
2024-Sep-12 12:01 UTC
[Rd] Can gzfile be given the same method option as file
Recently my employer has introduced a security system which generates SSL certificates on the fly to be able to see the content of https connections. To make this work they add a new root certificate to the windows certificate store. In R this causes problems because the default library used to download data from URLs doesn't look at this store, however the "wininet" download method works so where this is used then things work (albeit with a warning about future deprecation). For functions like download.file this works great, but it fails when running readRDS: readRDS('https://seurat.nygenome.org/azimuth/references/homologs.rds') Error in gzfile(file, "rb") : cannot open the connection In addition: Warning message: In gzfile(file, "rb") : cannot open compressed file 'https://seurat.nygenome.org/azimuth/references/homologs.rds', probable reason 'Invalid argument' After some debugging I see that the root cause is from the gzfile function.> gzfile('https://seurat.nygenome.org/azimuth/references/homologs.rds') -> g > open(g, open="r")Error in open.connection(g, open = "r") : cannot open the connection In addition: Warning message: In open.connection(g, open = "r") : cannot open compressed file 'https://seurat.nygenome.org/azimuth/references/homologs.rds', probable reason 'Invalid argument' If this was not a compressed file then using file rather than gzfile we can make this work by setting the url.method option:> options("url.method"="wininet") > file('https://seurat.nygenome.org/azimuth/references/homologs.rds') -> g > open(g, open="r")Warning message: In open.connection(g, open = "r") : the 'wininet' method of url() is deprecated for http:// and https:// URLs So I get a warning, but it works. I guess this boils down to two questions: 1. Is it possible to add the same "method" argument to gzfile that file uses so that people in my situation have a work round? 2. Given the warnings we're getting when using wininet, are their plans to make windows certficates be supported in another way? Thanks Simon. [[alternative HTML version deleted]]
? Thu, 12 Sep 2024 12:01:54 +0000 Simon Andrews via R-devel <r-devel at r-project.org> ?????:> readRDS('https://seurat.nygenome.org/azimuth/references/homologs.rds') > Error in gzfile(file, "rb") : cannot open the connectionI don't think that gzfile works with URLs. gzcon(), on the other hand, does work with url() connections, which accepts the 'method' argument and the getOption('url.method') default. h <- readRDS(url( 'https://seurat.nygenome.org/azimuth/references/homologs.rds' )) But that only works with gzip-compressed files. For example, CRAN's PACKAGES.rds is xz-compressed, and I don't see a way to read it the same way: readBin( index <- file.path( contrib.url(getOption('repos')['CRAN']), 'PACKAGES.rds' ), raw(), 5 ) |> rawToChar() # [1] "\xfd7zXZ" <-- note the "7zXZ" header readRDS(url(index)) # Error in readRDS(url(index)) : unknown input format> 2. Given the warnings we're getting when using wininet, are their > plans to make windows certficates be supported in another way?What does libcurlVersion() return for you? In theory, it should be possible to make libcurl use schannel and therefore the system certificate store for TLS verification purposes. -- Best regards, Ivan