Tyler Backman
2011-Mar-11  22:42 UTC
[R] is gzcon w/ urls not implemented or used differently on linux?
I wrote some code which reads a gzipped text file directly from the web with
gzcon(url()) and it works perfectly on OSX, but I cannot get it to work on linux
at all, trying several different R versions and linux distributions. Any ideas?
Here's an example of my code:
z <-
gzcon(url("ftp://ftp-private.ncbi.nlm.nih.gov/pubchem/.fetch/8897497837079742771.sdf.gz"))
sdf <- readLines(z)
close(z)
On linux it produces the following error:
Error in readLines(z) : cannot open the connection
The non-gzipped version works flawlessly on linux:
con <-
url("http://chemmine.ucr.edu/ChemMineToolsV2/static/example_db.sdf")
sdf <- readLines(con)
close(con)
As an analog, gzcon does work with non-url files on linux:
system("wget
ftp://ftp-private.ncbi.nlm.nih.gov/pubchem/.fetch/8897497837079742771.sdf.gz")
z <- gzcon(file("8897497837079742771.sdf.gz", "rb"))
sdf <- readLines(z)
close(z)
But this doesn't help me, because I need my code to be cross platform!
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
[5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8   
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
[9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base 
> system("uname -a")
Linux biocluster 2.6.26-2-openvz-amd64 #1 SMP Tue Jan 25 06:04:33 UTC 2011
x86_64 GNU/Linux
Thank you,
Tyler William H Backman
Cheminformatics Programmer
Department of Botany and Plant Sciences
E-mail: tyler.backman at ucr.edu
1207E Genomics Building
University of California
Riverside, CA 92521
