Sam H
2020-Jul-15 20:16 UTC
[R-sig-Debian] read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2
Hi, I am trying to download some data using read.csv and it works perfectly in RStudio and fails in the R console in the terminal in Ubuntu 18.04 after upgrading from R 3.6.3 to 4.0.2. Before upgrading this worked in the R console in the terminal also without any issues. Why would that be? How to fix this? Below please find R code output and sessionInfo(). *Works in RStudio*> read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download", header=TRUE, as.is=TRUE, na="n/a")Symbol Name LastSale MarketCap IPOyear 1 TXG 10x Genomics, Inc. 87.4400 $8.6B 2019 2 YI 111, Inc. 6.4800 $533.69M 2018 3 PIH 1347 Property Insurance Holdings, Inc. 4.5350 $27.52M 2014 sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.4 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.0.2 tools_4.0.2 *Fails in R console in terminal* > read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download", header=TRUE, as.is=TRUE, na="n/a") Error in file(file, "rt") : cannot open the connection to 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download' In addition: Warning message: In file(file, "rt") : URL 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download': status was 'Failure when receiving data from the peer'> traceback()3: file(file, "rt") 2: read.table(file = file, header = header, sep = sep, quote = quote, dec = dec, fill = fill, comment.char = comment.char, ...) 1: read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download", header = TRUE, as.is = TRUE, na = "n/a")> sessionInfo()R version 4.0.2 (2020-06-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.4 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.0.2>I also asked this question here https://stackoverflow.com/questions/62898008/why-read-csv-fails-in-r-console-in-ubuntu-terminal-but-works-in-rstudio-after-r . Since there was no answer on stackoverflow I sent this question also to R-Help where I was advised to better ask this question on r-sig-debian. Best regards, Sam [[alternative HTML version deleted]]
Dirk Eddelbuettel
2020-Jul-15 20:35 UTC
[R-sig-Debian] read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2
On 15 July 2020 at 16:16, Sam H wrote: | I am trying to download some data using read.csv and it works perfectly in | RStudio and fails in the R console in the terminal in Ubuntu 18.04 after | upgrading from R 3.6.3 to 4.0.2. Before upgrading this worked in the R | console in the terminal also without any issues. | | Why would that be? How to fix this? | | Below please find R code output and sessionInfo(). | | *Works in RStudio* | | > read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download", header=TRUE, as.is=TRUE, na="n/a") Ok, let's stop right here. First off, for good debugging it helps to separate - downloading a file via R from - reading a file - maybe varying the arguments you give there In my case this got easier. I clicked on the link (in Ubuntu 20.04) and it downloaded it. From there few problems. `read.csv()` just reads it: edd at rob:~/Downloads$ Rscript -e 'data.table::fread("companylist.csv", header=TRUE)' Symbol Name LastSale MarketCap IPOyear Sector industry Summary Quote V9 1: TXG 10x Genomics, Inc. 88.91 $8.75B 2019 Capital Goods Biotechnology: Laboratory Analytical Instruments https://old.nasdaq.com/symbol/txg NA 2: YI 111, Inc. 6.64 $546.87M 2018 Health Care Medical/Nursing Services https://old.nasdaq.com/symbol/yi NA 3: PIH 1347 Property Insurance Holdings, Inc. 4.528 $27.48M 2014 Finance Property-Casualty Insurers https://old.nasdaq.com/symbol/pih NA 4: PIHPP 1347 Property Insurance Holdings, Inc. 24.8631 n/a n/a Finance Property-Casualty Insurers https://old.nasdaq.com/symbol/pihpp NA 5: TURN 180 Degree Capital Corp. 1.67 $51.97M n/a Finance Finance/Investors Services https://old.nasdaq.com/symbol/turn NA --- 3622: ZS Zscaler, Inc. 122.43 $15.98B 2018 Technology EDP Services https://old.nasdaq.com/symbol/zs NA 3623: ZUMZ Zumiez Inc. 25.55 $649.76M 2005 Consumer Services Clothing/Shoe/Accessory Stores https://old.nasdaq.com/symbol/zumz NA 3624: ZYNE Zynerba Pharmaceuticals, Inc. 3.41 $85.08M 2015 Health Care Major Pharmaceuticals https://old.nasdaq.com/symbol/zyne NA 3625: ZYXI Zynex, Inc. 26.22 $870.31M n/a Health Care Biotechnology: Electromedical & Electrotherapeutic Apparatus https://old.nasdaq.com/symbol/zyxi NA 3626: ZNGA Zynga Inc. 9.82 $10.54B 2011 Technology EDP Services https://old.nasdaq.com/symbol/znga NA edd at rob:~/Downloads$ For kicks, same with data.table: edd at rob:~/Downloads$ Rscript -e 'str(read.csv("companylist.csv"))' 'data.frame': 3626 obs. of 9 variables: $ Symbol : chr "TXG" "YI" "PIH" "PIHPP" ... $ Name : chr "10x Genomics, Inc." "111, Inc." "1347 Property Insurance Holdings, Inc." "1347 Property Insurance Holdings, Inc." ... $ LastSale : chr "88.91" "6.64" "4.528" "24.8631" ... $ MarketCap : chr "$8.75B" "$546.87M" "$27.48M" "n/a" ... $ IPOyear : chr "2019" "2018" "2014" "n/a" ... $ Sector : chr "Capital Goods" "Health Care" "Finance" "Finance" ... $ industry : chr "Biotechnology: Laboratory Analytical Instruments" "Medical/Nursing Services" "Property-Casualty Insurers" "Property-Casualty Insurers" ... $ Summary.Quote: chr "https://old.nasdaq.com/symbol/txg" "https://old.nasdaq.com/symbol/yi" "https://old.nasdaq.com/symbol/pih" "https://old.nasdaq.com/symbol/pihpp" ... $ X : logi NA NA NA NA NA NA ... edd at rob:~/Downloads$ So in short, if you have a problem, it is not likely coming from the Ubuntu binary for R 4.0.2 which I am running here. Maybe start by downloading the file? You could have firewall or other issues. We can't tell. And we can't reproduce the issue. Good luck, Dirk -- https://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
David Winsemius
2020-Jul-16 02:15 UTC
[R-sig-Debian] read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2
On 7/15/20 1:35 PM, Dirk Eddelbuettel wrote:> On 15 July 2020 at 16:16, Sam H wrote: > | I am trying to download some data using read.csv and it works perfectly in > | RStudio and fails in the R console in the terminal in Ubuntu 18.04 after > | upgrading from R 3.6.3 to 4.0.2. Before upgrading this worked in the R > | console in the terminal also without any issues. > | > | Why would that be? How to fix this? > | > | Below please find R code output and sessionInfo(). > | > | *Works in RStudio* > | > | > read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download", header=TRUE, as.is=TRUE, na="n/a") > > Ok, let's stop right here. First off, for good debugging it helps to separate > > - downloading a file via R from > - reading a file > - maybe varying the arguments you give there > > In my case this got easier. I clicked on the link (in Ubuntu 20.04) and it > downloaded it. From there few problems. `read.csv()` just reads it:In fact one can use the fread approach directly, rather than first using your system or your browser to download the copy: z <- data.table::fread("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download", header=TRUE) ?Downloaded 486840 bytes...> > str(z) Classes ?data.table? and 'data.frame':??? 3631 obs. of? 9 variables: ?$ Symbol?????? : chr? "TXG" "YI" "PIH" "PIHPP" ... ?$ Name???????? : chr? "10x Genomics, Inc." "111, Inc." "1347 Property Insurance Holdings, Inc." "1347 Property Insurance Holdings, Inc." ... ?$ LastSale???? : chr? "90.93" "6.31" "4.528" "24.35" ... ?$ MarketCap??? : chr? "$8.94B" "$519.69M" "$27.48M" "n/a" ... ?$ IPOyear????? : chr? "2019" "2018" "2014" "n/a" ... ?$ Sector?????? : chr? "Capital Goods" "Health Care" "Finance" "Finance" ... ?$ industry???? : chr? "Biotechnology: Laboratory Analytical Instruments" "Medical/Nursing Services" "Property-Casualty Insurers" "Property-Casualty Insurers" ... ?$ Summary Quote: chr? "https://old.nasdaq.com/symbol/txg" "https://old.nasdaq.com/symbol/yi" "https://old.nasdaq.com/symbol/pih" "https://old.nasdaq.com/symbol/pihpp" ... ?$ V9?????????? : logi? NA NA NA NA NA NA ... ?- attr(*, ".internal.selfref")=<externalptr> I had earlier experienced the hanging of the original example in Ubuntu 18.04 using R 3.6.1. I get teh same result in either a Terminal hosted R session or an Rstudio R session. (It does leave hanging the question of why `read.csv` is failing.) -- David.> > edd at rob:~/Downloads$ Rscript -e 'data.table::fread("companylist.csv", header=TRUE)' > Symbol Name LastSale MarketCap IPOyear Sector industry Summary Quote V9 > 1: TXG 10x Genomics, Inc. 88.91 $8.75B 2019 Capital Goods Biotechnology: Laboratory Analytical Instruments https://old.nasdaq.com/symbol/txg NA > 2: YI 111, Inc. 6.64 $546.87M 2018 Health Care Medical/Nursing Services https://old.nasdaq.com/symbol/yi NA > 3: PIH 1347 Property Insurance Holdings, Inc. 4.528 $27.48M 2014 Finance Property-Casualty Insurers https://old.nasdaq.com/symbol/pih NA > 4: PIHPP 1347 Property Insurance Holdings, Inc. 24.8631 n/a n/a Finance Property-Casualty Insurers https://old.nasdaq.com/symbol/pihpp NA > 5: TURN 180 Degree Capital Corp. 1.67 $51.97M n/a Finance Finance/Investors Services https://old.nasdaq.com/symbol/turn NA > --- > 3622: ZS Zscaler, Inc. 122.43 $15.98B 2018 Technology EDP Services https://old.nasdaq.com/symbol/zs NA > 3623: ZUMZ Zumiez Inc. 25.55 $649.76M 2005 Consumer Services Clothing/Shoe/Accessory Stores https://old.nasdaq.com/symbol/zumz NA > 3624: ZYNE Zynerba Pharmaceuticals, Inc. 3.41 $85.08M 2015 Health Care Major Pharmaceuticals https://old.nasdaq.com/symbol/zyne NA > 3625: ZYXI Zynex, Inc. 26.22 $870.31M n/a Health Care Biotechnology: Electromedical & Electrotherapeutic Apparatus https://old.nasdaq.com/symbol/zyxi NA > 3626: ZNGA Zynga Inc. 9.82 $10.54B 2011 Technology EDP Services https://old.nasdaq.com/symbol/znga NA > edd at rob:~/Downloads$ > > For kicks, same with data.table: > > edd at rob:~/Downloads$ Rscript -e 'str(read.csv("companylist.csv"))' > 'data.frame': 3626 obs. of 9 variables: > $ Symbol : chr "TXG" "YI" "PIH" "PIHPP" ... > $ Name : chr "10x Genomics, Inc." "111, Inc." "1347 Property Insurance Holdings, Inc." "1347 Property Insurance Holdings, Inc." ... > $ LastSale : chr "88.91" "6.64" "4.528" "24.8631" ... > $ MarketCap : chr "$8.75B" "$546.87M" "$27.48M" "n/a" ... > $ IPOyear : chr "2019" "2018" "2014" "n/a" ... > $ Sector : chr "Capital Goods" "Health Care" "Finance" "Finance" ... > $ industry : chr "Biotechnology: Laboratory Analytical Instruments" "Medical/Nursing Services" "Property-Casualty Insurers" "Property-Casualty Insurers" ... > $ Summary.Quote: chr "https://old.nasdaq.com/symbol/txg" "https://old.nasdaq.com/symbol/yi" "https://old.nasdaq.com/symbol/pih" "https://old.nasdaq.com/symbol/pihpp" ... > $ X : logi NA NA NA NA NA NA ... > edd at rob:~/Downloads$ > > So in short, if you have a problem, it is not likely coming from the Ubuntu > binary for R 4.0.2 which I am running here. > > Maybe start by downloading the file? You could have firewall or other > issues. We can't tell. And we can't reproduce the issue. > > Good luck, Dirk >
Possibly Parallel Threads
- read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2
- read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2
- Date column in downloaded date
- Could dynlm function work for xts objects?
- RHEL Kickstart and Puppet certificates