Hello all, I wish to use read.csv to read a google doc spreadsheet. I try using the following code: data_url <- " http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv " read.csv(data_url) Which results in the following error: Error in file(file, "rt") : cannot open the connection I'm on windows 7. And the code was tried on R 2.12 and 2.13 I remember trying this a few months ago and it worked fine. Any suggestion what might be causing this or how to solve it? Thanks. ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- [[alternative HTML version deleted]]
David Winsemius
2011-Apr-29 17:36 UTC
[R] read.csv fails to read a CSV file from google docs
On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:> Hello all, > I wish to use read.csv to read a google doc spreadsheet. > > I try using the following code: > > data_url <- " > http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv > " > read.csv(data_url) > > Which results in the following error: > > Error in file(file, "rt") : cannot open the connection > > > I'm on windows 7. And the code was tried on R 2.12 and 2.13 > > I remember trying this a few months ago and it worked fine.I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it could not possibly have succeeded as presented. But now it's time to admit I do not know why it is not succeeding when I correct those flaws. > closeAllConnections() > data_url <- url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv ") > read.csv(data_url) Error in open.connection(file, "rt") : cannot open the connection > closeAllConnections() > dd <- read.csv(con <- url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv ")) Error in open.connection(file, "rt") : cannot open the connection So, I guess I'm not reading the help pages for `url` and `read.csv` as well I thought I was.> Any suggestion what might be causing this or how to solve it?-- David Winsemius, MD West Hartford, CT
Philipp Pagel
2011-Apr-29 18:27 UTC
[R] read.csv fails to read a CSV file from google docs
On Fri, Apr 29, 2011 at 06:19:24PM +0300, Tal Galili wrote:> > data_url <- " > http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv > " > read.csv(data_url) > Error in file(file, "rt") : cannot open the connectionI get the same error (R 2.11.1, Debian LINUX) and don't have a solution. But I did some tests and found the origin of the problem I can download the file from google with wget but get some interesting ?information in the process: ------------------------------------------------------------ $ wget -v 'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv' --2011-04-29 20:07:40-- http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv Resolving spreadsheets0.google.com... 209.85.148.139, 209.85.148.113, 209.85.148.138, ... Connecting to spreadsheets0.google.com|209.85.148.139|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv [following] --2011-04-29 20:07:41-- https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv Connecting to spreadsheets0.google.com|209.85.148.139|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/plain] Saving to: ?pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1? [ <=> ] 41 --.-K/s in 0s 2011-04-29 20:07:42 (342 KB/s) - ?pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1? saved [41] ------------------------------------------------------------ The message that caught my attention was the http redirection: "302 Moved Temporarily". If you try again with the new url you get this:> read.csv(url("https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&g"))Error in open.connection(file, "rt") : cannot open the connection In addition: Warning message: In open.connection(file, "rt") : unsupported URL scheme ?url told me "Note that ?https://? connections are not supported." Case closed, problem unsolved... Dirty workaround: use system() and wget or whatever command is available on Windows for this. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/