similar to: RCurl unable to download a particular web page -- what is so special about this web page?

Displaying 20 results from an estimated 2000 matches similar to: "RCurl unable to download a particular web page -- what is so special about this web page?"

2008 Oct 06
3
Extracting text from html code using the RCurl package.
Dear R-help, I want to download the text from a web page, however what i end up with is the html code. Is there some option that i am missing in the RCurl package? Or is there another way to achieve this? This is the code i am using: > library(RCurl) > my.url <- 'https://stat.ethz.ch/mailman/listinfo/r-help' > html.file <- getURI(my.url, ssl.verifyhost = FALSE,
2009 Dec 31
3
XML and RCurl: problem with encoding (htmlTreeParse)
Hi, I'm trying to get data from web page and modify it in R. I have a problem with encoding. I'm not able to get encoding right in htmlTreeParse command. See below > library(RCurl) > library(XML) > > site <- getURL("http://www.aarresaari.net/jobboard/jobs.html") > txt <- readLines(tc <- textConnection(site)); close(tc) > txt <- htmlTreeParse(txt,
2009 Jan 19
3
download/retain text file structure with RCurl/getURL()
Dear list, I'm trying to download a text file directly from the internet using the RCurl package and the command getURL. Duncan Lang graciously helped me solve the first step in this problem using the following command: ################# txtfile <- getURL('ftp://ftp.wcc.nrcs.usda.gov/data/snow/snow_course/table/history/idaho/13e19.txt', ftp.use.epsv = FALSE) #################
2011 Jun 06
1
RCurl and kerberos
Dear list, I would like to call a Kerberos-authenticated web-service from within R. Curl can do it: $ curl --negotiate -u : "http://my.web.service/" so I would expect that RCurl also has the capability, but I have not been able to find the correct options to set. listCurlOptions() does not return anything with negotiate, and searching the source of RCurl, the only thing I found was
2009 Sep 17
1
RCurl and Google Scholar's EndNote references
Hi! I've performed a Google Scholar Search using a query, let's say "Frank Harrell", and parsed the links to the EndNote references from the resulting HTML code. Now I'd like to download all the references automatically. For this, I have tried to use RCurl, but I can't seem to get it working: I always get error code "403 Forbidden" from the web server.
2010 Aug 04
2
Finding the right url for RCurl
Hi all, I am using RCurl to try and download data from a website, but I'm having trouble finding out what URL to use. Here is the site: http://www.invescopowershares.com/products/holdings.aspx?ticker=PGX See how in the upper right, above the displayed sheet, there's a link to download the data as a .csv file? When I hit "copy url" and paste into getURL in R, it doesn't
2013 Aug 25
2
RCurl cookiejar
R-helpers, When I use cURL in the Terminal: curl --cookie-jar cookie.txt --url "http://corpusdelespanol.org/x.asp" --user-agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20100101 Firefox/23.0" --location --include a cookie file "cookie.txt" is saved to my working directory. However, when I try what I think is the equivalent command R with RCurl:
2010 Jul 03
1
XML and RCurl: problem with encoding (htmlTreeParse)
Hi All, First method:- >library(XML) >theurl <- "http://home.sina.com" >download.file(theurl, "tmp.html") >txt <- readLines("tmp.html") >txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes = TRUE) >g <- xpathSApply(txt, "//p", function(x) xmlValue(x)) >head(grep(" ", g, value=T)) [1] " |
2008 Aug 27
1
RCurl: using netrc with curlPerform
Hello, I am having trouble getting the curlPerform function to authenticate using the .netrc file. From the documentation I've read it certainly seems as though this function should be able to authenticate via the .netrc file. The example I am using here comes from the "R as a Web Client- the RCurl package" paper and demonstrates using the .netrc file to access the
2012 Jul 23
1
[RCurl] HTTP 404 Status
I am trying to get contents of a REST response: getURL("http://localhost/myweb-app/rest-ws") This is a web application (myweb-app) which is providing a REST web service (rest-ws)... Unfortunately, the HTTP status sent back is 404. If I request the url using Chrome/IE, I get a HTTP status 200 OK. In Opera the request does not succeed either. I am using 2.15.1 (Win7, 64Bit) and just
2009 Jun 02
1
Problem downloading webpages using batchfiles and RCurl from command line in Vista Basic - couldn't connect to host
Dear all, I am having a problem downloading webpages through R when i run it in the DOS window under Windows Vista Basic. I have downloaded the batchfiles from http://code.google.com/p/batchfiles/ and have successfully set the PATH. I open up 'Command Prompt' in Vista and type (after the C:\...> stuff): ### START ### C:\Users\Karen>Rscript -e "library(RCurl);
2010 Nov 14
1
RCurl and cookies in POST requests
Hello. I know that it's usually possible to write cookies to a cookie file by removing the curl handle and doing a gc() call. I can do this with getURL(), but I just can't obtain the same results with postForm(). If I use: curlHandle <- getCurlHandle(cookiefile=FILE, cookiejar=FILE) and then do: getURL(http://example.com/script.cgi, curl=curlHandle) rm(curlHandle) gc() it's
2012 Oct 11
1
Problems with getURL (RCurl) to obtain list files of an ftp directory
Dear all, I have a problem with the command 'getURL' from the RCurl package, which I have been using to obtain a ftp directory list from the MOD16 (ET, DSI) products, and then to download them. (part of the script by Tomislav Hengl, spatial-analyst). Instead of the list of files (from ftp), I am getting the complete html code. Anyone knows why this might happen? This are the steps i
2009 Aug 19
2
RGoogleDocs/RCurl through proxy
Dear list, I am trying to use RGoogleDocs, but I am connecting through a proxy server. I know RCurl is used for the connection, which should be able to deal with proxies and such. How do I set this up for RCurl? And can I use those settings with RGoogleDocs as well? I have the name of the proxy server and the port number. (Windows XP). thanks, Remko
2008 Oct 01
1
changing 'https' to 'http' when using download.file(), any side effects or just use RCurl?
Dear R-Help, >From reading the help file, it is my understanding the the download.file() function does not support HTTPS connections. So therefore, understandably, the follow produces an error: ### R Code > url <- "https://stat.ethz.ch/pipermail/r-help/2008-October/thread.html" > destfile <- "//PFO-SBS001/Redirected/tonyb/Desktop/R_web_test/tmp.txt" >
2013 Feb 21
4
Getting htmlParse to work with Hebrew? (on windows)
Hello dear R-help mailing list. Looks like the same issue in Russian: library(RCurl) library(XML) u = " http://www.cian.ru/cat.php?deal_type=2&obl_id=1&room1=1" a = getURL(u) a # Here - the Russian is fine. a2 <- htmlParse(a) a2 # Here it is a mess... None of these seem to fix it: htmlParse(a, encoding = "windows-1251") htmlParse(a, encoding =
2009 Oct 15
1
Removing Embedded Null characters from text/html
Hi, I'm trying to download some data from the web and am running into problems with 'embedded null' characters. These seem to indicate to R that it should stop processing the page so I'd like to remove them. I've been looking around and can't seem to identify exactly what the character is and consequently how to remove it. # THE CODE WORKS ON THIS PAGE library(RCurl)
2011 May 20
4
source and localhost
Dear List, I have problems with the function source() using a url of the kind: http://localhost:5984/path/fn.R I receive Fehler in file(file, "r", encoding = encoding) : kann Verbindung nicht ?ffnen Zus?tzlich: Warnmeldung: In file(file, "r", encoding = encoding) : ?ffnen fehlgeschlagen: HTTP Status war '502 cannotconnect' The url itself is ok since I can: -
2013 Apr 25
2
extracting tables from web pages?
Hello: What tools would you recommend for extracting the table of members of the US House of representatives from "http://house.gov/representatives/" and "http://en.wikipedia.org/wiki/List_of_current_members_of_the_United_States_House_of_Representatives_by_age"? I started writing something using getURL{RCurl}. However, I'm getting bogged down
2010 Feb 10
3
Using R to format a file using a server (PDB to PQR file)
I am trying to write a program that uses R and takes a pdb file, and converts it to a pqr file. This task is simple generally, using the website, http://pdb2pqr-1.wustl.edu/pdb2pqr/. How do you use R to input a pdb file (that is on hand) into the upload pdb file input, and run the website and give the return file to be a pqr file. Thanks for your help. -- View this message in context: