thr3ads.net - similar to: "How to set cookies in RCurl"

Displaying 20 results from an estimated 500 matches similar to: "How to set cookies in RCurl"

2012 Sep 19

scraping with session cookies

Hi, I am starting coding in r and one of the things that i want to do is to scrape some data from the web. The problem that I am having is that I cannot get passed the disclaimer page (which produces a session cookie). I have been able to collect some ideas and combine them in the code below but I dont get passed the disclaimer page. I am trying to agree the disclaimer with the postForm and write

RCurl and cookies in POST requests

2010 Nov 14

RCurl and cookies in POST requests

Hello. I know that it's usually possible to write cookies to a cookie file by removing the curl handle and doing a gc() call. I can do this with getURL(), but I just can't obtain the same results with postForm(). If I use: curlHandle <- getCurlHandle(cookiefile=FILE, cookiejar=FILE) and then do: getURL(http://example.com/script.cgi, curl=curlHandle) rm(curlHandle) gc() it's

RCurl and Google Scholar's EndNote references

2009 Sep 17

RCurl and Google Scholar's EndNote references

Hi! I've performed a Google Scholar Search using a query, let's say "Frank Harrell", and parsed the links to the EndNote references from the resulting HTML code. Now I'd like to download all the references automatically. For this, I have tried to use RCurl, but I can't seem to get it working: I always get error code "403 Forbidden" from the web server.

read htm table error

2012 Aug 09

read htm table error

Hi I am using Version R 2.15 and I haven't been able read html table. Following is my code and error message. Error in htmlParse(doc) : error in creating parser for http://en.wikipedia.org/wiki/Brazil_national_football_team theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team" tables <- readHTMLTable(theurl) Regards, Kiung [[alternative HTML version

Rcurl, postForm()

2012 May 28

Rcurl, postForm()

Dear colleagues, Could I get some assistance using postForm() to scrape the business names and addresses at this website: http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic

Converting scraped data

2010 Oct 06

Converting scraped data

Dear Colleagues, I used this code to scrape data from the URL conatined within. This code should be reproducible. require("XML") library(XML) theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm" tables <- readHTMLTable(theurl) n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) class(tables) test<-data.frame(tables, stringsAsFactors=FALSE)

google login via RCurl

2012 Feb 09

google login via RCurl

Hi, Can anyone manage to login to a google account via RCurl? All info on the web appears to be out of date. (1) both RGoogleDocs and RGoogleTrends on omegahat appears to be withdrawn: http://www.omegahat.org/RGoogleDocs/ http://www.omegahat.org/RGoogleTrends/ Does anyone know why? (2) The closest I can get is based on code from

Dependencies of Imports not attached?

2013 May 08

Dependencies of Imports not attached?

Encountered an error in scripting, which can be reproduced using Rscript as follows: $ Rscript -e "library(httr); handle('http://cran.r-project.org')" Error in getCurlHandle(cookiefile = cookie_path, .defaults = list()) : could not find function "getClass" Calls: handle -> getCurlHandle or by starting R without the methods package attached: $

Create single vector after looping through multiple data frames with GREP

2010 Oct 10

Create single vector after looping through multiple data frames with GREP

Hello all, I changed the subject line of the e-mail, because the question I''m posing now is different than the first one. I hope that this is proper etiquette. However, the original chain is included below. I've incorporated bits of both Ethan and Brian's code into the script below, but there's one aspect I can't get my head around. I'm totally new to programming

postForm() in RCurl and library RHTMLForms

2010 Nov 04

postForm() in RCurl and library RHTMLForms

Hi RUsers, Suppose I want to see the data on the website url <- "http://www.nseindia.com/content/indices/ind_histvalues.htm" for the index "S&P CNX NIFTY" for dates "FromDate"="01-11-2010","ToDate"="02-11-2010" then read the html table from the page using readHTMLtable() I am using this code webpage <-

RCurl cookiejar

2013 Aug 25

RCurl cookiejar

R-helpers, When I use cURL in the Terminal: curl --cookie-jar cookie.txt --url "http://corpusdelespanol.org/x.asp" --user-agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20100101 Firefox/23.0" --location --include a cookie file "cookie.txt" is saved to my working directory. However, when I try what I think is the equivalent command R with RCurl:

postForm() in RCurl and library RHTMLForms

2012 Dec 02

postForm() in RCurl and library RHTMLForms

Checking for monotonic sequence

2011 Nov 16

Checking for monotonic sequence

I am scraping data from a web page using XML (excellent package BTW - that's scraping data the easy way!). So far, I've got the code: tables <- readHTMLTable(theurl) rhf <- tables$tabResHistFull div1 <- rhf[which(rhf$V1=="Div ps"),] div1 which is giving me the result: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 15

Finding the right url for RCurl

2010 Aug 04

Finding the right url for RCurl

Hi all, I am using RCurl to try and download data from a website, but I'm having trouble finding out what URL to use. Here is the site: http://www.invescopowershares.com/products/holdings.aspx?ticker=PGX See how in the upper right, above the displayed sheet, there's a link to download the data as a .csv file? When I hit "copy url" and paste into getURL in R, it doesn't

XML and RCurl: problem with encoding (htmlTreeParse)

2010 Jul 03

XML and RCurl: problem with encoding (htmlTreeParse)

Hi All, First method:- >library(XML) >theurl <- "http://home.sina.com" >download.file(theurl, "tmp.html") >txt <- readLines("tmp.html") >txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes = TRUE) >g <- xpathSApply(txt, "//p", function(x) xmlValue(x)) >head(grep(" ", g, value=T)) [1] " |

Problem with readHTMLTable

2012 May 26

Problem with readHTMLTable

Hello All, i was trying to simply run the readHTMLTable on the example published in the package. And on a page I was working on. So running: u = "http://en.wikipedia.org/wiki/List_of_countries_by_population" tables = readHTMLTable(u) returns the following error: Error in tb[["thead"]] : subscript out of bounds looking up this error on the web, didnt give me any hint. Is

readHTMLTable function - unable to find an inherited method ~ for signature "NULL"

2012 Jun 14

readHTMLTable function - unable to find an inherited method ~ for signature "NULL"

Hi R experts, I have been playing with library(XML) recently and found out that readHTMLTable workls flawlessly for some website, but it does give me an error like below ... Error in function (classes, fdef, mtable) : unable to find an inherited method for function "readHTMLTable", for signature "NULL" let's say..for example, this code works fine a

How to suppress errors generated by readHTMLTable?

2009 Nov 26

How to suppress errors generated by readHTMLTable?

library(XML) download.file('http://polya.umdnj.edu/polya_db2/gene.php?llid=109079&unigene=&submit=Submit','index.html') tables=readHTMLTable("index.html",error=function(...){}) tables readHTMLTable gives me the following errors. Could somebody let me know how to suppress them? Opening and ending tag mismatch: center and table htmlParseEntityRef: expecting

htmlParse (from XML library) working sporadically in the same code

2013 Mar 20

htmlParse (from XML library) working sporadically in the same code

I am using htmlParse from XML library on a paricular website. Sometimes code fails, sometimes it works, most of the time id doesn't and i cannot see why. The file i am trying to parse is http://www.londonstockexchange.com/exchange/prices-and-markets/international-markets/indices/home/sp-500.html?page=0 Sometimes the following code works n<-readHTMLTable(htmlParse(url)) But most of the

readHTMLTable (XML package)

2013 Jan 15

readHTMLTable (XML package)

Hi, I am using XML::readHTMLTable and getting the below error. Does anyone know why? Does this function not work with https? I didn't see anything in help about that. > library(XML) > wampage<-readHTMLTable('https://hr-workforce-analytics.llnl.gov/wf_pi_pop.html',1) Error in htmlParse(doc) : File https://hr-workforce-analytics.llnl.gov/wf_pi_pop.html does not exist Dan

similar to: How to set cookies in RCurl