Displaying 20 results from an estimated 500 matches similar to: "How to set cookies in RCurl"
2012 Sep 19
1
scraping with session cookies
Hi, I am starting coding in r and one of the things that i want to do is to
scrape some data from the web.
The problem that I am having is that I cannot get passed the disclaimer
page (which produces a session cookie). I have been able to collect some
ideas and combine them in the code below but I dont get passed the
disclaimer page.
I am trying to agree the disclaimer with the postForm and write
2010 Nov 14
1
RCurl and cookies in POST requests
Hello.
I know that it's usually possible to write cookies to a cookie
file by removing the curl handle and doing a gc() call. I can do
this with getURL(), but I just can't obtain the same results with
postForm().
If I use:
curlHandle <- getCurlHandle(cookiefile=FILE, cookiejar=FILE)
and then do:
getURL(http://example.com/script.cgi, curl=curlHandle)
rm(curlHandle)
gc()
it's
2009 Sep 17
1
RCurl and Google Scholar's EndNote references
Hi!
I've performed a Google Scholar Search using a query, let's say "Frank
Harrell", and parsed the links to the EndNote references from the resulting
HTML code. Now I'd like to download all the references automatically. For
this, I have tried to use RCurl, but I can't seem to get it working: I
always get error code "403 Forbidden" from the web server.
2012 Aug 09
2
read htm table error
Hi I am using Version R 2.15 and I haven't been able read html table. Following is my code and error message.
Error in htmlParse(doc) :
error in creating parser for http://en.wikipedia.org/wiki/Brazil_national_football_team
theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team"
tables <- readHTMLTable(theurl)
Regards,
Kiung
[[alternative HTML version
2012 May 28
1
Rcurl, postForm()
Dear colleagues,
Could I get some assistance using postForm() to scrape the business names and addresses at this website:
http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx
I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic
2010 Oct 06
2
Converting scraped data
Dear Colleagues,
I used this code to scrape data from the URL conatined within. This
code should be reproducible.
require("XML")
library(XML)
theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm"
tables <- readHTMLTable(theurl)
n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))
class(tables)
test<-data.frame(tables, stringsAsFactors=FALSE)
2012 Feb 09
0
google login via RCurl
Hi,
Can anyone manage to login to a google account via RCurl? All info on
the web appears to be out of date.
(1) both RGoogleDocs and RGoogleTrends on omegahat appears to be withdrawn:
http://www.omegahat.org/RGoogleDocs/
http://www.omegahat.org/RGoogleTrends/
Does anyone know why?
(2) The closest I can get is based on code from
2013 May 08
1
Dependencies of Imports not attached?
Encountered an error in scripting, which can be reproduced using Rscript as
follows:
$ Rscript -e "library(httr); handle('http://cran.r-project.org')"
Error in getCurlHandle(cookiefile = cookie_path, .defaults = list()) :
could not find function "getClass"
Calls: handle -> getCurlHandle
or by starting R without the methods package attached:
$
2010 Oct 10
1
Create single vector after looping through multiple data frames with GREP
Hello all,
I changed the subject line of the e-mail, because the question I''m posing now is different than the first one. I hope that this is proper etiquette. However, the original chain is included below.
I've incorporated bits of both Ethan and Brian's code into the script below, but there's one aspect I can't get my head around. I'm totally new to programming
2010 Nov 04
3
postForm() in RCurl and library RHTMLForms
Hi RUsers,
Suppose I want to see the data on the website
url <- "http://www.nseindia.com/content/indices/ind_histvalues.htm"
for the index "S&P CNX NIFTY" for
dates "FromDate"="01-11-2010","ToDate"="02-11-2010"
then read the html table from the page using readHTMLtable()
I am using this code
webpage <-
2013 Aug 25
2
RCurl cookiejar
R-helpers,
When I use cURL in the Terminal:
curl --cookie-jar cookie.txt --url "http://corpusdelespanol.org/x.asp" --user-agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20100101 Firefox/23.0" --location --include
a cookie file "cookie.txt" is saved to my working directory. However, when I try what I think is the equivalent command R with RCurl:
2012 Dec 02
1
postForm() in RCurl and library RHTMLForms
Hi RUsers,
Suppose I want to see the data on the website
url <- "http://www.nseindia.com/content/indices/ind_histvalues.htm"
for the index "S&P CNX NIFTY" for
dates "FromDate"="01-11-2010","ToDate"="02-11-2010"
then read the html table from the page using readHTMLtable()
I am using this code
webpage <-
2011 Nov 16
1
Checking for monotonic sequence
I am scraping data from a web page using XML (excellent package BTW - that's scraping data the easy way!).
So far, I've got the code:
tables <- readHTMLTable(theurl)
rhf <- tables$tabResHistFull
div1 <- rhf[which(rhf$V1=="Div ps"),]
div1
which is giving me the result:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15
15
2010 Aug 04
2
Finding the right url for RCurl
Hi all,
I am using RCurl to try and download data from a website, but I'm having
trouble finding out what URL to use. Here is the site:
http://www.invescopowershares.com/products/holdings.aspx?ticker=PGX
See how in the upper right, above the displayed sheet, there's a link to
download the data as a .csv file? When I hit "copy url" and paste into
getURL in R, it doesn't
2010 Jul 03
1
XML and RCurl: problem with encoding (htmlTreeParse)
Hi All,
First method:-
>library(XML)
>theurl <- "http://home.sina.com"
>download.file(theurl, "tmp.html")
>txt <- readLines("tmp.html")
>txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes =
TRUE)
>g <- xpathSApply(txt, "//p", function(x) xmlValue(x))
>head(grep(" ", g, value=T))
[1] " |
2012 May 26
3
Problem with readHTMLTable
Hello All,
i was trying to simply run the readHTMLTable on the example published in the
package. And on a page I was working on. So running:
u = "http://en.wikipedia.org/wiki/List_of_countries_by_population"
tables = readHTMLTable(u)
returns the following error:
Error in tb[["thead"]] : subscript out of bounds
looking up this error on the web, didnt give me any hint. Is
2012 Jun 14
1
readHTMLTable function - unable to find an inherited method ~ for signature "NULL"
Hi R experts,
I have been playing with library(XML) recently and found out that
readHTMLTable workls flawlessly for some website, but it does give me an
error like below
... Error in function (classes, fdef, mtable) :
unable to find an inherited method for function "readHTMLTable", for
signature "NULL"
let's say..for example, this code works fine
a
2009 Nov 26
1
How to suppress errors generated by readHTMLTable?
library(XML)
download.file('http://polya.umdnj.edu/polya_db2/gene.php?llid=109079&unigene=&submit=Submit','index.html')
tables=readHTMLTable("index.html",error=function(...){})
tables
readHTMLTable gives me the following errors. Could somebody let me
know how to suppress them?
Opening and ending tag mismatch: center and table
htmlParseEntityRef: expecting
2013 Mar 20
1
htmlParse (from XML library) working sporadically in the same code
I am using htmlParse from XML library on a paricular website. Sometimes code fails, sometimes it works, most of the time id doesn't and i cannot see why. The file i am trying to parse is
http://www.londonstockexchange.com/exchange/prices-and-markets/international-markets/indices/home/sp-500.html?page=0
Sometimes the following code works
n<-readHTMLTable(htmlParse(url))
But most of the
2013 Jan 15
1
readHTMLTable (XML package)
Hi,
I am using XML::readHTMLTable and getting the below error. Does anyone know why? Does this function not work with https? I didn't see anything in help about that.
> library(XML)
> wampage<-readHTMLTable('https://hr-workforce-analytics.llnl.gov/wf_pi_pop.html',1)
Error in htmlParse(doc) :
File https://hr-workforce-analytics.llnl.gov/wf_pi_pop.html does not exist
Dan