similar to: R as a web scraping tool using RCurl

Displaying 20 results from an estimated 4000 matches similar to: "R as a web scraping tool using RCurl"

2012 May 14
3
Scraping a web page.
Folks, I want to scrape a series of web-page sources for strings like the following: "/en/Ships/A-8605507.html" "/en/Ships/Aalborg-8122830.html" which appear in an href inside an <a> tag inside a <div> tag inside a table. In fact all I want is the (exactly) 7-digit number before ".html". The good news is that as far as I can tell the the <a>
2012 Sep 19
1
scraping with session cookies
Hi, I am starting coding in r and one of the things that i want to do is to scrape some data from the web. The problem that I am having is that I cannot get passed the disclaimer page (which produces a session cookie). I have been able to collect some ideas and combine them in the code below but I dont get passed the disclaimer page. I am trying to agree the disclaimer with the postForm and write
2012 May 28
1
Rcurl, postForm()
Dear colleagues, Could I get some assistance using postForm() to scrape the business names and addresses at this website: http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic
2013 May 08
1
Dependencies of Imports not attached?
Encountered an error in scripting, which can be reproduced using Rscript as follows: $ Rscript -e "library(httr); handle('http://cran.r-project.org')" Error in getCurlHandle(cookiefile = cookie_path, .defaults = list()) : could not find function "getClass" Calls: handle -> getCurlHandle or by starting R without the methods package attached: $
2011 May 06
0
My First Attempt at Screen Scraping with R
Hello Folks, I'm working on trying to scrape my first web site and ran into a issue because I'm really don't know anything about regular expressions in R. library(XML) library(RCurl) site <- "http://thisorthat.com/leader/month" site.doc <- htmlParse(site, ?, xmlValue) At the ?, I realize that I need to insert a regex command which will decipher the contents of the
2012 Apr 16
1
grep and XML
Hi all: I struggle a lot scraping web data. I still haven't got a handle on the XML package. I'd like to get particular exchange rates from this table: https://raw.github.com/currencybot/open-exchange-rates/master/latest.json This is the code that I'm working with: library(RCurl) library(XML)
2007 Mar 21
2
problem with RCurl install on Unix
I am having trouble getting an install of RCurl to work properly on a Unix server. The steps I have taken are: 1. installed cUrl from source without difficulty 2. installed RCurl from source using the command ~/R_HOME/R-devel/bin/R CMD INSTALL -l ~/R_HOME/R-devel/library ~/RCurl_0.8-0.tar.gz I received no errors during this install 3. when I go back to R and require(RCurl), I get >
2007 Mar 21
2
problem with RCurl install on Unix
I am having trouble getting an install of RCurl to work properly on a Unix server. The steps I have taken are: 1. installed cUrl from source without difficulty 2. installed RCurl from source using the command ~/R_HOME/R-devel/bin/R CMD INSTALL -l ~/R_HOME/R-devel/library ~/RCurl_0.8-0.tar.gz I received no errors during this install 3. when I go back to R and require(RCurl), I get >
2008 Dec 01
1
[BioC] Rcurl 0.8-1 update for bioconductor 2.7
Hi Patrick, Greetings from !(sunny) Pittsburgh. What's the scoop on RCurl on windows (XP)? I've tried to install RCurl_0.92-0.zip and RCurl_0.9-3.zip, with both R 2.7.2 and R 2.8.0 from the RGUI (utils:::menuInstallLocal), and get the error "Windows binary packages in zipfiles are not supported". which (according to google's one and only hit) comes from a perl script.
2006 Jan 27
1
Caching from screen scraping
Hi all, I need to do some screen scraping from my rails app. Given an ethernet (MAC) adress, I scrape results from an internal web page that returns location and hostname. How can I cache the result from that screen scraping as to be polite to the scrapee? I would like to expire the results daily. In perl, I would use Cache::File. Can I use rails caching for this? What''s the best
2008 Jul 25
1
Installation error for RCurl in Redhat enterrpise 5
I am getting the following error while trying to install the RCurl library. I have checked that the curl and the libcurl.so.3 is already installed in the /usr/bin > install.packages("RCurl") --- Please select a CRAN mirror for use in this session --- Loading Tcl/Tk interface ... done trying URL 'http://cran.hostingzero.net/src/contrib/RCurl_0.9-3.tar.gz' Content type
2011 Apr 03
1
problem in install RCurl in R (Ubuntu Linux)
I have some problem in running R-cran's Demography package. The hmd.mx function need Rcurl. I tried to install RCurl, but meet the following error: ********************************************************************* ... * installing *source* package ?RCurl? ... checking for curl-config... no Cannot find curl-config ERROR: configuration failed for package ?RCurl? * removing
2011 Jun 06
1
RCurl and kerberos
Dear list, I would like to call a Kerberos-authenticated web-service from within R. Curl can do it: $ curl --negotiate -u : "http://my.web.service/" so I would expect that RCurl also has the capability, but I have not been able to find the correct options to set. listCurlOptions() does not return anything with negotiate, and searching the source of RCurl, the only thing I found was
2014 Jan 02
2
Installing RCurl -
Dear all, I am trying to install RCurl (because I want to install devtools) and to do so I've been informed that I must install one of the packages libcurl4-openssl-dev libcurl4-nss-dev No matter which one I install I get the following error from R: * installing *source* package ‘RCurl’ ... ** package ‘RCurl’ successfully unpacked and MD5 sums checked checking for curl-config...
2012 Jun 07
1
How to set cookies in RCurl
Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url <- "http://www.theurl.com" content <- readHTMLTable(url) content
2010 Dec 03
1
Problem installing RCurl
I have 64-bit R 2 12 0 installed on Solaris 10 of Sun Sparc. When I tried to install RCurl, it failed with the following lines, ............... Version has CURLOPT_SSL_SESSIONID_CACHE libcurl version: libcurl 7.19.6 configure: creating ./config.status config.status: creating src/Makevars ** libs cc -xc99 -m64 -xarch=sparcvis2 -I/apps/sparcv9/R-2.12.0/lib/R/include -I/opt/csw/include
2008 Sep 17
2
RCurl compilation error on ubuntu hardy
Dear list members, I encountered this problem and the solution pointed out in a previous thread did not work for me. (e.g. install.packages("RCurl", repos = "http://www.omegahat.org/R") I work with Ubuntu Hardy, and installed R 2.6.2 via apt-get. I really need RCurl in order to use biomaRt ... any help would be greatly appreciated. Best wishes, Emmanuel
2008 Aug 27
1
RCurl: using netrc with curlPerform
Hello, I am having trouble getting the curlPerform function to authenticate using the .netrc file. From the documentation I've read it certainly seems as though this function should be able to authenticate via the .netrc file. The example I am using here comes from the "R as a Web Client- the RCurl package" paper and demonstrates using the .netrc file to access the
2009 Feb 26
2
ftp fetch using RCurl?
Hi everyone, I have to fetch about 300 to 500 zipped archives from a remote ftp server. Each of the archive is about 1Mb. I know I can get it done by using download.file() in R, but I am curious that is there a faster way to do this using RCurl. For example, are there some parameters that I can set so that the connection does not need to be rebuilt....etc. A even simpler question is, how can I
2007 Oct 16
1
problem with RCurl 0.8-1 installation on Debian Etch
Dear R-Users, I am having some trouble getting an installation of RCurl 0.8-1 to work properly on a Debian (Etch) machine. The command 'R CMD INSTALL RCurl_0.8-1.tar.gz' yields the following error: Installing *source* package 'RCurl' ... checking for curl-config... no Cannot find curl-config ERROR: configuration failed for package 'RCurl' I do know that a file is