Displaying 20 results from an estimated 600 matches similar to: "htmlParse (from XML library) working sporadically in the same code"
2011 Jun 07
1
XML segfault on some architectures
Hi,
I found an architecture-specific segfault problem with the XML package. I originally found the problem using the parseKGML2Graph function in the Bioconductor KEGGgraph package, but as far as I can tell the underlying issue seems to be with the xmlTreeParse which is called by parseKGML2Graph.
I'm trying this piece of code, from the xmlTreeParse help page:
library(XML)
fileName <-
2012 May 15
1
KEGGSOAP installation error
Hello,
I'm trying to install KEGGSOAP with bioconductor but i'm facing this
problem:
/> biocLite("KEGGSOAP")
BioC_mirror: http://bioconductor.org
Using R version 2.15, BiocInstaller version 1.4.4.
Installing package(s) 'KEGGSOAP'
trying URL
'http://www.bioconductor.org/packages/2.10/bioc/src/contrib/KEGGSOAP_1.30.0.tar.gz'
Content type
2010 Nov 04
3
postForm() in RCurl and library RHTMLForms
Hi RUsers,
Suppose I want to see the data on the website
url <- "http://www.nseindia.com/content/indices/ind_histvalues.htm"
for the index "S&P CNX NIFTY" for
dates "FromDate"="01-11-2010","ToDate"="02-11-2010"
then read the html table from the page using readHTMLtable()
I am using this code
webpage <-
2011 Aug 29
1
reading tables from multiple HTML pages
Hi, beginner to R and was having some problems scraping data from tables in
html using the XML package. I have included some code below.
I am trying to loop through a series of html pages, each of which contains a
single table from which I want to scrape data. However, some of the pages
are blank - and so it throws me an error message when it gets to
htmlParse(). The loop then closes out and I
2012 May 21
1
htmlParse Error
I am trying to parse a webpage using the htmlParse command in XML package as
follows:
library(XML)
u = "http://en.wikipedia.org/wiki/World_population"
doc = htmlParse(u)
I get the following error:
Error in htmlParse(u) :
error in creating parser for http://en.wikipedia.org/wiki/World_population
I am using a R 2.13.1 (32 bit version) on a 64 bit Windows. (I tried
installing it in
2012 Aug 09
2
read htm table error
Hi I am using Version R 2.15 and I haven't been able read html table. Following is my code and error message.
Error in htmlParse(doc) :
error in creating parser for http://en.wikipedia.org/wiki/Brazil_national_football_team
theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team"
tables <- readHTMLTable(theurl)
Regards,
Kiung
[[alternative HTML version
2012 Jan 30
1
Getting htmlParse to work with Hebrew? (on windows)
Hello dear R-help mailing list.
I wish to be able to have htmlParse work well with Hebrew, but it keeps to
scramble the Hebrew text in pages I feed into it.
For example:
# why can't I parse the Hebrew correctly?
library(RCurl)
library(XML)
u = "http://humus101.com/?p=2737"
a = getURL(u)
a # Here - the hebrew is fine.
a2 <- htmlParse(a)
a2 # Here it is a mess...
None of
2013 Feb 21
4
Getting htmlParse to work with Hebrew? (on windows)
Hello dear R-help mailing list.
Looks like the same issue in Russian:
library(RCurl)
library(XML)
u = " http://www.cian.ru/cat.php?deal_type=2&obl_id=1&room1=1"
a = getURL(u)
a # Here - the Russian is fine.
a2 <- htmlParse(a)
a2 # Here it is a mess...
None of these seem to fix it:
htmlParse(a, encoding = "windows-1251")
htmlParse(a, encoding =
2013 Jan 15
1
readHTMLTable (XML package)
Hi,
I am using XML::readHTMLTable and getting the below error. Does anyone know why? Does this function not work with https? I didn't see anything in help about that.
> library(XML)
> wampage<-readHTMLTable('https://hr-workforce-analytics.llnl.gov/wf_pi_pop.html',1)
Error in htmlParse(doc) :
File https://hr-workforce-analytics.llnl.gov/wf_pi_pop.html does not exist
Dan
2010 Mar 18
1
Do colClasses in readHTMLTable (XML Package) work?
Hi,
I can't get the colClasses option to work in the readHTMLTable function
of the XML package. Here's a code fragment:
require("XML")
doc <- "http://www.nber.org/cycles/cyclesmain.html"
table <- getNodeSet(htmlParse(doc),"//table") [[2]] # The
main table is the second one because it's embedded in the page table.
xt
2011 Sep 05
2
htmlParse hangs or crashes
Dear colleagues,
each time I use htmlParse, R crashes or hangs. The url I'd like to parse is included below as is the results of a series of basic commands that describe what I'm experiencing. The results of sessionInfo() are attached at the bottom of the message.
The thing is, htmlTreeParse appears to work just fine, although it doesn't appear to contain the information I need (the
2009 Jun 30
1
How to pass parameters to htmlParse Bank of Canada html pages
To get USDCAD rates from Bank of Canada, we first go
url <- "http://banqueducanada.ca/en/rates/exchange-avg.html"
select 12 months for Rates for the past and click "Get Rates" button. Then
the page moves to
address <- "http://banqueducanada.ca/cgi-bin/famecgi_fdps"
and the rates show in the html page.
htmlParse() can read the html document but
2012 Jun 07
1
How to set cookies in RCurl
Hi,
I am trying to access a website and read its content. The website is a
restricted access website that I access through a proxy server (which
therefore requires me to enable cookies). I have problems in allowing Rcurl
to receive and send cookies.
The following lines give me:
library(RCurl)
library(XML)
url <- "http://www.theurl.com"
content <- readHTMLTable(url)
content
2012 Sep 04
0
get only little part of html with htmlParse
Here is my code.
there are three method to get text to be parded by htmlParse function.
1.file on mycomputer
options(encoding="gbk")
library(XML)
xmltext1 <- htmlParse("/home/tiger/Desktop/27174.htm" )
#/home/tiger/Desktop/27174.htm is the file of http://www.jb51.net/article/27174.htm downloaded on my computer.
2.url
options(encoding="gbk")
2012 Oct 17
0
postForm() in RCurl and library RHTMLForms
Hi R Users,
I want to get the data from the url given from 10/09/2012 to 15/10/2012.
I don't know how to pass the parameters .
.......................................................................................................................................
library(RHTMLForms)
>
> ff = getHTMLFormDescription("
2013 Feb 28
0
Scraping data from website---Error in htmlParse: error in creating parser
I'm trying to scrape football projections from accuscore.com for the
different positions (right now the projections are set to zeros, but that
will change). I can get the QB projections, but I can't get the
projections for any of the other positions (e.g., RB). How can I get the
RB projections?
I'm not sure what the actual website for the RB and other projections is.
When I go to
2009 Nov 26
1
How to suppress errors generated by readHTMLTable?
library(XML)
download.file('http://polya.umdnj.edu/polya_db2/gene.php?llid=109079&unigene=&submit=Submit','index.html')
tables=readHTMLTable("index.html",error=function(...){})
tables
readHTMLTable gives me the following errors. Could somebody let me
know how to suppress them?
Opening and ending tag mismatch: center and table
htmlParseEntityRef: expecting
2009 May 12
2
import HTML tables
Hello,
I was wondering if there is a function in R that imports tables directly
from a HTML document. I know there are functions (say, getURL() from {RCurl}
) that download the entire page source, but here I refer to something like
google document's function importHTML() (if you don't know this function, go
check it, it's very useful). Anyway, if someone of something that does this
2012 Sep 14
0
htmlParse pop ups over web pages
Hello All,
I am trying to write a routine that loops over some links and parses those links using htmlParse. The problem is that one of the links may display a pop up window on top of that link's web page. If there is a pop up, the routine bombs and I get an error message that the url doesn't exist.
Does the XML package (or perhaps another package) provide a way to deal with this
2012 Apr 16
1
grep and XML
Hi all:
I struggle a lot scraping web data. I still haven't got a handle on the XML package.
I'd like to get particular exchange rates from this table:
https://raw.github.com/currencybot/open-exchange-rates/master/latest.json
This is the code that I'm working with:
library(RCurl)
library(XML)