Displaying 20 results from an estimated 5000 matches similar to: "Bug with memory allocation when loading Rdata files iteratively?"
2011 May 30
1
Need help reading website info with XML package and XPath
Hi, I'm looking for help extracting some information of the zillow website.
I'd like to do this for the general case where I manually change the address
by modifying the url (see code below). With the url containing the address,
I'd like to be able to extract the same information each time. The specific
information I'd like to be able to extract includes the homedetails url,
price
2010 Mar 15
1
XML: Slower parsing over time with htmlTreeParse()
Sorry, I listed the wrong package in the header of my previous post!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Dear List,
has anyone of you experienced a significant increase in the time it takes to
parse an URL via "htmlTreeParse()" when this function is called
2009 Nov 25
2
XML package example code?
I'm interested in parsing an html page. I should use XML, right? Could
you somebody show me some example code? Is there a tutorial for this
package?
2009 Sep 23
3
retrieve certain part from html
Dear All,
Can someone please guide me how to get the certain part from a long html
language?
e.g.
"<td><a href='2005-01.html'>2005-01</a></td><td><a
href='2006-01.html'>2006-01</a></td><td><a
href='2007-01.html'>2007-01</a></td><td><a
2009 May 12
2
import HTML tables
Hello,
I was wondering if there is a function in R that imports tables directly
from a HTML document. I know there are functions (say, getURL() from {RCurl}
) that download the entire page source, but here I refer to something like
google document's function importHTML() (if you don't know this function, go
check it, it's very useful). Anyway, if someone of something that does this
2007 Nov 18
4
Re ad HTML table
You can use htmlTreeParse and xpathApply from the XML library.
something like:
xpathApply( htmlTreeParse("http://blabla", useInt=T), "//td", function(x)
xmlValue(x))
should do it.
Gamma wrote:
>
> anyone care to explain how to read a html table, it's streaming data
> (updated every second) and i am looking for a suitable function.
>
> The imported html
2012 May 28
1
Rcurl, postForm()
Dear colleagues,
Could I get some assistance using postForm() to scrape the business names and addresses at this website:
http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx
I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic
2011 May 28
1
newbie xml parsing question
I am trying to read some data off the zillow site. Newbie to xml, html,
parsing and the xml package. I've been able to load the web page I'm
interested with the following code but I'm not sure of the next step to get
the information I'm interested in into R :
library(XML)
url <- "http://www.zillow.com/homes/511 W Lafayette St, Norristown, PA_rb"
doc <-doc <-
2009 Dec 31
3
XML and RCurl: problem with encoding (htmlTreeParse)
Hi,
I'm trying to get data from web page and modify it in R. I have a
problem with encoding. I'm not able to get
encoding right in htmlTreeParse command. See below
> library(RCurl)
> library(XML)
>
> site <- getURL("http://www.aarresaari.net/jobboard/jobs.html")
> txt <- readLines(tc <- textConnection(site)); close(tc)
> txt <- htmlTreeParse(txt,
2008 Nov 04
2
How to suppress errors from htmlTreeParse() function in XML package?
Dear R-help,
The following code downloads an html document into variable 'doc' and
then stores an internal representation into variable 'html.tree'. Even
if the html code is malformed, this still works which is fantastic.
However, as in the example below, i do get some ouput from R in the
console which i would like to suppress somehow, so i can keep my
window a bit cleaner.
I
2011 Jul 05
2
Stuck ...can't get sapply and xmlTreeParse working
Can't seem to get the code below working. It gets stuck on line 24 inside the
function hm; comments show the line in question. The function hm is called
by sapply and is at the bottom of the code. Other stuff above line 24 works
correctly including the first couple of lines of the function hm. Should I
be using a different apply function or am I doing something wrong with
xmlTreeParse ?
2010 Aug 30
4
getNodeSet - what am I doing wrong?
Hi,
Why is the following retuning a nodset of length 0:
> library(XML)
> test <- xmlTreeParse(
> "http://www.unimod.org/xml/unimod_tables.xml",useInternalNodes=TRUE)
> getNodeSet(test,"//modifications_row")
Thanks for any hint.
Joh
2008 Oct 06
3
Extracting text from html code using the RCurl package.
Dear R-help,
I want to download the text from a web page, however what i end up
with is the html code. Is there some option that i am missing in the
RCurl package? Or is there another way to achieve this? This is the
code i am using:
> library(RCurl)
> my.url <- 'https://stat.ethz.ch/mailman/listinfo/r-help'
> html.file <- getURI(my.url, ssl.verifyhost = FALSE,
2011 Aug 25
1
R hangs after htmlTreeParse
Dear colleagues,
I'm trying to parse the html content from this webpage:
2011 Apr 06
1
Treatment of xml-stylesheet processing instructions in XML module
Hello again,
Another stumble here that is defeating me.
I try:
a<-readLines(url("http://feeds.feedburner.com/grokin"))
t<-XML::xmlTreeParse(a, ignoreBlanks=TRUE, replaceEntities=FALSE,
asText=TRUE)
elem<- XML::getNodeSet(XML::xmlRoot(t),"/rss/channel/item")[[1]]
And I get:
Start tag expected, '<' not found
Error: 1: Start tag expected, '<' not
2010 Jan 15
1
Can an object reference itself?
Dear List,
I am not really familiar with any other language than R, but I?ve heard that
in other languages there is something called ?self referencing?.
Here?s what I?m trying to get an answer for:
Suppose there is a function that takes as its input a value of a slot of an
S4 object. The function itself is stored in another slot of the SAME S4
object. Is it then possible to have the function
2009 Jul 01
3
is there a way to extract fata from web pages through some R function ?
I deal with a huge amount of Biology data stored in different databases.
The databases belongig to Bioconductor organization can be accessed through Bioconductor packages.
Unluckily some useful data is stored in databases like, for instance, miRDB, miRecords, etc ... which offer just an
interactive HTML interface. See for instance
http://mirdb.org/cgi-bin/search.cgi,
2011 Aug 30
1
Why does loading saved/cached objects add significantly to RAM consumption?
Dear list,
I make use of cached objects extensively for time consuming computations
and yesterday I happened to notice some very strange behavior in that
respect:
When I execute a given computation whose result I'd like to cache (tried
both saving it as '.Rdata' and via package 'R.cache' which uses a own
filetype '.Rcache'), my R session consumes about 200 MB of
2009 Oct 15
1
Removing Embedded Null characters from text/html
Hi,
I'm trying to download some data from the web and am running into
problems with 'embedded null' characters. These seem to indicate to R
that it should stop processing the page so I'd like to remove them.
I've been looking around and can't seem to identify exactly what the
character is and consequently how to remove it.
# THE CODE WORKS ON THIS PAGE
library(RCurl)
2009 Sep 24
2
Downloading data from from internet
Hi all,
I want to download data from those two different sources, directly into R :
http://www.rateinflation.com/consumer-price-index/usa-cpi.php
http://eaindustry.nic.in/asp2/list_d.asp
First one is CPI of US and 2nd one is WPI of India. Can anyone please give
any clue how to download them directly into R. I want to make them zoo
object for further analysis.
Thanks,
--
View this message in