thr3ads.net - similar to: "Bug with memory allocation when loading Rdata files iteratively?"

Displaying 20 results from an estimated 5000 matches similar to: "Bug with memory allocation when loading Rdata files iteratively?"

Need help reading website info with XML package and XPath

2011 May 30

Need help reading website info with XML package and XPath

Hi, I'm looking for help extracting some information of the zillow website. I'd like to do this for the general case where I manually change the address by modifying the url (see code below). With the url containing the address, I'd like to be able to extract the same information each time. The specific information I'd like to be able to extract includes the homedetails url, price

XML: Slower parsing over time with htmlTreeParse()

2010 Mar 15

XML: Slower parsing over time with htmlTreeParse()

Sorry, I listed the wrong package in the header of my previous post! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dear List, has anyone of you experienced a significant increase in the time it takes to parse an URL via "htmlTreeParse()" when this function is called

XML package example code?

2009 Nov 25

XML package example code?

I'm interested in parsing an html page. I should use XML, right? Could you somebody show me some example code? Is there a tutorial for this package?

retrieve certain part from html

2009 Sep 23

retrieve certain part from html

Dear All, Can someone please guide me how to get the certain part from a long html language? e.g. "<td><a href='2005-01.html'>2005-01</a></td><td><a href='2006-01.html'>2006-01</a></td><td><a href='2007-01.html'>2007-01</a></td><td><a

import HTML tables

2009 May 12

import HTML tables

Hello, I was wondering if there is a function in R that imports tables directly from a HTML document. I know there are functions (say, getURL() from {RCurl} ) that download the entire page source, but here I refer to something like google document's function importHTML() (if you don't know this function, go check it, it's very useful). Anyway, if someone of something that does this

Re ad HTML table

2007 Nov 18

Re ad HTML table

You can use htmlTreeParse and xpathApply from the XML library. something like: xpathApply( htmlTreeParse("http://blabla", useInt=T), "//td", function(x) xmlValue(x)) should do it. Gamma wrote: > > anyone care to explain how to read a html table, it's streaming data > (updated every second) and i am looking for a suitable function. > > The imported html

Rcurl, postForm()

2012 May 28

Rcurl, postForm()

Dear colleagues, Could I get some assistance using postForm() to scrape the business names and addresses at this website: http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic

newbie xml parsing question

2011 May 28

newbie xml parsing question

I am trying to read some data off the zillow site. Newbie to xml, html, parsing and the xml package. I've been able to load the web page I'm interested with the following code but I'm not sure of the next step to get the information I'm interested in into R : library(XML) url <- "http://www.zillow.com/homes/511 W Lafayette St, Norristown, PA_rb" doc <-doc <-

XML and RCurl: problem with encoding (htmlTreeParse)

2009 Dec 31

XML and RCurl: problem with encoding (htmlTreeParse)

Hi, I'm trying to get data from web page and modify it in R. I have a problem with encoding. I'm not able to get encoding right in htmlTreeParse command. See below > library(RCurl) > library(XML) > > site <- getURL("http://www.aarresaari.net/jobboard/jobs.html") > txt <- readLines(tc <- textConnection(site)); close(tc) > txt <- htmlTreeParse(txt,

How to suppress errors from htmlTreeParse() function in XML package?

2008 Nov 04

How to suppress errors from htmlTreeParse() function in XML package?

Dear R-help, The following code downloads an html document into variable 'doc' and then stores an internal representation into variable 'html.tree'. Even if the html code is malformed, this still works which is fantastic. However, as in the example below, i do get some ouput from R in the console which i would like to suppress somehow, so i can keep my window a bit cleaner. I

Stuck ...can't get sapply and xmlTreeParse working

2011 Jul 05

Stuck ...can't get sapply and xmlTreeParse working

Can't seem to get the code below working. It gets stuck on line 24 inside the function hm; comments show the line in question. The function hm is called by sapply and is at the bottom of the code. Other stuff above line 24 works correctly including the first couple of lines of the function hm. Should I be using a different apply function or am I doing something wrong with xmlTreeParse ?

getNodeSet - what am I doing wrong?

2010 Aug 30

getNodeSet - what am I doing wrong?

Hi, Why is the following retuning a nodset of length 0: > library(XML) > test <- xmlTreeParse( > "http://www.unimod.org/xml/unimod_tables.xml",useInternalNodes=TRUE) > getNodeSet(test,"//modifications_row") Thanks for any hint. Joh

Extracting text from html code using the RCurl package.

2008 Oct 06

Extracting text from html code using the RCurl package.

Dear R-help, I want to download the text from a web page, however what i end up with is the html code. Is there some option that i am missing in the RCurl package? Or is there another way to achieve this? This is the code i am using: > library(RCurl) > my.url <- 'https://stat.ethz.ch/mailman/listinfo/r-help' > html.file <- getURI(my.url, ssl.verifyhost = FALSE,

R hangs after htmlTreeParse

2011 Aug 25

R hangs after htmlTreeParse

Dear colleagues, I'm trying to parse the html content from this webpage:

Treatment of xml-stylesheet processing instructions in XML module

2011 Apr 06

Treatment of xml-stylesheet processing instructions in XML module

Hello again, Another stumble here that is defeating me. I try: a<-readLines(url("http://feeds.feedburner.com/grokin")) t<-XML::xmlTreeParse(a, ignoreBlanks=TRUE, replaceEntities=FALSE, asText=TRUE) elem<- XML::getNodeSet(XML::xmlRoot(t),"/rss/channel/item")[[1]] And I get: Start tag expected, '<' not found Error: 1: Start tag expected, '<' not

Can an object reference itself?

2010 Jan 15

Can an object reference itself?

Dear List, I am not really familiar with any other language than R, but I?ve heard that in other languages there is something called ?self referencing?. Here?s what I?m trying to get an answer for: Suppose there is a function that takes as its input a value of a slot of an S4 object. The function itself is stored in another slot of the SAME S4 object. Is it then possible to have the function

is there a way to extract fata from web pages through some R function ?

2009 Jul 01

is there a way to extract fata from web pages through some R function ?

I deal with a huge amount of Biology data stored in different databases. The databases belongig to Bioconductor organization can be accessed through Bioconductor packages. Unluckily some useful data is stored in databases like, for instance, miRDB, miRecords, etc ... which offer just an interactive HTML interface. See for instance http://mirdb.org/cgi-bin/search.cgi,

Why does loading saved/cached objects add significantly to RAM consumption?

2011 Aug 30

Why does loading saved/cached objects add significantly to RAM consumption?

Dear list, I make use of cached objects extensively for time consuming computations and yesterday I happened to notice some very strange behavior in that respect: When I execute a given computation whose result I'd like to cache (tried both saving it as '.Rdata' and via package 'R.cache' which uses a own filetype '.Rcache'), my R session consumes about 200 MB of

Removing Embedded Null characters from text/html

2009 Oct 15

Removing Embedded Null characters from text/html

Hi, I'm trying to download some data from the web and am running into problems with 'embedded null' characters. These seem to indicate to R that it should stop processing the page so I'd like to remove them. I've been looking around and can't seem to identify exactly what the character is and consequently how to remove it. # THE CODE WORKS ON THIS PAGE library(RCurl)

Downloading data from from internet

2009 Sep 24

Downloading data from from internet

Hi all, I want to download data from those two different sources, directly into R : http://www.rateinflation.com/consumer-price-index/usa-cpi.php http://eaindustry.nic.in/asp2/list_d.asp First one is CPI of US and 2nd one is WPI of India. Can anyone please give any clue how to download them directly into R. I want to make them zoo object for further analysis. Thanks, -- View this message in

similar to: Bug with memory allocation when loading Rdata files iteratively?