thr3ads.net - search: "xpathapply"

Displaying 20 results from an estimated 27 matches for "xpathapply".

Package XML: Parse Garmin *.tcx file problems

2011 Mar 30

Package XML: Parse Garmin *.tcx file problems

...nodes, but an attempt at value of a single node comes up empty (even though there is data there. One question: Has anybody succeeded parsing Garmin .tcx (xml) files? Thanks! Michael _______________________ >doc2 = xmlRoot(xmlTreeParse("HR.reduced3.tcx",useInternalNodes = TRUE)) >xpathApply(doc2, "//*", xmlName) [[1]] [1] "TrainingCenterDatabase" [[2]] [1] "Activities" [[3]] [1] "Activity" [[4]] [1] "Id" [[5]] [1] "Lap" [[6]] [1] "TotalTimeSeconds" > xpathApply(doc2, "//TotalTimeSeconds", xmlValu...

Re ad HTML table

2007 Nov 18

Re ad HTML table

You can use htmlTreeParse and xpathApply from the XML library. something like: xpathApply( htmlTreeParse("http://blabla", useInt=T), "//td", function(x) xmlValue(x)) should do it. Gamma wrote: > > anyone care to explain how to read a html table, it's streaming data > (updated every second) and i am l...

Using xpathapply or getnodeset to get text between two distinct tags

2012 May 11

Using xpathapply or getnodeset to get text between two distinct tags

Hello: The following code extracts the links to the daily transcripts of Canada's House Of Commons. 'links' is a matrix of URLs (ncol=1), each of which points to one day's transcripts. If you inspect the code for scrape(links[1]), you will find that periodically there appears an italicitze tag after a paragraph tag (<p some text ><i>Translation</i></p>.

Analyzing Publications from Pubmed via XML

2007 Dec 14

Analyzing Publications from Pubmed via XML

I would like to track in which journals articles about a particular disease are being published. Creating a pubmed search is trivial. The search provides data but obviously not as an R dataframe. I can get the search to export the data as an xml feed and the xml package seems to be able to read it. xmlTreeParse("

How to parse XML

2008 May 02

How to parse XML

I would like to learn how to parse a mixed text/xml document I downloaded from the sec.gov website (see example below). I would like to parse this to get the value for each xml tag and then access it within R, but I don't know much about xml so I don't even know where to start debugging the errors I am getting in this example code. Can anyone help me get started? Thanks, Roger ftp

Scrap java scripts and styles from an html document

2011 Mar 29

Scrap java scripts and styles from an html document

Hi, I am working on developing a web crawler in R and I needed some help with regard to removal of javascripts and style sheets from the html document of a web page. i tried using the xml package, hence the function xpathApply library(XML) txt = xpathApply(html,"//body//text()[not(ancestor::script)][not(ancestor::style)]", xmlValue) The output comes out as text lines, without any html tags. I want the html tags to remain intact and scrap only the javascript and styles from it. Any help would be highly apprec...

encoding problem using xml package

2009 Sep 03

encoding problem using xml package

...des=T) > show(doc) <?xml version="1.0" encoding="ISO-8859-1"?> <Daten> <ITEM> <Messdaten> <MESSUNG> <BEZEICHNUNG>Länge</BEZEICHNUNG> </MESSUNG> </Messdaten> </ITEM> </Daten> > xpathApply(doc,"//MESSUNG/BEZEICHNUNG", xmlValue) [[1]] [1] "LÃ¤nge" > doc <- xmlTreeParse(fname2,useInternalNodes=T) > show(doc) <?xml version="1.0" encoding="utf-8"?> <Daten> <ITEM> <Messdaten> <MESSUNG> <B...

Memory allocation failed: Copying Node

2008 Jun 25

Memory allocation failed: Copying Node

...return(NULL) } #xValHelperSpecial and xValHelper are prerty similar hence avoiding code for xValHelper xValHelperSpecial <- function(node, xtag) { nobs <- xmlSize(node) out<-NULL if(xtag == "tagName1") { for (n in seq(1:nobs)) { temp <- xpathApply(node[[n]], "//" %+% xtag, xmlValue) if(length(temp) > 0) { if (n==1) assign("out",gsub('(^ +)|( +$)','',gsub('\n','',temp[[1]]))) else assign("out",rbind(out,gsub('(^ +)|( +$)','&...

Parse XML

2008 Jun 10

Parse XML

Could someone provide a link or examples of parsing XML document in R? Few specific questions below: For instance I can retrieve specific nodes using this: node <- xpathApply(xml, "//" %+% xtag, xmlValue) 1) I want to be able to retrieve parent node for this node, how can I do this? getParentNode() does not seem to cut it. 2) How can I retrieve children nodes for a particular node? 3) How can I create an iterator to iterate through the whole tree? Thank...

Parsing large XML documents in R - how to optimize the speed?

2012 Aug 10

Parsing large XML documents in R - how to optimize the speed?

...ious attributes -a list of 'aa' objects, each of which consisting of a couple of attributes. Here is the basic structure of the code: xml.doc <- xmlTreeParse("file", getDTD=FALSE, useInternalNodes=TRUE) result <- new('S4_result_class') result@proteins <- xpathApply(xml.doc, "//model/protein", function(protein.node) { protein <- new('S4_protein_class') ## fill in a couple of attributes of the protein object using xmlValue and xmlAttrs(protein.node) protein@peptides <- xpathApply(protein.node, "./peptide", function(peptid...

XML parameters to Column Headers for importing into a dataset

2008 Jun 12

XML parameters to Column Headers for importing into a dataset

Dear List, Do you know any way I can convert XML parameters into column headers. My data is in a csv file with each row containing a xml form of data , and multiple parameters ( <param1> data_val1 </param2> , <param2> data_val2 </param2> ) I want to convert it so each row caters to one record and each parameter becomes a different column. param1

Extracting text from html code using the RCurl package.

2008 Oct 06

Extracting text from html code using the RCurl package.

Dear R-help, I want to download the text from a web page, however what i end up with is the html code. Is there some option that i am missing in the RCurl package? Or is there another way to achieve this? This is the code i am using: > library(RCurl) > my.url <- 'https://stat.ethz.ch/mailman/listinfo/r-help' > html.file <- getURI(my.url, ssl.verifyhost = FALSE,

Extraccion de datos de una Web

2016 Jan 18

Extraccion de datos de una Web

...nas que no haya puntuado (en el ejemplo, la segunda semana). De momento lo estoy obteniendo de la siguiente forma: url_jugador<-"http://localhost:8080/jugadores/Luis" txt_jugador <- getURL(url_jugador) doc<-htmlTreeParse(txt_jugador, useInternalNodes = TRUE) puntos_nodo<- xpathApply(doc, "//table[@class='points']/tr") puntos_nodo [[1]] <tr> <td class="semana">1</td> <td class="neg"/> <td> <div class="bar">6</div> </td> </tr> [[2]] <tr> <td class="...

Importing huge XML-Files

2007 Sep 01

Importing huge XML-Files

Dear all, for my diploma thesis I have to import huge XML-Files into R for statistical processing - huge means a size about 33 MB. I'm using the XML-Package version 1.9 As far as reading the complete file into R via xmlTreeParse doesn't work or is too slow, I'm trying to use xmlEventParse but I got completely stuck. I have many different type of nodes + <configuration>

Using a FOR LOOP to name objects

2012 Feb 29

Using a FOR LOOP to name objects

...he following example (which doesn't work quite well) my_list<-c("A","B","C","D","E","F") for(i in c(1:length(my_list))){ url<- "http://finance.yahoo.com" doc = htmlTreeParse(url, useInternalNodes = T) tab_nodes = xpathApply(doc, "//table[@cellpadding = '3']") *my_list[i]*=lapply(tab_nodes, readHTMLTable) #problem is in this line names(*my_list[i]*)=c("Ins","outs") } The problem is that in iteraction #1, I need the info to be stored at an object called "A"; At...

using XML package to read RSS

2012 May 17

using XML package to read RSS

...have an issue. The RSS feed I would like to use is: > URL <- "http://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&type=&company=&dateb=&owner=include&start=0&count=40&output=atom" > library(XML) > doc <- xmlTreeParse(URL) > src <- xpathApply(xmlRoot(doc), "//entry") I get an empty list rather than a list of each of the "entry": > src list() attr(,"class") [1] "XMLNodeSet" I'm not sure how to fix this. Any suggestions? Do I need to provide a namespace, or is the RSS malformed? Thanks,...

Try Giving Invalid Argument Type Error

2012 May 19

Try Giving Invalid Argument Type Error

...sion(NULL)) URL<-paste("http://www.advfn.com/p.php?pid=financials&btn=istart_date&mode=quarterly_reports&symbol=", exh,"%3A",tic,"&istart_date=0", sep = "") if( !is( try( doc <- htmlParse(URL) ,"try-error") ) ) { qtrstop <- xpathApply(doc, "count(//select/option)")-5 } Error in !silent : invalid argument type Any help would be most appreciated. --John Sparks

Chinese characters encoding problem with XML

2008 Dec 31

Chinese characters encoding problem with XML

XML is a good tool reading data from web within R. But I wonder how could get the encoding correctly. library(XML) url <- 'http://www.szitic.com/docc/jz-lmzq.html' xml <- htmlTreeParse(url, useInternal=TRUE) q <- "//tbody/tr/td" dat <- unlist(xpathApply(xml, q, xmlValue)) df <- as.data.frame(t(matrix(dat, 4))) dt<-as.character(df[15,1]) The first column of df is dates in Chinese. dt is one of the Chinese dates. When I copied the content of dt into the email, it become the following: > dt [1] "2008å?G????㐀&#...

Extraccion de datos de una Web

2016 Jan 19

Extraccion de datos de una Web

...emana). De >> momento lo estoy obteniendo de la siguiente forma: >> >> url_jugador<-"http://localhost:8080/jugadores/Luis" >> txt_jugador <- getURL(url_jugador) >> doc<-htmlTreeParse(txt_jugador, useInternalNodes = TRUE) >> puntos_nodo<- xpathApply(doc, "//table[@class='points']/tr") >> puntos_nodo >> [[1]] >> <tr> >> <td class="semana">1</td> >> <td class="neg"/> >> <td> >> <div class="bar">6</div> &g...

XML - get node by name

2008 Sep 07

XML - get node by name

Hi there, I try to rewrite some Java-code with R. It deals with reading XML files. I started with the XML package. In Java, I had a very useful method which gave me a node by using: name of the node index of appearance start point: global (false) / local (true) So, I could do something like this. setCurrentChildNode("data", 0); getValueOfElement("val",1,true); -->

search for: xpathapply