search for: xpathapply

Displaying 20 results from an estimated 27 matches for "xpathapply".

2011 Mar 30
1
Package XML: Parse Garmin *.tcx file problems
...nodes, but an attempt at value of a single node comes up empty (even though there is data there. One question: Has anybody succeeded parsing Garmin .tcx (xml) files? Thanks! Michael _______________________ >doc2 = xmlRoot(xmlTreeParse("HR.reduced3.tcx",useInternalNodes = TRUE)) >xpathApply(doc2, "//*", xmlName) [[1]] [1] "TrainingCenterDatabase" [[2]] [1] "Activities" [[3]] [1] "Activity" [[4]] [1] "Id" [[5]] [1] "Lap" [[6]] [1] "TotalTimeSeconds" > xpathApply(doc2, "//TotalTimeSeconds", xmlValu...
2007 Nov 18
4
Re ad HTML table
You can use htmlTreeParse and xpathApply from the XML library. something like: xpathApply( htmlTreeParse("http://blabla", useInt=T), "//td", function(x) xmlValue(x)) should do it. Gamma wrote: > > anyone care to explain how to read a html table, it's streaming data > (updated every second) and i am l...
2012 May 11
0
Using xpathapply or getnodeset to get text between two distinct tags
Hello: The following code extracts the links to the daily transcripts of Canada's House Of Commons. 'links' is a matrix of URLs (ncol=1), each of which points to one day's transcripts. If you inspect the code for scrape(links[1]), you will find that periodically there appears an italicitze tag after a paragraph tag (<p some text ><i>Translation</i></p>.
2007 Dec 14
6
Analyzing Publications from Pubmed via XML
I would like to track in which journals articles about a particular disease are being published. Creating a pubmed search is trivial. The search provides data but obviously not as an R dataframe. I can get the search to export the data as an xml feed and the xml package seems to be able to read it. xmlTreeParse("
2008 May 02
1
How to parse XML
I would like to learn how to parse a mixed text/xml document I downloaded from the sec.gov website (see example below). I would like to parse this to get the value for each xml tag and then access it within R, but I don't know much about xml so I don't even know where to start debugging the errors I am getting in this example code. Can anyone help me get started? Thanks, Roger ftp
2011 Mar 29
2
Scrap java scripts and styles from an html document
Hi, I am working on developing a web crawler in R and I needed some help with regard to removal of javascripts and style sheets from the html document of a web page. i tried using the xml package, hence the function xpathApply library(XML) txt = xpathApply(html,"//body//text()[not(ancestor::script)][not(ancestor::style)]", xmlValue) The output comes out as text lines, without any html tags. I want the html tags to remain intact and scrap only the javascript and styles from it. Any help would be highly apprec...
2009 Sep 03
1
encoding problem using xml package
...des=T) > show(doc) <?xml version="1.0" encoding="ISO-8859-1"?> <Daten> <ITEM> <Messdaten> <MESSUNG> <BEZEICHNUNG>Länge</BEZEICHNUNG> </MESSUNG> </Messdaten> </ITEM> </Daten> > xpathApply(doc,"//MESSUNG/BEZEICHNUNG", xmlValue) [[1]] [1] "Länge" > doc <- xmlTreeParse(fname2,useInternalNodes=T) > show(doc) <?xml version="1.0" encoding="utf-8"?> <Daten> <ITEM> <Messdaten> <MESSUNG> <B...
2008 Jun 25
0
Memory allocation failed: Copying Node
...return(NULL) } #xValHelperSpecial and xValHelper are prerty similar hence avoiding code for xValHelper xValHelperSpecial <- function(node, xtag) { nobs <- xmlSize(node) out<-NULL if(xtag == "tagName1") { for (n in seq(1:nobs)) { temp <- xpathApply(node[[n]], "//" %+% xtag, xmlValue) if(length(temp) > 0) { if (n==1) assign("out",gsub('(^ +)|( +$)','',gsub('\n','',temp[[1]]))) else assign("out",rbind(out,gsub('(^ +)|( +$)','&...
2008 Jun 10
1
Parse XML
Could someone provide a link or examples of parsing XML document in R? Few specific questions below: For instance I can retrieve specific nodes using this: node <- xpathApply(xml, "//" %+% xtag, xmlValue) 1) I want to be able to retrieve parent node for this node, how can I do this? getParentNode() does not seem to cut it. 2) How can I retrieve children nodes for a particular node? 3) How can I create an iterator to iterate through the whole tree? Thank...
2012 Aug 10
3
Parsing large XML documents in R - how to optimize the speed?
...ious attributes -a list of 'aa' objects, each of which consisting of a couple of attributes. Here is the basic structure of the code: xml.doc <- xmlTreeParse("file", getDTD=FALSE, useInternalNodes=TRUE) result <- new('S4_result_class') result@proteins <- xpathApply(xml.doc, "//model/protein", function(protein.node) { protein <- new('S4_protein_class') ## fill in a couple of attributes of the protein object using xmlValue and xmlAttrs(protein.node) protein@peptides <- xpathApply(protein.node, "./peptide", function(peptid...
2008 Jun 12
1
XML parameters to Column Headers for importing into a dataset
Dear List, Do you know any way I can convert XML parameters into column headers. My data is in a csv file with each row containing a xml form of data , and multiple parameters ( <param1> data_val1 </param2> , <param2> data_val2 </param2> ) I want to convert it so each row caters to one record and each parameter becomes a different column. param1
2008 Oct 06
3
Extracting text from html code using the RCurl package.
Dear R-help, I want to download the text from a web page, however what i end up with is the html code. Is there some option that i am missing in the RCurl package? Or is there another way to achieve this? This is the code i am using: > library(RCurl) > my.url <- 'https://stat.ethz.ch/mailman/listinfo/r-help' > html.file <- getURI(my.url, ssl.verifyhost = FALSE,
2016 Jan 18
3
Extraccion de datos de una Web
...nas que no haya puntuado (en el ejemplo, la segunda semana). De momento lo estoy obteniendo de la siguiente forma: url_jugador<-"http://localhost:8080/jugadores/Luis" txt_jugador <- getURL(url_jugador) doc<-htmlTreeParse(txt_jugador, useInternalNodes = TRUE) puntos_nodo<- xpathApply(doc, "//table[@class='points']/tr") puntos_nodo [[1]] <tr> <td class="semana">1</td> <td class="neg"/> <td> <div class="bar">6</div> </td> </tr> [[2]] <tr> <td class="...
2007 Sep 01
2
Importing huge XML-Files
Dear all, for my diploma thesis I have to import huge XML-Files into R for statistical processing - huge means a size about 33 MB. I'm using the XML-Package version 1.9 As far as reading the complete file into R via xmlTreeParse doesn't work or is too slow, I'm trying to use xmlEventParse but I got completely stuck. I have many different type of nodes + <configuration>
2012 Feb 29
2
Using a FOR LOOP to name objects
...he following example (which doesn't work quite well) my_list<-c("A","B","C","D","E","F") for(i in c(1:length(my_list))){ url<- "http://finance.yahoo.com" doc = htmlTreeParse(url, useInternalNodes = T) tab_nodes = xpathApply(doc, "//table[@cellpadding = '3']") *my_list[i]*=lapply(tab_nodes, readHTMLTable) #problem is in this line names(*my_list[i]*)=c("Ins","outs") } The problem is that in iteraction #1, I need the info to be stored at an object called "A"; At...
2012 May 17
1
using XML package to read RSS
...have an issue. The RSS feed I would like to use is: > URL <- "http://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent&type=&company=&dateb=&owner=include&start=0&count=40&output=atom" > library(XML) > doc <- xmlTreeParse(URL) > src <- xpathApply(xmlRoot(doc), "//entry") I get an empty list rather than a list of each of the "entry": > src list() attr(,"class") [1] "XMLNodeSet" I'm not sure how to fix this. Any suggestions? Do I need to provide a namespace, or is the RSS malformed? Thanks,...
2012 May 19
1
Try Giving Invalid Argument Type Error
...sion(NULL)) URL<-paste("http://www.advfn.com/p.php?pid=financials&btn=istart_date&mode=quarterly_reports&symbol=", exh,"%3A",tic,"&istart_date=0", sep = "") if( !is( try( doc <- htmlParse(URL) ,"try-error") ) ) { qtrstop <- xpathApply(doc, "count(//select/option)")-5 } Error in !silent : invalid argument type Any help would be most appreciated. --John Sparks
2008 Dec 31
1
Chinese characters encoding problem with XML
XML is a good tool reading data from web within R. But I wonder how could get the encoding correctly. library(XML) url <- 'http://www.szitic.com/docc/jz-lmzq.html' xml <- htmlTreeParse(url, useInternal=TRUE) q <- "//tbody/tr/td" dat <- unlist(xpathApply(xml, q, xmlValue)) df <- as.data.frame(t(matrix(dat, 4))) dt<-as.character(df[15,1]) The first column of df is dates in Chinese. dt is one of the Chinese dates. When I copied the content of dt into the email, it become the following: > dt [1] "2008&#229;?G????&#13312;&#...
2016 Jan 19
2
Extraccion de datos de una Web
...emana). De >> momento lo estoy obteniendo de la siguiente forma: >> >> url_jugador<-"http://localhost:8080/jugadores/Luis" >> txt_jugador <- getURL(url_jugador) >> doc<-htmlTreeParse(txt_jugador, useInternalNodes = TRUE) >> puntos_nodo<- xpathApply(doc, "//table[@class='points']/tr") >> puntos_nodo >> [[1]] >> <tr> >> <td class="semana">1</td> >> <td class="neg"/> >> <td> >> <div class="bar">6</div> &g...
2008 Sep 07
4
XML - get node by name
Hi there, I try to rewrite some Java-code with R. It deals with reading XML files. I started with the XML package. In Java, I had a very useful method which gave me a node by using: name of the node index of appearance start point: global (false) / local (true) So, I could do something like this. setCurrentChildNode("data", 0); getValueOfElement("val",1,true); -->