thr3ads.net - search: "getnodeset"

Displaying 20 results from an estimated 42 matches for "getnodeset".

2010 Aug 30

getNodeSet - what am I doing wrong?

Hi, Why is the following retuning a nodset of length 0: > library(XML) > test <- xmlTreeParse( > "http://www.unimod.org/xml/unimod_tables.xml",useInternalNodes=TRUE) > getNodeSet(test,"//modifications_row") Thanks for any hint. Joh

Need help reading website info with XML package and XPath

2011 May 30

Need help reading website info with XML package and XPath

...f "doc" that shows and highlights all the information I'm interested in (note that either url that's highligted in doc is fine). http://r.789695.n4.nabble.com/file/n3561075/relevant-section-of-doc.pdf relevant-section-of-doc.pdf I'm guessing my xpath statements are wrong or getNodeSet needs something else to get to information contained in a bubble on a webpage. Any suggestions or ideas would be GREATLY appreciated. library(XML) url <- "http://www.zillow.com/homes/511 W Lafayette St, Norristown, PA_rb" doc <- htmlTreeParse(url, useInternalNode=TRUE, isURL=TRUE...

Using xpathapply or getnodeset to get text between two distinct tags

2012 May 11

Using xpathapply or getnodeset to get text between two distinct tags

...;Ses=1') #Scrape the page with the links doc<-scrape(url=hansard, parse=TRUE, follow=TRUE) #Not sure what exactly this does, but it is necessary doc<-doc[[1]] #Get the xmlRoot directory doc<- xmlRoot(doc) #Get nodes that contain only the links to each day's transcripts links<- getNodeSet(doc, "//a[@class='PublicationCalendarLink']/@href") links<-matrix(links) #Paste those href links to the root URL links<-apply(links, 1, function(x) paste('http://www.parl.gc.ca', x, sep='')) #Inspect links[1] #Scrape text from first URL in 'links' one...

XML getNodeSet syntax for PUBMED XML export

2010 Sep 08

XML getNodeSet syntax for PUBMED XML export

I am looking for the syntax to capture XML tags marked with /DescriptorName MajorTopicYN="Y"/ , but the combination of the internal space (between "Name" and "Major" and the embedded quote marks are defeating me. I can get all the "DescriptorName" tags, but these include both MajroTopicYN = "Y" and "N" variants. Any suggestions?

Rcurl, postForm()

2012 May 28

Rcurl, postForm()

...ML) library(RCurl) library(scrapeR) library(RHTMLForms) #Set URL bus<-c('http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx') #Scrape URL orig<-getURLContent(url=bus) #Parse doc doc<-htmlParse(orig[[1]], asText=TRUE) #Get The forms forms<-getNodeSet(doc, "//form") forms[[1]] #These are the input nodes getNodeSet(forms[[1]], ".//input") #These are the select nodes getNodeSet(forms[[1]], ".//select") ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Stree...

Stuck ...can't get sapply and xmlTreeParse working

2011 Jul 05

Stuck ...can't get sapply and xmlTreeParse working

...d=X1-ZWz1bup03e49vv_5kvb6&address=",x, sep="") ############## problem line is next ################################# zdoc <-xmlTreeParse(url.zill, useInternalNode=TRUE, isURL=TRUE) ############# problem line above ################################## f$zpid <- sapply(getNodeSet(zdoc, "//result/zpid"), xmlValue) f$zest.low <-sapply(getNodeSet(zdoc, "//valuationRange/low"), xmlValue) f$zest <- sapply(getNodeSet(zdoc, "//zestimate/amount"), xmlValue) rm(zdoc) return(f) } j <-sapply(new.add, FUN=hm) print(zest) -- View this mess...

another XML package question

2008 Sep 08

another XML package question

Hi there, does anybody know how to return the xmlPath from a node? For example, at several location in the xml file, I have nodes with the same name and I'd like to process only the nodes from a certain path. Any idea? Antje

Treatment of xml-stylesheet processing instructions in XML module

2011 Apr 06

Treatment of xml-stylesheet processing instructions in XML module

Hello again, Another stumble here that is defeating me. I try: a<-readLines(url("http://feeds.feedburner.com/grokin")) t<-XML::xmlTreeParse(a, ignoreBlanks=TRUE, replaceEntities=FALSE, asText=TRUE) elem<- XML::getNodeSet(XML::xmlRoot(t),"/rss/channel/item")[[1]] And I get: Start tag expected, '<' not found Error: 1: Start tag expected, '<' not found When I modify the second line in "a" to remove the following (just leaving the <rss> tag with its attributes), I do no...

How to apply XPath query on XML nodes separately?

2012 Dec 28

How to apply XPath query on XML nodes separately?

...e whole document, *not* just those of the currently queried parent. I know, this is because I prefix my XPath Query with // and apparently any given XMLNode "knows" of his whole document, but I seem not to be able to find a proper solution. So, my question is: How do I restrict a call of getNodeSet to just a XMLNode and not the whole document it was retrieved from? I use the XML and RCurl packages. The document I speak of is downloaded from uniprot.org, a protein knowledge server well known to biologists. The lamentably somewhat lengthy code follows: library(XML) library(RCurl) getEntries...

Parsing large XML documents in R - how to optimize the speed?

2012 Aug 10

Parsing large XML documents in R - how to optimize the speed?

...f the XML package when parsing the xml tree; -vectorizing the parsing (i.e., replacing loops like "for(node in group.of.nodes) {...}" by "sapply(group.of.node, function(node){...}") I gained another 5 seconds by making small changes to the functions used (like replacing 'getNodeset' by 'xmlElementsByTagName' when I don't need to navigate to the children nodes). Now I am blocked at around 35 seconds and I would still like to cut this time by a 5x, but I have no clue what to do to achieve this gain. I'll try to expose as briefly as possible the relevant stru...

Help with tryCatch

2011 Jul 10

Help with tryCatch

Having a hard time understanding the help files for tryCatch. Looking for a little help with the following statement which sits inside a for loop zest[i] <- tryCatch(sapply(getNodeSet(zdoc, "//zestimate/amount"), xmlValue), error=function() zest[i] <-"NA") zest is a numeric vector If the sapply statement evaluates to an error, I'd like to set the value of zest[i] to NA and continue with the loop. Suggestions ? -- View this message in context: http:...

Do colClasses in readHTMLTable (XML Package) work?

2010 Mar 18

Do colClasses in readHTMLTable (XML Package) work?

Hi, I can't get the colClasses option to work in the readHTMLTable function of the XML package. Here's a code fragment: require("XML") doc <- "http://www.nber.org/cycles/cyclesmain.html" table <- getNodeSet(htmlParse(doc),"//table") [[2]] # The main table is the second one because it's embedded in the page table. xt <- readHTMLTable( table, header = c("peak","trough","contraction","expansion","tr...

grep and XML

2012 Apr 16

grep and XML

...le: https://raw.github.com/currencybot/open-exchange-rates/master/latest.json This is the code that I'm working with: library(RCurl) library(XML) txt<-getURL("https://raw.github.com/currencybot/open-exchange-rates/master/latest.json") txt<-htmlParse(txt, asText=TRUE) txt<- getNodeSet(txt, '//p') So, I can get the node, properly but then, if I try soething like this: grep(c('USD'), txt) I get: integer(0) Can anyone suggest a way forward? Yours, Simon KIss ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 G...

Process XML files

2012 Jun 06

Process XML files

...um of users! I was successful in processing files uing R's XML librariy. Thank you, Rxperts! I know there are libraries like XML and SPXML available in S-Plus. Could anyone please share examples of reading an xml file and save the contents in a data frame? Are there Splus equivalents of "getNodeSet", "xmlSApply" and "xmlValue"? Thanks so much! Santosh [[alternative HTML version deleted]]

using XML package to read RSS

2012 May 17

using XML package to read RSS

Hi, I'm trying to use the XML package to read an RSS feed. To get started, I was trying to use this post as an example: http://www.r-bloggers.com/how-to-build-a-dataset-in-r-using-an-rss-feed-or-web-page/ I can replicate the beginning section of the post, but when I try to use another RSS feed I have an issue. The RSS feed I would like to use is: > URL <-

Bug with memory allocation when loading Rdata files iteratively?

2012 Feb 10

Bug with memory allocation when loading Rdata files iteratively?

...ventually). It just seems like removing the object via |rm()| and firing |gc()| do not have any effect, so the memory consumption of each loaded R object cumulates until there's no more memory left :-/ Possibly, this is also related to XML package functionality (mainly |htmlTreeParse| and |getNodeSet|), but I also experience the described behavior when simply iteratively loading and removing Rdata files. I've put together a little example that illustrates the memory ballooning mentioned above which you can find here: http://stackoverflow.com/questions/9220849/significant-memory-issue-in...

Objects must be passed as an argument or generated in the function, right?

2011 Feb 24

Objects must be passed as an argument or generated in the function, right?

...? What am I missing here? Thanks Zheng Jk parseXmlEntryNodeSet <- function(psimi25file, psimi25source, verbose=TRUE) { psimi25Doc <- xmlTreeParse(psimi25file, useInternalNodes = TRUE) psimi25NS <- getDefaultNamespace(psimi25Doc) namespaces <- c(ns = psimi25NS) entry <- getNodeSet(psimi25Doc, "/ns:entrySet/ns:entry", namespaces) if(verbose) statusDisplay(paste(length(entry),"Entries found\n",sep=" ")) entryCount <- length(nodes) entryList <- list() for(i in 1:entryCount) { entryList[[i]] <- parseXmlEntryNode(doc=psimi2...

reading tables from multiple HTML pages

2011 Aug 29

reading tables from multiple HTML pages

...go about keeping the loop running so I can parse the rest? **************************************************** library(XML) url_root<-"http://www.szrd.gov.cn/viewcommondbfc.do?id=" for(i in 700:750){ url = paste(url_root, i, sep="") doc = htmlParse(url) tableNodes = getNodeSet(doc, "//table") tbl = readHTMLTable(tableNodes[[3]]) } **************************************************** Steve Oliver Department of Political Science University of California at San Diego 9500 Gilman Dr. La Jolla, CA 92092 -- View this message in context: http://r.789695.n4.nabble.c...

Removing or overwriting an XML node

2008 Jul 02

Removing or overwriting an XML node

...nd now imagine I want to change <first>Duncan</first> into e.g.? <initials>D.</initials>. How to do that ? I am able to add my node: library(XML) x <- xmlTreeParse("duncan.xml", useInternalNodes = TRUE) # find parent, add as last child: name <- getNodeSet(x, "//name")[[1]] newXMLNode("initials", "D.", parent=name) first <- getNodeSet(x, "//first")[[1]] ? # wanted: # deleteXMLNode(name) # or ? # replaceXMLNode("initials", "D.", replace=first) cat(saveXML(x)) free(x) As...

How to extract following data

2008 Nov 05

How to extract following data

Hi everyone, I have this kind of raw dataset : - <Temp diffgr:id="Temp14" msdata:rowOrder="13"> <Date>2005-01-17T00:00:00+05:30</Date> <SecurityID>10149</SecurityID> <PriceClose>1288.40002</PriceClose> </Temp> - <Temp diffgr:id="Temp15" msdata:rowOrder="14">

search for: getnodeset