search for: useinternalnodes

Displaying 20 results from an estimated 36 matches for "useinternalnodes".

2010 Aug 30
4
getNodeSet - what am I doing wrong?
Hi, Why is the following retuning a nodset of length 0: > library(XML) > test <- xmlTreeParse( > "http://www.unimod.org/xml/unimod_tables.xml",useInternalNodes=TRUE) > getNodeSet(test,"//modifications_row") Thanks for any hint. Joh
2008 Nov 04
2
How to suppress errors from htmlTreeParse() function in XML package?
...know that the html code is malformed, but for my purposes i can ignore that output. Is there a way to achieve this? ### Example: library(RCurl); library(XML) doc <- getURL('http://www.google.co.uk/search?q=%22R%20Project %22&as_qdr=d1&num=100') html.tree <- htmlTreeParse(doc, useInternalNodes = TRUE) ### Output - this is what i would like to suppress Tag nobr invalid htmlParseEntityRef: expecting ';' htmlParseEntityRef: expecting ';' ### etc. I attempted to use try(expr, silent=TRUE) but that didn't work for me: > try(htmlTreeParse(doc, useInternalNodes = TRUE)...
2007 Dec 14
6
Analyzing Publications from Pubmed via XML
I would like to track in which journals articles about a particular disease are being published. Creating a pubmed search is trivial. The search provides data but obviously not as an R dataframe. I can get the search to export the data as an xml feed and the xml package seems to be able to read it. xmlTreeParse("
2012 Aug 10
3
Parsing large XML documents in R - how to optimize the speed?
...ry large, I mean going up to 5-10Gb, although I am using a 'small' 40M file to test my code.) My first attempt at parsing the 40M file, using the XML package, took more than 2200 seconds and left me quite disappointed. I managed to cut that down to around 40 seconds by: -using the 'useInternalNodes' option of the XML package when parsing the xml tree; -vectorizing the parsing (i.e., replacing loops like "for(node in group.of.nodes) {...}" by "sapply(group.of.node, function(node){...}") I gained another 5 seconds by making small changes to the functions used (like r...
2009 Dec 31
3
XML and RCurl: problem with encoding (htmlTreeParse)
...encoding right in htmlTreeParse command. See below > library(RCurl) > library(XML) > > site <- getURL("http://www.aarresaari.net/jobboard/jobs.html") > txt <- readLines(tc <- textConnection(site)); close(tc) > txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes = TRUE) > > g <- xpathSApply(txt, "//p", function(x) xmlValue(x)) > head(grep(" ", g, value=T)) [1] "????PART-TIME EXPORT SALES ASSOCIATES (ALSO SUMMER WORK) ? Valuatum Oy ??Helsinki ??Ilmoitus lis??tty: 31.12.2009. Viimeinen hakup??iv??: 28.02.2010" [2] &...
2009 Sep 03
1
encoding problem using xml package
...gt; The following code shows that i cannot extract the Value of "Länge" correctly. Any help is very welcom. ---------------------------------------- Code Start ------------ > fname1 <- "test1.xml" > fname2 <- "test2.xml" > doc <- xmlTreeParse(fname1,useInternalNodes=T) > show(doc) <?xml version="1.0" encoding="ISO-8859-1"?> <Daten> <ITEM> <Messdaten> <MESSUNG> <BEZEICHNUNG>Länge</BEZEICHNUNG> </MESSUNG> </Messdaten> </ITEM> </Daten> > xpa...
2016 Jan 18
3
Extraccion de datos de una Web
...teniendo en cuenta que puede que haya semanas que no haya puntuado (en el ejemplo, la segunda semana). De momento lo estoy obteniendo de la siguiente forma: url_jugador<-"http://localhost:8080/jugadores/Luis" txt_jugador <- getURL(url_jugador) doc<-htmlTreeParse(txt_jugador, useInternalNodes = TRUE) puntos_nodo<- xpathApply(doc, "//table[@class='points']/tr") puntos_nodo [[1]] <tr> <td class="semana">1</td> <td class="neg"/> <td> <div class="bar">6</div> </td> </tr> [...
2011 Mar 30
1
Package XML: Parse Garmin *.tcx file problems
...output below shows I can get nodes, but an attempt at value of a single node comes up empty (even though there is data there. One question: Has anybody succeeded parsing Garmin .tcx (xml) files? Thanks! Michael _______________________ >doc2 = xmlRoot(xmlTreeParse("HR.reduced3.tcx",useInternalNodes = TRUE)) >xpathApply(doc2, "//*", xmlName) [[1]] [1] "TrainingCenterDatabase" [[2]] [1] "Activities" [[3]] [1] "Activity" [[4]] [1] "Id" [[5]] [1] "Lap" [[6]] [1] "TotalTimeSeconds" > xpathApply(doc2, "//TotalTi...
2012 Feb 29
2
Using a FOR LOOP to name objects
...objects in each iteraction. As in the following example (which doesn't work quite well) my_list<-c("A","B","C","D","E","F") for(i in c(1:length(my_list))){ url<- "http://finance.yahoo.com" doc = htmlTreeParse(url, useInternalNodes = T) tab_nodes = xpathApply(doc, "//table[@cellpadding = '3']") *my_list[i]*=lapply(tab_nodes, readHTMLTable) #problem is in this line names(*my_list[i]*)=c("Ins","outs") } The problem is that in iteraction #1, I need the info to be stored at an ob...
2012 Dec 11
1
query multiple terms in PubMed abstract
...Ocular. I am using the following code: url= "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?" search = paste(url, "db=pubmed&term=COL4A1+AND+Ocular[abstract]&retmax=300", sep="") docId <- xmlTreeParse(getURL(paste(url, search, sep="")), useInternalNodes=TRUE) I want to get the reply where BOTH the terms exist in abstract. But it is not doing that now. Any idea? Thanks john [[alternative HTML version deleted]]
2012 Dec 27
1
Conjunction and disjunction in pubmed query
...search = paste(url, "db=pubmed&term=", queryTerm1, "+AND+", queryTerm2,"+OR+",queryTerm3, "+OR+", queryTerm4, "[abstract]&retmax=100&usehistory=y", sep="") docId <- xmlTreeParse(getURL(paste(url, search, sep="")), useInternalNodes=TRUE) I want to fetch abstracts containing queryTerm1 AND queryTerm2 Or queryTerm3 OR queryTerm4. The code runs without error, but from the result I find that conjunction and disjunction is not working. Can anyone suggest a correct syntax for doing AND and OR pubmed query? Thanks John [[altern...
2008 Dec 17
1
Extract Data from a Webpage
Hi All: I would like to extract the provider name, address, and phone number from multiple webpages like this: http://oasasapps.oasas.state.ny.us/portal/pls/portal/oasasrep.providersearch.take_to_rpt?P1=3489&P2=11490 Based on searching R-help archives, it seems like the XML package might have something useful for this task. I can load the XML package and supply the url as an argument to
2011 Feb 24
1
Objects must be passed as an argument or generated in the function, right?
...The "nodes" object is either the argument nor generated in the function. How can R find the "nodes" object? What am I missing here? Thanks Zheng Jk parseXmlEntryNodeSet <- function(psimi25file, psimi25source, verbose=TRUE) { psimi25Doc <- xmlTreeParse(psimi25file, useInternalNodes = TRUE) psimi25NS <- getDefaultNamespace(psimi25Doc) namespaces <- c(ns = psimi25NS) entry <- getNodeSet(psimi25Doc, "/ns:entrySet/ns:entry", namespaces) if(verbose) statusDisplay(paste(length(entry),"Entries found\n",sep=" ")) entryCount <...
2013 Mar 20
1
htmlParse (from XML library) working sporadically in the same code
...in htmlParse code:     ans <- .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks), as.logical(replaceEntities), as.logical(asText), as.logical(trim), as.logical(validate), as.logical(getDTD), as.logical(isURL), as.logical(addAttributeNamespaces), as.logical(useInternalNodes), as.logical(isHTML), as.logical(isSchema), as.logical(fullNamespaceInfo), as.character(encoding), as.logical(useDotNames), xinclude, error, addFinalizer, as.integer(options), PACKAGE = "XML") By the way, readHTMLTable(htmlParse(url)) works fine on other pages, so the problem is someho...
2010 Jul 03
1
XML and RCurl: problem with encoding (htmlTreeParse)
Hi All, First method:- >library(XML) >theurl <- "http://home.sina.com" >download.file(theurl, "tmp.html") >txt <- readLines("tmp.html") >txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes = TRUE) >g <- xpathSApply(txt, "//p", function(x) xmlValue(x)) >head(grep(" ", g, value=T)) [1] " | | ENGLISH" " " [3] " ()" " " [5] " "...
2011 Jul 05
2
Stuck ...can't get sapply and xmlTreeParse working
Can't seem to get the code below working. It gets stuck on line 24 inside the function hm; comments show the line in question. The function hm is called by sapply and is at the bottom of the code. Other stuff above line 24 works correctly including the first couple of lines of the function hm. Should I be using a different apply function or am I doing something wrong with xmlTreeParse ?
2007 Sep 01
2
Importing huge XML-Files
Dear all, for my diploma thesis I have to import huge XML-Files into R for statistical processing - huge means a size about 33 MB. I'm using the XML-Package version 1.9 As far as reading the complete file into R via xmlTreeParse doesn't work or is too slow, I'm trying to use xmlEventParse but I got completely stuck. I have many different type of nodes + <configuration>
2011 May 30
1
Need help reading website info with XML package and XPath
Hi, I'm looking for help extracting some information of the zillow website. I'd like to do this for the general case where I manually change the address by modifying the url (see code below). With the url containing the address, I'd like to be able to extract the same information each time. The specific information I'd like to be able to extract includes the homedetails url, price
2016 Jan 19
2
Extraccion de datos de una Web
...as que no haya puntuado (en el ejemplo, la segunda semana). De >> momento lo estoy obteniendo de la siguiente forma: >> >> url_jugador<-"http://localhost:8080/jugadores/Luis" >> txt_jugador <- getURL(url_jugador) >> doc<-htmlTreeParse(txt_jugador, useInternalNodes = TRUE) >> puntos_nodo<- xpathApply(doc, "//table[@class='points']/tr") >> puntos_nodo >> [[1]] >> <tr> >> <td class="semana">1</td> >> <td class="neg"/> >> <td> >> &lt...
2011 Jun 07
1
XML segfault on some architectures
...#39; Traceback: 1: .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks), as.logical(replaceEntities), as.logical(asText), as.logical(trim), as.logical(validate), as.logical(getDTD), as.logical(isURL), as.logical(addAttributeNamespaces), as.logical(useInternalNodes), FALSE, as.logical(isSchema), as.logical(fullNamespaceInfo), as.character(encoding), as.logical(useDotNames), xinclude, error, addFinalizer, PACKAGE = "XML") 2: xmlTreeParse(fileName) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without...