similar to: Parse XML

Displaying 20 results from an estimated 300 matches similar to: "Parse XML"

2007 Sep 01
2
Importing huge XML-Files
Dear all, for my diploma thesis I have to import huge XML-Files into R for statistical processing - huge means a size about 33 MB. I'm using the XML-Package version 1.9 As far as reading the complete file into R via xmlTreeParse doesn't work or is too slow, I'm trying to use xmlEventParse but I got completely stuck. I have many different type of nodes + <configuration>
2012 Oct 26
1
Parsing very large xml datafiles with SAX: How to profile <anonymous> functions?
Hello everyone, I'm trying to parse a very large XML file using SAX with the XML package (i.e., mainly the xmlEventParsing function). This function takes as an argument a list of other functions (handlers) that will be called to handle particular xml nodes. If when I use Rprof(), all the handler functions are lumped together under the <anonymous> label, and I get something like this:
2005 Sep 21
5
SAX Parser best practise
Dear All, I have a question regarding best practise in setting up a XML parser within R. Because I have files with more than 100 MB and I'm only interested in some values I think a SAX-like parser using xmlEventParse() will be the best solution. Unfortunately the values I'm looking for, to construct some higher "mass spectrum", are distributed over different lines: as
2008 Jun 25
0
Memory allocation failed: Copying Node
Following code bugs with "Memory allocation failed: Copying Node" error after parsing n thousand files. I have included the main code(below) and functions(after the main code). I am not sure which lines are causing the copying Node which results in memory failure. Please advise. #Beginning of Code for(i in 1:nrow(newFile)) { if(i%%3000 == 0) gc()
2007 Dec 14
6
Analyzing Publications from Pubmed via XML
I would like to track in which journals articles about a particular disease are being published. Creating a pubmed search is trivial. The search provides data but obviously not as an R dataframe. I can get the search to export the data as an xml feed and the xml package seems to be able to read it. xmlTreeParse("
2011 Mar 30
1
Package XML: Parse Garmin *.tcx file problems
I'm struggling with package XML to parse a Garmin file (named *.tcx). I wonder if it's form is incomplete, but appreciably reluctant to paste even a shortened version. The output below shows I can get nodes, but an attempt at value of a single node comes up empty (even though there is data there. One question: Has anybody succeeded parsing Garmin .tcx (xml) files? Thanks! Michael
2007 Nov 18
4
Re ad HTML table
You can use htmlTreeParse and xpathApply from the XML library. something like: xpathApply( htmlTreeParse("http://blabla", useInt=T), "//td", function(x) xmlValue(x)) should do it. Gamma wrote: > > anyone care to explain how to read a html table, it's streaming data > (updated every second) and i am looking for a suitable function. > > The imported html
2007 Sep 04
1
SOLVED: importing huge XML-Files -- new problem: special characters
Hi all, thanks to the people who replied to my question! I finally solved the issue by writing own handlers and using xmlEventParse - which leads to the following problem which is so odd that its probably a bug. I use several special charachter in my XML-File, e.g. umlauts or ? or ? - but no matter how I encode my XML (UTF or ISO) or I escape these characters xmlEventParse always stops
2008 May 02
1
How to parse XML
I would like to learn how to parse a mixed text/xml document I downloaded from the sec.gov website (see example below). I would like to parse this to get the value for each xml tag and then access it within R, but I don't know much about xml so I don't even know where to start debugging the errors I am getting in this example code. Can anyone help me get started? Thanks, Roger ftp
2011 Mar 29
2
Scrap java scripts and styles from an html document
Hi, I am working on developing a web crawler in R and I needed some help with regard to removal of javascripts and style sheets from the html document of a web page. i tried using the xml package, hence the function xpathApply library(XML) txt = xpathApply(html,"//body//text()[not(ancestor::script)][not(ancestor::style)]", xmlValue) The output comes out as text lines, without any html
2009 Sep 03
1
encoding problem using xml package
Dear list I tried to read an xml file using the xml package. Unfortunately, some encoding problems occure. E.g. german Umlaut will be red correctly. I assume that the occurs due to (internal?) conversion to utf-8. To illustrate the problem, I have wrote to xml files. File Test 1 ----------- <?xml version="1.0" encoding="ISO-8859-1"?> <Daten> <ITEM>
2012 Mar 21
1
Trouble installing the XML package
Hello everyone, I am probably not the only one having trouble with this package but here goes. I want to install XML on Ubuntu. I installed libxml2-dev and everything works out fine until I get the following: Error in reconcilePropertiesAndPrototype(name, slots, prototype, superClasses, : No definition was found for superclass "namedList" in the specification of class
2005 May 10
0
Fwd: Extract just some fields from XML]
Duncan, you are a king! Thanks a lot for this cookie. It really helped me. Thanks for the code as well as detailed explanation at the end. >Hi Gregor. > >Here is a function that will collect all of the nodes in the >XML document whose names are in the vector elementNames > >getElements = >function(elementNames) >{ > els = list() > > startElement = function(node,
2008 Jun 12
1
XML parameters to Column Headers for importing into a dataset
Dear List, Do you know any way I can convert XML parameters into column headers. My data is in a csv file with each row containing a xml form of data , and multiple parameters ( <param1> data_val1 </param2> , <param2> data_val2 </param2> ) I want to convert it so each row caters to one record and each parameter becomes a different column. param1
2012 Aug 10
3
Parsing large XML documents in R - how to optimize the speed?
Hello everyone, I would like to parse very large xml files from MS/MS experiments and create R objects from their content. (By very large, I mean going up to 5-10Gb, although I am using a 'small' 40M file to test my code.) My first attempt at parsing the 40M file, using the XML package, took more than 2200 seconds and left me quite disappointed. I managed to cut that down to around 40
2008 Oct 06
3
Extracting text from html code using the RCurl package.
Dear R-help, I want to download the text from a web page, however what i end up with is the html code. Is there some option that i am missing in the RCurl package? Or is there another way to achieve this? This is the code i am using: > library(RCurl) > my.url <- 'https://stat.ethz.ch/mailman/listinfo/r-help' > html.file <- getURI(my.url, ssl.verifyhost = FALSE,
2005 May 08
2
Extract just some fields from XML
Hello! I am trying to get specific fields from an XML document and I am totally puzzled. I hope someone can help me. # URL URL<-"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=11877539,11822933,11871444&retmode=xml&rettype=citation" # download a XML file tmp <- xmlTreeParse(URL, isURL = TRUE) tmp <- xmlRoot(tmp) Now I want to extract only
2016 Jan 18
3
Extraccion de datos de una Web
Buenas tardes, Quiero extraer datos de una web en la que ser relaciona la semana con la puntuaciĆ³n obtenida por un jugador. Ahora mismo llego a obtener elnodo en el que se relacionan la semana con la puntuaciĆ³n obtenida, pero no soy capaz de extraer esa informacion en una tabla de dos columna (semana, puntuacion) teniendo en cuenta que puede que haya semanas que no haya puntuado (en el ejemplo,
2003 Oct 09
2
building XML-0.95-1 on MacOS
I am trying to build the XML package on MacOS. I am using the fink installation of libxml-1.8.17. The configuration information is: Configuration information: Libxml settings libxml include directory: /sw/include/gnome-xml libxml library directory: -L/sw/lib -lxml -lz -lz -lxml libxml 2: no Compilation flags: -I/sw/include/gnome-xml -I/sw/include/gnome-xml/libxml
2010 Nov 10
2
odfWeave/XML Windows issue
I am getting the following error when using odfWeave Error in xmlEventParse(infile, handlers = handlers, trim = FALSE, state = state) : File content_1.xml does not exist This appears to be the same issue detailed in http://markmail.org/message/qsrqdtozizlngbrt#query:+page:1+mid:qsrqdtozizlngbrt+state:results however the link to XML_1.93-2.3.zip appears to be dead. Is there either a better