stefan.duke at gmail.com
2009-Jul-14 21:40 UTC
[R] (simple) xml into data.frame and reverse
Hello,
I am trying to convert a simple data.frame (it will always be a few
equally long variables) into the XML format (which I don't understand
too well but need as input for another program) and reverse the
operation (from XML back into data.frame).
I found some code which does the first and it works good enough for me
(see below). Is there an easy way to reverse the operation? My XML
files are nothing fancy (no child nodes or anything, at least as far
as I can see.
### data.frame
data<- as.data.frame(cbind(c( 0 , 1 ),c( 500 , 300),c(200, 400)))
names(data)<-c("age","0","1")
### converts data.frame into XML
xml <- xmlTree()
xml$addTag("populationsize", close=FALSE)
for (i in 1:nrow(data)) {
xml$addTag("size", close=FALSE)
for (j in names(data)) {
xml$addTag(j, data[i, j])
}
xml$closeTag()
}
xml$closeTag()
# view the result
cat(saveXML(xml))
I put below also an example of how my data looks like.
Thanks for any advice!
Best and have a great day,
Stefan
APPENDIX
XML-file
------------------
<populationsize>
?<size>
? ?<age>0</age>
? ?<sex>0</sex>
? ?<number>500</number>
?</size>
?<size>
? ?<age>0</age>
? ?<sex>1</sex>
? ?<number>300</number>
?</size>
?<size>
? ?<age>1</age>
? ?<sex>0</sex>
? ?<number>200</number>
?</size>
<size>
<age>1</age>
<sex>1</sex>
<number>400</number>
</size>
</populationsize>
---------
DATAFRAME
age 0 1
0 500 300
1 200 400
stefan.duke at gmail.com wrote:> Hello, > I am trying to convert a simple data.frame (it will always be a few > equally long variables) into the XML format (which I don't understand > too well but need as input for another program) and reverse the > operation (from XML back into data.frame). > I found some code which does the first and it works good enough for me > (see below). Is there an easy way to reverse the operation?> My XML> files are nothing fancy (no child nodes or anything, at least as far > as I can see.Just for the record, there are child nodes. You have a top-level node <populationsize> This has several children <size>. And each of these has <age>, <sex> and <number> as children. You don't sub-nodes of these so the hierarchy is relatively flat and does correspond to a data frame with each <size> node as an observation and <age>, <sex> and <number> as variables/columns. I wrote some relatively general functions, but hastily written functions to read this sort of data. You can find them attached or at http://www.omegahat.org/RSXML/xmlToDataFrame.xml You can use these as xmlToDataFrame("size.xml") It handles homogeneous and non-homogeneous nodes (i.e. with the same number and names of sub-nodes or not) and also allows one to specify colClasses somewhat similar to that in read.table() ( but not completely implemented yet). These functions will most likely be in the next release of the XML package. Let me know if they don't work for your data. D.> > > ### data.frame > data<- as.data.frame(cbind(c( 0 , 1 ),c( 500 , 300),c(200, 400))) > names(data)<-c("age","0","1") > > ### converts data.frame into XML > xml <- xmlTree() > xml$addTag("populationsize", close=FALSE) > for (i in 1:nrow(data)) { > xml$addTag("size", close=FALSE) > for (j in names(data)) { > xml$addTag(j, data[i, j]) > } > xml$closeTag() > } > xml$closeTag() > > # view the result > cat(saveXML(xml)) > > I put below also an example of how my data looks like. > Thanks for any advice! > Best and have a great day, > Stefan > > > > > APPENDIX > XML-file > ------------------ > > > <populationsize> > <size> > <age>0</age> > <sex>0</sex> > <number>500</number> > </size> > <size> > <age>0</age> > <sex>1</sex> > <number>300</number> > </size> > <size> > <age>1</age> > <sex>0</sex> > <number>200</number> > </size> > <size> > <age>1</age> > <sex>1</sex> > <number>400</number> > </size> > </populationsize> > > --------- > DATAFRAME > > age 0 1 > 0 500 300 > 1 200 400 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.