stefan.duke at gmail.com
2009-Jul-14 21:40 UTC
[R] (simple) xml into data.frame and reverse
Hello, I am trying to convert a simple data.frame (it will always be a few equally long variables) into the XML format (which I don't understand too well but need as input for another program) and reverse the operation (from XML back into data.frame). I found some code which does the first and it works good enough for me (see below). Is there an easy way to reverse the operation? My XML files are nothing fancy (no child nodes or anything, at least as far as I can see. ### data.frame data<- as.data.frame(cbind(c( 0 , 1 ),c( 500 , 300),c(200, 400))) names(data)<-c("age","0","1") ### converts data.frame into XML xml <- xmlTree() xml$addTag("populationsize", close=FALSE) for (i in 1:nrow(data)) { xml$addTag("size", close=FALSE) for (j in names(data)) { xml$addTag(j, data[i, j]) } xml$closeTag() } xml$closeTag() # view the result cat(saveXML(xml)) I put below also an example of how my data looks like. Thanks for any advice! Best and have a great day, Stefan APPENDIX XML-file ------------------ <populationsize> ?<size> ? ?<age>0</age> ? ?<sex>0</sex> ? ?<number>500</number> ?</size> ?<size> ? ?<age>0</age> ? ?<sex>1</sex> ? ?<number>300</number> ?</size> ?<size> ? ?<age>1</age> ? ?<sex>0</sex> ? ?<number>200</number> ?</size> <size> <age>1</age> <sex>1</sex> <number>400</number> </size> </populationsize> --------- DATAFRAME age 0 1 0 500 300 1 200 400
stefan.duke at gmail.com wrote:> Hello, > I am trying to convert a simple data.frame (it will always be a few > equally long variables) into the XML format (which I don't understand > too well but need as input for another program) and reverse the > operation (from XML back into data.frame). > I found some code which does the first and it works good enough for me > (see below). Is there an easy way to reverse the operation?> My XML> files are nothing fancy (no child nodes or anything, at least as far > as I can see.Just for the record, there are child nodes. You have a top-level node <populationsize> This has several children <size>. And each of these has <age>, <sex> and <number> as children. You don't sub-nodes of these so the hierarchy is relatively flat and does correspond to a data frame with each <size> node as an observation and <age>, <sex> and <number> as variables/columns. I wrote some relatively general functions, but hastily written functions to read this sort of data. You can find them attached or at http://www.omegahat.org/RSXML/xmlToDataFrame.xml You can use these as xmlToDataFrame("size.xml") It handles homogeneous and non-homogeneous nodes (i.e. with the same number and names of sub-nodes or not) and also allows one to specify colClasses somewhat similar to that in read.table() ( but not completely implemented yet). These functions will most likely be in the next release of the XML package. Let me know if they don't work for your data. D.> > > ### data.frame > data<- as.data.frame(cbind(c( 0 , 1 ),c( 500 , 300),c(200, 400))) > names(data)<-c("age","0","1") > > ### converts data.frame into XML > xml <- xmlTree() > xml$addTag("populationsize", close=FALSE) > for (i in 1:nrow(data)) { > xml$addTag("size", close=FALSE) > for (j in names(data)) { > xml$addTag(j, data[i, j]) > } > xml$closeTag() > } > xml$closeTag() > > # view the result > cat(saveXML(xml)) > > I put below also an example of how my data looks like. > Thanks for any advice! > Best and have a great day, > Stefan > > > > > APPENDIX > XML-file > ------------------ > > > <populationsize> > <size> > <age>0</age> > <sex>0</sex> > <number>500</number> > </size> > <size> > <age>0</age> > <sex>1</sex> > <number>300</number> > </size> > <size> > <age>1</age> > <sex>0</sex> > <number>200</number> > </size> > <size> > <age>1</age> > <sex>1</sex> > <number>400</number> > </size> > </populationsize> > > --------- > DATAFRAME > > age 0 1 > 0 500 300 > 1 200 400 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.