ajay ohri
2008-Jun-12 09:11 UTC
[R] XML parameters to Column Headers for importing into a dataset
Dear List, Do you know any way I can convert XML parameters into column headers. My data is in a csv file with each row containing a xml form of data , and multiple parameters ( <param1> data_val1 </param2> , <param2> data_val2 </param2> ) I want to convert it so each row caters to one record and each parameter becomes a different column. param1 param2 Row1 data_val1 data_val2 What is the most efficient way for doing this. Apologize for the duplicate email , but this is an emergency with loads of files for me !!! Regards, Ajay www.decisionstats.com [[alternative HTML version deleted]]
Martin Morgan
2008-Jun-12 16:07 UTC
[R] XML parameters to Column Headers for importing into a dataset
Hi Ajay -- "ajay ohri" <ohri2007 at gmail.com> writes:> Dear List, > > Do you know any way I can convert XML parameters into column headers. MyIn R, the XML package will help you...> data is in a csv file with each row containing a xml form of data , and > multiple parameters ( > > <param1> data_val1 </param2> , <param2> data_val2 </param2> )I guess that first closing tag is param1...> I want to convert it so each row caters to one record and each parameter > becomes a different column. > > param1 param2 > Row1 data_val1 data_val2 > > What is the most efficient way for doing this. Apologize for the duplicatePersonally I like to use the xpath query language; the following relies a little on your data being regular (e.g., all rows having entries for all column values), but for some file 'fl' (perhaps accessible via a url) library(xml) xml = xmlTreeParse(fl, useInternal=TRUE) data.frame( param1 = unlist(xpathApply(xml, "//param1", xmlValue)), param2 = unlist(xpathApply(xml, "//param2", xmlValue))) does the trick. these are string values, you can convert them to numeric in the usual R way (as.numeric(unlist...)) or at the xpath level (along the lines of xpathApply(xml, "number(//param1)")). xpath help is available at http://www.w3.org/TR/xpath, especially http://www.w3.org/TR/xpath#path-abbrev The above is with R 2.7.0 and XML 1.95-2 Martin> email , but this is an emergency with loads of files for me !!! > > Regards, > > Ajay > > www.decisionstats.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793