David Kahn
2010-Oct-31 23:09 UTC
Re: Loading xml document using Nokogiri and retrieving CDATA element SOLVED
SOLVED: Guess posting this got me more curious and I figured it out: If I ask for my_file.xpath("//EMBEDDED_FILE/DOCUMENT").text Nokogiri automatically takes the content within the cdata element within the DOCUMENT node and returns it to me without the cdata. Nice. So just a case of making things harder for myself. On Sun, Oct 31, 2010 at 4:59 PM, David Kahn <dk-rfEMNHKVqOwNic7Bib+Ti1W1rNmOCjRP@public.gmane.org>wrote:> This is an extension of my last post (problems with REXML) which has me > looking to Nokogiri again. The reason I am not using Nokogiri is I can not > seem to find a way to get CDATA out of a Nokogiri document. > > First, can you tell me if I am loading my document correctly, because when > I call my_document.to_xml, I only get one line back: > > (rdb:1) test_file = Nokogiri::XML(mismo_xml_file) > #<Nokogiri::XML::Document:0x5dd22 name="document"> > (rdb:1) test_file.to_xml > "<?xml version=\"1.0\"?>\n" > > So maybe this is the first step and if I get the full doc to load, my cdata > will be there?!! > > Alternatively, if you have a code snippet of loading a doc and successfully > getting CDATA out, that would be a great help! > > Thanks in advance, > > David > >-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Chris
2010-Oct-31 23:29 UTC
Re: Loading xml document using Nokogiri and retrieving CDATA element SOLVED
David, Just to add some context, what you experienced pretty much applies to all lanuages that offer access to xml, and the reason is the W3C XML specification requires this behavior. When any XML parser reads the XML, CDATA sections are not preserved. The text property returns the text node of an element and if the element happens to have a CDATA section, then the text part if it is returned along with any other text content of the element. On Oct 31, 4:09 pm, David Kahn <d...-rfEMNHKVqOwNic7Bib+Ti1W1rNmOCjRP@public.gmane.org> wrote:> SOLVED: Guess posting this got me more curious and I figured it out: > > If I ask for > my_file.xpath("//EMBEDDED_FILE/DOCUMENT").text > > Nokogiri automatically takes the content within the cdata element within the > DOCUMENT node and returns it to me without the cdata. Nice. So just a case > of making things harder for myself. > > On Sun, Oct 31, 2010 at 4:59 PM, David Kahn <d...-rfEMNHKVqOwNic7Bib+Ti1W1rNmOCjRP@public.gmane.org>wrote: > > > > > This is an extension of my last post (problems with REXML) which has me > > looking to Nokogiri again. The reason I am not using Nokogiri is I can not > > seem to find a way to get CDATA out of a Nokogiri document. > > > First, can you tell me if I am loading my document correctly, because when > > I call my_document.to_xml, I only get one line back: > > > (rdb:1) test_file = Nokogiri::XML(mismo_xml_file) > > #<Nokogiri::XML::Document:0x5dd22 name="document"> > > (rdb:1) test_file.to_xml > > "<?xml version=\"1.0\"?>\n" > > > So maybe this is the first step and if I get the full doc to load, my cdata > > will be there?!! > > > Alternatively, if you have a code snippet of loading a doc and successfully > > getting CDATA out, that would be a great help! > > > Thanks in advance, > > > David-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
David Kahn
2010-Nov-01 16:55 UTC
Re: Re: Loading xml document using Nokogiri and retrieving CDATA element SOLVED
On Sun, Oct 31, 2010 at 5:29 PM, Chris <Chris-WFo5Yo47r5cd9SLi6J12IkEOCMrvLtNR@public.gmane.org> wrote:> David, > > Just to add some context, what you experienced pretty much applies to > all lanuages that offer access to xml, and the reason is the W3C XML > specification requires this behavior. When any XML parser reads the > XML, CDATA sections are not preserved. The text property returns the > text node of an element and if the element happens to have a CDATA > section, then the text part if it is returned along with any other > text content of the element. >Thanks Chris, that is helpful to know, so I wont make that same mistake with another xml parser.> > > On Oct 31, 4:09 pm, David Kahn <d...-rfEMNHKVqOwNic7Bib+Ti1W1rNmOCjRP@public.gmane.org> wrote: > > SOLVED: Guess posting this got me more curious and I figured it out: > > > > If I ask for > > my_file.xpath("//EMBEDDED_FILE/DOCUMENT").text > > > > Nokogiri automatically takes the content within the cdata element within > the > > DOCUMENT node and returns it to me without the cdata. Nice. So just a > case > > of making things harder for myself. > > > > On Sun, Oct 31, 2010 at 4:59 PM, David Kahn <d...-rfEMNHKVqOwNic7Bib+Ti1W1rNmOCjRP@public.gmane.org > >wrote: > > > > > > > > > This is an extension of my last post (problems with REXML) which has me > > > looking to Nokogiri again. The reason I am not using Nokogiri is I can > not > > > seem to find a way to get CDATA out of a Nokogiri document. > > > > > First, can you tell me if I am loading my document correctly, because > when > > > I call my_document.to_xml, I only get one line back: > > > > > (rdb:1) test_file = Nokogiri::XML(mismo_xml_file) > > > #<Nokogiri::XML::Document:0x5dd22 name="document"> > > > (rdb:1) test_file.to_xml > > > "<?xml version=\"1.0\"?>\n" > > > > > So maybe this is the first step and if I get the full doc to load, my > cdata > > > will be there?!! > > > > > Alternatively, if you have a code snippet of loading a doc and > successfully > > > getting CDATA out, that would be a great help! > > > > > Thanks in advance, > > > > > David > > -- > You received this message because you are subscribed to the Google Groups > "Ruby on Rails: Talk" group. > To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To unsubscribe from this group, send email to > rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org<rubyonrails-talk%2Bunsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> > . > For more options, visit this group at > http://groups.google.com/group/rubyonrails-talk?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.