Hello all,
I recently was pointed towards the Nokogiri gem recently to find all
html elements with a particular class, rather than attempting my own
regular expression. (Thanks John-John Tedro and Hassan Schroeder!!!!)
It works perfectly on my local machine, (Lion OS X and passenger), but
when I deployed it to my server (Centos 5.5 and passenger) Nokogiri
seems to not grab all the elements of the html file.
Here is my method:
========================def find_editable
code = Nokogiri::HTML(open(source_code(FtpAction::DOWNLOAD)))
# source_code() method returns a location of a file within the app
Rails.logger.info "===== code ===="
Rails.logger.info code.inspect
elements = []
num = 0
code.css(''.my-class'').each do |el|
# I tried using xpath, but was not able to get it to grab elements
w/ ''class="my-class icon"''
# only ''class="my-class"''
attrs = []
el.attributes.each { |attr| attrs << {attr[0] => attr[1].value }
}
elements << {:element => el.name, :attributes => attrs,
:content
=> el.content, :count => num+=1 }
end
Rails.logger.info "==== elements ===="
Rails.logger.info elements.inspect
elements
# array return of hash built above, containing:
# name (html tag)
# attributes (classes and ids and anything else)
# content actual text within element "Hello World!"
end
=========================
Below is the output from the log files. You don''t really need to go
through all of the local output. The whole .html file is being parsed.
If anyone has any idea why Nokogiri is not working on my server but
would be locally, I would appreciate any help you can provide.
Also I did check that the full .html file is there on the server and
using File.open do |file| printed out the full file for me.
Thank you in advance.
Output from Rails.logger.info:
----------------------------------
Server (Centos 5.5)
===== code ===#<Nokogiri::HTML::Document:0x5d14890 name="document"
children=[#<Nokogiri::XML::DTD:0x5d14624 name="html">]>
==== elements ===[]
Locally (Lion OS X)
===== code ===#<Nokogiri::HTML::Document:0x81f508d0 name="document"
children=[#<Nokogiri::XML::DTD:0x81f4fd2c name="html">,
#<Nokogiri::XML::Element:0x81f4fcb4 name="html"
attributes=[#<Nokogiri::XML::Attr:0x81f4b5b0 name="lang"
value="en">]
children=[#<Nokogiri::XML::Element:0x81f4aa84 name="head"
children=[#<Nokogiri::XML::Element:0x81f4a3b8 name="meta"
attributes=[#<Nokogiri::XML::Attr:0x81f4a0fc name="charset"
value="utf-8">]>, #<Nokogiri::XML::Element:0x81f4a264
name="meta"
attributes=[#<Nokogiri::XML::Attr:0x81f49918 name="http-equiv"
value="X-UA-Compatible">, #<Nokogiri::XML::Attr:0x81f49904
name="content" value="IE=edge,chrome=1">]>,
#<Nokogiri::XML::Element:0x81f499a4 name="title"
children=[#<Nokogiri::XML::Text:0x81f47e4c "Mut8 Test
Site">]>,
#<Nokogiri::XML::Element:0x81f47cd0 name="meta"
attributes=[#<Nokogiri::XML::Attr:0x81f47bf4 name="name"
value="description">, #<Nokogiri::XML::Attr:0x81f47be0
name="content">]>, #<Nokogiri::XML::Element:0x81f47c80
name="meta"
attributes=[#<Nokogiri::XML::Attr:0x81f47320 name="name"
value="author">, #<Nokogiri::XML::Attr:0x81f4730c
name="content">]>]>,
#<Nokogiri::XML::Element:0x81f4677c name="body"
children=[#<Nokogiri::XML::Text:0x81f46470 "\n \t">,
#<Nokogiri::XML::Element:0x81f46420 name="header"
children=[#<Nokogiri::XML::Element:0x81f46128 name="hgroup"
children=[#<Nokogiri::XML::Element:0x81f45ea8 name="h1"
children=[#<Nokogiri::XML::Text:0x81f45c14 "Mut8 Testing">]>,
#<Nokogiri::XML::Text:0x81f45aac "\n \t ">,
#<Nokogiri::XML::Element:0x81f45a5c name="h2"
children=[#<Nokogiri::XML::Text:0x81f45764 "In
Progress...">]>,
#<Nokogiri::XML::Text:0x81f455fc "\n \t ">]>]>,
#<Nokogiri::XML::Element:0x81f453cc name="div"
attributes=[#<Nokogiri::XML::Attr:0x81f4514c name="id"
value="content">]
children=[#<Nokogiri::XML::Text:0x81f4307c "\n \t ">,
#<Nokogiri::XML::Element:0x81f4302c name="h3"
children=[#<Nokogiri::XML::Text:0x81f42d34 "My Page Title
Here!">]>,
#<Nokogiri::XML::Text:0x81f42bcc "\n \t ">,
#<Nokogiri::XML::Element:0x81f42b7c name="p"
attributes=[#<Nokogiri::XML::Attr:0x81f42a28 name="id"
value="villa">,
#<Nokogiri::XML::Attr:0x81f42a14 name="class" value="bob
my-class">]
children=[#<Nokogiri::XML::Text:0x81f42410 "Lorem ipsum dolor sit amet,
consectetur adipiscing elit. Nulla eu ipsum urna, et molestie mi. \n \t
\t Aliquam adipiscing, massa et fermentum ullamcorper, neque nunc
consectetur enim, imperdiet \n \t \t porta lacus est non turpis. Nam
id nisi vitae enim scelerisque ullamcorper vel nec magna. \n \t \t
Morbi erat augue, mattis non imperdiet ac, dignissim in velit.">]>,
#<Nokogiri::XML::Text:0x81f422a8 "\n\n\t ">,
#<Nokogiri::XML::Element:0x81f42258 name="p"
attributes=[#<Nokogiri::XML::Attr:0x81f42104 name="class"
value="my-class icon">]
children=[#<Nokogiri::XML::Text:0x81f41cb8
"Pellentesque dapibus, nisl non venenatis vehicula, quam tortor placerat
lacus, hendrerit \n\t commodo nunc nisl non jusdddto. Donec erat
nulla, facilisis fringilla vestibulum et, iaculis \n\t eu metus. Sed
aliquet ultrices nunc quis pulvinar. Quisque facilisis dolor sed mauris
\n\t sagittis blandit. Quisque tortor libero, vestibulum quis semper
a, gravida quis nisl. \n\t Maecenas quam eros, blandit malesuada
imperdiet quis, volutpat sit amet nisl.">]>,
#<Nokogiri::XML::Text:0x81f41b50 "\n \t">]>,
#<Nokogiri::XML::Text:0x81f419e8 "\n \t">,
#<Nokogiri::XML::Element:0x81f41998 name="footer"
children=[#<Nokogiri::XML::Element:0x81f41664 name="p"
attributes=[#<Nokogiri::XML::Attr:0x81f41574 name="class"
value="my-class">] children=[#<Nokogiri::XML::Text:0x81f40ee4
"This is
my footer info">]>, #<Nokogiri::XML::Text:0x81f40d7c "\n
\t">]>]>]>]>
==== elements ===[{:content=>"Lorem ipsum dolor sit amet, consectetur
adipiscing elit.
Nulla eu ipsum urna, et molestie mi. Aliquam adipiscing, massa et
fermentum ullamcorper, neque nunc consectetur enim, imperdiet porta
lacus est non turpis. Nam id nisi vitae enim scelerisque ullamcorper vel
nec magna. Morbi erat augue, mattis non imperdiet ac, dignissim in
velit.", :attributes=>[{"class"=>"bob my-class"},
{"id"=>"villa"}],
:count=>1, :element=>"p"}, {:content=>"Pellentesque
dapibus, nisl non
venenatis vehicula, quam tortor placerat lacus, hendrerit commodo nunc
nisl non jusdddto. Donec erat nulla, facilisis fringilla vestibulum et,
iaculis eu metus. Sed aliquet ultrices nunc quis pulvinar. Quisque
facilisis dolor sed mauris sagittis blandit. Quisque tortor libero,
vestibulum quis semper a, gravida quis nisl. Maecenas quam eros, blandit
malesuada imperdiet quis, volutpat sit amet nisl.",
:attributes=>[{"class"=>"my-class icon"}],
:count=>2, :element=>"p"},
{:content=>"This is my footer info",
:attributes=>[{"class"=>"my-class"}], :count=>3,
:element=>"p"}]
--
Posted via http://www.ruby-forum.com/.
--
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en.