On Apr 14, 2011, at 5:41 AM, Ralph Shnelvar wrote:
> We have twenty-or-so MS Word 2000 documents that we want to display on
> our website.
>
> What we did was convert the MS Word documents to Compact HTML. We
> then
> display a document via an
> <object data="/doc/somedoc.htm"
height=''100%'' id=''xyz''
width=''100%''>
>
> This all works great except for a bit of a fly in the ointment.
>
> Doing an SEO (Search Engine Optimization) analysis shows that a least
> one analyzer does not analyze the contents of "/doc/somedoc.htm".
>
> I guess it is reasonable not to count the contents of the document
> pointed to because it might not even be owned by the displaying page.
>
> - - - -
>
> But these MS Word 2000 are _ours_. So does anyone know of a way to
> automatically convert the htm file produced so that I can render it
> rather than refer to the document via object/data?
If these documents are all alike in internal structure, you could
write a little script using Nokogiri to capture only the id="whatever"
node containing the page content, and then write that back out as a
sort of partial. OR suck it into an ActiveRecord object and persist it
in your database.
require ''rubygems''
require ''nokogiri''
require ''fileutils''
fp = ''/path/to/your/file''
#if your starting document is well-formed
doc = Nokogiri::XML(File.read(fp))
#otherwise
#doc = Nokogiri::HTML(File.read(fp))
div = doc.at_css(''#someDiv'')
output = div.to_xhtml
Walter
>
> --
> Posted via http://www.ruby-forum.com/.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Ruby on Rails: Talk" group.
> To post to this group, send email to rubyonrails-
> talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> .
> For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
> .
>
--
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en.