thr3ads.net - Rails - [Rails] PDF to HTML converter for Ruby? [Jul 2006]

If this information is useful, please help other people find it:
Share via:

Bryan Duxbury

2006-Jul-30 02:32 UTC

[Rails] PDF to HTML converter for Ruby?

Does anyone know of a good package that can convert a PDF into HTML? 
Cross-platform compatible is a plus, but I can live with Linux-only if 
it comes to that.

-- 
Posted via http://www.ruby-forum.com/.

Jeff Everett

2006-Jul-30 16:40 UTC

head link

[Rails] PDF to HTML converter for Ruby?

I''ve never been able to find a reliable, open source solution to this
problem, if anyone knows of one I''d really like to know about it as
well.

Here are some options that I know of:

If you just have a few PDFs, you can save them as HTML from Acrobat (not
Reader), or with Adobe''s online conversion tool at:

http://www.adobe.com/products/acrobat/access_onlinetools.html

So if a commercial, non-Ruby solution is OK for you, Adobe obviously can do
what you want and the appropriate capabilities to convert many documents are
probably available in their server products. Or you might be able to get at
what you want through the Acrobat SDK.

There is a commercial product called PDFLib (http://www.pdflib.org). It
works with almost every major programming language, including Ruby, and has
a ton of features. No direct conversion to HTML, but you can extract text
with PDFLib TET and then mark it up with Ruby.

The only totally open option I know of is PDFBox (http://www.pdfbox.org).
Its a Java library of PDF functions, including the ability to extract text
similar to PDFLib TET, but again you''re on you''re own to mark
it up as HTML.

HTH,
Jeff

On 7/29/06, Bryan Duxbury <bryan.duxbury@gmail.com>
wrote:>
> Does anyone know of a good package that can convert a PDF into HTML?
> Cross-platform compatible is a plus, but I can live with Linux-only if
> it comes to that.
>
> --
> Posted via http://www.ruby-forum.com/.
> _______________________________________________
> Rails mailing list
> Rails@lists.rubyonrails.org
> http://lists.rubyonrails.org/mailman/listinfo/rails
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://wrath.rubyonrails.org/pipermail/rails/attachments/20060730/9e3502c2/attachment-0001.html

Maybe Matching Threads

Search for more possibly parallel threads

Rails - Jul 2006 - PDF to HTML converter for Ruby?

[Rails] PDF to HTML converter for Ruby?

[Rails] PDF to HTML converter for Ruby?

Maybe Matching Threads