Anyone got a preferred program or package for this? I'd like a *good* one, and Word or OO.o's save as html in no way qualifies as other than amateur crap. So far, with a little googling, I've found the wv package. wvHtml works, but I don't like the output - it insists on <div>, and on &rhquo instead of plain, simple ". mark "what, ask for an opinion in this shy, diffident group?"
On Fri, 22 Jun 2012, m.roth at 5-cent.us wrote:> To: CentOS mailing list <centos at centos.org> > From: m.roth at 5-cent.us > Subject: [CentOS] converting .doc to html > > Anyone got a preferred program or package for this? I'd like a *good* one, > and Word or OO.o's save as html in no way qualifies as other than amateur > crap. > > So far, with a little googling, I've found the wv package. wvHtml works, > but I don't like the output - it insists on <div>, and on &rhquo instead > of plain, simple ".I think Abiword can read and write those formats. [root at karsites ~]# rpm -qv abiword abiword-2.6.6-1.el5.rf HTH Keith ----------------------------------------------------------- Websites: http://www.karsites.net http://www.php-debuggers.net http://www.raised-from-the-dead.org.uk All email addresses are challenge-response protected with TMDA [http://tmda.net] -----------------------------------------------------------
On Fri, Jun 22, 2012 at 9:40 AM, <m.roth at 5-cent.us> wrote:> Anyone got a preferred program or package for this? I'd like a *good* one, > and Word or OO.o's save as html in no way qualifies as other than amateur > crap. > > So far, with a little googling, I've found the wv package. wvHtml works, > but I don't like the output - it insists on <div>, and on &rhquo instead > of plain, simple ". >Mail it to yourself on a gmail account, then 'view' the attachment instead of downloading the original. It is still going to have <div>'s though. -- Les Mikesell lesmikesell at gmail.com
On 6/22/2012 8:40 AM, m.roth at 5-cent.us wrote:> > wvHtml works, > but I don't like the output - it insists on <div>, and on &rhquo instead > of plain, simple ".You mean ”? What's wrong with that? You wanted HTML, and *any* browser will understand that HTML entity, even Lynx. If you wanted "HTML I can read like an e-book", I'd say you should be converting to Markdown instead. One path from Word to Markdown would be unrtf (https://www.gnu.org/software/unrtf/) to HTML, then HTML to Markdown via Pandoc (http://johnmacfarlane.net/pandoc/).