Hello!
having the same here. Solved by adding some ram in my server.
Maybe external calls can't be done properly, and omindex crashes when
launching programs such as antiwor, pdftotext and so on...
Hope this can help you...
Regards,
C.
On Mon, 06 Apr 2009 12:45:40 +0200
"Eric Voisard" <eric.voisard at atisuher.ch> wrote:
> Hi all,
>
> I'm having a recurrent problem with Omega's indexing.
> When I run omindex, it sometimes misses to recognize the extension of
> some files (.doc, .pdf) and skips them. In the same run, omindex is
> otherwise perfectly able to index other files with same extensions. The
> reason is not clear but it should occur before it selects a content
> converter since for example, if I manually run antiword on a .doc file
> that failed, it works...
>
> Running omindex:
> Unknown extension: "/srv/xapian/targets/dir/subdir/file name.doc"
- skipping
>
> Manual conversion:
> host:/srv # antiword "/srv/xapian/targets/dir/subdir/file
name.doc"
> <..plain text content of the file...>
> host:/srv #
>
> Note that the target directory is a CIFS mount of a remote Windows
> shared directory. Charset is UTF-8.
> I don't think it has to do with the whitespace in the file name since
> other .doc filenames with whitespaces work.
>
> Any idea?...
>
> Thanks in advance, Eric
> ATIS Uher S.A.
> CH 2046 Fontaines
>
________________________________________________________________________________________________
>
> This message is confidential. It may also be privileged or otherwise
protected by work product immunity or other legal rules. If you have received
this message by mistake please let us know by reply and then delete it from your
system; you should not copy it or disclose its contents to anyone. All messages
sent to and from ATIS Uher S.A. may be monitored to ensure compliance with
internal policies and to protect our business. E-Mails are not secure and cannot
be guaranteed to be error free as they can be intercepted, amended, lost or
destroyed. Anyone who communicates with us by e-mail is taken to accept these
risks.
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
--
C?dric Jeanneret | System Administrator
021 619 10 32 | Camptocamp SA
cedric.jeanneret at camptocamp.com | PSE-A / EPFL