Well, thank you for the answer, but the actual issue is that data sent
by the decoder (stipulated in the conf file) is properly collected by
dovecot core, but /not/ sent to the plugin : the plugin receives the
original data.
This is not linked to a particular plugin (xapian, solr, squat, etc..)
but seems to be a general issue of dovecot core
On 2021-02-08 01:03, John Fawcett wrote:
> On 07/02/2021 18:51, Joan Moreau wrote:
>
> more info : the function fts_parser_script_more in
> plugins/fts/fts-parser.c properly read the output of the script
>
> still, the data is not sent to the FTS pligins (xapian or any other)
>
> On 2021-02-07 17:37, Joan Moreau wrote:
>
> more info : I am running dovecot git version
>
> On 2021-02-07 17:15, Joan Moreau wrote:
>
> a bit more on this, adding log in the decode2text.sh, I can see that
> pdftotext output the right data, but that data is /not/ transmitted to
> the fts plugin for indexing (only the original pdf code is)
>
> On 2021-02-07 17:00, Joan Moreau wrote:
>
> Hello,
>
> I am trying to deal properly with email attachements in fts-xapian
> plugins.
>
> I tried the default script with a PDF file.
>
> The data I receive in the fts plugin part ("xxx_build_more") is
the
> original document, no the output of the pdftotext
>
> Is there anything I am missing ?
>
> Here my config:
>
> plugin {
> plugin = fts_xapian managesieve sieve
>
> fts = xapian
> fts_xapian = partial=2 full=20 verbose=1 attachments=1
>
> fts_autoindex = yes
> fts_enforced = yes
> fts_autoindex_exclude = \Trash
> fts_autoindex_exclude2 = \Drafts
>
> fts_decoder = decode2text
>
> sieve = /data/mail/%d/%n/local.sieve
> sieve_after = /data/mail/after.sieve
> sieve_before = /data/mail/before.sieve
> sieve_dir = /data/mail/%d/%n/sieve
> sieve_global_dir = /data/mail
> sieve_global_path = /data/mail/global.sieve
> }
>
> ...
>
> service decode2text {
> executable = script /usr/libexec/dovecot/decode2text.sh
> user = dovecot
> unix_listener decode2text {
> mode = 0666
> }
> }
>
> Thank you
Joan
I'm not sure I can be much use for xapian, but looking at your
configuration I did notice some differences with the documentation. I
don't know if they are relevant to the issue you're seeing.
First of all I don't see
mail_plugins = fts
plugin = fts
settings which are both mentioned in the xapian documentation.
Also the documentation states that attachments=1 can only index text
attachments. Maybe you should be using attachments=0 and let fts_decode
handle the attachments.
Failing that, I can only advise to turn on some debugging and see what
that brings.
best regards
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://dovecot.org/pipermail/dovecot/attachments/20210208/872d5568/attachment.html>