Hi Philon, Thanks a lot for your thoughts! Can I ask you if using Solr improved things for you? I have a mailbox with 15 years of e-mail and searching things take a long time. On 04.02.2020 09:39, Philon wrote:> Hi Francis, > > next to fts-solr there was fts-lucene. But that Lucene there seems > heavily outdated why the Dovecot docs also suggest using Solr. > Elasticsearch probably is similar to Solr but the later is maintained > by Dovecot team. > > I started with downloading the Solr binary distribution to Debian with > JRE preinstalled and things were running like after 10 min. Yes it?s a > bit more complicated to find the schema and edit things like header > size (in tips section). It?s running quite nicely since then and has > zero maintenance.I will try again - I kept getting some weird errors, so I don't know if that's why I wasn't seing much of improvement.> > As FTS indexes are separate in external Solr instance I?d guess that > it won?t interfere with dsync. What I don?t know is if dsync?ing would > trigger indexing. This brings me to wonder how one could actually > replicate the Solr instance!?Good question. But what I thought about doing was to install FTS on my backup instance, and if things go fine, then I install an FTS instance on my production server - that is, if one doesn't interfere with the other. I will give Solr another shot - my worries are mostly if Solr is supported on ARM (my prod instance is running on ARM) - I know Elasticsearch has an ARM build. Ii thought about the Xapian engine, but since it requires dovecot 2.3, I will have to wait. Best, Francis> > Philon > >> On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay >> <r_f at med-lo.eu> wrote: >> >> Hi there, >> >> I got successfully to replicate my mail server to another dovecot >> install using dsync, mainly for redundancy, and it works great. >> >> I want to try to install fts, as some of the mailboxes have tens of >> thousands of messages, and it takes minutes to get some results when >> searching via IMAP on a Roundcube interface. >> >> I want to experiment with fts-solr first, and firstly on my redundant >> server, ie., not on my main dovecot install. Is it ok to do this? I >> ask because I am afraid of how this whole reindexing on the redundant >> install will affect the production server. >> >> Also, any tips on something else than fts-solr? I tried it once, but >> it was so hard to get it right, so many configurations, java, etc., >> that I'd rather try something else. I also could try fts-elastic or >> something like that, but, again, having to maintain an elasticsearch >> install might use more resources than I think is worth. Any thoughts >> on that? >> >> Best, >> >> -- >> Francis >>-------------- next part -------------- A non-text attachment was scrubbed... Name: 0xEE41D33F.asc Type: application/pgp-keys Size: 5544 bytes Desc: not available URL: <https://dovecot.org/pipermail/dovecot/attachments/20200204/d0ccd27d/attachment.bin>
On February 4, 2020 11:46:31 AM GMT+01:00, Francis Augusto Medeiros-Logeay <r_f at med-lo.eu> wrote:>Hi Philon, > >Thanks a lot for your thoughts! > >Can I ask you if using Solr improved things for you? I have a mailbox >with 15 years of e-mail and searching things take a long time.It a vast improvement, more or less instant results.>On 04.02.2020 09:39, Philon wrote: >> Hi Francis, >> >> next to fts-solr there was fts-lucene. But that Lucene there seems >> heavily outdated why the Dovecot docs also suggest using Solr. >> Elasticsearch probably is similar to Solr but the later is maintained >> by Dovecot team. >> >> I started with downloading the Solr binary distribution to Debian >with >> JRE preinstalled and things were running like after 10 min. Yes it?s >a >> bit more complicated to find the schema and edit things like header >> size (in tips section). It?s running quite nicely since then and has >> zero maintenance. > >I will try again - I kept getting some weird errors, so I don't know if > >that's why I wasn't seing much of improvement. >> >> As FTS indexes are separate in external Solr instance I?d guess that >> it won?t interfere with dsync. What I don?t know is if dsync?ing >would >> trigger indexing. This brings me to wonder how one could actually >> replicate the Solr instance!? > >Good question. But what I thought about doing was to install FTS on my >backup instance, and if things go fine, then I install an FTS instance >on my production server - that is, if one doesn't interfere with the >other. > >I will give Solr another shot - my worries are mostly if Solr is >supported on ARM (my prod instance is running on ARM) - I know >Elasticsearch has an ARM build. > >Ii thought about the Xapian engine, but since it requires dovecot 2.3, >I >will have to wait. > >Best, > >Francis > > >> >> Philon >> >>> On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay >>> <r_f at med-lo.eu> wrote: >>> >>> Hi there, >>> >>> I got successfully to replicate my mail server to another dovecot >>> install using dsync, mainly for redundancy, and it works great. >>> >>> I want to try to install fts, as some of the mailboxes have tens of >>> thousands of messages, and it takes minutes to get some results when > >>> searching via IMAP on a Roundcube interface. >>> >>> I want to experiment with fts-solr first, and firstly on my >redundant >>> server, ie., not on my main dovecot install. Is it ok to do this? I >>> ask because I am afraid of how this whole reindexing on the >redundant >>> install will affect the production server. >>> >>> Also, any tips on something else than fts-solr? I tried it once, but > >>> it was so hard to get it right, so many configurations, java, etc., >>> that I'd rather try something else. I also could try fts-elastic or >>> something like that, but, again, having to maintain an elasticsearch > >>> install might use more resources than I think is worth. Any thoughts > >>> on that? >>> >>> Best, >>> >>> -- >>> Francis >>>-- Christian Kivalo
Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay:> Hi Philon, > > Thanks a lot for your thoughts! > > Can I ask you if using Solr improved things for you? I have a mailbox > with 15 years of e-mail and searching things take a long time.Here, SOLR itself searches a quarter million mails in split seconds and returns very good results. That is on a low memory average machine. If you dont mind the standard, you can change the schema, so headers (from, to) get indexed in body text. That can help narrowing results. Only problem is search through e.g. nested folders from IMAP: something like ESEARCH would be nice - https://tools.ietf.org/html/rfc6237 Peter> > On 04.02.2020 09:39, Philon wrote: >> Hi Francis, >> >> next to fts-solr there was fts-lucene. But that Lucene there seems >> heavily outdated why the Dovecot docs also suggest using Solr. >> Elasticsearch probably is similar to Solr but the later is maintained >> by Dovecot team. >> >> I started with downloading the Solr binary distribution to Debian with >> JRE preinstalled and things were running like after 10 min. Yes it?s a >> bit more complicated to find the schema and edit things like header >> size (in tips section). It?s running quite nicely since then and has >> zero maintenance. > > I will try again - I kept getting some weird errors, so I don't know if > that's why I wasn't seing much of improvement. > >> >> As FTS indexes are separate in external Solr instance I?d guess that >> it won?t interfere with dsync. What I don?t know is if dsync?ing would >> trigger indexing. This brings me to wonder how one could actually >> replicate the Solr instance!? > > Good question. But what I thought about doing was to install FTS on my > backup instance, and if things go fine, then I install an FTS instance > on my production server - that is, if one doesn't interfere with the other. > > I will give Solr another shot - my worries are mostly if Solr is > supported on ARM (my prod instance is running on ARM) - I know > Elasticsearch has an ARM build. > > Ii thought about the Xapian engine, but since it requires dovecot 2.3, I > will have to wait. > > Best, > > Francis > > >> >> Philon >> >>> On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay >>> <r_f at med-lo.eu> wrote: >>> >>> Hi there, >>> >>> I got successfully to replicate my mail server to another dovecot >>> install using dsync, mainly for redundancy, and it works great. >>> >>> I want to try to install fts, as some of the mailboxes have tens of >>> thousands of messages, and it takes minutes to get some results when >>> searching via IMAP on a Roundcube interface. >>> >>> I want to experiment with fts-solr first, and firstly on my redundant >>> server, ie., not on my main dovecot install. Is it ok to do this? I >>> ask because I am afraid of how this whole reindexing on the redundant >>> install will affect the production server. >>> >>> Also, any tips on something else than fts-solr? I tried it once, but >>> it was so hard to get it right, so many configurations, java, etc., >>> that I'd rather try something else. I also could try fts-elastic or >>> something like that, but, again, having to maintain an elasticsearch >>> install might use more resources than I think is worth. Any thoughts >>> on that? >>> >>> Best, >>> >>> -- >>> Francis >>>
Am 04.02.20 um 12:37 schrieb Peter Chiochetti:> Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay: >> Hi Philon, >> >> Thanks a lot for your thoughts! >> >> Can I ask you if using Solr improved things for you? I have a mailbox >> with 15 years of e-mail and searching things take a long time. > > Here, SOLR itself searches a quarter million mails in split seconds and > returns very good results. That is on a low memory average machine.Looking at the facts, it is closer to half a million mails in a 160GB Maildir, lots of trash too, but no one to sort it out. SOLR index is 1.2 GB in size on disk. A tremendous ratio IMO. In dovecot terms this is likely considered a small installation. We are a small team too :) and quite happy with the generous gift of dovecot, and Thunderbird BTW.> Only problem is search through e.g. nested folders from IMAP: something > like ESEARCH would be nice - https://tools.ietf.org/html/rfc6237PS: There is powerful client side search in some MUAs, yet sometimes serverside comes handy. -- peter
I updated fts-xapian to make it compatible with dovecot 2.2 On 2020-02-04 12:37, Peter Chiochetti wrote:> Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay: > >> Hi Philon, >> >> Thanks a lot for your thoughts! >> >> Can I ask you if using Solr improved things for you? I have a mailbox with 15 years of e-mail and searching things take a long time. > > Here, SOLR itself searches a quarter million mails in split seconds and returns very good results. That is on a low memory average machine. > > If you dont mind the standard, you can change the schema, so headers (from, to) get indexed in body text. That can help narrowing results. > > Only problem is search through e.g. nested folders from IMAP: something like ESEARCH would be nice - https://tools.ietf.org/html/rfc6237 > > Peter > > On 04.02.2020 09:39, Philon wrote: Hi Francis, > > next to fts-solr there was fts-lucene. But that Lucene there seems > heavily outdated why the Dovecot docs also suggest using Solr. > Elasticsearch probably is similar to Solr but the later is maintained > by Dovecot team. > > I started with downloading the Solr binary distribution to Debian with > JRE preinstalled and things were running like after 10 min. Yes it's a > bit more complicated to find the schema and edit things like header > size (in tips section). It's running quite nicely since then and has > zero maintenance. > I will try again - I kept getting some weird errors, so I don't know if that's why I wasn't seing much of improvement. > > As FTS indexes are separate in external Solr instance I'd guess that > it won't interfere with dsync. What I don't know is if dsync'ing would > trigger indexing. This brings me to wonder how one could actually > replicate the Solr instance!? > Good question. But what I thought about doing was to install FTS on my backup instance, and if things go fine, then I install an FTS instance on my production server - that is, if one doesn't interfere with the other. > > I will give Solr another shot - my worries are mostly if Solr is supported on ARM (my prod instance is running on ARM) - I know Elasticsearch has an ARM build. > > Ii thought about the Xapian engine, but since it requires dovecot 2.3, I will have to wait. > > Best, > > Francis > > Philon > > On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay <r_f at med-lo.eu> wrote: > > Hi there, > > I got successfully to replicate my mail server to another dovecot install using dsync, mainly for redundancy, and it works great. > > I want to try to install fts, as some of the mailboxes have tens of thousands of messages, and it takes minutes to get some results when searching via IMAP on a Roundcube interface. > > I want to experiment with fts-solr first, and firstly on my redundant server, ie., not on my main dovecot install. Is it ok to do this? I ask because I am afraid of how this whole reindexing on the redundant install will affect the production server. > > Also, any tips on something else than fts-solr? I tried it once, but it was so hard to get it right, so many configurations, java, etc., that I'd rather try something else. I also could try fts-elastic or something like that, but, again, having to maintain an elasticsearch install might use more resources than I think is worth. Any thoughts on that? > > Best, > > -- Francis-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20200215/4c3e54a6/attachment.html>
Hi Francis, My Solr instance is on 1GB but using less than 512MB. You might need to adjust Java VM memory usage but it's possible. I have only my own email but also 10-15 years history and search results including headers and body are instant. Things are on SSD but still I think the search storage fits into memory. Philon Am 04.02.2020 11:46, schrieb Francis Augusto Medeiros-Logeay:> Hi Philon, > > Thanks a lot for your thoughts! > > Can I ask you if using Solr improved things for you? I have a mailbox > with 15 years of e-mail and searching things take a long time. > > On 04.02.2020 09:39, Philon wrote: >> Hi Francis, >> >> next to fts-solr there was fts-lucene. But that Lucene there seems >> heavily outdated why the Dovecot docs also suggest using Solr. >> Elasticsearch probably is similar to Solr but the later is maintained >> by Dovecot team. >> >> I started with downloading the Solr binary distribution to Debian with >> JRE preinstalled and things were running like after 10 min. Yes it?s a >> bit more complicated to find the schema and edit things like header >> size (in tips section). It?s running quite nicely since then and has >> zero maintenance. > > I will try again - I kept getting some weird errors, so I don't know > if that's why I wasn't seing much of improvement. > >> >> As FTS indexes are separate in external Solr instance I?d guess that >> it won?t interfere with dsync. What I don?t know is if dsync?ing would >> trigger indexing. This brings me to wonder how one could actually >> replicate the Solr instance!? > > Good question. But what I thought about doing was to install FTS on my > backup instance, and if things go fine, then I install an FTS instance > on my production server - that is, if one doesn't interfere with the > other. > > I will give Solr another shot - my worries are mostly if Solr is > supported on ARM (my prod instance is running on ARM) - I know > Elasticsearch has an ARM build. > > Ii thought about the Xapian engine, but since it requires dovecot 2.3, > I will have to wait. > > Best, > > Francis > > >> >> Philon >> >>> On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay >>> <r_f at med-lo.eu> wrote: >>> >>> Hi there, >>> >>> I got successfully to replicate my mail server to another dovecot >>> install using dsync, mainly for redundancy, and it works great. >>> >>> I want to try to install fts, as some of the mailboxes have tens of >>> thousands of messages, and it takes minutes to get some results when >>> searching via IMAP on a Roundcube interface. >>> >>> I want to experiment with fts-solr first, and firstly on my redundant >>> server, ie., not on my main dovecot install. Is it ok to do this? I >>> ask because I am afraid of how this whole reindexing on the redundant >>> install will affect the production server. >>> >>> Also, any tips on something else than fts-solr? I tried it once, but >>> it was so hard to get it right, so many configurations, java, etc., >>> that I'd rather try something else. I also could try fts-elastic or >>> something like that, but, again, having to maintain an elasticsearch >>> install might use more resources than I think is worth. Any thoughts >>> on that? >>> >>> Best, >>> >>> -- >>> Francis >>>