Rich Wales
2016-Mar-11 17:29 UTC
Looking for way to monitor dsync, confirm it is or isn't running
I am syncing two Dovecot sites using the dsync function. I would like to be able to run some sort of periodic health check to confirm that dsync is (or is not) running properly between the two sites, and alert me if dsync is failing or lagging excessively. Does anyone know of a tool to do this? (If possible, something I can set up to run periodically in Nagios?) Thanks for any suggestions. Rich Wales richw at richw.org
Michael Grimm
2016-Mar-11 18:09 UTC
Looking for way to monitor dsync, confirm it is or isn't running
Rich Wales <richw at richw.org> wrote:> I am syncing two Dovecot sites using the dsync function.If you are referring to replication ...> I would like to be able to run some sort of periodic health check to > confirm that dsync is (or is not) running properly between the two > sites, and alert me if dsync is failing or lagging excessively. > > Does anyone know of a tool to do this?No replication running: | mail> doveadm replicator status | Fatal: net_connect_unix(/var/run/dovecot/replicator-doveadm) failed: No such file or directory Replication running: | mail> doveadm replicator status | Queued 'sync' requests 0 | Queued 'high' requests 0 | Queued 'low' requests 0 | Queued 'failed' requests 0 | Queued 'full resync' requests 0 | Waiting 'failed' requests 0 If those numbers tend to become significantly larger than 0, then replication has issues. I do not use that for health checking by something like ...> (If possible, something I can set up to run periodically in Nagios?)? but used it once in a while when suspecting issues with replication. HTH, Michael
Rich Wales
2016-Mar-14 04:55 UTC
Looking for way to monitor dsync, confirm it is or isn't running
Earlier, I asked:>> I would like to be able to run some sort of periodic health check to >> confirm that dsync is (or is not) running properly between the two >> sites, and alert me if dsync is failing or lagging excessively. Does >> anyone know of a tool to do this?and Michael Grimm replied:> doveadm replicator status > > If those numbers tend to become significantly larger than 0, then > replication has issues. I do not use that for health checking . . . > but used it once in a while when suspecting issues with replication.Thanks. As a followup question: If "doveadm replicator status" shows problems, are there any commands available to pinpoint exactly which request(s) is/are causing the problem(s)? One of the sites I am administering, for example, has been reporting 1 "queued 'full resync' requests" and 9 "waiting 'failed' requests" for the past couple of days. But I have no idea how to resolve the issue. Suggestions welcome. Rich Wales richw at richw.org