More dsync issues. We were running 2.1.7 and we updated to 2.1.9. Same problem with both versions. I'm getting an error 75 on about 40 boxes out of 1800. It is the same list of boxes every time we use 'dsync backup' to backup the server. dsync seems to stop communicating to the backup box (over ssh). strace just shows it sitting at a epoll_wait. Once the program quits (times out?), a 'du' shows the destination is smaller (200kbyte in one case). Has anyone else seen an exit code of 75? Nothing in the documentation mentions what exit code 75 could mean. What can I do to help the developers locate the bug? ...Jeff
Hi Jeff, Jeff Gustafson wrote:> More dsync issues. We were running 2.1.7 and we updated to 2.1.9. Same > problem with both versions. > I'm getting an error 75 on about 40 boxes out of 1800. It is the same > list of boxes every time we use 'dsync backup' to backup the server. > dsync seems to stop communicating to the backup box (over ssh). strace > just shows it sitting at a epoll_wait. Once the program quits (times > out?), a 'du' shows the destination is smaller (200kbyte in one case). > What can I do to help the developers locate the bug?Please start by following the instructions at http://dovecot.org/bugreport.html and post your 'doveconf -n' output in order to provide possibly important information about your system and configs. Regards Daniel -- https://plus.google.com/103021802792276734820
I ran a rsync on the mailboxes that I was having issues with. I re-ran rsync until I had a full sync with no further updates. Then I ran a dsync. dsync was able to run without issue. If I wipe out the target directory and re-run dsync, I'm back to dsync getting stuck. Running rsync on mdbox files is not optimal. What else can I do to track down the issue? I've contacted Timo's company about payed support so we can get a fix for this issue. I hope to hear from them soon. ...Jeff
On 11.8.2012, at 0.54, Jeff Gustafson wrote:> More dsync issues. We were running 2.1.7 and we updated to 2.1.9. Same > problem with both versions. > I'm getting an error 75 on about 40 boxes out of 1800. It is the same > list of boxes every time we use 'dsync backup' to backup the server. > dsync seems to stop communicating to the backup box (over ssh). strace > just shows it sitting at a epoll_wait.So you can easily reproduce this by running dsync for a specific user?> Once the program quits (times > out?), a 'du' shows the destination is smaller (200kbyte in one case).As in, some of the mails didn't get synced? (doveadm fetch could be used to do a better comparison, file sizes don't necessarily mean anything.)> Has anyone else seen an exit code of 75? Nothing in the documentation > mentions what exit code 75 could mean."temporary failure".> What can I do to help the developers locate the bug?Those hangs are a little bit annoying to debug, and the whole code has been rewritten for v2.2 already in a way that should make the hangs pretty much impossible. Annoyingly v2.2 isn't ready yet..