Delian Krustev
2019-Nov-02 02:05 UTC
Sieve redirect does not collect the sendmail child process correctly (Dovecot 2.3.4.1, Pigeonhole 0.5.4)
Hi, I got this simple test Sieve config: ############################################# # Sieve Filter # Generated by Ingo (http://www.horde.org/apps/ingo/) (11/01/2019, 10:46:06 PM) # Forwards if true { redirect "krustev at krustev.net"; } # Forward Keep Action if true { keep; stop; } ############################################# Which used to work fine with: # 2.2.27 (c0f36b0): /etc/dovecot/dovecot.conf # Pigeonhole version 0.4.16 (fed8554) # OS: Linux 4.9.0-8-amd64 x86_64 Debian 9.11 Then I've tried migrating to: # 2.3.4.1 (f79e8e7e4): /etc/dovecot/dovecot.conf # Pigeonhole version 0.5.4 () # OS: Linux 4.19.0-6-amd64 x86_64 Debian 10.1 And started getting the following errors: # cat .dovecot.sieve.log sieve: info: started log at Nov 01 22:55:50. error: msgid=<4221584.AlbrL4BjWX at home>: failed to redirect message to <krustev at krustev.net>: Failed to execute sendmail (temporary failure). A google search lead me to: https://bugs.archlinux.org/task/56933 which blames systemd service changes: NoNewPrivileges=true , however I'm not running dovecot via systemd but use DJB's daemontools for service supervision . The page above suggested to change from: sendmail_path = /usr/sbin/sendmail to submission_host = localhost in "conf.d/15-lda.conf", which worked fine. But since I don't have other problems with "sendmail" execution I've decided to debug this further. I have dovecot integrated with Qmail: # apt-cache policy qmail qmail: Installed: 1.06-6+b1 Candidate: 1.06-6+b1 Version table: *** 1.06-6+b1 900 900 http://deb.debian.org/debian buster/main amd64 Packages so sendmail's implementation comes from Qmail: # ls -al `which sendmail` lrwxrwxrwx 1 root root 14 Feb 23 2015 /usr/sbin/sendmail -> qmail-sendmail I've strace-d the qmail-lspawn process which executes dovecot's LDA, which processes the Sieve rules and executes "sendmail". The sendmail process executes and returns successfully, the mail is delivered, however the Sieve code does not wait for the child process correctly and tries to kill the already exited process after 30 seconds timeout. Here are the problematic Sieve syscalls: 22:55:15.283482 execve("/usr/lib/dovecot/deliver", ["/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x5620b4330c40 /* 61 vars */) = 0 ... 22:55:15.286845 pipe([5, 6]) = 0 22:55:15.286870 fcntl(5, F_GETFL) = 0 (flags O_RDONLY) 22:55:15.286889 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 22:55:15.286909 fcntl(6, F_GETFL) = 0x1 (flags O_WRONLY) 22:55:15.286928 fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 22:55:15.286947 fcntl(5, F_GETFD) = 0 22:55:15.286966 fcntl(5, F_SETFD, FD_CLOEXEC) = 0 22:55:15.286985 fcntl(6, F_GETFD) = 0 22:55:15.287003 fcntl(6, F_SETFD, FD_CLOEXEC) = 0 22:55:15.287024 epoll_create(128) = 7 22:55:15.287043 fcntl(7, F_GETFD) = 0 22:55:15.287062 fcntl(7, F_SETFD, FD_CLOEXEC) = 0 22:55:15.287084 epoll_ctl(7, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47201248, u64=94699781110752}}) = 0 ... 22:55:15.294625 epoll_ctl(7, EPOLL_CTL_DEL, 5, 0x7ffd59156b8c) = 0 22:55:15.294647 epoll_create(128) = 12 22:55:15.294668 fcntl(12, F_GETFD) = 0 22:55:15.294687 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 22:55:15.294708 epoll_ctl(12, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47455536, u64=94699781365040}}) = 0 Dovecot/Sieve will wait for this pipe FD 5 later, but no event will arrive. 22:55:15.294740 rt_sigaction(SIGCHLD, {sa_handler=0x7f15404b3640, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f1540229840}, NULL, 8) = 0 22:55:15.294763 pipe([13, 14]) = 0 22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929 The sendmail child process with PID 6929 is forked here. 22:55:15.294928 close(13) = 0 22:55:15.294969 fcntl(14, F_GETFL) = 0x1 (flags O_WRONLY) 22:55:15.295006 fcntl(14, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 22:55:15.295039 lseek(14, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) 22:55:15.295065 getsockname(14, 0x7ffd59156a40, [28]) = -1 ENOTSOCK (Socket operation on non-socket) 22:55:15.295109 write(14, "EDITTED LONG MESSAGE DATA\n", 1461) = 1461 22:55:15.295150 close(14) = 0 22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0 22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0 Here's where the 30 second's timeout is requested and waited. 22:55:45.325580 kill(6929, SIGTERM) = 0 22:55:45.325769 epoll_wait(12, [], 1, 5000) = 0 22:55:50.331108 kill(6929, SIGKILL) = 0 22:55:50.332867 wait4(6929, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6929 22:55:50.333020 epoll_ctl(12, EPOLL_CTL_DEL, 5, 0x7ffd59156b6c) = 0 22:55:50.333060 epoll_ctl(7, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47201248, u64=94699781110752}}) = 0 22:55:50.333097 close(12) = 0 22:55:50.333240 sendto(9, "<19>Nov 1 22:55:50 dovecot: lda(krustev)<6926><IlpXEdO3vF0OGwAA4EGeoA>: Error: sieve: msgid=<4221584.AlbrL4BjWX at home>: failed to redirect message to <krustev at krustev.net>: Failed to execute sendmail (temporary failure)", 220, MSG_NOSIGNAL, NULL, 0) = 220 22:55:50.333321 openat(AT_FDCWD, "/home/krustev/mail/domains/krustev.net/tester/.dovecot.sieve.log", O_WRONLY|O_CREAT|O_APPEND, 0600) = 12 22:55:50.333557 fstat(12, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 22:55:50.333613 lseek(12, 0, SEEK_CUR) = 0 22:55:50.333643 fstat(12, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 22:55:50.333678 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0 22:55:50.333726 write(12, "sieve: info: started log at Nov 01 22:55:50.\n", 45) = 45 22:55:50.333788 write(12, "error: msgid=<4221584.AlbrL4BjWX at home>: failed to redirect message to <krustev at krustev.net>: Failed to execute sendmail (temporary failure).\n", 141) = 141 22:55:50.333857 sendto(9, "<19>Nov 1 22:55:50 dovecot: lda(krustev)<6926><IlpXEdO3vF0OGwAA4EGeoA>: Error: sieve: Execution of script /home/krustev/mail/domains/krustev.net/tester/.dovecot.sieve was aborted due to temporary failure (user logfile /home/krustev/mail/domains/krustev.net/tester/.dovecot.sieve.log may reveal additional details)", 317, MSG_NOSIGNAL, NULL, 0) = 317 22:55:50.333941 close(10) = 0 22:55:50.333996 close(12) = 0 22:55:50.334057 lstat("/home/krustev/mail/domains/krustev.net/tester/.dovecot.lda-dupes.lock", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 22:55:50.334095 unlink("/home/krustev/mail/domains/krustev.net/tester/.dovecot.lda-dupes.lock") = 0 22:55:50.334188 close(11) = 0 22:55:50.334245 write(2, "Execution of Sieve filters was aborted due to temporary failure\n", 64) = 64 22:55:50.334418 munmap(0x7f15401bc000, 103680) = 0 22:55:50.334481 munmap(0x7f15403b3000, 32800) = 0 22:55:50.334516 munmap(0x7f15400fb000, 788352) = 0 22:55:50.334610 epoll_ctl(7, EPOLL_CTL_DEL, 8, 0x7ffd5915706c) = 0 22:55:50.334640 close(8) = 0 22:55:50.334685 epoll_ctl(7, EPOLL_CTL_DEL, 5, 0x7ffd5915706c) = 0 22:55:50.334738 close(5) = 0 22:55:50.334785 close(6) = 0 22:55:50.334813 close(7) = 0 22:55:50.334841 close(4) = 0 22:55:50.334881 exit_group(75) = ? 22:55:50.335077 +++ exited with 75 +++ Here's the interesting part of sendmail's (PID 6929) execution with timestamps: .. 22:55:15.295268 execve("/usr/sbin/sendmail", ["/usr/sbin/sendmail", "-i", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], NULL) = 0 ... 22:55:15.296024 chdir("/var/lib/qmail") = 0 22:55:15.296055 execve("bin/qmail-inject", ["bin/qmail-inject", "-a", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], 0x55741b1890e0 /* 2 vars */) = 0 ... 22:55:15.297264 vfork() = 6930 .. 22:55:15.297930 wait4(6930, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6930 22:55:15.305391 exit_group(0) = ? 22:55:15.305472 +++ exited with 0 +++ The PID 6930 is a call to the setuid qmail-queue, which also executes and returns normally. With all the above stated - It seems that the problem is in the "epoll" calls. Either it is not set-up correctly or dovecot-LDA does not receive the information it waits for. I've tried to make my way trough Dovecot's source code but have concluded that someone more familiar should be able to spot the problem much quicker than me. Let me know if you need some additional information. Sorry for the long mail. Dovecot config follows: # 2.3.4.1 (f79e8e7e4): /etc/dovecot/dovecot.conf # Pigeonhole version 0.5.4 () # OS: Linux 4.19.0-6-amd64 x86_64 Debian 10.1 # Hostname: pinko.livehostserver.com auth_mechanisms = plain login disable_plaintext_auth = no lda_mailbox_autocreate = yes lda_mailbox_autosubscribe = yes log_path = /dev/stderr log_timestamp mail_plugins = " quota" managesieve_notify_capability = mailto managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext namespace inbox { inbox = yes location mailbox Archive { special_use = \Archive } mailbox Archives { special_use = \Archive } mailbox "Deleted Items" { special_use = \Trash } mailbox "Deleted Messages" { special_use = \Trash } mailbox Drafts { auto = subscribe special_use = \Drafts } mailbox Junk { auto = subscribe special_use = \Junk } mailbox "Junk E-mail" { special_use = \Junk } mailbox Sent { auto = subscribe special_use = \Sent } mailbox "Sent Items" { special_use = \Sent } mailbox "Sent Mail" { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Spam { special_use = \Junk } mailbox Trash { auto = subscribe special_use = \Trash } mailbox sent-mail { special_use = \Sent } prefix } passdb { args = /usr/local/sbin/vchkpw-dovecot driver = checkpassword } plugin { antispam_allow_append_to_spam = yes antispam_backend = pipe antispam_pipe_program = /bin/sh antispam_pipe_program_notspam_args = -c;HOME=$PWD exec nice sa-learn --ham antispam_pipe_program_spam_args = -c;HOME=$PWD exec nice sa-learn --spam antispam_pipe_tmpdir = /tmp antispam_spam_pattern_ignorecase = spam;junk;junk * antispam_trash_pattern_ignorecase = trash;Deleted * quota = maildir:User quota sieve = ~/.dovecot.sieve sieve_dir = ~/sieve sieve_redirect_envelope_from = recipient } protocols = " imap sieve pop3" service anvil { client_limit = 1536 } service auth { client_limit = 1536 } service imap-login { process_limit = 512 } service managesieve-login { inet_listener sieve { port = 4190 } process_min_avail = 1 service_count = 1 } service managesieve { process_limit = 50 } service pop3-login { process_limit = 512 } service stats { client_limit = 1024 unix_listener stats-writer { mode = 0666 } } ssl_ca = </scm/deb-global/etc/ssl/intermediate-CAs.crt ssl_cert = </etc/ssl/default-fullchain.pem ssl_client_ca_dir = /etc/ssl/certs ssl_dh = # hidden, use -P to show it ssl_key = # hidden, use -P to show it userdb { driver = prefetch } protocol lda { info_log_path log_path mail_plugins = " quota sieve" } protocol imap { mail_max_userip_connections = 64 mail_plugins = " quota imap_quota antispam" } protocol sieve { mail_max_userip_connections = 10 managesieve_implementation_string = Dovecot Pigeonhole managesieve_max_line_length = 65536 } Best regards -- Delian
Stephan Bosch
2019-Nov-02 21:40 UTC
Sieve redirect does not collect the sendmail child process correctly (Dovecot 2.3.4.1, Pigeonhole 0.5.4)
On 02/11/2019 03:05, Delian Krustev via dovecot wrote:> Hi, > > I got this simple test Sieve config: > > ############################################# > # Sieve Filter > # Generated by Ingo (http://www.horde.org/apps/ingo/) (11/01/2019, 10:46:06 PM) > > # Forwards > if true { > redirect "krustev at krustev.net"; > } > # Forward Keep Action > if true { > keep; > stop; > } > ############################################# > > > Which used to work fine with: > > # 2.2.27 (c0f36b0): /etc/dovecot/dovecot.conf > # Pigeonhole version 0.4.16 (fed8554) > # OS: Linux 4.9.0-8-amd64 x86_64 Debian 9.11 > > Then I've tried migrating to: > > # 2.3.4.1 (f79e8e7e4): /etc/dovecot/dovecot.conf > # Pigeonhole version 0.5.4 () > # OS: Linux 4.19.0-6-amd64 x86_64 Debian 10.1 > > And started getting the following errors: > > # cat .dovecot.sieve.log > sieve: info: started log at Nov 01 22:55:50. > error: msgid=<4221584.AlbrL4BjWX at home>: failed to redirect message to <krustev at krustev.net>: Failed to execute sendmail (temporary failure).I think we already know about this issue (tracked as DOP-942). It apparently happens because qmail masks the SIGCHLD signal while Dovecot doesn't unmask it before waiting for children. Regards, Stephan.> > A google search lead me to: > > https://bugs.archlinux.org/task/56933 > > which blames systemd service changes: > > NoNewPrivileges=true > > , however I'm not running dovecot via systemd but use DJB's daemontools for > service supervision . > > The page above suggested to change from: > > sendmail_path = /usr/sbin/sendmail > > to > > submission_host = localhost > > in "conf.d/15-lda.conf", which worked fine. > But since I don't have other problems with "sendmail" execution I've decided > to debug this further. > > I have dovecot integrated with Qmail: > > # apt-cache policy qmail > qmail: > Installed: 1.06-6+b1 > Candidate: 1.06-6+b1 > Version table: > *** 1.06-6+b1 900 > 900 http://deb.debian.org/debian buster/main amd64 Packages > > so sendmail's implementation comes from Qmail: > > # ls -al `which sendmail` > lrwxrwxrwx 1 root root 14 Feb 23 2015 /usr/sbin/sendmail -> qmail-sendmail > > > I've strace-d the qmail-lspawn process which executes dovecot's LDA, which > processes the Sieve rules and executes "sendmail". > The sendmail process executes and returns successfully, the mail is delivered, > however the Sieve code does not wait for the child process correctly and tries > to kill the already exited process after 30 seconds timeout. > > Here are the problematic Sieve syscalls: > > 22:55:15.283482 execve("/usr/lib/dovecot/deliver", ["/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x5620b4330c40 /* 61 vars */) = 0 > ... > 22:55:15.286845 pipe([5, 6]) = 0 > 22:55:15.286870 fcntl(5, F_GETFL) = 0 (flags O_RDONLY) > 22:55:15.286889 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 > 22:55:15.286909 fcntl(6, F_GETFL) = 0x1 (flags O_WRONLY) > 22:55:15.286928 fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 > 22:55:15.286947 fcntl(5, F_GETFD) = 0 > 22:55:15.286966 fcntl(5, F_SETFD, FD_CLOEXEC) = 0 > 22:55:15.286985 fcntl(6, F_GETFD) = 0 > 22:55:15.287003 fcntl(6, F_SETFD, FD_CLOEXEC) = 0 > 22:55:15.287024 epoll_create(128) = 7 > 22:55:15.287043 fcntl(7, F_GETFD) = 0 > 22:55:15.287062 fcntl(7, F_SETFD, FD_CLOEXEC) = 0 > 22:55:15.287084 epoll_ctl(7, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47201248, u64=94699781110752}}) = 0 > ... > 22:55:15.294625 epoll_ctl(7, EPOLL_CTL_DEL, 5, 0x7ffd59156b8c) = 0 > 22:55:15.294647 epoll_create(128) = 12 > 22:55:15.294668 fcntl(12, F_GETFD) = 0 > 22:55:15.294687 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 > 22:55:15.294708 epoll_ctl(12, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47455536, u64=94699781365040}}) = 0 > > Dovecot/Sieve will wait for this pipe FD 5 later, but no event will arrive. > > 22:55:15.294740 rt_sigaction(SIGCHLD, {sa_handler=0x7f15404b3640, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f1540229840}, NULL, 8) = 0 > 22:55:15.294763 pipe([13, 14]) = 0 > 22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929 > > > The sendmail child process with PID 6929 is forked here. > > > 22:55:15.294928 close(13) = 0 > 22:55:15.294969 fcntl(14, F_GETFL) = 0x1 (flags O_WRONLY) > 22:55:15.295006 fcntl(14, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 > 22:55:15.295039 lseek(14, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) > 22:55:15.295065 getsockname(14, 0x7ffd59156a40, [28]) = -1 ENOTSOCK (Socket operation on non-socket) > 22:55:15.295109 write(14, "EDITTED LONG MESSAGE DATA\n", 1461) = 1461 > 22:55:15.295150 close(14) = 0 > 22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0 > 22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0 > > > Here's where the 30 second's timeout is requested and waited. > > > 22:55:45.325580 kill(6929, SIGTERM) = 0 > 22:55:45.325769 epoll_wait(12, [], 1, 5000) = 0 > 22:55:50.331108 kill(6929, SIGKILL) = 0 > 22:55:50.332867 wait4(6929, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6929 > 22:55:50.333020 epoll_ctl(12, EPOLL_CTL_DEL, 5, 0x7ffd59156b6c) = 0 > 22:55:50.333060 epoll_ctl(7, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47201248, u64=94699781110752}}) = 0 > 22:55:50.333097 close(12) = 0 > 22:55:50.333240 sendto(9, "<19>Nov 1 22:55:50 dovecot: lda(krustev)<6926><IlpXEdO3vF0OGwAA4EGeoA>: Error: sieve: msgid=<4221584.AlbrL4BjWX at home>: failed to redirect message to <krustev at krustev.net>: Failed to execute sendmail (temporary failure)", 220, MSG_NOSIGNAL, NULL, 0) = 220 > 22:55:50.333321 openat(AT_FDCWD, "/home/krustev/mail/domains/krustev.net/tester/.dovecot.sieve.log", O_WRONLY|O_CREAT|O_APPEND, 0600) = 12 > 22:55:50.333557 fstat(12, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 > 22:55:50.333613 lseek(12, 0, SEEK_CUR) = 0 > 22:55:50.333643 fstat(12, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 > 22:55:50.333678 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0 > 22:55:50.333726 write(12, "sieve: info: started log at Nov 01 22:55:50.\n", 45) = 45 > 22:55:50.333788 write(12, "error: msgid=<4221584.AlbrL4BjWX at home>: failed to redirect message to <krustev at krustev.net>: Failed to execute sendmail (temporary failure).\n", 141) = 141 > 22:55:50.333857 sendto(9, "<19>Nov 1 22:55:50 dovecot: lda(krustev)<6926><IlpXEdO3vF0OGwAA4EGeoA>: Error: sieve: Execution of script /home/krustev/mail/domains/krustev.net/tester/.dovecot.sieve was aborted due to temporary failure (user logfile /home/krustev/mail/domains/krustev.net/tester/.dovecot.sieve.log may reveal additional details)", 317, MSG_NOSIGNAL, NULL, 0) = 317 > 22:55:50.333941 close(10) = 0 > 22:55:50.333996 close(12) = 0 > 22:55:50.334057 lstat("/home/krustev/mail/domains/krustev.net/tester/.dovecot.lda-dupes.lock", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 > 22:55:50.334095 unlink("/home/krustev/mail/domains/krustev.net/tester/.dovecot.lda-dupes.lock") = 0 > 22:55:50.334188 close(11) = 0 > 22:55:50.334245 write(2, "Execution of Sieve filters was aborted due to temporary failure\n", 64) = 64 > 22:55:50.334418 munmap(0x7f15401bc000, 103680) = 0 > 22:55:50.334481 munmap(0x7f15403b3000, 32800) = 0 > 22:55:50.334516 munmap(0x7f15400fb000, 788352) = 0 > 22:55:50.334610 epoll_ctl(7, EPOLL_CTL_DEL, 8, 0x7ffd5915706c) = 0 > 22:55:50.334640 close(8) = 0 > 22:55:50.334685 epoll_ctl(7, EPOLL_CTL_DEL, 5, 0x7ffd5915706c) = 0 > 22:55:50.334738 close(5) = 0 > 22:55:50.334785 close(6) = 0 > 22:55:50.334813 close(7) = 0 > 22:55:50.334841 close(4) = 0 > 22:55:50.334881 exit_group(75) = ? > 22:55:50.335077 +++ exited with 75 +++ > > > Here's the interesting part of sendmail's (PID 6929) execution with timestamps: > .. > 22:55:15.295268 execve("/usr/sbin/sendmail", ["/usr/sbin/sendmail", "-i", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], NULL) = 0 > ... > 22:55:15.296024 chdir("/var/lib/qmail") = 0 > 22:55:15.296055 execve("bin/qmail-inject", ["bin/qmail-inject", "-a", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], 0x55741b1890e0 /* 2 vars */) = 0 > ... > 22:55:15.297264 vfork() = 6930 > .. > 22:55:15.297930 wait4(6930, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6930 > 22:55:15.305391 exit_group(0) = ? > 22:55:15.305472 +++ exited with 0 +++ > > The PID 6930 is a call to the setuid qmail-queue, which also executes and returns normally. > > > With all the above stated - It seems that the problem is in the "epoll" calls. > Either it is not set-up correctly or dovecot-LDA does not receive the > information it waits for. > > I've tried to make my way trough Dovecot's source code but have concluded that > someone more familiar should be able to spot the problem much quicker than me. > > > Let me know if you need some additional information. > Sorry for the long mail. > > > > Dovecot config follows: > > # 2.3.4.1 (f79e8e7e4): /etc/dovecot/dovecot.conf > # Pigeonhole version 0.5.4 () > # OS: Linux 4.19.0-6-amd64 x86_64 Debian 10.1 > # Hostname: pinko.livehostserver.com > auth_mechanisms = plain login > disable_plaintext_auth = no > lda_mailbox_autocreate = yes > lda_mailbox_autosubscribe = yes > log_path = /dev/stderr > log_timestamp > mail_plugins = " quota" > managesieve_notify_capability = mailto > managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext > namespace inbox { > inbox = yes > location > mailbox Archive { > special_use = \Archive > } > mailbox Archives { > special_use = \Archive > } > mailbox "Deleted Items" { > special_use = \Trash > } > mailbox "Deleted Messages" { > special_use = \Trash > } > mailbox Drafts { > auto = subscribe > special_use = \Drafts > } > mailbox Junk { > auto = subscribe > special_use = \Junk > } > mailbox "Junk E-mail" { > special_use = \Junk > } > mailbox Sent { > auto = subscribe > special_use = \Sent > } > mailbox "Sent Items" { > special_use = \Sent > } > mailbox "Sent Mail" { > special_use = \Sent > } > mailbox "Sent Messages" { > special_use = \Sent > } > mailbox Spam { > special_use = \Junk > } > mailbox Trash { > auto = subscribe > special_use = \Trash > } > mailbox sent-mail { > special_use = \Sent > } > prefix > } > passdb { > args = /usr/local/sbin/vchkpw-dovecot > driver = checkpassword > } > plugin { > antispam_allow_append_to_spam = yes > antispam_backend = pipe > antispam_pipe_program = /bin/sh > antispam_pipe_program_notspam_args = -c;HOME=$PWD exec nice sa-learn --ham > antispam_pipe_program_spam_args = -c;HOME=$PWD exec nice sa-learn --spam > antispam_pipe_tmpdir = /tmp > antispam_spam_pattern_ignorecase = spam;junk;junk * > antispam_trash_pattern_ignorecase = trash;Deleted * > quota = maildir:User quota > sieve = ~/.dovecot.sieve > sieve_dir = ~/sieve > sieve_redirect_envelope_from = recipient > } > protocols = " imap sieve pop3" > service anvil { > client_limit = 1536 > } > service auth { > client_limit = 1536 > } > service imap-login { > process_limit = 512 > } > service managesieve-login { > inet_listener sieve { > port = 4190 > } > process_min_avail = 1 > service_count = 1 > } > service managesieve { > process_limit = 50 > } > service pop3-login { > process_limit = 512 > } > service stats { > client_limit = 1024 > unix_listener stats-writer { > mode = 0666 > } > } > ssl_ca = </scm/deb-global/etc/ssl/intermediate-CAs.crt > ssl_cert = </etc/ssl/default-fullchain.pem > ssl_client_ca_dir = /etc/ssl/certs > ssl_dh = # hidden, use -P to show it > ssl_key = # hidden, use -P to show it > userdb { > driver = prefetch > } > protocol lda { > info_log_path > log_path > mail_plugins = " quota sieve" > } > protocol imap { > mail_max_userip_connections = 64 > mail_plugins = " quota imap_quota antispam" > } > protocol sieve { > mail_max_userip_connections = 10 > managesieve_implementation_string = Dovecot Pigeonhole > managesieve_max_line_length = 65536 > } > > > Best regards > -- > Delian
Delian Krustev
2019-Nov-03 17:01 UTC
Sieve redirect does not collect the sendmail child process correctly (Dovecot 2.3.4.1, Pigeonhole 0.5.4)
On Saturday, November 2, 2019 10:40:45 PM EET Stephan Bosch <stephan at rename-it.nl> wrote:> I think we already know about this issue (tracked as DOP-942). It > apparently happens because qmail masks the SIGCHLD signal while Dovecot > doesn't unmask it before waiting for children.You're right. Isn't this a simple fix - a call to sigprocmask ? This should not be a problem in qmail-lspawn/qmail-local as their code has not been changed for years. So there should have been a breaking change in Dovecot/Pigeonhole. What follows is some more debugging from me. The chain of processes that lead to dovecot-lda and sendmail execution look like this: PID 32457 qmail-lspawn 6923 qmail-local 6924 sh -c vlelivermail 6925 sh -c preline /usr/local/bin/vpop-lda 6926 vpop-lda(perl script) -> deliver -> doveconf -> deliver 6927 grep 6928 pwd 6929 sendmail -> qmail-inject 6930 qmail-queue Just before executing sendmail, dovecot-lda is registering a signal handler for SIGCHLD: 22:55:15.294647 epoll_create(128) = 12 22:55:15.294668 fcntl(12, F_GETFD) = 0 22:55:15.294687 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 22:55:15.294708 epoll_ctl(12, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=47455536, u64=94699781365040}}) = 0 22:55:15.294740 rt_sigaction(SIGCHLD, {sa_handler=0x7f15404b3640, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f1540229840}, NULL, 8) = 0 22:55:15.294763 pipe([13, 14]) = 0 22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929 Later dovecot-lda checks non-blockingly whether sendmail has exited and waits for epoll IO event: 22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0 22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0 22:55:45.325580 kill(6929, SIGTERM) = 0 22:55:45.325769 epoll_wait(12, [], 1, 5000) = 0 22:55:50.331108 kill(6929, SIGKILL) = 0 I'm not sure why it is waiting for IO event on FD 5, which is a PIPE FD initialized much earlier, with no actual reads/writes on it: 22:55:15.286845 pipe([5, 6]) = 0 22:55:15.286870 fcntl(5, F_GETFL) = 0 (flags O_RDONLY) 22:55:15.286889 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 22:55:15.286909 fcntl(6, F_GETFL) = 0x1 (flags O_WRONLY) 22:55:15.286928 fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 The pipe used for communication with the sendmail child is this: 22:55:15.294763 pipe([13, 14]) = 0 22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929 22:55:15.294928 close(13) = 0 22:55:15.294969 fcntl(14, F_GETFL) = 0x1 (flags O_WRONLY) 22:55:15.295006 fcntl(14, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 22:55:15.295039 lseek(14, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) 22:55:15.295065 getsockname(14, 0x7ffd59156a40, [28]) = -1 ENOTSOCK (Socket operation on non-socket) 22:55:15.295109 write(14, "X-Sieve: Pigeonhole Sieve 0.5.4 ()\nX-Sieve-Redirected-From: ...", 1461) = 1461 22:55:15.295150 close(14) = 0 22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0 22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0 22:55:45.325580 kill(6929, SIGTERM) = 0 This is the process creation/waiting and signal mask debugging: qmail-lspawn (PID 32457) 22:55:15.227742 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 ... 22:55:15.227970 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f8a117de7d0) = 6923 ... 22:55:15.228237 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 22:55:15.228267 select(4, [0 3], NULL, NULL, NULL) = 1 (in [3]) 22:55:50.334325 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 22:55:50.334367 read(3, "Execution of Sieve filters was aborted due to temporary failure\n", 128) = 64 22:55:50.334406 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 22:55:50.334440 select(4, [0 3], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted if no handler) 22:55:50.335625 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6923, si_uid=1002, si_status=111, si_utime=0, si_stime=0} --- 22:55:50.335649 wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 111}], WNOHANG, NULL) = 6923 22:55:50.335683 close(4) = 0 22:55:50.335710 wait4(-1, 0x7fff05f68d8c, WNOHANG, NULL) = -1 ECHILD (No child processes) 22:55:50.335742 rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call) 22:55:50.335773 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 22:55:50.335801 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 22:55:50.335827 select(4, [0 3], NULL, NULL, NULL) = 1 (in [3]) 22:55:50.335874 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 22:55:50.335899 read(3, "", 128) = 0 22:55:50.335925 write(1, "\0ZExecution of Sieve filters was aborted due to temporary failure\n\0", 67) = 67 22:55:50.335974 close(3) = 0 22:55:50.336006 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 22:55:50.336034 select(1, [0], NULL, NULL, NULL <detached ...> qmail-local (PID 6923) 22:55:15.229053 execve("bin/qmail-local", ["bin/qmail-local", "--", "krustev.net", "/home/krustev/mail/domains/krustev.net", "krustev.net-tester", "-", "tester", "krustev.net", "krustev at krustev.net", "./Maildir/"], 0x7fff05f69560 /* 1 var */) = 0 ... 22:55:15.230313 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fde0346a7d0) = 6924 22:55:15.230420 wait4(6924, [{WIFEXITED(s) && WEXITSTATUS(s) == 111}], 0, NULL) = 6924 22:55:50.335462 exit_group(111) = ? 22:55:50.335572 +++ exited with 111 +++ sh -> vlelivermail (PID 6924) 22:55:15.230511 execve("/bin/sh", ["/bin/sh", "-c", " /var/lib/vpopmail/bin/vdelivermail '' bounce-no-mailbox"], 0x564c0be3ac30 /* 19 vars */) = 0 .. 22:55:15.233866 rt_sigprocmask(SIG_BLOCK, NULL, [CHLD], 8) = 0 ... 22:55:15.233997 execve("/var/lib/vpopmail/bin/vdelivermail", ["/var/lib/vpopmail/bin/vdelivermail", "", "bounce-no-mailbox"], 0x55e1e61bcde0 /* 22 vars */) = 0 ... 22:55:15.237461 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f7bea310a10) = 6925 22:55:15.237581 wait4(6925, [{WIFEXITED(s) && WEXITSTATUS(s) == 75}], 0, NULL) = 6925 22:55:50.335255 lseek(0, -1337, SEEK_CUR) = 0 22:55:50.335283 exit_group(111) = ? 22:55:50.335385 +++ exited with 111 +++ sh -> preline (PID 6925) 22:55:15.237676 execve("/bin/sh", ["/bin/sh", "-c", "preline /usr/local/bin/vpop-lda"], 0x20a8b30 /* 23 vars */) = 0 ... 22:55:15.240248 rt_sigprocmask(SIG_BLOCK, NULL, [CHLD], 8) = 0 ... 22:55:15.241235 execve("/var/lib/qmail/bin/preline", ["preline", "/usr/local/bin/vpop-lda"], 0x564b8bbbbc50 /* 23 vars */) = 0 ... 22:55:15.242411 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f68e199c7d0) = 6926 ... 22:55:15.243005 wait4(6926, [{WIFEXITED(s) && WEXITSTATUS(s) == 75}], 0, NULL) = 6926 22:55:50.335131 exit_group(75) = ? 22:55:50.335203 +++ exited with 75 +++ vpop-lda->deliver->doveconf->deliver (PID 6926 22:55:15.242860 execve("/usr/local/bin/vpop-lda", ["/usr/local/bin/vpop-lda"], 0x7ffe55a89810 /* 23 vars */) = 0 ... 22:55:15.257985 rt_sigprocmask(SIG_BLOCK, [CHLD], [CHLD], 8) = 0 22:55:15.258010 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd5e8098490) = 6927 ... 22:55:15.258305 wait4(6927, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 6927 ... 22:55:15.260774 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0 ... 22:55:15.261905 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd5e8098490) = 6928 ... 22:55:15.263580 wait4(6928, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6928 ... 22:55:15.263650 execve("/usr/lib/dovecot/deliver", ["/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x563ae02171e0 /* 23 vars */) = 0 ... 22:55:15.267914 execve("/usr/bin/doveconf", ["/usr/bin/doveconf", "-f", "service=lda", "-c", "/etc/dovecot/dovecot.conf", "-m", "lda", "-e", "/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x55ad578460b0 /* 25 vars */) = 0 ... 22:55:15.283482 execve("/usr/lib/dovecot/deliver", ["/usr/lib/dovecot/deliver", "-e", "-a", "tester at krustev.net"], 0x5620b4330c40 /* 61 vars */) = 0 ... 22:55:15.294788 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f15401ec2d0) = 6929 ... 22:55:15.295175 wait4(6929, 0x562102d4fee4, WNOHANG, NULL) = 0 22:55:15.295211 epoll_wait(12, [], 1, 30000) = 0 22:55:45.325580 kill(6929, SIGTERM) = 0 22:55:45.325769 epoll_wait(12, [], 1, 5000) = 0 22:55:50.331108 kill(6929, SIGKILL) = 0 22:55:50.332867 wait4(6929, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6929 grep (PID 6927) 22:55:15.258651 execve("/bin/grep", ["grep", "-ql", "Spam", ".dovecot.sieve"], 0x563ae02171e0 /* 23 vars */) = 0 ... 22:55:15.260681 exit_group(1) = ? 22:55:15.260751 +++ exited with 1 +++ pwd (PID 6928) 22:55:15.262492 execve("/bin/pwd", ["pwd"], 0x563ae02171e0 /* 23 vars */) = 0 ... 22:55:15.263532 exit_group(0) = ? 22:55:15.263587 +++ exited with 0 +++ 22:55:15.295268 execve("/usr/sbin/sendmail", ["/usr/sbin/sendmail", "-i", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], NULL) = 0 ... 22:55:15.296055 execve("bin/qmail-inject", ["bin/qmail-inject", "-a", "-f", "tester at krustev.net", "--", "krustev at krustev.net"], 0x55741b1890e0 /* 2 vars */) = 0 ... 22:55:15.297264 vfork() = 6930 ... 22:55:15.297930 wait4(6930, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6930 22:55:15.305391 exit_group(0) = ? 22:55:15.305472 +++ exited with 0 +++ Best regards -- Delian
Maybe Matching Threads
- Sieve redirect does not collect the sendmail child process correctly (Dovecot 2.3.4.1, Pigeonhole 0.5.4)
- Sieve redirect does not collect the sendmail child process correctly (Dovecot 2.3.4.1, Pigeonhole 0.5.4)
- Auth process sometimes stop responding after upgrade
- Auth process sometimes stop responding after upgrade
- dovecot auth using 100% CPU