Harlan Stenn
2021-Jul-02 00:56 UTC
dsync replication fails with No space left on device / Out of memory
Inodes? df -i On 7/1/2021 5:07 PM, Steven Varco wrote:> Hi All > > Since I configured dsync replication I get strange errors in the maillog on my two mail dovecot nodes: > > PRIMARY: > Jul 2 01:21:42 mx01.example.com dovecot: doveadm: Error: read(mx02.example.com) failed: read(size=3148) failed: Connection reset by peer (last sent=mail, last recv=mail (EOL)) > > > The secondary is more interesting: > > SECONDARY > Jul 2 01:21:42 mx02 dovecot: doveadm: Error: close(-1[istream-seekable.c:237]) failed: No space left on device > Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: pool_system_realloc(268435456): Out of memory > Jul 2 01:21:43 mx02 dovecot: doveadm: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7f2e9be4c92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7f2e9be4ca0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7f2e9bddc3d3] -> /usr/lib64/dovecot/libdo > Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: master: service(doveadm): child 2876 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) > Jul 2 01:21:51 mx02 dovecot: dsync-local(user at example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7fd56e17e92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7fd56e17ea0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7fd56e10e3d3] -> /us > Jul 2 01:21:51 mx02 dovecot: dsync-local(user at example.com): Fatal: master: service(doveadm): child 2882 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) > > > The error messages state that disk space and/or memory is a problem, but disk space and memory is enough available: > > mx02 [~] # df -h /srv/mail/ > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/system-mail 10G 5.7G 4.3G 58% /srv/mail > > mx02 [~] # free -m > total used free shared buff/cache available > Mem: 3789 1602 1088 199 1097 1759 > Swap: 471 93 378 > > > I also tried to increase vsz_limit from 256 MB to 512 MB, which did not help. > > > And for the sake of completness also the connection to the doveadm port works well from both nodes: > > mx01-prod [~] # telnet mx02 14310 > Trying 172.20.19.225... > Connected to mx02. > Escape character is '^]'. > ^] > > > mx02 [~] # telnet mx01 14310 > Trying 172.20.19.251... > Connected to mx01. > Escape character is '^]'. > ^] > > > Although mail replication seems to be working properly and mails are in sync on both nodes (as what I could see), I would like to find the cause of this messages, as this does definetely don?t look normal? > > I?m grateful for any help, since I?m quite on a struggle now? > > Steven > > > Here?s my config > -------------------------------------------------------------------------------- > # doveconf -n > # 2.2.36 (1f10bfa63): /etc/dovecot/dovecot.conf > # Pigeonhole version 0.4.24 (124e06aa) > # OS: Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 CentOS Linux release 7.9.2009 (Core) > # Hostname: mx01.example.com > auth_mechanisms = plain login > auth_verbose = yes > dict { > sqlquota = mysql:/etc/dovecot/dict-sqlquota.conf.ext > } > doveadm_password = # hidden, use -P to show it > doveadm_port = 14310 > first_valid_uid = 1000 > mail_plugins = quota notify replication > managesieve_notify_capability = mailto > managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext > mbox_write_locks = fcntl > namespace inbox { > inbox = yes > location > mailbox Drafts { > special_use = \Drafts > } > mailbox Junk { > special_use = \Junk > } > mailbox Sent { > special_use = \Sent > } > mailbox "Sent Messages" { > special_use = \Sent > } > mailbox Trash { > special_use = \Trash > } > prefix > separator = / > type = private > } > passdb { > args = /etc/dovecot/dovecot-sql.conf.ext > driver = sql > } > plugin { > mail_replica = tcp:mx02.example.com > quota = maildir:User quota > quota_exceeded_message = Quota exceeded, please go to http://www.example.com/over_quota_help for instructions on how to fix this. > quota_rule2 = INBOX.Trash:storage=+100M > quota_status_nouser = DUNNO > quota_status_overquota = 552 5.2.2 Mailbox is full / Mailbox ist voll > quota_status_success = DUNNO > quota_warning = storage=90%% quota-warning 90 %u > quota_warning2 = -storage=90%% quota-warning below %u > sieve = file:~/sieve;active=~/.dovecot.sieve > } > postmaster_address = postmaster at example.com > protocols = imap pop3 lmtp sieve > replication_dsync_parameters = -d -l 30 -U > service aggregator { > fifo_listener replication-notify-fifo { > user = vmail > } > unix_listener replication-notify { > user = vmail > } > } > service auth { > unix_listener /var/spool/postfix/private/auth { > group = postfix > mode = 0660 > user = postfix > } > unix_listener auth-userdb { > user = vmail > } > } > service dict { > unix_listener dict { > user = vmail > } > } > service doveadm { > inet_listener { > port = 14310 > ssl = no > } > } > service managesieve-login { > inet_listener sieve { > port = 4190 > } > } > service quota-status { > client_limit = 1 > executable = quota-status -p postfix > inet_listener { > port = 14340 > } > } > service quota-warning { > executable = script /usr/local/libexec/dovecot/quota-warning.sh > unix_listener quota-warning { > user = vmail > } > user = vmail > } > service replicator { > process_min_avail = 1 > unix_listener replicator-doveadm { > mode = 0600 > user = vmail > } > } > ssl = required > ssl_cert = </etc/ssl/acme/certs/mail.example.com.chain.crt > ssl_key = # hidden, use -P to show it > userdb { > args = /etc/dovecot/dovecot-sql.conf.ext > driver = sql > } > verbose_proctitle = yes > protocol lmtp { > mail_plugins = quota notify replication sieve > } > protocol lda { > mail_plugins = quota notify replication sieve > } > protocol imap { > mail_max_userip_connections = 20 > mail_plugins = quota notify replication imap_quota > } > -------------------------------------------------------------------------------- > > > mx02.example.com has exact the same config, except of: > -------------------------------------------------------------------------------- > plugin { > mail_replica = tcp:mx01.example.com > -------------------------------------------------------------------------------- > > > ? > https://steven.varco.ch/ > https://www.tech-island.com/ >
Jörg Faudin Schulz
2021-Jul-02 05:43 UTC
dsync replication fails with No space left on device / Out of memory
Hi, the memory issue has already been reported, not resolved yet: https://www.mail-archive.com/dovecot at dovecot.org/msg83763.html the disk-free issue is something different. Increasing memory parameters doesn't help- the sync only crashes later. Here, everything seems to be synced fine nevertheless. Am 02.07.21 um 02:56 schrieb Harlan Stenn:> Inodes? df -i > > On 7/1/2021 5:07 PM, Steven Varco wrote: >> Hi All >> >> Since I configured dsync replication I get strange errors in the maillog on my two mail dovecot nodes: >> >> PRIMARY: >> Jul 2 01:21:42 mx01.example.com dovecot: doveadm: Error: read(mx02.example.com) failed: read(size=3148) failed: Connection reset by peer (last sent=mail, last recv=mail (EOL)) >> >> >> The secondary is more interesting: >> >> SECONDARY >> Jul 2 01:21:42 mx02 dovecot: doveadm: Error: close(-1[istream-seekable.c:237]) failed: No space left on device >> Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: pool_system_realloc(268435456): Out of memory >> Jul 2 01:21:43 mx02 dovecot: doveadm: Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7f2e9be4c92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7f2e9be4ca0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7f2e9bddc3d3] -> /usr/lib64/dovecot/libdo >> Jul 2 01:21:43 mx02 dovecot: doveadm: Fatal: master: service(doveadm): child 2876 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) >> Jul 2 01:21:51 mx02 dovecot: dsync-local(user at example.com): Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0(+0xa192e) [0x7fd56e17e92e] -> /usr/lib64/dovecot/libdovecot.so.0(+0xa1a0e) [0x7fd56e17ea0e] -> /usr/lib64/dovecot/libdovecot.so.0(i_error+0) [0x7fd56e10e3d3] -> /us >> Jul 2 01:21:51 mx02 dovecot: dsync-local(user at example.com): Fatal: master: service(doveadm): child 2882 returned error 83 (Out of memory (service doveadm { vsz_limit=256 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump) >> >> >> The error messages state that disk space and/or memory is a problem, but disk space and memory is enough available: >> >> mx02 [~] # df -h /srv/mail/ >> Filesystem Size Used Avail Use% Mounted on >> /dev/mapper/system-mail 10G 5.7G 4.3G 58% /srv/mail >> >> mx02 [~] # free -m >> total used free shared buff/cache available >> Mem: 3789 1602 1088 199 1097 1759 >> Swap: 471 93 378 >> >> >> I also tried to increase vsz_limit from 256 MB to 512 MB, which did not help. >> >> >> And for the sake of completness also the connection to the doveadm port works well from both nodes: >> >> mx01-prod [~] # telnet mx02 14310 >> Trying 172.20.19.225... >> Connected to mx02. >> Escape character is '^]'. >> ^] >> >> >> mx02 [~] # telnet mx01 14310 >> Trying 172.20.19.251... >> Connected to mx01. >> Escape character is '^]'. >> ^] >> >> >> Although mail replication seems to be working properly and mails are in sync on both nodes (as what I could see), I would like to find the cause of this messages, as this does definetely don?t look normal? >> >> I?m grateful for any help, since I?m quite on a struggle now? >> >> Steven >> >> >> Here?s my config >> -------------------------------------------------------------------------------- >> # doveconf -n >> # 2.2.36 (1f10bfa63): /etc/dovecot/dovecot.conf >> # Pigeonhole version 0.4.24 (124e06aa) >> # OS: Linux 3.10.0-1160.31.1.el7.x86_64 x86_64 CentOS Linux release 7.9.2009 (Core) >> # Hostname: mx01.example.com >> auth_mechanisms = plain login >> auth_verbose = yes >> dict { >> sqlquota = mysql:/etc/dovecot/dict-sqlquota.conf.ext >> } >> doveadm_password = # hidden, use -P to show it >> doveadm_port = 14310 >> first_valid_uid = 1000 >> mail_plugins = quota notify replication >> managesieve_notify_capability = mailto >> managesieve_sieve_capability = fileinto reject envelope encoded-character vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy include variables body enotify environment mailbox date index ihave duplicate mime foreverypart extracttext >> mbox_write_locks = fcntl >> namespace inbox { >> inbox = yes >> location >> mailbox Drafts { >> special_use = \Drafts >> } >> mailbox Junk { >> special_use = \Junk >> } >> mailbox Sent { >> special_use = \Sent >> } >> mailbox "Sent Messages" { >> special_use = \Sent >> } >> mailbox Trash { >> special_use = \Trash >> } >> prefix >> separator = / >> type = private >> } >> passdb { >> args = /etc/dovecot/dovecot-sql.conf.ext >> driver = sql >> } >> plugin { >> mail_replica = tcp:mx02.example.com >> quota = maildir:User quota >> quota_exceeded_message = Quota exceeded, please go to http://www.example.com/over_quota_help for instructions on how to fix this. >> quota_rule2 = INBOX.Trash:storage=+100M >> quota_status_nouser = DUNNO >> quota_status_overquota = 552 5.2.2 Mailbox is full / Mailbox ist voll >> quota_status_success = DUNNO >> quota_warning = storage=90%% quota-warning 90 %u >> quota_warning2 = -storage=90%% quota-warning below %u >> sieve = file:~/sieve;active=~/.dovecot.sieve >> } >> postmaster_address = postmaster at example.com >> protocols = imap pop3 lmtp sieve >> replication_dsync_parameters = -d -l 30 -U >> service aggregator { >> fifo_listener replication-notify-fifo { >> user = vmail >> } >> unix_listener replication-notify { >> user = vmail >> } >> } >> service auth { >> unix_listener /var/spool/postfix/private/auth { >> group = postfix >> mode = 0660 >> user = postfix >> } >> unix_listener auth-userdb { >> user = vmail >> } >> } >> service dict { >> unix_listener dict { >> user = vmail >> } >> } >> service doveadm { >> inet_listener { >> port = 14310 >> ssl = no >> } >> } >> service managesieve-login { >> inet_listener sieve { >> port = 4190 >> } >> } >> service quota-status { >> client_limit = 1 >> executable = quota-status -p postfix >> inet_listener { >> port = 14340 >> } >> } >> service quota-warning { >> executable = script /usr/local/libexec/dovecot/quota-warning.sh >> unix_listener quota-warning { >> user = vmail >> } >> user = vmail >> } >> service replicator { >> process_min_avail = 1 >> unix_listener replicator-doveadm { >> mode = 0600 >> user = vmail >> } >> } >> ssl = required >> ssl_cert = </etc/ssl/acme/certs/mail.example.com.chain.crt >> ssl_key = # hidden, use -P to show it >> userdb { >> args = /etc/dovecot/dovecot-sql.conf.ext >> driver = sql >> } >> verbose_proctitle = yes >> protocol lmtp { >> mail_plugins = quota notify replication sieve >> } >> protocol lda { >> mail_plugins = quota notify replication sieve >> } >> protocol imap { >> mail_max_userip_connections = 20 >> mail_plugins = quota notify replication imap_quota >> } >> -------------------------------------------------------------------------------- >> >> >> mx02.example.com has exact the same config, except of: >> -------------------------------------------------------------------------------- >> plugin { >> mail_replica = tcp:mx01.example.com >> -------------------------------------------------------------------------------- >> >> >> ? >> https://steven.varco.ch/ >> https://www.tech-island.com/ >>