On 19.06.2017 20:36, antonello.cioffi at uniparthenope.it
wrote:>
> Hi everybody
>
> I'm finding on my server a lot of a lot of errors like this:
>
> Jun 19 14:22:45 posta2 kernel: [885017.412902] BUG: soft lockup - CPU#2
stuck for 22s! [dovecot-lda:11955]
> Jun 19 14:22:45 posta2 kernel: [885017.412906] Modules linked in: ocfs2(E)
jbd2 quota_tree dm_service_time dm_multipath ocfs2_dlmfs(E) ocfs2_stack_o2cb(E)
ocfs2_dlm(
> E) ocfs2_nodemanager(E) ocfs2_stackglue(E) configfs cpufreq_conservative
cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse nls_iso8859_1
nls_cp437 vfat fat
> loop shpchp iTCO_wdt bnx2 pci_hotplug rtc_cmos ipv6 ipv6_lib cdc_ether
usbnet ioatdma dca sg tpm_tis i2c_i801 serio_raw mii i7core_edac edac_core
pcspkr iTCO_vendor
> _support tpm mptctl tpm_bios button ext3 jbd mbcache dm_mirror
dm_region_hash dm_log linear ttm drm_kms_helper drm i2c_algo_bit sysimgblt
sysfillrect i2c_core syscop
> yarea uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif processor
thermal_sys hwmon scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh
dm_snapshot dm_m
> od mptsas mptscsih mptbase scsi_transport_sas scsi_mod
> Jun 19 14:22:45 posta2 kernel: [885017.412971] Supported: Yes
> Jun 19 14:22:45 posta2 kernel: [885017.412973] CPU 2
> Jun 19 14:22:45 posta2 kernel: [885017.412975] Modules linked in: ocfs2(E)
jbd2 quota_tree dm_service_time dm_multipath ocfs2_dlmfs(E) ocfs2_stack_o2cb(E)
ocfs2_dlm(
> E) ocfs2_nodemanager(E) ocfs2_stackglue(E) configfs cpufreq_conservative
cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse nls_iso8859_1
nls_cp437 vfat fat
> loop shpchp iTCO_wdt bnx2 pci_hotplug rtc_cmos ipv6 ipv6_lib cdc_ether
usbnet ioatdma dca sg tpm_tis i2c_i801 serio_raw mii i7core_edac edac_core
pcspkr iTCO_vendor
> _support tpm mptctl tpm_bios button ext3 jbd mbcache dm_mirror
dm_region_hash dm_log linear ttm drm_kms_helper drm i2c_algo_bit sysimgblt
sysfillrect i2c_core syscop
> yarea uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif processor
thermal_sys hwmon scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh
dm_snapshot dm_m
> od mptsas mptscsih mptbase scsi_transport_sas scsi_mod
> Jun 19 14:22:45 posta2 kernel: [885017.413027] Supported: Yes
> Jun 19 14:22:45 posta2 kernel: [885017.413029]
> Jun 19 14:22:45 posta2 kernel: [885017.413032] Pid: 11955, comm:
dovecot-lda Tainted: G E 3.0.101-0.46-default #1 IBM BladeCenter
HS22 -[7870H5G]-/68Y813
> 8
> Jun 19 14:22:45 posta2 kernel: [885017.413037] RIP:
0010:[<ffffffff81257b07>] [<ffffffff81257b07>]
find_next_zero_bit+0x67/0xc0
> Jun 19 14:22:45 posta2 kernel: [885017.413046] RSP: 0018:ffff88006cc855f0
EFLAGS: 00000287
> Jun 19 14:22:45 posta2 kernel: [885017.413049] RAX: 0000000000006f30 RBX:
ffffffff8118b348 RCX: 0000000000000010
> Jun 19 14:22:45 posta2 kernel: [885017.413051] RDX: 0001000000000000 RSI:
0000000000006f30 RDI: 0000000000006f00
> Jun 19 14:22:45 posta2 kernel: [885017.413054] RBP: 000000000000000f R08:
0000000000000f00 R09: ffff88006cc857e8
> Jun 19 14:22:45 posta2 kernel: [885017.413057] R10: 000000000000000b R11:
ffff88063c1fa678 R12: ffffffff8146d1ee
> Jun 19 14:22:45 posta2 kernel: [885017.413059] R13: 0000000000000000 R14:
ffff880655a918c0 R15: 0000000001f9f800
> Jun 19 14:22:45 posta2 kernel: [885017.413063] FS: 00007f7387952700(0000)
GS:ffff88067f240000(0000) knlGS:0000000000000000
> Jun 19 14:22:45 posta2 kernel: [885017.413066] CS: 0010 DS: 0000 ES: 0000
CR0: 0000000080050033
> Jun 19 14:22:45 posta2 kernel: [885017.413068] CR2: 00007f7386fad9b0 CR3:
000000011f6cc000 CR4: 00000000000007e0
> Jun 19 14:22:45 posta2 kernel: [885017.413071] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
> Jun 19 14:22:45 posta2 kernel: [885017.413074] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
> Jun 19 14:22:45 posta2 kernel: [885017.413077] Process dovecot-lda (pid:
11955, threadinfo ffff88006cc84000, task ffff88003f3284c0)
> Jun 19 14:22:45 posta2 kernel: [885017.413079] Stack:
> Jun 19 14:22:45 posta2 kernel: [885017.413085] ffffffffa052f86e
0000000000000000 ffff88006cc857e8 00007e000000000b
> Jun 19 14:22:45 posta2 kernel: [885017.413090] 6843000000006a00
ffff88063c1fa67a 0000000000000000 000000001f400000
> Jun 19 14:22:45 posta2 kernel: [885017.413095] 0000000000007e00
0000000000007e00 ffff880653aaccb8 ffff880656f7f000
> Jun 19 14:22:45 posta2 kernel: [885017.413100] Call Trace:
> Jun 19 14:22:45 posta2 kernel: [885017.413143] [<ffffffffa052f86e>]
ocfs2_block_group_find_clear_bits+0x6e/0x180 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413224] [<ffffffffa052fa23>]
ocfs2_cluster_group_search+0xa3/0x1f0 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413297] [<ffffffffa0530369>]
ocfs2_search_chain+0x139/0x730 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413367] [<ffffffffa0531ce8>]
ocfs2_claim_suballoc_bits+0x398/0x520 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413447] [<ffffffffa0531f0f>]
__ocfs2_claim_clusters+0x9f/0x340 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413519] [<ffffffffa050f8ff>]
ocfs2_local_alloc_new_window+0x1cf/0x320 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413581] [<ffffffffa050fdd0>]
ocfs2_local_alloc_slide_window+0x380/0x5e0 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413642]
[<fffffffbyefa05101d3>] ocfs2_reserve_local_alloc_bits+0x1a3/0x330 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413706] [<ffffffffa053396b>]
ocfs2_reserve_clusters_with_limit+0xab/0x330 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413780] [<ffffffffa0534d15>]
ocfs2_lock_allocators+0xc5/0x290 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413847] [<ffffffffa04e30fd>]
ocfs2_write_begin_nolock+0x89d/0x11d0 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413883] [<ffffffffa04e3b47>]
ocfs2_write_begin+0x117/0x240 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413900] [<ffffffff810fa1a2>]
generic_perform_write+0xc2/0x1c0
> Jun 19 14:22:45 posta2 kernel: [885017.413907] [<ffffffff810fa301>]
generic_file_buffered_write+0x61/0xa0
> Jun 19 14:22:45 posta2 kernel: [885017.413936] [<ffffffffa050055f>]
ocfs2_file_aio_write+0x92f/0x960 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.413966] [<ffffffff8115d478>]
do_sync_write+0xc8/0x110
> Jun 19 14:22:45 posta2 kernel: [885017.413972] [<ffffffff8115daae>]
vfs_write+0xce/0x140
> Jun 19 14:22:45 posta2 kernel: [885017.413977] [<ffffffff8115dc23>]
sys_write+0x53/0xa0
> Jun 19 14:22:45 posta2 kernel: [885017.413983] [<ffffffff8146c812>]
system_call_fastpath+0x16/0x1b
> Jun 19 14:22:45 posta2 kernel: [885017.413992] [<00007f7386c54300>]
0x7f7386c542ff
> Jun 19 14:22:45 posta2 kernel: [885017.413994] Code: 3f 77 61 48 c7 c0 ff
ff ff ff 44 89 c1 4a 8d 34 07 48 d3 e0 48 09 c2 48 83 fa ff 74 0b 48 f7 d2 48 0f
bc c2 48 8
> d 34 38 48 89 f0 <c3> 0f 1f 84 00 00 00 00 00 48 8b 10 48 83 fa ff
75 e0 48 83 c0
> Jun 19 14:22:45 posta2 kernel: [885017.414037] Call Trace:
> Jun 19 14:22:45 posta2 kernel: [885017.414070] [<ffffffffa052f86e>]
ocfs2_block_group_find_clear_bits+0x6e/0x180 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414141] [<ffffffffa052fa23>]
ocfs2_cluster_group_search+0xa3/0x1f0 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414213] [<ffffffffa0530369>]
ocfs2_search_chain+0x139/0x730 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414286] [<ffffffffa0531ce8>]
ocfs2_claim_suballoc_bits+0x398/0x520 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414359] [<ffffffffa0531f0f>]
__ocfs2_claim_clusters+0x9f/0x340 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414359] [<ffffffffa0531f0f>]
__ocfs2_claim_clusters+0x9f/0x340 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414430] [<ffffffffa050f8ff>]
ocfs2_local_alloc_new_window+0x1cf/0x320 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414492] [<ffffffffa050fdd0>]
ocfs2_local_alloc_slide_window+0x380/0x5e0 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414553] [<ffffffffa05101d3>]
ocfs2_reserve_local_alloc_bits+0x1a3/0x330 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414617] [<ffffffffa053396b>]
ocfs2_reserve_clusters_with_limit+0xab/0x330 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414691] [<ffffffffa0534d15>]
ocfs2_lock_allocators+0xc5/0x290 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414758] [<ffffffffa04e30fd>]
ocfs2_write_begin_nolock+0x89d/0x11d0 [ocfs2]
> Jun 19 14:22:45 posta2 kernel: [885017.414793] [<ffffffffa04e3b47>]
ocfs2_write_begin+0x117/0x240 [ocfs2]
> Jun 19 14:22:46 posta2 kernel: [885017.414809] [<ffffffff810fa1a2>]
generic_perform_write+0xc2/0x1c0
> Jun 19 14:22:46 posta2 kernel: [885017.414815] [<ffffffff810fa301>]
generic_file_buffered_write+0x61/0xa0
> Jun 19 14:22:46 posta2 kernel: [885017.414844] [<ffffffffa050055f>]
ocfs2_file_aio_write+0x92f/0x960 [ocfs2]
> Jun 19 14:22:46 posta2 kernel: [885017.414872] [<ffffffff8115d478>]
do_sync_write+0xc8/0x110
> Jun 19 14:22:46 posta2 kernel: [885017.414878] [<ffffffff8115daae>]
vfs_write+0xce/0x140
> Jun 19 14:22:46 posta2 kernel: [885017.414883] [<ffffffff8115dc23>]
sys_write+0x53/0xa0
> Jun 19 14:22:46 posta2 kernel: [885017.414888] [<ffffffff8146c812>]
system_call_fastpath+0x16/0x1b
> Jun 19 14:22:46 posta2 kernel: [885017.414895] [<00007f7386c54300>]
0x7f7386c542ff
>
> The machine is a SUSE Linux Enterprise Server 11 SP 3
>
> posta2:/var/core # dovecot -n
> # 2.2.30.2 (c0c463e): /usr/local/etc/dovecot/dovecot.conf
> # Pigeonhole version 0.4.18 (29cc74d)
> # OS: Linux 3.0.101-0.46-default x86_64 SUSE Linux Enterprise Server 11
(x86_64)
> auth_mechanisms = plain login
> auth_username_format = %Ln
> auth_verbose = yes
> default_internal_user = vmail
> default_login_user = nobody
> disable_plaintext_auth = no
> first_valid_uid = 100
> hostname = mail.xxxx.it
> lda_mailbox_autocreate = yes
> lda_mailbox_autosubscribe = yes
> mail_debug = yes
> mail_gid = 100
> mail_location = maildir:%h
> mail_plugins = " quota"
> mail_uid = 1002
> managesieve_notify_capability = mailto
> managesieve_sieve_capability = fileinto reject envelope encoded-character
vacation subaddress comparator-i;ascii-numeric relational regex imap4flags copy
include variables body enotify environment mailbox date index ihave duplicate
mime foreverypart extracttext spamtest spamtestplus
> passdb {
> args = /usr/local/etc/dovecot/dovecot-people-ldap.conf.ext
> driver = ldap
> }
> plugin {
> last_login_dict = redis:host=127.0.0.1:port=6379
> mail_log_events = delete undelete expunge copy mailbox_delete
mailbox_rename
> mail_log_fields = uid box msgid size
> quota = maildir:User quota
> quota_warning = storage=95%% quota-warning 95 %u
> quota_warning2 = storage=80%% quota-warning 80 %u
> sieve = file:~/sieve;active=~/.dovecot.sieve
> sieve_before = /usr/local/etc/dovecot/sieve/
> sieve_dir = ~/.sieve
> sieve_extensions = +spamtest +spamtestplus +relational
+comparator-i;ascii-numeric
> }
> postmaster_address = postmaster at uniparthenope.it
> protocols = imap pop3 lmtp sieve
> service auth {
> unix_listener /var/spool/postfix/private/auth {
> group = postfix
> mode = 0666
> user = postfix
> }
> }
> service imap-login {
> inet_listener imap {
> port = 143
> }
> inet_listener imaps {
> port = 993
> ssl = yes
> }
> service_count = 0
> vsz_limit = 256 M
> }
> service managesieve-login {
> inet_listener sieve {
> port = 4190
> }
> service_count = 1
> vsz_limit = 64 M
> }
> service pop3-login {
> inet_listener pop3 {
> port = 110
> }
> inet_listener pop3s {
> port = 995
> ssl = yes
> }
> service_count = 0
> vsz_limit = 256 M
> }
> service quota-status {
> client_limit = 1
> executable = quota-status -p postfix
> inet_listener {
> port = 12340
> }
> }
> service quota-warning {
> executable = script /usr/local/bin/quota-warning.sh
> unix_listener quota-warning {
> user = vmail
> }
> user = vmail
> }
> ssl_ca = </etc/postfix/ssl/chain-8204-mail.uniparthenope.it.pem
> ssl_cert = </etc/postfix/ssl/cert-8204-mail.uniparthenope.it.pem
> ssl_key = # hidden, use -P to show it
> submission_host = 127.0.0.1
> userdb {
> args = /usr/local/etc/dovecot/dovecot-people-ldap.conf.ext
> driver = ldap
> }
> protocol lmtp {
> mail_plugins = " quota"
> }
> protocol lda {
> mail_plugins = " quota quota sieve"
> }
> protocol imap {
> imap_client_workarounds = delay-newmail tb-extra-mailbox-sep
tb-lsub-flags
> mail_max_userip_connections = 100
> mail_plugins = " quota imap_quota last_login"
> }
> protocol pop3 {
> mail_max_userip_connections = 100
> mail_plugins = " quota last_login"
> pop3_logout_format = top=%t/posta2:/var/core # dovecot --version
> 2.2.30.2 (c0c463e)
> %p, retr=%r/%b, del=%d/%m, size=%s, %u
> pop3_save_uidl = no
> pop3_uidl_format = %08Xu%08Xv
> }
>
> I've upgraded dovecot from 2.2.18 to 2.2.30.2 but errors are still
present
>
> posta2:/var/core # dovecot --version
> 2.2.30.2 (c0c463e)bye
>
> Errors appear only with dovecot-lda process so I tend to exsclude disks or
ocfs failure
>
> Is there someone who can help me?
Hello
To us it looks like the crash is happening inside your operating systems kernel.
That would also be the primary target when looking for a resolution. There might
not be anything that can be done by Dovecot.
br,
Teemu>
> Best regards
>
> Thanks