Alex
2024-Apr-01 15:52 UTC
[Samba] task "smbd" blocked for more than x seconds (followed by call trace)
Hi, I have a samba 4.15.13 file server running on Ubuntu 20.04.6, fully updated/patched. The samba shared folders are stored on an XFS formatted drive. Intermittently, usually after 1 or 2 weeks of normal operation, browsing the shared drive will freeze Windows Explorer, and when I look at the file server console, I see the messages below over and over again: [2955729.823684] INFO: task smbd:2888777 blocked for more than 120 seconds. [2955729.824540] Not tainted 5.4.0-172-generic #190-Ubuntu [2955729.824989] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [2955729.825549] smbd D 0 2888777 1737 0x00000080 [2955729.825565] Call Trace: [2955729.825674] __schedule+0x2e3/0x740 [2955729.825689] schedule+0x42/0xb0 [2955729.825698] rwsem_down_read_slowpath+0x16c/0x4a0 [2955729.825760] __down_read+0x6b/0x80 [2955729.825769] __percpu_down_read+0x54/0x80 [2955729.825802] __sb_start_write+0x79/0x80 [2955729.825848] mnt_want_write+0x24/0x60 [2955729.825866] do_last+0x8ea/0x900 [2955729.825877] ? __alloc_file+0x94/0x110 [2955729.825890] path_openat+0x8d/0x290 [2955729.825904] do_filp_open+0x91/0x100 [2955729.825923] ? __alloc_fd+0x46/0x150 [2955729.825954] do_sys_open+0x17e/0x290 [2955729.825991] ? __audit_syscall_exit+0x233/0x290 [2955729.826005] __x64_sys_openat+0x20/0x30 [2955729.826036] do_syscall_64+0x57/0x190 [2955729.826055] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [2955729.826085] RIP: 0033:0x7fac5d210163 [2955729.826098] Code: Bad RIP value. [2955729.826103] RSP: 002b:00007ffec466ae50 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [2955729.826119] RAX: ffffffffffffffda RBX: 0000555ed448f100 RCX: 00007fac5d210163 [2955729.826124] RDX: 00000000000208c2 RSI: 0000555ed441fd20 RDI: 00000000ffffff9c [2955729.826128] RBP: 0000555ed441fd20 R08: 0000000000000000 R09: 00000000000001e4 [2955729.826133] R10: 00000000000001e4 R11: 0000000000000293 R12: 00000000000208c2 [2955729.826138] R13: 00000000000001e4 R14: 0000555ed43c7b40 R15: 0000555ed43c7b40 [2955850.658135] INFO: task smbd:2888777 blocked for more than 241 seconds. [2955850.659195] Not tainted 5.4.0-172-generic #190-Ubuntu [2955850.659800] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [2955850.660549] smbd D 0 2888777 1737 0x00000080 [2955850.660560] Call Trace: [2955850.660586] __schedule+0x2e3/0x740 [2955850.660604] schedule+0x42/0xb0 [2955850.660615] rwsem_down_read_slowpath+0x16c/0x4a0 [2955850.660640] __down_read+0x6b/0x80 [2955850.660651] __percpu_down_read+0x54/0x80 [2955850.660665] __sb_start_write+0x79/0x80 [2955850.660678] mnt_want_write+0x24/0x60 [2955850.660690] do_last+0x8ea/0x900 [2955850.660703] ? __alloc_file+0x94/0x110 [2955850.660719] path_openat+0x8d/0x290 [2955850.660736] do_filp_open+0x91/0x100 [2955850.660756] ? __alloc_fd+0x46/0x150 [2955850.660773] do_sys_open+0x17e/0x290 [2955850.660785] ? __audit_syscall_exit+0x233/0x290 [2955850.660802] __x64_sys_openat+0x20/0x30 [2955850.660814] do_syscall_64+0x57/0x190 [2955850.660827] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [2955850.660836] RIP: 0033:0x7fac5d210163 [2955850.660845] Code: 89 7c 24 18 44 89 54 24 0c e8 99 64 f8 ff 44 8b 54 24 0c 8b 54 24 1c 41 89 c0 48 8b 74 24 10 8b 7c 24 18 b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 44 89 c7 89 44 24 0c e8 c9 64 f8 ff 8b 44 [2955850.660851] RSP: 002b:00007ffec466ae50 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [2955850.660861] RAX: ffffffffffffffda RBX: 0000555ed448f100 RCX: 00007fac5d210163 [2955850.660867] RDX: 00000000000208c2 RSI: 0000555ed441fd20 RDI: 00000000ffffff9c [2955850.660873] RBP: 0000555ed441fd20 R08: 0000000000000000 R09: 00000000000001e4 [2955850.660879] R10: 00000000000001e4 R11: 0000000000000293 R12: 00000000000208c2 [2955850.660885] R13: 00000000000001e4 R14: 0000555ed43c7b40 R15: 0000555ed43c7b40 [2955971.492313] INFO: task smbd:2888777 blocked for more than 362 seconds. [2955971.493227] Not tainted 5.4.0-172-generic #190-Ubuntu [2955971.493694] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [2955971.494231] smbd D 0 2888777 1737 0x00000080 [2955971.494253] Call Trace: [2955971.494592] __schedule+0x2e3/0x740 [2955971.494609] schedule+0x42/0xb0 [2955971.494618] rwsem_down_read_slowpath+0x16c/0x4a0 [2955971.494637] __down_read+0x6b/0x80 [2955971.494645] __percpu_down_read+0x54/0x80 [2955971.494656] __sb_start_write+0x79/0x80 [2955971.494666] mnt_want_write+0x24/0x60 systemctl restart smbd hangs. Rebooting the server solves the issue, until in eventually occurs again a week or two later. Any advice on troubleshooting this? Thanks! Peter
Jeremy Allison
2024-Apr-01 16:59 UTC
[Samba] task "smbd" blocked for more than x seconds (followed by call trace)
On Mon, Apr 01, 2024 at 08:52:54AM -0700, Alex via samba wrote:>Hi, > >I have a samba 4.15.13 file server running on Ubuntu 20.04.6, fully >updated/patched. The samba shared folders are stored on an XFS formatted >drive. >Intermittently, usually after 1 or 2 weeks of normal operation, browsing >the shared drive will freeze Windows Explorer, and when I look at the file >server console, I see the messages below over and over again: > >[2955729.823684] INFO: task smbd:2888777 blocked for more than 120 seconds. >[2955729.824540] Not tainted 5.4.0-172-generic #190-Ubuntu >[2955729.824989] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >disables this message. >[2955729.825549] smbd D 0 2888777 1737 0x00000080 >[2955729.825565] Call Trace: >[2955729.825674] __schedule+0x2e3/0x740 >[2955729.825689] schedule+0x42/0xb0 >[2955729.825698] rwsem_down_read_slowpath+0x16c/0x4a0 >[2955729.825760] __down_read+0x6b/0x80 >[2955729.825769] __percpu_down_read+0x54/0x80 >[2955729.825802] __sb_start_write+0x79/0x80 >[2955729.825848] mnt_want_write+0x24/0x60 >[2955729.825866] do_last+0x8ea/0x900 >[2955729.825877] ? __alloc_file+0x94/0x110 >[2955729.825890] path_openat+0x8d/0x290 >[2955729.825904] do_filp_open+0x91/0x100 >[2955729.825923] ? __alloc_fd+0x46/0x150 >[2955729.825954] do_sys_open+0x17e/0x290 >[2955729.825991] ? __audit_syscall_exit+0x233/0x290 >[2955729.826005] __x64_sys_openat+0x20/0x30 >[2955729.826036] do_syscall_64+0x57/0x190 >[2955729.826055] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 >[2955729.826085] RIP: 0033:0x7fac5d210163 >[2955729.826098] Code: Bad RIP value. >[2955729.826103] RSP: 002b:00007ffec466ae50 EFLAGS: 00000293 ORIG_RAX: >0000000000000101 >[2955729.826119] RAX: ffffffffffffffda RBX: 0000555ed448f100 RCX: >00007fac5d210163 >[2955729.826124] RDX: 00000000000208c2 RSI: 0000555ed441fd20 RDI: >00000000ffffff9c >[2955729.826128] RBP: 0000555ed441fd20 R08: 0000000000000000 R09: >00000000000001e4 >[2955729.826133] R10: 00000000000001e4 R11: 0000000000000293 R12: >00000000000208c2 >[2955729.826138] R13: 00000000000001e4 R14: 0000555ed43c7b40 R15: >0000555ed43c7b40 >[2955850.658135] INFO: task smbd:2888777 blocked for more than 241 seconds. >[2955850.659195] Not tainted 5.4.0-172-generic #190-Ubuntu >[2955850.659800] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >disables this message.You have a bad or corrupted disk. Nothing smbd does causes kernel hangs.