thr3ads.net - CentOS - [CentOS] systemd automount of cifs share hangs [Oct 2018]

If this information is useful, please help other people find it:
Share via:

Elliott Balsley

2018-Oct-19 20:33 UTC

[CentOS] systemd automount of cifs share hangs

>
> But if I start the automount unit and ls the mount point, the shell hangs
> and eventually, a long time later (I haven't timed it, maybe an hour),
I
> eventually get a prompt again. Control-C won't interrupt it. I can
still
> ssh in and get another session so it's just the process that's
accessing
> the mount point that hangs.
>
I don't have a solution, but I wanted to point out this same hang happened
to me recently with a Myricom 10Gb card.  Apparently Myricom drivers do not
support CentOS 7 smb connections, although HTTP traffic works fine.  I
solved it by switching to a different NIC.

Kenneth Porter

2018-Oct-26 17:40 UTC

head link

[CentOS] systemd automount of cifs share hangs

--On Friday, October 19, 2018 2:33 PM -0700 Elliott Balsley 
<elliott at altsystems.com> wrote:
> I don't have a solution, but I wanted to point out this same hang
happened
> to me recently with a Myricom 10Gb card.  Apparently Myricom drivers do
> not support CentOS 7 smb connections, although HTTP traffic works fine.  I
> solved it by switching to a different NIC.
The mount works fine for me. It's only the automount that hangs, and only 
since a few months ago.

I had it happen again today when my LetsEncrypt cert renewed and the 
dovecot (IMAP) server restarted. Dovecot checks all the mountpoints (in 
case any have mail folders on them) and hung on restart. I shelled in and 
ran df and it also hung. I logged in yet another session and tried to ls 
the mountpoint and that hung completing the directory name.

Here's what I see in /var/log/messages when dovecot hangs and I manually 
mount the shares from another shell session. SELinux is in permissive mode.

Oct 26 09:11:39 saruman systemd: Mounting NAS1 share 1...
Oct 26 09:11:39 saruman systemd: Failed to expire automount, ignoring: No 
such device
Oct 26 09:11:39 saruman systemd: Mounted NAS1 share 1.
Oct 26 09:11:45 saruman kernel: INFO: task dovecot:831 blocked for more 
than 120 seconds.
Oct 26 09:11:45 saruman kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 26 09:11:45 saruman kernel: dovecot         D ffff9994adfa3f40     0 
831      1 0x00000084
Oct 26 09:11:45 saruman kernel: Call Trace:
Oct 26 09:11:45 saruman kernel: [<ffffffff85f1890c>] ? 
__schedule+0x41c/0xa20
Oct 26 09:11:45 saruman kernel: [<ffffffff85f18f39>] schedule+0x29/0x70
Oct 26 09:11:45 saruman kernel: [<ffffffff85f168a9>] 
schedule_timeout+0x239/0x2c0
Oct 26 09:11:45 saruman kernel: [<ffffffff858beb96>] ?
finish_wait+0x56/0x70
Oct 26 09:11:45 saruman kernel: [<ffffffff85f16ff2>] ?
mutex_lock+0x12/0x2f
Oct 26 09:11:45 saruman kernel: [<ffffffff85ab4e00>] ? 
autofs4_wait+0x420/0x910
Oct 26 09:11:45 saruman kernel: [<ffffffff859faf82>] ? 
kmem_cache_alloc+0x1c2/0x1f0
Oct 26 09:11:45 saruman kernel: [<ffffffff85f192ed>] 
wait_for_completion+0xfd/0x140
Oct 26 09:11:45 saruman kernel: [<ffffffff858d2010>] ? 
wake_up_state+0x20/0x20
Oct 26 09:11:45 saruman kernel: [<ffffffff85ab603b>] 
autofs4_expire_wait+0xab/0x160
Oct 26 09:11:45 saruman kernel: [<ffffffff85ab2fc0>] 
do_expire_wait+0x1e0/0x210
Oct 26 09:11:45 saruman kernel: [<ffffffff85ab31fe>] 
autofs4_d_manage+0x7e/0x1d0
Oct 26 09:11:45 saruman kernel: [<ffffffff85a2a37a>] 
follow_managed+0xba/0x310
Oct 26 09:11:45 saruman kernel: [<ffffffff85a2b32d>]
lookup_fast+0x12d/0x230
Oct 26 09:11:45 saruman kernel: [<ffffffff85a2e0dd>] 
path_lookupat+0x16d/0x8b0
Oct 26 09:11:45 saruman kernel: [<ffffffff85f127ba>] ? 
avc_alloc_node+0x24/0x123
Oct 26 09:11:45 saruman kernel: [<ffffffff859fadf5>] ? 
kmem_cache_alloc+0x35/0x1f0
Oct 26 09:11:45 saruman kernel: [<ffffffff85a30aef>] ? 
getname_flags+0x4f/0x1a0
Oct 26 09:11:45 saruman kernel: [<ffffffff85a2e84b>] 
filename_lookup+0x2b/0xc0
Oct 26 09:11:45 saruman kernel: [<ffffffff85a31c87>] 
user_path_at_empty+0x67/0xc0
Oct 26 09:11:45 saruman kernel: [<ffffffff85927b72>] ? 
from_kgid_munged+0x12/0x20
Oct 26 09:11:45 saruman kernel: [<ffffffff85a251df>] ? 
cp_new_stat+0x14f/0x180
Oct 26 09:11:45 saruman kernel: [<ffffffff85a31cf1>]
user_path_at+0x11/0x20
Oct 26 09:11:45 saruman kernel: [<ffffffff85a24cd3>] vfs_fstatat+0x63/0xc0
Oct 26 09:11:45 saruman kernel: [<ffffffff85a2523e>]
SYSC_newstat+0x2e/0x60
Oct 26 09:11:45 saruman kernel: [<ffffffff859326b6>] ? 
__audit_syscall_exit+0x1e6/0x280
Oct 26 09:11:45 saruman kernel: [<ffffffff85a2551e>] SyS_newstat+0xe/0x10
Oct 26 09:11:45 saruman kernel: [<ffffffff85f2579b>] 
system_call_fastpath+0x22/0x27
Oct 26 09:11:50 saruman systemd: Unmounting NAS1 share 1...
Oct 26 09:11:50 saruman systemd: Unmounted NAS1 share 1.
Oct 26 09:12:41 saruman systemd: dovecot.service stop-final-sigterm timed 
out. Killing.
Oct 26 09:13:19 saruman systemd: Mounting NAS1 share 2...
Oct 26 09:13:19 saruman systemd: Failed to expire automount, ignoring: No 
such device
Oct 26 09:13:19 saruman systemd: dovecot.service: main process exited, 
code=killed, status=9/KILL
Oct 26 09:13:19 saruman systemd: Unit dovecot.service entered failed state.
Oct 26 09:13:19 saruman systemd: dovecot.service failed.
Oct 26 09:13:19 saruman systemd: Starting Dovecot IMAP/POP3 email server...
Oct 26 09:13:19 saruman systemd: Mounted NAS1 share 2.
Oct 26 09:13:19 saruman systemd: Started Dovecot IMAP/POP3 email server.

mark

2018-Oct-26 19:25 UTC

head link

[CentOS] systemd automount of cifs share hangs

Kenneth Porter wrote:> --On Friday, October 19, 2018 2:33 PM -0700 Elliott Balsley
> <elliott at altsystems.com> wrote:
>
>> I don't have a solution, but I wanted to point out this same hang
>> happened to me recently with a Myricom 10Gb card.  Apparently Myricom
>> drivers do not support CentOS 7 smb connections, although HTTP traffic
>> works fine.  I solved it by switching to a different NIC.
>
> The mount works fine for me. It's only the automount that hangs, and
only
>  since a few months ago.
>
> I had it happen again today when my LetsEncrypt cert renewed and the
> dovecot (IMAP) server restarted. Dovecot checks all the mountpoints (in
> case any have mail folders on them) and hung on restart. I shelled in and
>  ran df and it also hung. I logged in yet another session and tried to ls
>  the mountpoint and that hung completing the directory name.
>
> Here's what I see in /var/log/messages when dovecot hangs and I
manually
> mount the shares from another shell session. SELinux is in permissive
> mode.
>
> Oct 26 09:11:39 saruman systemd: Mounting NAS1 share 1...
> Oct 26 09:11:39 saruman systemd: Failed to expire automount, ignoring: No
> such device Oct 26 09:11:39 saruman systemd: Mounted NAS1 share 1.
> Oct 26 09:11:45 saruman kernel: INFO: task dovecot:831 blocked for more
> than 120 seconds. Oct 26 09:11:45 saruman kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Oct 26 09:11:45 saruman kernel: dovecot         D ffff9994adfa3f40     0
> 831      1 0x00000084
> Oct 26 09:11:45 saruman kernel: Call Trace:
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f1890c>] ?
> __schedule+0x41c/0xa20
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f18f39>]
schedule+0x29/0x70
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f168a9>]
> schedule_timeout+0x239/0x2c0 Oct 26 09:11:45 saruman kernel:
> [<ffffffff858beb96>] ? finish_wait+0x56/0x70
> Oct 26 09:11:45 saruman kernel: [<ffffffff85f16ff2>] ?
> mutex_lock+0x12/0x2f Oct 26 09:11:45 saruman kernel:
[<ffffffff85ab4e00>]
> ?<snip>
Wait a minute: are you running IPv6? What we see is that if a system
doesn't get its IPv6 address, NFSv4 goes preferentially for that, and if
it has that, and looses it, it will *NOT* fall back to IPv4, but hangs.

      mark

Reasonably Related Threads

Search for more apparently analagous threads

CentOS - Oct 2018 - systemd automount of cifs share hangs

[CentOS] systemd automount of cifs share hangs

[CentOS] systemd automount of cifs share hangs

[CentOS] systemd automount of cifs share hangs

Reasonably Related Threads