thr3ads.net - CentOS - [CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM? [May 2020]

If this information is useful, please help other people find it:
Share via:

2020-May-18 11:13 UTC

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

Hi,

after trying sshfs to mount a remote file system on a server with the result 
that sshfs will sooner or later get stuck and require a reboot of the client, 
I'm fed up with it and am looking for alternatives.

So next I would like to use NFS over a VPN connection instead.  To minimize 
the instances of the NFS mount getting stuck, it might be helpful to use 
autofs.

What happens when the mount is stuck because the connection is down and autofs 
figures the idle timeout has expired and tries to unmount the remote file 
system?

What happens when I put the client to sleep by suspending to RAM?  Will autofs 
automatically unmount first, or will the server have to deal with a client 
that has apparently gone away and might re-appear later in unexpected ways?

Is there a way to tell NFS to retry an operation _now_ after the connection 
went down and came back, rather than having to wait for a possibly rather long 
time?

Is there a better alternative for mounting remote file systems over unreliable 
connections?

Warren Young

2020-May-18 23:36 UTC

head link

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

On May 18, 2020, at 5:13 AM, hw <hw at gc-24.de>
wrote:> 
> Is there a better alternative for mounting remote file systems over
unreliable
> connections?
I don?t have a good answer for you, because if you?d asked me without all this
backstory whether NFS or SSHFS is more tolerant of bad connections, I?d have
told you SSHFS.

NFS comes out of the "Unix lab? world, where all of the computers are
hard-wired to nearby servers.  It gets really annoyed when packet loss starts
happening, and since it?s down in the kernel, that can mean the whole box locks
up until NFS gets happy again.

NFS is that way on purpose: it?s often used to provide critical file service
(e.g. root-on-NFS) so if file I/O stops happening it *must* block and wait out
the failure, else all I/O dependent on NFS starts failing.

Some of this affects SSHFS as well.  To some extent, the solution to the broader
problem is ?Dropbox? et al.  That is, a solution that was designed around the
idea that connectivity might not be constant.

This is also while DVCSes like Git have become popular.

2020-May-19 10:19 UTC

head link

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

On Tuesday, May 19, 2020 1:36:03 AM CEST Warren Young
wrote:> On May 18, 2020, at 5:13 AM, hw <hw at gc-24.de> wrote:
> > Is there a better alternative for mounting remote file systems over
> > unreliable connections?
> 
> I don?t have a good answer for you, because if you?d asked me without all
> this backstory whether NFS or SSHFS is more tolerant of bad connections,
> I?d have told you SSHFS.
That's what I thought.  Should I make a bug report?  Sshfs is clearly
intended
to reconnect automatically when mounted like that, and it doesn't do that.
> NFS comes out of the "Unix lab? world, where all of the computers are
> hard-wired to nearby servers.  It gets really annoyed when packet loss
> starts happening, and since it?s down in the kernel, that can mean the
> whole box locks up until NFS gets happy again.
It's intended to do that, which is fine.  Sshfs is intended to do that as 
well.  Both are supposed to reconnect when the connection is back.  So far, 
sshfs has failed to do that to the extend that it is unusable.  So far, NFS 
with autofs hasn't caused issues, yet the testing continues.  It's also
a lot
faster despite I used compression with sshfs.
> NFS is that way on purpose: it?s often used to provide critical file
service
> (e.g. root-on-NFS) so if file I/O stops happening it *must* block and wait
> out the failure, else all I/O dependent on NFS starts failing.
> 
> Some of this affects SSHFS as well.  To some extent, the solution to the
> broader problem is ?Dropbox? et al.  That is, a solution that was designed
> around the idea that connectivity might not be constant.
Well, I need the file system accessible like a file system, not involving 
storing files somewhere else and downloading them somewhere else or somehow 
syncing some files manually between servers and clients once in a while.  How 
am I supposed to work remotely when I don't have access to the files
involved.
> This is also while DVCSes like Git have become popular.
Are you sure that's the reason?

Jonathan Billings

2020-May-19 12:22 UTC

head link

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

On Mon, May 18, 2020 at 05:36:03PM -0600, Warren Young
wrote:> On May 18, 2020, at 5:13 AM, hw <hw at gc-24.de> wrote:
> > 
> > Is there a better alternative for mounting remote file systems
> > over unreliable  
> > connections?
> 
> I don?t have a good answer for you, because if you?d asked me
> without all this backstory whether NFS or SSHFS is more tolerant of
> bad connections, I?d have told you SSHFS. 
On the other hand, NFS is a fully-featured filesystem that supports
fancy features like locking and a full ACL system.  SSHFS is a FUSE
filesystem that will break a lot of software if you try to use it for
anything more complex than 'ls' and 'cp'.

For what it's worth, Samba with SMBv3 and the POSIX extension[1] is a
lot more tolerant of bad connections, and presents itself as a real
filesystem under linux.

1. https://wiki.samba.org/index.php/SMB3-Linux

-- 
Jonathan Billings <billings at negate.org>

Orion Poplawski

2020-May-21 02:55 UTC

head link

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

On 5/18/20 5:13 AM, hw wrote:> Hi,
> 
> after trying sshfs to mount a remote file system on a server with the
result
> that sshfs will sooner or later get stuck and require a reboot of the
client,
> I'm fed up with it and am looking for alternatives.
> 
> So next I would like to use NFS over a VPN connection instead.  To minimize
> the instances of the NFS mount getting stuck, it might be helpful to use
> autofs.
> 
> What happens when the mount is stuck because the connection is down and
autofs
> figures the idle timeout has expired and tries to unmount the remote file
> system?
Nothing good, and bad things happen before this.
> What happens when I put the client to sleep by suspending to RAM?  Will
autofs
> automatically unmount first, or will the server have to deal with a client
> that has apparently gone away and might re-appear later in unexpected ways?
This is the mechanism that I use to try to mitigate this on our systems:

This triggers on suspend type events:

# cat /etc/systemd/system/suspend.target.wants/offnet.service
[Unit]
Description=Unmount all NFS mounts before disconnecting from network
Before=systemd-hibernate.service
Before=systemd-shutdown.service
Before=systemd-suspend.service

[Service]
ExecStart=/usr/local/sbin/offnet
Type=oneshot

[Install]
WantedBy=hibernate.target
WantedBy=shutdown.target
WantedBy=suspend.target

----

This triggers when you bring down a vpn connection with NetworkManager:

# cat /etc/NetworkManager/dispatcher.d/pre-down.d/autofs
#!/bin/bash

if [ -x /usr/bin/logger ]; then
   LOGGER="/usr/bin/logger -s -p user.notice -t $0"
else
   LOGGER=echo
fi

[ -z "${DEVICE_IP_IFACE}" ] && exit

# Unmount NFS and shutdown autofs if we are shutting down the last 
ethernet device or exiting vpn
if [ "$(/usr/bin/nmcli --terse --fields 'device,type' c show
--active |
grep -v "^${DEVICE_IP_IFACE}:" | grep -c :802-)" -eq 0 -o \
      "${DEVICE_IP_IFACE}" = tun0 ]; then
   $LOGGER "Unmounting NFS/CIFS directories"
   /usr/local/sbin/offnet
   $LOGGER "Performing autofs pre-down stop"
   systemctl stop autofs.service
fi

----

# cat /usr/local/sbin/offnet
#!/bin/bash
. /etc/init.d/functions

# __umount_loop awk_program fstab_file first_msg retry_msg retry_umount_args
# awk_program should process fstab_file and return a list of fstab-encoded
# paths; it doesn't have to handle comments in fstab_file.
__umount_loop() {
         local remaining sig         local retry=3 count

         remaining=$(LC_ALL=C awk "/^#/ {next} $1" "$2" |
sort -r)
         while [ -n "$remaining" -a "$retry" -gt 0 ]; do
                 if [ "$retry" -eq 3 ]; then
                         action "$3" umount $remaining
                 else
                         action "$4" umount $5 $remaining
                 fi
                 count=4
                 remaining=$(LC_ALL=C awk "/^#/ {next} $1"
"$2" | sort -r)
                 while [ "$count" -gt 0 ]; do
                         [ -z "$remaining" ] && break
                         count=$(($count-1))
                         usleep 500000
                         remaining=$(LC_ALL=C awk "/^#/ {next} $1"
"$2"
| sort -r)
                 done
                 [ -z "$remaining" ] && break
                 kill $sig $(/sbin/fuser -m $remaining 2>/dev/null  | 
sed -e "s/\b$$\b//g") > /dev/null
                 sleep 3
                 retry=$(($retry -1))
                 sig=-9
         done
}

__umount_loop '$3 ~ /^nfs/ && $3 != "nfsd" && $2
!= "/" {print $2}' \
     /proc/mounts \
     $"Unmounting NFS filesystems: " \
     $"Unmounting NFS filesystems (retry): " \
     "-f -l"

__umount_loop '$3 ~ /^cifs/ && $2 != "/" {print $2}' \
     /proc/mounts \
     $"Unmounting CIFS filesystems: " \
     $"Unmounting CIFS filesystems (retry): " \
     "-f -l"
> Is there a way to tell NFS to retry an operation _now_ after the connection
> went down and came back, rather than having to wait for a possibly rather
long
> time?
Not that I'm aware of.
> Is there a better alternative for mounting remote file systems over
unreliable
> connections?
I would second the recommendation for SMBv3/CIFS for a fault tolerant 
remote file system.

-- 
Orion Poplawski
Manager of NWRA Technical Systems          720-772-5637
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion at nwra.com
Boulder, CO 80301                 https://www.nwra.com/

Apparently Analagous Threads

Search for more possibly parallel threads

CentOS - May 2020 - how does autofs deal with stuck NFS mounts and suspending to RAM?

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

[CentOS] how does autofs deal with stuck NFS mounts and suspending to RAM?

Apparently Analagous Threads