Whit Blauvelt
2012-Aug-30 01:31 UTC
[Gluster-users] Samba > NFS > Gluster leads to runaway lock problem after many stable months
Hi, I have a couple of Gluster 3.1.4 shares on the LAN NFS mounted at 192.168.1.242 to a couple of other systems. One of those systems in turn has access via Samba that includes those shares. This has been a stable system for a year. Today it went crazy, where the most immediate bad effect was an immense swelling of the logs of the system with both the Gluster shares mounted by NFS, and Samba access from other systems. That swelling was in the neighborhood of 5000 lines per second repeating approximately this: Aug 29 18:47:01 system2 smbd[7118]: [2012/08/29 18:47:01, 0] locking/posix.c:posix_fcntl_getlock(244) Aug 29 18:47:01 system2 smbd[7118]: on 32 bit NFS mounted file systems. Aug 29 18:47:01 system2 smbd[7118]: [2012/08/29 18:47:01, 0] locking/posix.c:posix_fcntl_getlock(252) Aug 29 18:47:01 system2 smbd[7118]: Offset greater than 31 bits. Returning success. Aug 29 18:47:01 system2 smbd[7118]: [2012/08/29 18:47:01, 0] locking/posix.c:posix_fcntl_getlock(242) Aug 29 18:47:01 system2 smbd[7118]: posix_fcntl_getlock: WARNING: lock request at offset 2886282524, length 1 returned Aug 29 18:47:01 system2 smbd[7118]: [2012/08/29 18:47:01, 0] locking/posix.c:posix_fcntl_getlock(243) Aug 29 18:47:01 system2 smbd[7118]: an No locks available error. This can happen when using 64 bit lock offsets Aug 29 18:47:01 system2 smbd[7118]: [2012/08/29 18:47:01, 0] locking/posix.c:posix_fcntl_getlock(244) Aug 29 18:47:01 system2 kernel: [14164811.340891] lockd: couldn't create RPC handle for 192.168.1.242 Not good. The /var partition filled totally, fast. Turned Samba off. Cleared logs out of the way. Restarted Samba, and everything's happy and not complaining. But obviously I need to avoid whatever set that off like the devil. It's an older Samba on that system, Version 3.0.28a, if that makes a difference. What are my options for avoiding this, both the core problem, and a repeat of anything throwing 5000 lines into the logs per second? The log level's set low in Samba. I suppose I could set up something to shut down syslog if /var gets to 99% full again, as a stop gap. There's 3X the space there that the logs normally fill. Thanks, Whit
Whit Blauvelt
2012-Aug-30 12:22 UTC
[Gluster-users] Samba > NFS > Gluster leads to runaway lock problem after many stable months
Taking the message from Samba to be:> No locks available error. This can happen when using 64 bit lock offsets > on 32 bit NFS mounted file systems.Is Gluster 3.1.4's NFS itself 32-bit or 64-bit? From here: http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_NFS_Frequently_Asked_Questions#Application_fails_with_.22Invalid_argument.22_or_.22Value_too_large_for_defined_data_type.22_error. it looks like 3.1 is 64-bit NFS. The file systems in the 2 Gluster mounts are ext4 for one and xfs for the other. All the systems are themselves, and have always been, 64-bit. The one that's ext4 I had at some point added "posix locking = no" to smb.conf to avoid a problem that I don't recall clearly, but it wasn't the present one. The Samba docs advise that that setting should never be required, and that it's about coordinating the smb lock table with the posix one. But there are old reports on the Samba mailing list of it curing various things. I've added that for the xfs-based Gluster share now. Hoping someone has a clearer view of this. Are there respects in which Gluster's NFS is 32-bit rather than 64? Or that xfs on a 64-bit system is? Or ext4? I haven't been able to work out just which operation exhausted the locks, but it's more likely to have been on the xfs Gluster nfs mount, as more was going on there at the time things went bad. Whit
Whit Blauvelt
2012-Aug-30 16:08 UTC
[Gluster-users] Samba "posix locking" = yes or no for Gluster?
On Thu, Aug 30, 2012 at 10:42:01AM -0500, John Jolet wrote:> i had to turn off posix locking in order to get windows machines to > be able to write to the shares at all.Haven't seen that problem. Looking for background I found this presentation on SMB2 - the next version of Samba basically, incorporating Windows 8 features - according to which posix locking will be about the last thing to be implemented there: http://www.samba.org/~sfrench/presentations/smf-linux-collab-summmit-future-of-file-protocols-smb2.2.pdf Makes it sound as if posix locking's incredibly hard to implement right. Which could explain why there are many reports of situations where people find it necessary to turn it off in the current Samba, despite the assurance in the Samba docs that there should never be reason to do so. It also argues somewhat reasonably (even allowing for the presenter's bias) that going forward SMB2 is likely to progress much faster than NFS4 - and here we are with Gluster still at NFS3. Wonder if Gluster has/should have plans for including SMB2 support? Whit