A Ghoshal
2015-Feb-23 21:27 UTC
[Gluster-users] NFS File handles change during upgrade from glusterfs version 3.4.2 to 3.5.3
Hello, After upgrading from 3.4.2 to 3.5.3, I noticed that all my NFS clients (mounted over tcp) on remote nodes turned stale. I investigated this and got the following recurrent logs from the NFS server: [2015-02-23 20:40:25.834071] W [nfs3-helpers.c:3401:nfs3_log_common_res] 0-nfs-nfsv3: XID: 7ead95cb, GETATTR: NFS: 10001(Illegal NFS file handle), POSIX: 14(Bad address) [2015-02-23 20:40:25.834167] E [nfs3.c:301:__nfs3_get_volume_id] (-->/usr/lib64/glusterfs/3.5.3/xlator/nfs/server.so(nfs3_getattr+0x4cb) [0x7fc728b3631a] (-->/usr/lib64/glusterfs/3.5.3/xlator/nfs/server.so(nfs3_getattr_reply+0x37) [0x7fc728b35873] (-->/usr/lib64/glusterfs/3.5.3/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0xb0) [0x7fc728b357ba]))) 0-nfs-nfsv3: invalid argument: xl [2015-02-23 20:40:25.834801] E [nfs3.c:840:nfs3_getattr] 0-nfs-nfsv3: Bad Handle Upon investigation, it seems to me that the trouble is with procedure nfs3_fh_validate(). In 3.4.2, the validation is against the following identifiers: #define GF_NFSFH_IDENT0 ':' #define GF_NFSFH_IDENT1 'O' #define GF_NFSFH_IDENT_SIZE (sizeof(char) * 2) #define GF_NFSFH_STATIC_SIZE (GF_NFSFH_IDENT_SIZE + (2*sizeof (uuid_t))) While, on 3.5.3, this has expanded to #define GF_NFSFH_IDENT0 ':' #define GF_NFSFH_IDENT1 'O' #define GF_NFSFH_IDENT2 'G' #define GF_NFSFH_IDENT3 'L' #define GF_NFSFH_IDENT_SIZE (sizeof(char) * 4) #define GF_NFSFH_STATIC_SIZE (GF_NFSFH_IDENT_SIZE + (2*sizeof (uuid_t))) Due to this, I have to unmount+mount all my nfs clients to get them back into service. Could somebody help me understand why this change was introduced (maybe a reference of the bug ID)? Also, any chance that there's a way to get around this? Thank you in advance for your suggestions. Anirban =====-----=====-----====Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150224/167b5497/attachment.html>
Vijay Bellur
2015-Feb-24 07:56 UTC
[Gluster-users] [Gluster-devel] NFS File handles change during upgrade from glusterfs version 3.4.2 to 3.5.3
On 02/24/2015 02:57 AM, A Ghoshal wrote:> Hello, > > After upgrading from 3.4.2 to 3.5.3, I noticed that all my NFS clients > (mounted over tcp) on remote nodes turned stale. I investigated this and > got the following recurrent logs from the NFS server: > > [2015-02-23 20:40:25.834071] W [nfs3-helpers.c:3401:nfs3_log_common_res] > 0-nfs-nfsv3: XID: 7ead95cb, GETATTR: NFS: 10001(Illegal NFS file > handle), POSIX: 14(Bad address) > [2015-02-23 20:40:25.834167] E [nfs3.c:301:__nfs3_get_volume_id] > (-->/usr/lib64/glusterfs/3.5.3/xlator/nfs/server.so(nfs3_getattr+0x4cb) > [0x7fc728b3631a] > (-->/usr/lib64/glusterfs/3.5.3/xlator/nfs/server.so(nfs3_getattr_reply+0x37) > [0x7fc728b35873] > (-->/usr/lib64/glusterfs/3.5.3/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0xb0) > [0x7fc728b357ba]))) 0-nfs-nfsv3: invalid argument: xl > [2015-02-23 20:40:25.834801] E [nfs3.c:840:nfs3_getattr] 0-nfs-nfsv3: > *Bad Handle* > > Upon investigation, it seems to me that the trouble is with procedure > nfs3_fh_validate(). In 3.4.2, the validation is against the following > identifiers: > > #define GF_NFSFH_IDENT0 ':' > #define GF_NFSFH_IDENT1 'O' > #define GF_NFSFH_IDENT_SIZE (sizeof(char) * 2) > #define GF_NFSFH_STATIC_SIZE (GF_NFSFH_IDENT_SIZE + (2*sizeof (uuid_t))) > > While, on 3.5.3, this has expanded to > > #define GF_NFSFH_IDENT0 ':' > #define GF_NFSFH_IDENT1 'O' > #define GF_NFSFH_IDENT2 'G' > #define GF_NFSFH_IDENT3 'L' > #define GF_NFSFH_IDENT_SIZE (sizeof(char) * 4) > #define GF_NFSFH_STATIC_SIZE (GF_NFSFH_IDENT_SIZE + (2*sizeof (uuid_t))) > > Due to this, I have to unmount+mount all my nfs clients to get them back > into service. Could somebody help me understand why this change was > introduced (maybe a reference of the bug ID)? Also, any chance that > there's a way to get around this? > >git blame indicates that this change has been introduced as part of [1]. Not sure if there is an easy way to get around this without re-mounting. -Vijay [1] http://review.gluster.org/4918