This message seems spurious, there have been no network changes and the system has been in operation for many months. I restarted the volume and it worked for a while and crashed in the same way. [2014-04-14 01:17:59.291103] E [nlm4.c:968:nlm4_establish_callback] 0-nfs-NLM: Unable to get NLM port of the client. Is the firewall running on client? patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2014-04-14 01:18:29configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.4.1 /lib64/libc.so.6(+0x329a0)[0x7f43750899a0] /lib64/libc.so.6(inet_ntop+0x88)[0x7f437514fc88] /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4_establish_callback+0x397)[0x7f436b540467] /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4svc_send_granted+0x308)[0x7f436b540808] /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4svc_lock_cbk+0x1cb)[0x7f436b540a1b] /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nfs_fop_lk_cbk+0x6e)[0x7f436b51b77e] /usr/lib64/glusterfs/3.4.1/xlator/debug/io-stats.so(io_stats_lk_cbk+0xf6)[0x7f436b7672c6] /usr/lib64/glusterfs/3.4.1/xlator/cluster/distribute.so(dht_lk_cbk+0xdb)[0x7f436bbafd0b] /usr/lib64/glusterfs/3.4.1/xlator/protocol/client.so(client3_3_lk_cbk+0x1ad)[0x7f436bddec4d] /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f4375c0a6f5] /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x11f)[0x7f4375c0bc6f] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f4375c074e8] /usr/lib64/glusterfs/3.4.1/rpc-transport/socket.so(+0x91d6)[0x7f4371e9c1d6] /usr/lib64/glusterfs/3.4.1/rpc-transport/socket.so(+0xabfd)[0x7f4371e9dbfd] /usr/lib64/libglusterfs.so.0(+0x5e207)[0x7f4375e70207] /usr/sbin/glusterfs(main+0x5e8)[0x406818] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f4375075d1d] /usr/sbin/glusterfs[0x404659]
Just discovered that restarting glusterd seems to fixup the problem for a while. The glusterd log has lots of errors: [2014-04-14 02:17:06.493855] E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused) [2014-04-14 02:17:09.494224] E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused) [2014-04-14 02:17:12.494573] E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused) [2014-04-14 02:17:15.494939] E [socket.c:2788:socket_connect] 0-management: connection attempt failed (Connection refused) On Mon, 2014-04-14 at 09:43 +0800, Franco Broi wrote:> This message seems spurious, there have been no network changes and the > system has been in operation for many months. > > I restarted the volume and it worked for a while and crashed in the same > way. > > [2014-04-14 01:17:59.291103] E [nlm4.c:968:nlm4_establish_callback] 0-nfs-NLM: Unable to get NLM port of the client. Is the firewall running on client? > > patchset: git://git.gluster.com/glusterfs.git > signal received: 11 > time of crash: 2014-04-14 01:18:29configuration details: > argp 1 > backtrace 1 > dlfcn 1 > fdatasync 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.4.1 > /lib64/libc.so.6(+0x329a0)[0x7f43750899a0] > /lib64/libc.so.6(inet_ntop+0x88)[0x7f437514fc88] > /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4_establish_callback+0x397)[0x7f436b540467] > /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4svc_send_granted+0x308)[0x7f436b540808] > /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4svc_lock_cbk+0x1cb)[0x7f436b540a1b] > /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nfs_fop_lk_cbk+0x6e)[0x7f436b51b77e] > /usr/lib64/glusterfs/3.4.1/xlator/debug/io-stats.so(io_stats_lk_cbk+0xf6)[0x7f436b7672c6] > /usr/lib64/glusterfs/3.4.1/xlator/cluster/distribute.so(dht_lk_cbk+0xdb)[0x7f436bbafd0b] > /usr/lib64/glusterfs/3.4.1/xlator/protocol/client.so(client3_3_lk_cbk+0x1ad)[0x7f436bddec4d] > /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x7f4375c0a6f5] > /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x11f)[0x7f4375c0bc6f] > /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x7f4375c074e8] > /usr/lib64/glusterfs/3.4.1/rpc-transport/socket.so(+0x91d6)[0x7f4371e9c1d6] > /usr/lib64/glusterfs/3.4.1/rpc-transport/socket.so(+0xabfd)[0x7f4371e9dbfd] > /usr/lib64/libglusterfs.so.0(+0x5e207)[0x7f4375e70207] > /usr/sbin/glusterfs(main+0x5e8)[0x406818] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f4375075d1d] > /usr/sbin/glusterfs[0x404659] > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users
On 04/13/2014 06:43 PM, Franco Broi wrote:> > This message seems spurious, there have been no network changes and the > system has been in operation for many months. > > I restarted the volume and it worked for a while and crashed in the same > way. > > [2014-04-14 01:17:59.291103] E [nlm4.c:968:nlm4_establish_callback] 0-nfs-NLM: Unable to get NLM port of the client. Is the firewall running on client? > > patchset: git://git.gluster.com/glusterfs.git > signal received: 11 > time of crash: 2014-04-14 01:18:29configuration details: > argp 1 > backtrace 1 > dlfcn 1 > fdatasync 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.4.1 > /lib64/libc.so.6(+0x329a0)[0x7f43750899a0] > /lib64/libc.so.6(inet_ntop+0x88)[0x7f437514fc88] > /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4_establish_callback+0x397)[0x7f436b540467] > /usr/lib64/glusterfs/3.4.1/xlator/nfs/server.so(nlm4svc_send_granted+0x308)[0x7f436b540808]This seems to be fixed through: http://review.gluster.org/#/c/5452/. Thanks, Vijay