Hi, I have a cluster with 3 nodes on pre-production. Yesterday, one node was down. The errror that I have seen is that: [2015-05-28 19:04:27.305560] E [glusterd-syncop.c:1578:gd_sync_task_begin] 0-management: Unable to acquire lock for cfe-gv1 The message "I [MSGID: 106006] [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 5 times between [2015-05-28 19:04:09.346088] and [2015-05-28 19:04:24.349191] pending frames: frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2015-05-28 19:04:27 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.6.1 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fd86e2f1232] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7fd86e30871d] /usr/lib64/libc.so.6(+0x35640)[0x7fd86d30c640] /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_remove_pending_entry+0x2c)[0x7fd85f52450c] /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(+0x5ae28)[0x7fd85f511e28] /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x237)[0x7fd85f50f027] /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_brick_op_cbk+0x2fe)[0x7fd85f53be5e] /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x4c)[0x7fd85f53d48c] /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fd86e0c50b0] /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x171)[0x7fd86e0c5321] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fd86e0c1273] /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0x8530)[0x7fd85d17d530] /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0xace4)[0x7fd85d17fce4] /usr/lib64/libglusterfs.so.0(+0x76322)[0x7fd86e346322] /usr/sbin/glusterd(main+0x502)[0x7fd86e79afb2] /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd86d2f8af5] /usr/sbin/glusterd(+0x6351)[0x7fd86e79b351] --------- That is a problem with software? is a bug ? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150529/f531d0c4/attachment.html>
On 29 May 2015 13:29, "F?lix de Lelelis" <felix.delelisdd at gmail.com> wrote:> > Hi, > > I have a cluster with 3 nodes on pre-production. Yesterday, one node wasdown. The errror that I have seen is that:> > > [2015-05-28 19:04:27.305560] E[glusterd-syncop.c:1578:gd_sync_task_begin] 0-management: Unable to acquire lock for cfe-gv1> The message "I [MSGID: 106006][glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 5 times between [2015-05-28 19:04:09.346088] and [2015-05-28 19:04:24.349191]> pending frames: > frame : type(0) op(0) > patchset: git://git.gluster.com/glusterfs.git > signal received: 11 > time of crash: > 2015-05-28 19:04:27 > configuration details: > argp 1 > backtrace 1 > dlfcn 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.6.1 > /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fd86e2f1232] > /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7fd86e30871d] > /usr/lib64/libc.so.6(+0x35640)[0x7fd86d30c640] >/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_remove_pending_entry+0x2c)[0x7fd85f52450c]>/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(+0x5ae28)[0x7fd85f511e28]>/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x237)[0x7fd85f50f027]>/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_brick_op_cbk+0x2fe)[0x7fd85f53be5e]>/usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x4c)[0x7fd85f53d48c]> /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fd86e0c50b0] > /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x171)[0x7fd86e0c5321] > /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fd86e0c1273] >/usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0x8530)[0x7fd85d17d530]>/usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0xace4)[0x7fd85d17fce4]> /usr/lib64/libglusterfs.so.0(+0x76322)[0x7fd86e346322] > /usr/sbin/glusterd(main+0x502)[0x7fd86e79afb2] > /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd86d2f8af5] > /usr/sbin/glusterd(+0x6351)[0x7fd86e79b351] > --------- > > > That is a problem with software? is a bug ?This indicates that glusterd crashed, could you raise a bug capturing the sosreport with -a option and attaching it to bugzilla?> > Thanks. > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150529/ebe1722f/attachment.html>
On 05/29/2015 01:29 PM, F?lix de Lelelis wrote:> Hi, > > I have a cluster with 3 nodes on pre-production. Yesterday, one node was > down. The errror that I have seen is that: > > > [2015-05-28 19:04:27.305560] E [glusterd-syncop.c:1578:gd_sync_task_begin] > 0-management: Unable to acquire lock for cfe-gv1 > The message "I [MSGID: 106006] > [glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs > has disconnected from glusterd." repeated 5 times between [2015-05-28 > 19:04:09.346088] and [2015-05-28 19:04:24.349191] > pending frames: > frame : type(0) op(0) > patchset: git://git.gluster.com/glusterfs.git > signal received: 11 > time of crash: > 2015-05-28 19:04:27 > configuration details: > argp 1 > backtrace 1 > dlfcn 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.6.1 > /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fd86e2f1232] > /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7fd86e30871d] > /usr/lib64/libc.so.6(+0x35640)[0x7fd86d30c640] > /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_remove_pending_entry+0x2c)[0x7fd85f52450c] > /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(+0x5ae28)[0x7fd85f511e28] > /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x237)[0x7fd85f50f027] > /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_brick_op_cbk+0x2fe)[0x7fd85f53be5e] > /usr/lib64/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x4c)[0x7fd85f53d48c] > /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7fd86e0c50b0] > /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x171)[0x7fd86e0c5321] > /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fd86e0c1273] > /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0x8530)[0x7fd85d17d530] > /usr/lib64/glusterfs/3.6.1/rpc-transport/socket.so(+0xace4)[0x7fd85d17fce4] > /usr/lib64/libglusterfs.so.0(+0x76322)[0x7fd86e346322] > /usr/sbin/glusterd(main+0x502)[0x7fd86e79afb2] > /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd86d2f8af5] > /usr/sbin/glusterd(+0x6351)[0x7fd86e79b351] > --------- > > > That is a problem with software? is a bug ?The problem what I see here is concurrent volume status transactions were run at a given point of time (From the cmd log history in BZ 1226254). 3.6.1 has some fixes missing to take care of these issues identified on the same line. If you upgrade your cluster to 3.6.3 problem will go away. However 3.6.3 still misses one more fix http://review.gluster.org/#/c/10023/ which will be released in 3.6.4. I would request you to upgrade your cluster to 3.6.3 if not 3.7.> > Thanks. > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-- ~Atin