Hi, All, I used Infiniband to connect all GlusterFS nodes and the clients. Previously I run IP over IB and everything was OK. Now I used rdma transport mode instead. And then I ran the traffic. After I while, the glusterfs process exited because of segmentation fault. Here were the messages when I saw segmentation fault: pending frames: frame : type(0) op(0) frame : type(0) op(0) frame : type(1) op(WRITE) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2017-11-01 11:11:23 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.11.0 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x78)[0x7f95bc54e618] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7f95bc557834] /lib64/libc.so.6(+0x32510)[0x7f95bace2510] The client OS was CentOS 7.3. The server OS was CentOS 6.5. The GlusterFS version was 3.11.0 both in clients and servers. The Infiniband card was Mellanox. The Mellanox IB driver version was v4.1-1.0.2 (27 Jun 2017) both in clients and servers. Is rdma code stable for GlusterFS? Need I upgrade the IB driver or apply a patch? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171104/2027cdc4/attachment.html>
Ben Turner
2017-Nov-04 19:00 UTC
[Gluster-users] glusterfs segmentation fault in rdma mode
This looks like there could be some some problem requesting / leaking / whatever memory but without looking at the core its tought to tell for sure. Note: /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x78)[0x7f95bc54e618] Can you open up a bugzilla and get us the core file to review? -b ----- Original Message -----> From: "???" <21291285 at qq.com> > To: "gluster-users" <gluster-users at gluster.org> > Sent: Saturday, November 4, 2017 5:27:50 AM > Subject: [Gluster-users] glusterfs segmentation fault in rdma mode > > > > Hi, All, > > > > > I used Infiniband to connect all GlusterFS nodes and the clients. Previously > I run IP over IB and everything was OK. Now I used rdma transport mode > instead. And then I ran the traffic. After I while, the glusterfs process > exited because of segmentation fault. > > > > > Here were the messages when I saw segmentation fault: > > pending frames: > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(1) op(WRITE) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > patchset: git:// git.gluster.org/glusterfs.git > > signal received: 11 > > time of crash: > > 2017-11-01 11:11:23 > > configuration details: > > argp 1 > > backtrace 1 > > dlfcn 1 > > libpthread 1 > > llistxattr 1 > > setfsid 1 > > spinlock 1 > > epoll.h 1 > > xattr.h 1 > > st_atim.tv_nsec 1 > > package-string: glusterfs 3.11.0 > > /usr/lib64/ libglusterfs.so.0(_gf_msg_backtrace_nomem+0x78)[0x7f95bc54e618 ] > > /usr/lib64/ libglusterfs.so.0(gf_print_trace+0x324)[0x7f95bc557834 ] > > /lib64/ libc.so.6(+0x32510)[0x7f95bace2510 ] > > The client OS was CentOS 7.3. The server OS was CentOS 6.5. The GlusterFS > version was 3.11.0 both in clients and servers. The Infiniband card was > Mellanox. The Mellanox IB driver version was v4.1-1.0.2 (27 Jun 2017) both > in clients and servers. > > > Is rdma code stable for GlusterFS? Need I upgrade the IB driver or apply a > patch? > > Thanks! > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users
acfreeman
2017-Nov-06 06:21 UTC
[Gluster-users] 回复: glusterfs segmentation fault in rdma mode
Hi ,all We found a strange problem. Some clients worked normally while some clients couldn't access sepcial files. For exmaple, Client A couldn't create the directory xxx, but Client B could. However, if Client B created the directory, Client A could acess it and even deleted it. But Client A still couldn't create the same directory later. If I changed the directory name, Client A worked without problems. It seemed that there were some problems with special bricks in special clients. But all the bricks were online. I saw this in the logs in the GlusterFS client after creating directory failure: [2017-11-06 11:55:18.420610] W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 0-data-dht: no subvolume for hash (value) = 4148753024 [2017-11-06 11:55:18.457744] W [fuse-bridge.c:521:fuse_entry_cbk] 0-glusterfs-fuse: 488: MKDIR() /xxx => -1 (Input/output error) The message "W [MSGID: 109011] [dht-layout.c:186:dht_layout_search] 0-data-dht: no subvolume for hash (value) = 4148753024" repeated 3 times between [2017-11-06 11:55:18.420610] and [2017-11-06 11:55:18.457731] ------------------ ???? ------------------ ???: "Ben&nbsp;Turner";<bturner at redhat.com>; ????: 2017?11?5?(???) ??3:00 ???: "acfreeman"<21291285 at qq.com>; ??: "gluster-users"<gluster-users at gluster.org>; ??: Re: [Gluster-users] glusterfs segmentation fault in rdma mode This looks like there could be some some problem requesting / leaking / whatever memory but without looking at the core its tought to tell for sure. Note: /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x78)[0x7f95bc54e618] Can you open up a bugzilla and get us the core file to review? -b ----- Original Message -----> From: "???" <21291285 at qq.com> > To: "gluster-users" <gluster-users at gluster.org> > Sent: Saturday, November 4, 2017 5:27:50 AM > Subject: [Gluster-users] glusterfs segmentation fault in rdma mode > > > > Hi, All, > > > > > I used Infiniband to connect all GlusterFS nodes and the clients. Previously > I run IP over IB and everything was OK. Now I used rdma transport mode > instead. And then I ran the traffic. After I while, the glusterfs process > exited because of segmentation fault. > > > > > Here were the messages when I saw segmentation fault: > > pending frames: > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(1) op(WRITE) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > frame : type(0) op(0) > > patchset: git:// git.gluster.org/glusterfs.git > > signal received: 11 > > time of crash: > > 2017-11-01 11:11:23 > > configuration details: > > argp 1 > > backtrace 1 > > dlfcn 1 > > libpthread 1 > > llistxattr 1 > > setfsid 1 > > spinlock 1 > > epoll.h 1 > > xattr.h 1 > > st_atim.tv_nsec 1 > > package-string: glusterfs 3.11.0 > > /usr/lib64/ libglusterfs.so.0(_gf_msg_backtrace_nomem+0x78)[0x7f95bc54e618 ] > > /usr/lib64/ libglusterfs.so.0(gf_print_trace+0x324)[0x7f95bc557834 ] > > /lib64/ libc.so.6(+0x32510)[0x7f95bace2510 ] > > The client OS was CentOS 7.3. The server OS was CentOS 6.5. The GlusterFS > version was 3.11.0 both in clients and servers. The Infiniband card was > Mellanox. The Mellanox IB driver version was v4.1-1.0.2 (27 Jun 2017) both > in clients and servers. > > > Is rdma code stable for GlusterFS? Need I upgrade the IB driver or apply a > patch? > > Thanks! > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171106/2694ae42/attachment.html>