Hello, I'm having some major problems with Gluster and oVirt, I've been ripping my hair out with this, so if anybody can provide insight, that will be fantastic. I've tried both transports TCP and RDMA... both are having instability problems. So the first thing I'm running into, intermittently, on one specific node, will get spammed with the following message; "[2016-08-08 00:42:50.837992] E [rpc-clnt.c:357:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fb728b0f293] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7fb7288d73d1] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb7288d74ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7fb7288d8d0e] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fb7288d9528] ))))) 0-vmdata1-client-0: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2016-08-08 00:42:43.620710 (xid=0x6800b)" Then the infiniband device will get bounced and VMs will get stuck. Another problem I'm seeing, once a day, or every two days, an oVirt node will hang on gluster mounts. Issuing a df to check the mounts will just stall, this occurs hourly if RDMA is used. I can log into the hypervisor remount the gluster volumes most of the time. This is on Fedora 23; Gluster 3.8.1-1, the Infiniband gear is 40Gb/s QDR Qlogic, using the ib_qib module, this configuration was working with our old infinihost III. I couldn't get OFED to compile so all the infiniband modules are Fedora installed. So a volume looks like the following, (please if there is anything I need to adjust, the settings was pulled from several examples) Volume Name: vmdata_ha Type: Replicate Volume ID: 325a5fda-a491-4c40-8502-f89776a3c642 Status: Started Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp,rdma Bricks: Brick1: deadpool.ib.runlevelone.lan:/gluster/vmdata_ha Brick2: spidey.ib.runlevelone.lan:/gluster/vmdata_ha Brick3: groot.ib.runlevelone.lan:/gluster/vmdata_ha (arbiter) Options Reconfigured: performance.least-prio-threads: 4 performance.low-prio-threads: 16 performance.normal-prio-threads: 24 performance.high-prio-threads: 24 cluster.self-heal-window-size: 32 cluster.self-heal-daemon: on performance.md-cache-timeout: 1 performance.cache-max-file-size: 2MB performance.io-thread-count: 32 network.ping-timeout: 5 performance.write-behind-window-size: 4MB performance.cache-size: 256MB performance.cache-refresh-timeout: 10 server.allow-insecure: on network.remote-dio: enable performance.io-cache: off performance.read-ahead: off performance.quick-read: off storage.owner-gid: 36 storage.owner-uid: 36 performance.readdir-ahead: on nfs.disable: on config.transport: tcp,rdma performance.stat-prefetch: off cluster.eager-lock: enable Volume Name: vmdata1 Type: Distribute Volume ID: 3afefcb3-887c-4315-b9dc-f4e890f786eb Status: Started Number of Bricks: 2 Transport-type: tcp,rdma Bricks: Brick1: spidey.ib.runlevelone.lan:/gluster/vmdata1 Brick2: deadpool.ib.runlevelone.lan:/gluster/vmdata1 Options Reconfigured: config.transport: tcp,rdma network.remote-dio: enable performance.io-cache: off performance.read-ahead: off performance.quick-read: off nfs.disable: on storage.owner-gid: 36 storage.owner-uid: 36 performance.readdir-ahead: on server.allow-insecure: on performance.stat-prefetch: off performance.cache-refresh-timeout: 10 performance.cache-size: 256MB performance.write-behind-window-size: 4MB network.ping-timeout: 5 performance.io-thread-count: 32 performance.cache-max-file-size: 2MB performance.md-cache-timeout: 1 performance.high-prio-threads: 24 performance.normal-prio-threads: 24 performance.low-prio-threads: 16 performance.least-prio-threads: 4 /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,tcp option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option ping-timeout 0 option event-threads 1 # option rpc-auth-allow-insecure on option transport.socket.bind-address 0.0.0.0 # option transport.address-family inet6 # option base-port 49152 end-volume I think that's a good start, thank you so much for taking the time to look at this. You can find me on freenode, nick side_control if you want to chat, I'm GMT -5. Cheers, Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160807/db843789/attachment.html>
Pranith Kumar Karampuri
2016-Aug-11 06:53 UTC
[Gluster-users] Gluster Infiniband/RDMA Help
Added Rafi, Raghavendra who work on RDMA On Mon, Aug 8, 2016 at 7:58 AM, Dan Lavu <dan at redhat.com> wrote:> Hello, > > I'm having some major problems with Gluster and oVirt, I've been ripping > my hair out with this, so if anybody can provide insight, that will be > fantastic. I've tried both transports TCP and RDMA... both are having > instability problems. > > So the first thing I'm running into, intermittently, on one specific node, > will get spammed with the following message; > > "[2016-08-08 00:42:50.837992] E [rpc-clnt.c:357:saved_frames_unwind] (--> > /lib64/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fb728b0f293] (--> > /lib64/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7fb7288d73d1] (--> > /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb7288d74ee] (--> > /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7fb7288d8d0e] > (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fb7288d9528] ))))) > 0-vmdata1-client-0: forced unwinding frame type(GlusterFS 3.3) > op(WRITE(13)) called at 2016-08-08 00:42:43.620710 (xid=0x6800b)" > > Then the infiniband device will get bounced and VMs will get stuck. > > Another problem I'm seeing, once a day, or every two days, an oVirt node > will hang on gluster mounts. Issuing a df to check the mounts will just > stall, this occurs hourly if RDMA is used. I can log into the hypervisor > remount the gluster volumes most of the time. > > This is on Fedora 23; Gluster 3.8.1-1, the Infiniband gear is 40Gb/s QDR > Qlogic, using the ib_qib module, this configuration was working with our > old infinihost III. I couldn't get OFED to compile so all the infiniband > modules are Fedora installed. > > So a volume looks like the following, (please if there is anything I need > to adjust, the settings was pulled from several examples) > > Volume Name: vmdata_ha > Type: Replicate > Volume ID: 325a5fda-a491-4c40-8502-f89776a3c642 > Status: Started > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp,rdma > Bricks: > Brick1: deadpool.ib.runlevelone.lan:/gluster/vmdata_ha > Brick2: spidey.ib.runlevelone.lan:/gluster/vmdata_ha > Brick3: groot.ib.runlevelone.lan:/gluster/vmdata_ha (arbiter) > Options Reconfigured: > performance.least-prio-threads: 4 > performance.low-prio-threads: 16 > performance.normal-prio-threads: 24 > performance.high-prio-threads: 24 > cluster.self-heal-window-size: 32 > cluster.self-heal-daemon: on > performance.md-cache-timeout: 1 > performance.cache-max-file-size: 2MB > performance.io-thread-count: 32 > network.ping-timeout: 5 > performance.write-behind-window-size: 4MB > performance.cache-size: 256MB > performance.cache-refresh-timeout: 10 > server.allow-insecure: on > network.remote-dio: enable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > storage.owner-gid: 36 > storage.owner-uid: 36 > performance.readdir-ahead: on > nfs.disable: on > config.transport: tcp,rdma > performance.stat-prefetch: off > cluster.eager-lock: enable > > Volume Name: vmdata1 > Type: Distribute > Volume ID: 3afefcb3-887c-4315-b9dc-f4e890f786eb > Status: Started > Number of Bricks: 2 > Transport-type: tcp,rdma > Bricks: > Brick1: spidey.ib.runlevelone.lan:/gluster/vmdata1 > Brick2: deadpool.ib.runlevelone.lan:/gluster/vmdata1 > Options Reconfigured: > config.transport: tcp,rdma > network.remote-dio: enable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > nfs.disable: on > storage.owner-gid: 36 > storage.owner-uid: 36 > performance.readdir-ahead: on > server.allow-insecure: on > performance.stat-prefetch: off > performance.cache-refresh-timeout: 10 > performance.cache-size: 256MB > performance.write-behind-window-size: 4MB > network.ping-timeout: 5 > performance.io-thread-count: 32 > performance.cache-max-file-size: 2MB > performance.md-cache-timeout: 1 > performance.high-prio-threads: 24 > performance.normal-prio-threads: 24 > performance.low-prio-threads: 16 > performance.least-prio-threads: 4 > > > /etc/glusterfs/glusterd.vol > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,tcp > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option ping-timeout 0 > option event-threads 1 > # option rpc-auth-allow-insecure on > option transport.socket.bind-address 0.0.0.0 > # option transport.address-family inet6 > # option base-port 49152 > end-volume > > I think that's a good start, thank you so much for taking the time to look > at this. You can find me on freenode, nick side_control if you want to > chat, I'm GMT -5. > > Cheers, > > Dan > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160811/7a209c78/attachment.html>