Geoffrey Letessier
2014-Oct-07 22:45 UTC
[Gluster-users] RDMA connectivity not available with GlusterFS 3.5.2
Dears, I have a HPC cluster composed by 4 storage nodes (8x 24TB RAID6 bricks, 2 per nodes) and 62 compute nodes, interconnected via Infiniband QDR technology. NB: each brick provide around 1.2-1.5TBs write performances. My main volume is defined as below Volume Name: vol_home Type: Distributed-Replicate Volume ID: f6ebcfc1-b735-4a0e-b1d7-47ed2d2e7af6 Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp,rdma Bricks: Brick1: ib-storage1:/export/brick_home/brick1 Brick2: ib-storage2:/export/brick_home/brick1 Brick3: ib-storage3:/export/brick_home/brick1 Brick4: ib-storage4:/export/brick_home/brick1 Brick5: ib-storage1:/export/brick_home/brick2 Brick6: ib-storage2:/export/brick_home/brick2 Brick7: ib-storage3:/export/brick_home/brick2 Brick8: ib-storage4:/export/brick_home/brick2 Options Reconfigured: features.quota: on diagnostics.brick-log-level: CRITICAL auth.allow: localhost,127.0.0.1,10.* nfs.disable: on performance.cache-size: 64MB performance.write-behind-window-size: 1MB performance.quick-read: on performance.io-cache: on performance.io-thread-count: 64 features.default-soft-limit: 90% But, in the cluster, when I try to mount my volume specifying RDMA transport type, i notice all my communication go through TCP stack (all network packet are visible on ib0 network interface with ifstat shell command), not through RDMA [root at lucifer ~]# mount -t glusterfs -o transport=rdma,direct-io-mode=disable localhost:vol_home /home [root at lucifer ~]# mount|grep vol_home.rdma localhost:vol_home.rdma on /home type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) [root at lucifer ~]# ifstat -i ib0 ib0 KB/s in KB/s out 25313.60 6776.44 26258.96 9064.92 28272.97 10034.15 23495.09 8504.84 21842.41 7161.69 ^C So, my best noticed throughput is around 400MBs, but basically around 200-250MBs, although I can read on the net i can expect to achieve around 800-900MBs -sometimes more- with RDMA transport type. Can anyone help me to make it work? In addition, are my volume settings look like optimal? Thanks in advance, Geoffrey -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141008/3b169cad/attachment.html>
Mohammed Rafi K C
2014-Oct-08 04:58 UTC
[Gluster-users] RDMA connectivity not available with GlusterFS 3.5.2
On 10/08/2014 04:15 AM, Geoffrey Letessier wrote:> Dears, > > I have a HPC cluster composed by 4 storage nodes (8x 24TB RAID6 > bricks, 2 per nodes) and 62 compute nodes, interconnected via > Infiniband QDR technology. > > NB: each brick provide around 1.2-1.5TBs write performances. > > My main volume is defined as below > Volume Name: vol_home > Type: Distributed-Replicate > Volume ID: f6ebcfc1-b735-4a0e-b1d7-47ed2d2e7af6 > Status: Started > Number of Bricks: 4 x 2 = 8 > Transport-type: tcp,rdma > Bricks: > Brick1: ib-storage1:/export/brick_home/brick1 > Brick2: ib-storage2:/export/brick_home/brick1 > Brick3: ib-storage3:/export/brick_home/brick1 > Brick4: ib-storage4:/export/brick_home/brick1 > Brick5: ib-storage1:/export/brick_home/brick2 > Brick6: ib-storage2:/export/brick_home/brick2 > Brick7: ib-storage3:/export/brick_home/brick2 > Brick8: ib-storage4:/export/brick_home/brick2 > Options Reconfigured: > features.quota: on > diagnostics.brick-log-level: CRITICAL > auth.allow: localhost,127.0.0.1,10.* > nfs.disable: on > performance.cache-size: 64MB > performance.write-behind-window-size: 1MB > performance.quick-read: on > performance.io-cache: on > performance.io-thread-count: 64 > features.default-soft-limit: 90% > > But, in the cluster, when I try to mount my volume specifying RDMA > transport type, i notice all my communication go through TCP stack > (all network packet are visible on ib0 network interface with ifstat > shell command), not through RDMA > [root at lucifer ~]# mount -t glusterfs -o > transport=rdma,direct-io-mode=disable localhost:vol_home /home > [root at lucifer ~]# mount|grep vol_home.rdma > localhost:vol_home.rdma on /home type fuse.glusterfs > (rw,default_permissions,allow_other,max_read=131072) > [root at lucifer ~]# ifstat -i ib0 > ib0 > KB/s in KB/s out > 25313.60 6776.44 > 26258.96 9064.92 > 28272.97 10034.15 > 23495.09 8504.84 > 21842.41 7161.69 > ^C > > So, my best noticed throughput is around 400MBs, but basically around > 200-250MBs, although I can read on the net i can expect to achieve > around 800-900MBs -sometimes more- with RDMA transport type. > > Can anyone help me to make it work?There is known issue in rdma that volume with transport type as tcp,rdma will mount as tcp. the fix for the same is under review. You can pull the patch from git fetch https://review.gluster.org/glusterfs refs/changes/98/8498/7 && git format-patch -1 FETCH_HEAD. Applying the patch will help to mount the tcp,rdma volume as rdma. If you are mounting a tcp,rdma volume as RDMA fuse mount you can also append *.rdma *with volname**instead of -o option. Let me know your result, If possible I would like to know the version of gluster you are currently using.** Rafi KC.> > In addition, are my volume settings look like optimal? > > Thanks in advance, > Geoffrey > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141008/b30c0cbb/attachment.html>