Ewen Chan
2019-Aug-21 02:09 UTC
[Gluster-users] Does the GlusterFS FUSE client communicate only via TCP or how does it work?
To Whom It May Concern: My four compute nodes are all dual Xeon nodes, with 128 GB of RAM each, and a Mellanox ConnectX-4 100 Gbps 4X EDR Infiniband NIC each, connected to a Mellanox 4X EDR IB switch. The GlusterFS volume was created using this command: # gluster volume create gv0 transport=rdma node1:/bricks/brick1/gv0 node2:/bricks/brick1/gv0 node3:/bricks/brick1/gv0 node4:/bricks/brick1/gv0 volume create: gv0: success: please start the volume to access data [root at node1 ewen]# gluster volume start gv0 volume start: gv0: success [root at node1 ewen]# gluster volume info Volume Name: gv0 Type: Distribute Volume ID: 105dc537-b5bc-4561-b460-fc022f1f8033 Status: Started Snapshot Count: 0 Number of Bricks: 4 Transport-type: tcp,rdma Bricks: Brick1: node1:/bricks/brick1/gv0 Brick2: node2:/bricks/brick1/gv0 Brick3: node3:/bricks/brick1/gv0 Brick4: node4:/bricks/brick1/gv0 Options Reconfigured: nfs.disable: on [root at node1 ewen]# gluster volume status Status of volume: gv0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick node1:/bricks/brick1/gv0 49152 49153 Y 16554 Brick node2:/bricks/brick1/gv0 49152 49153 Y 16116 Brick node3:/bricks/brick1/gv0 49152 49153 Y 16319 Brick node4:/bricks/brick1/gv0 49152 49153 Y 16403 Task Status of Volume gv0 ------------------------------------------------------------------------------ There are no active volume tasks I was trying to mount the GlusterFS volume using this command: # mount -t glusterfs -o transport=rdma node1:/gv0 /mnt/gv0 and it failed to mount. The error message in the log said (/var/log/glusterfs/gv0.log): [2019-08-21 00:27:35.796350] W [MSGID: 103071] [rdma.c:1277:gf_rdma_cm_event_handler] 0-gv0-client-0: cma event RDMA_CM_EVENT_REJE CTED, error 28 (me:10.0.1.1:49151 peer:10.0.1.1:24008) [2019-08-21 00:27:35.848728] W [MSGID: 103071] [rdma.c:1277:gf_rdma_cm_event_handler] 0-gv0-client-2: cma event RDMA_CM_EVENT_REJE CTED, error 28 (me:10.0.1.1:49149 peer:10.0.1.3:24008) [2019-08-21 00:27:35.861258] W [MSGID: 103071] [rdma.c:1277:gf_rdma_cm_event_handler] 0-gv0-client-3: cma event RDMA_CM_EVENT_REJE CTED, error 28 (me:10.0.1.1:49148 peer:10.0.1.4:24008) [2019-08-21 00:27:35.883548] W [MSGID: 103071] [rdma.c:1277:gf_rdma_cm_event_handler] 0-gv0-client-1: cma event RDMA_CM_EVENT_REJE CTED, error 28 (me:10.0.1.1:49150 peer:10.0.1.2:24008) [2019-08-21 00:27:35.887479] I [fuse-bridge.c:5142:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 ?kernel 7.22 [2019-08-21 00:27:35.887519] I [fuse-bridge.c:5753:fuse_graph_sync] 0-fuse: switched to graph 0 [2019-08-21 00:27:35.888059] I [dict.c:560:dict_get] (-->/usr/lib64/glusterfs/6.5/xlator/protocol/client.so(+0xa940) [0x7fe7acf2a9 40] -->/usr/lib64/glusterfs/6.5/xlator/cluster/distribute.so(+0x45348) [0x7fe7accac348] -->/lib64/libglusterfs.so.0(dict_get+0x94) ?[0x7fe7bbade1b4] ) 0-dict: !this || key=trusted.glusterfs.dht.mds [Invalid argument] [2019-08-21 00:27:35.888167] E [fuse-bridge.c:5211:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is n ot connected) [2019-08-21 00:27:35.888387] I [dict.c:560:dict_get] (-->/usr/lib64/glusterfs/6.5/xlator/protocol/client.so(+0xa940) [0x7fe7acf2a9 40] -->/usr/lib64/glusterfs/6.5/xlator/cluster/distribute.so(+0x45348) [0x7fe7accac348] -->/lib64/libglusterfs.so.0(dict_get+0x94) ?[0x7fe7bbade1b4] ) 0-dict: !this || key=trusted.glusterfs.dht.mds [Invalid argument] I looked up the RDMA error and disabled SELinux entirely: # cat /etc/selinux/config # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of three values: # targeted - Targeted processes are protected, # minimum - Modification of targeted policy. Only selected processes are protected. # mls - Multi Level Security protection. SELINUXTYPE=minimum It still wouldn't mount. So I deleted the GlusterFS volume and re-created the volume using the same bricks, but added the TCP protocol to it: # gluster volume create gv0 transport=tcp,rdma node1:/bricks/brick1/gv0 node2:/bricks/brick1/gv0 node3:/bricks/brick1/gv0 node4:/bricks/brick1/gv0 volume create: gv0: success: please start the volume to access data [root at node1 ewen]# gluster volume start gv0 volume start: gv0: success and then mounted the glusterfs volume: # mount -t glusterfs -o transport=rdma node1:/gv0 /mnt/gv0 and that worked. Does the GlusterFS FUSE client communicate only via TCP? Your assistance in helping me understand why the RDMA GlusterFS volume would mount with both TCP and RDMA protocols are enabled, but it wouldn't mount when I only specified just the RDMA protocol would be greatly appreciated. Thank you. Sincerely, Ewen