thr3ads.net - Gluster users - [Gluster-users] RDMA inline threshold? [May 2018]

If this information is useful, please help other people find it:
Share via:

Stefan Solbrig

2018-May-29 21:20 UTC

[Gluster-users] RDMA inline threshold?

Dear all,

I faced a problem with a glusterfs volume (pure distributed, _not_ dispersed)
over RDMA transport.  One user had a directory with a large number of files
(50,000 files) and just doing an "ls" in this directory yields a
"Transport endpoint not connected" error. The effect is, that
"ls" only shows some files, but not all.

The respective log file shows this error message:

[2018-05-20 20:38:25.114978] W [MSGID: 114031]
[client-rpc-fops.c:2578:client3_3_readdirp_cbk] 0-glurch-client-0: remote
operation failed [Transport endpoint is not connected]
[2018-05-20 20:38:27.732796] W [MSGID: 103046]
[rdma.c:4089:gf_rdma_process_recv] 0-rpc-transport/rdma: peer
(10.100.245.18:49153), couldn't encode or decode the msg properly or write
chunks were not provided for replies that were bigger than RDMA_INLINE_THRESHOLD
(2048)
[2018-05-20 20:38:27.732844] W [MSGID: 114031]
[client-rpc-fops.c:2578:client3_3_readdirp_cbk] 0-glurch-client-3: remote
operation failed [Transport endpoint is not connected]
[2018-05-20 20:38:27.733181] W [fuse-bridge.c:2897:fuse_readdirp_cbk]
0-glusterfs-fuse: 72882828: READDIRP => -1 (Transport endpoint is not
connected)

I already set the memlock limit for glusterd to unlimited, but the problem
persists.

Only going from RDMA transport to TCP transport solved the problem.  (I'm
running the volume now in mixed mode, config.transport=tcp,rdma).  Mounting with
transport=rdma shows this error, mouting with transport=tcp is fine.

however, this problem does not arise on all large directories, not on all. I
didn't recognize a pattern yet.

I'm using glusterfs v3.12.6 on the servers, QDR Infiniband HCAs . 

Is this a known issue with RDMA transport?

best wishes,
Stefan

Dan Lavu

2018-May-30 00:47 UTC

head link

[Gluster-users] RDMA inline threshold?

Stefan,

Sounds like a brick process is not running. I have notice some strangeness
in my lab when using RDMA, I often have to forcibly restart the brick
process, often as in every single time I do a major operation, add a new
volume, remove a volume, stop a volume, etc.

gluster volume status <vol>

Does any of the self heal daemons show N/A? If that's the case, try forcing
a restart on the volume.

gluster volume start <vol> force

This will also explain why your volumes aren't being replicated properly.

On Tue, May 29, 2018 at 5:20 PM, Stefan Solbrig <stefan.solbrig at ur.de>
wrote:
> Dear all,
>
> I faced a problem with a glusterfs volume (pure distributed, _not_
> dispersed) over RDMA transport.  One user had a directory with a large
> number of files (50,000 files) and just doing an "ls" in this
directory
> yields a "Transport endpoint not connected" error. The effect is,
that "ls"
> only shows some files, but not all.
>
> The respective log file shows this error message:
>
> [2018-05-20 20:38:25.114978] W [MSGID: 114031]
[client-rpc-fops.c:2578:client3_3_readdirp_cbk]
> 0-glurch-client-0: remote operation failed [Transport endpoint is not
> connected]
> [2018-05-20 20:38:27.732796] W [MSGID: 103046]
> [rdma.c:4089:gf_rdma_process_recv] 0-rpc-transport/rdma: peer (
> 10.100.245.18:49153), couldn't encode or decode the msg properly or
write
> chunks were not provided for replies that were bigger than
> RDMA_INLINE_THRESHOLD (2048)
> [2018-05-20 20:38:27.732844] W [MSGID: 114031]
[client-rpc-fops.c:2578:client3_3_readdirp_cbk]
> 0-glurch-client-3: remote operation failed [Transport endpoint is not
> connected]
> [2018-05-20 20:38:27.733181] W [fuse-bridge.c:2897:fuse_readdirp_cbk]
> 0-glusterfs-fuse: 72882828: READDIRP => -1 (Transport endpoint is not
> connected)
>
> I already set the memlock limit for glusterd to unlimited, but the problem
> persists.
>
> Only going from RDMA transport to TCP transport solved the problem. 
(I'm
> running the volume now in mixed mode, config.transport=tcp,rdma).  Mounting
> with transport=rdma shows this error, mouting with transport=tcp is fine.
>
> however, this problem does not arise on all large directories, not on all.
> I didn't recognize a pattern yet.
>
> I'm using glusterfs v3.12.6 on the servers, QDR Infiniband HCAs .
>
> Is this a known issue with RDMA transport?
>
> best wishes,
> Stefan
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180529/8cd931fc/attachment.html>

Dan Lavu

2018-May-30 01:00 UTC

head link

[Gluster-users] RDMA inline threshold?

Forgot to mention, sometimes I have to do force start other volumes as
well, its hard to determine which brick process is locked up from the logs.


Status of volume: rhev_vms_primary
Gluster process
                      TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick spidey.ib.runlevelone.lan:/gluster/brick/rhev_vms_primary
     0         49157      Y       15666
Brick deadpool.ib.runlevelone.lan:/gluster/brick/rhev_vms_primary
   0         49156      Y       2542
Brick groot.ib.runlevelone.lan:/gluster/brick/rhev_vms_primary
     0         49156      Y       2180
Self-heal Daemon on localhost
                  N/A       N/A        N       N/A  << Brick process is
not
running on any node.
Self-heal Daemon on spidey.ib.runlevelone.lan
           N/A       N/A        N       N/A
Self-heal Daemon on groot.ib.runlevelone.lan
           N/A       N/A        N       N/A

Task Status of Volume rhev_vms_primary
------------------------------------------------------------------------------
There are no active volume tasks


 3081  gluster volume start rhev_vms_noshards force
 3082  gluster volume status
 3083  gluster volume start rhev_vms_primary force
 3084  gluster volume status
 3085  gluster volume start rhev_vms_primary rhev_vms
 3086  gluster volume start rhev_vms_primary rhev_vms force

Status of volume: rhev_vms_primary
Gluster process
                         TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick spidey.ib.runlevelone.lan:/gluster/brick/rhev_vms_primary
        0         49157      Y       15666
Brick deadpool.ib.runlevelone.lan:/gluster/brick/rhev_vms_primary
      0         49156      Y       2542
Brick groot.ib.runlevelone.lan:/gluster/brick/rhev_vms_primary
        0         49156      Y       2180
Self-heal Daemon on localhost
                     N/A       N/A        Y       8343
Self-heal Daemon on spidey.ib.runlevelone.lan
              N/A       N/A        Y       22381
Self-heal Daemon on groot.ib.runlevelone.lan
              N/A       N/A        Y       20633

Finally..

Dan




On Tue, May 29, 2018 at 8:47 PM, Dan Lavu <dan at redhat.com> wrote:
> Stefan,
>
> Sounds like a brick process is not running. I have notice some strangeness
> in my lab when using RDMA, I often have to forcibly restart the brick
> process, often as in every single time I do a major operation, add a new
> volume, remove a volume, stop a volume, etc.
>
> gluster volume status <vol>
>
> Does any of the self heal daemons show N/A? If that's the case, try
> forcing a restart on the volume.
>
> gluster volume start <vol> force
>
> This will also explain why your volumes aren't being replicated
properly.
>
> On Tue, May 29, 2018 at 5:20 PM, Stefan Solbrig <stefan.solbrig at
ur.de>
> wrote:
>
>> Dear all,
>>
>> I faced a problem with a glusterfs volume (pure distributed, _not_
>> dispersed) over RDMA transport.  One user had a directory with a large
>> number of files (50,000 files) and just doing an "ls" in this
directory
>> yields a "Transport endpoint not connected" error. The effect
is, that "ls"
>> only shows some files, but not all.
>>
>> The respective log file shows this error message:
>>
>> [2018-05-20 20:38:25.114978] W [MSGID: 114031]
>> [client-rpc-fops.c:2578:client3_3_readdirp_cbk] 0-glurch-client-0:
>> remote operation failed [Transport endpoint is not connected]
>> [2018-05-20 20:38:27.732796] W [MSGID: 103046]
>> [rdma.c:4089:gf_rdma_process_recv] 0-rpc-transport/rdma: peer (
>> 10.100.245.18:49153), couldn't encode or decode the msg properly or
>> write chunks were not provided for replies that were bigger than
>> RDMA_INLINE_THRESHOLD (2048)
>> [2018-05-20 20:38:27.732844] W [MSGID: 114031]
>> [client-rpc-fops.c:2578:client3_3_readdirp_cbk] 0-glurch-client-3:
>> remote operation failed [Transport endpoint is not connected]
>> [2018-05-20 20:38:27.733181] W [fuse-bridge.c:2897:fuse_readdirp_cbk]
>> 0-glusterfs-fuse: 72882828: READDIRP => -1 (Transport endpoint is
not
>> connected)
>>
>> I already set the memlock limit for glusterd to unlimited, but the
>> problem persists.
>>
>> Only going from RDMA transport to TCP transport solved the problem. 
(I'm
>> running the volume now in mixed mode, config.transport=tcp,rdma). 
Mounting
>> with transport=rdma shows this error, mouting with transport=tcp is
fine.
>>
>> however, this problem does not arise on all large directories, not on
>> all. I didn't recognize a pattern yet.
>>
>> I'm using glusterfs v3.12.6 on the servers, QDR Infiniband HCAs .
>>
>> Is this a known issue with RDMA transport?
>>
>> best wishes,
>> Stefan
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180529/a54d142c/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

Gluster users - May 2018 - RDMA inline threshold?

[Gluster-users] RDMA inline threshold?

[Gluster-users] RDMA inline threshold?

[Gluster-users] RDMA inline threshold?

Apparently Analagous Threads