thr3ads.net - Gluster users - [Gluster-users] Change transport-type on volume from tcp to rdma, tcp [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Geoffrey Letessier

2015-Jul-21 21:20 UTC

[Gluster-users] Change transport-type on volume from tcp to rdma, tcp

Hello Soumya, Hello everybody,

network.ping-timeout was set to 42 seconds. I set it to 0 but no difference. The
problem was, after having re-set le transport-type to rdma,tcp some brick down
after a few minutes.. Despite of restarting volumes, after a few minutes, some
[other/different] bricks down again.

Now, after re-creation of my volume, bricks keep alive but, oddly, i?m not able
to write on my volume. In addition, I defined a distributed volume with 2
servers, 4 bricks of 250GB each and my final volume seems to be only sized to
500GB? It?s amazing..

Here you can find some information:
# gluster volume status vol_workdir_amd
Status of volume: vol_workdir_amd
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick ib-storage1:/export/brick_workdir/bri
ck1/data                                    49185     49186      Y       23098
Brick ib-storage3:/export/brick_workdir/bri
ck1/data                                    49158     49159      Y       3886 
Brick ib-storage1:/export/brick_workdir/bri
ck2/data                                    49187     49188      Y       23117
Brick ib-storage3:/export/brick_workdir/bri
ck2/data                                    49160     49161      Y       3905 

# gluster volume info vol_workdir_amd
 
Volume Name: vol_workdir_amd
Type: Distribute
Volume ID: 087d26ea-c6df-4cbe-94af-ecd87b59aedb
Status: Started
Number of Bricks: 4
Transport-type: tcp,rdma
Bricks:
Brick1: ib-storage1:/export/brick_workdir/brick1/data
Brick2: ib-storage3:/export/brick_workdir/brick1/data
Brick3: ib-storage1:/export/brick_workdir/brick2/data
Brick4: ib-storage3:/export/brick_workdir/brick2/data
Options Reconfigured:
performance.readdir-ahead: on

# pdsh -w storage[1,3] df -h /export/brick_workdir/brick{1,2}
storage3: Filesystem            Size  Used Avail Use% Mounted on
storage3: /dev/mapper/st--block1-blk1--workdir
storage3:                       250G   34M  250G   1%
/export/brick_workdir/brick1
storage3: /dev/mapper/st--block2-blk2--workdir
storage3:                       250G   34M  250G   1%
/export/brick_workdir/brick2
storage1: Filesystem            Size  Used Avail Use% Mounted on
storage1: /dev/mapper/st--block1-blk1--workdir
storage1:                       250G   33M  250G   1%
/export/brick_workdir/brick1
storage1: /dev/mapper/st--block2-blk2--workdir
storage1:                       250G   33M  250G   1%
/export/brick_workdir/brick2

# df -h /workdir/
Filesystem            Size  Used Avail Use% Mounted on
localhost:vol_workdir_amd.rdma
                      500G   67M  500G   1% /workdir

# touch /workdir/test
touch: impossible de faire un touch ? /workdir/test ?: Aucun fichier ou dossier
de ce type

# tail -30l /var/log/glusterfs/workdir.log 
 Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:33.927673] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
 Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:37.877231] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
[2015-07-21 21:10:37.880556] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
[2015-07-21 21:10:37.914661] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
 Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:37.923535] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
 Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:41.883925] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
[2015-07-21 21:10:41.887085] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
[2015-07-21 21:10:41.919394] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
 Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:41.932622] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
 Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:44.682636] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.682947] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683240] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683472] W [dht-diskusage.c:48:dht_du_info_cbk]
0-vol_workdir_amd-dht: failed to get disk info from vol_workdir_amd-client-0
[2015-07-21 21:10:44.683506] W [dht-diskusage.c:48:dht_du_info_cbk]
0-vol_workdir_amd-dht: failed to get disk info from vol_workdir_amd-client-2
[2015-07-21 21:10:44.683532] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683551] W [fuse-bridge.c:1970:fuse_create_cbk]
0-glusterfs-fuse: 18: /test => -1 (Aucun fichier ou dossier de ce type)
[2015-07-21 21:10:44.683619] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683846] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:45.886807] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
[2015-07-21 21:10:45.893059] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
[2015-07-21 21:10:45.920434] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
 Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:45.925292] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
 Host Unreachable, Check your connection with IPoIB

I use GlusterFS in production since around 3 years without any block problem but
now the situation is awesome since more than 3 weeks? Indeed, our production are
down since roughly 3.5 weeks (with a lot and different problems with GlusterFS
v3.5.3 and now with 3.7.2-3) and i need to restart it?

Thanks in advance,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ing?nieur syst?me
UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr

Le 21 juil. 2015 ? 19:36, Soumya Koduri <skoduri at redhat.com> a ?crit :
> From the following errors,
> 
> [2015-07-21 14:36:30.495321] I [MSGID: 114020] [client.c:2118:notify]
0-vol_shared-client-0: parent translators are ready, attempting connect on
transport
> [2015-07-21 14:36:30.498989] W [socket.c:923:__socket_keepalive] 0-socket:
failed to set TCP_USER_TIMEOUT 0 on socket 12, Protocole non disponible
> [2015-07-21 14:36:30.499004] E [socket.c:3015:socket_connect]
0-vol_shared-client-0: Failed to set keep-alive: Protocole non disponible
> 
> looks like setting TCP_USER_TIMEOUT value to 0 on the socket failed with
error (IIUC) "Protocol not available".
> Could you check if 'network.ping-timeout' is set to zero for that
volume using 'gluster volume info'? Anyways from the code looks like
'TCP_USER_TIMEOUT' can take value zero. Not sure why it has failed.
> 
> Niels, any thoughts?
> 
> Thanks,
> Soumya
> 
> On 07/21/2015 08:15 PM, Geoffrey Letessier wrote:
>> [2015-07-21 14:36:30.495321] I [MSGID: 114020] [client.c:2118:notify]
>> 0-vol_shared-client-0: parent translators are ready, attempting connect
>> on transport
>> [2015-07-21 14:36:30.498989] W [socket.c:923:__socket_keepalive]
>> 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 12, Protocole non
>> disponible
>> [2015-07-21 14:36:30.499004] E [socket.c:3015:socket_connect]
>> 0-vol_shared-client-0: Failed to set keep-alive: Protocole non
disponible
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150721/d63a5a6d/attachment.html>

Niels de Vos

2015-Jul-21 21:49 UTC

head link

[Gluster-users] Change transport-type on volume from tcp to rdma, tcp

On Tue, Jul 21, 2015 at 11:20:20PM +0200, Geoffrey Letessier
wrote:> Hello Soumya, Hello everybody,
> 
> network.ping-timeout was set to 42 seconds. I set it to 0 but no
> difference. The problem was, after having re-set le transport-type to
> rdma,tcp some brick down after a few minutes.. Despite of restarting
> volumes, after a few minutes, some [other/different] bricks down
> again.
I'm not sure how if the ping-timeout is differently handled when RDMA is
used. Adding two of the guys that know RDMA well on CC.
> Now, after re-creation of my volume, bricks keep alive but, oddly, i?m
> not able to write on my volume. In addition, I defined a distributed
> volume with 2 servers, 4 bricks of 250GB each and my final volume
> seems to be only sized to 500GB? It?s amazing.. 
As seen further below, the 500GB volume is caused by two unreachable
bricks. When the bricks are not reachable, the size of the bricks can
not be detected by the client and therefore 2x 250 GB is missing.

It is unclear to me why writing to a pure distributed volume fails. When
a brick is not reachable, and the file should be created there, it
would normally get created on an other brick. When the brick that should
have the file gets online, and a new lookup for the file is done, a so
called "link file" is created, which points to the file on the other
brick. I guess the failure has to do with the connection issues, and I
would suggest to get that solved first.

HTH,
Niels

> Here you can find some information:
> # gluster volume status vol_workdir_amd
> Status of volume: vol_workdir_amd
> Gluster process                             TCP Port  RDMA Port  Online 
Pid
>
------------------------------------------------------------------------------
> Brick ib-storage1:/export/brick_workdir/bri
> ck1/data                                    49185     49186      Y      
23098
> Brick ib-storage3:/export/brick_workdir/bri
> ck1/data                                    49158     49159      Y      
3886
> Brick ib-storage1:/export/brick_workdir/bri
> ck2/data                                    49187     49188      Y      
23117
> Brick ib-storage3:/export/brick_workdir/bri
> ck2/data                                    49160     49161      Y      
3905
> 
> # gluster volume info vol_workdir_amd
>  
> Volume Name: vol_workdir_amd
> Type: Distribute
> Volume ID: 087d26ea-c6df-4cbe-94af-ecd87b59aedb
> Status: Started
> Number of Bricks: 4
> Transport-type: tcp,rdma
> Bricks:
> Brick1: ib-storage1:/export/brick_workdir/brick1/data
> Brick2: ib-storage3:/export/brick_workdir/brick1/data
> Brick3: ib-storage1:/export/brick_workdir/brick2/data
> Brick4: ib-storage3:/export/brick_workdir/brick2/data
> Options Reconfigured:
> performance.readdir-ahead: on
> 
> # pdsh -w storage[1,3] df -h /export/brick_workdir/brick{1,2}
> storage3: Filesystem            Size  Used Avail Use% Mounted on
> storage3: /dev/mapper/st--block1-blk1--workdir
> storage3:                       250G   34M  250G   1%
/export/brick_workdir/brick1
> storage3: /dev/mapper/st--block2-blk2--workdir
> storage3:                       250G   34M  250G   1%
/export/brick_workdir/brick2
> storage1: Filesystem            Size  Used Avail Use% Mounted on
> storage1: /dev/mapper/st--block1-blk1--workdir
> storage1:                       250G   33M  250G   1%
/export/brick_workdir/brick1
> storage1: /dev/mapper/st--block2-blk2--workdir
> storage1:                       250G   33M  250G   1%
/export/brick_workdir/brick2
> 
> # df -h /workdir/
> Filesystem            Size  Used Avail Use% Mounted on
> localhost:vol_workdir_amd.rdma
>                       500G   67M  500G   1% /workdir
> 
> # touch /workdir/test
> touch: impossible de faire un touch ? /workdir/test ?: Aucun fichier ou
dossier de ce type
> 
> # tail -30l /var/log/glusterfs/workdir.log 
>  Host Unreachable, Check your connection with IPoIB
> [2015-07-21 21:10:33.927673] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
>  Host Unreachable, Check your connection with IPoIB
> [2015-07-21 21:10:37.877231] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
> [2015-07-21 21:10:37.880556] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
> [2015-07-21 21:10:37.914661] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
>  Host Unreachable, Check your connection with IPoIB
> [2015-07-21 21:10:37.923535] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
>  Host Unreachable, Check your connection with IPoIB
> [2015-07-21 21:10:41.883925] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
> [2015-07-21 21:10:41.887085] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
> [2015-07-21 21:10:41.919394] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
>  Host Unreachable, Check your connection with IPoIB
> [2015-07-21 21:10:41.932622] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
>  Host Unreachable, Check your connection with IPoIB
> [2015-07-21 21:10:44.682636] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
> [2015-07-21 21:10:44.682947] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
> [2015-07-21 21:10:44.683240] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
> [2015-07-21 21:10:44.683472] W [dht-diskusage.c:48:dht_du_info_cbk]
0-vol_workdir_amd-dht: failed to get disk info from vol_workdir_amd-client-0
> [2015-07-21 21:10:44.683506] W [dht-diskusage.c:48:dht_du_info_cbk]
0-vol_workdir_amd-dht: failed to get disk info from vol_workdir_amd-client-2
> [2015-07-21 21:10:44.683532] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
> [2015-07-21 21:10:44.683551] W [fuse-bridge.c:1970:fuse_create_cbk]
0-glusterfs-fuse: 18: /test => -1 (Aucun fichier ou dossier de ce type)
> [2015-07-21 21:10:44.683619] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
> [2015-07-21 21:10:44.683846] W [dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash (value) = 1072520554
> [2015-07-21 21:10:45.886807] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
> [2015-07-21 21:10:45.893059] I [rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
> [2015-07-21 21:10:45.920434] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
>  Host Unreachable, Check your connection with IPoIB
> [2015-07-21 21:10:45.925292] W [rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
>  Host Unreachable, Check your connection with IPoIB
> 
> I use GlusterFS in production since around 3 years without any block
> problem but now the situation is awesome since more than 3 weeks?
> Indeed, our production are down since roughly 3.5 weeks (with a lot
> and different problems with GlusterFS v3.5.3 and now with 3.7.2-3) and
> i need to restart it? 
> 
> Thanks in advance,
> Geoffrey
> ------------------------------------------------------
> Geoffrey Letessier
> Responsable informatique & ing?nieur syst?me
> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> 
> Le 21 juil. 2015 ? 19:36, Soumya Koduri <skoduri at redhat.com> a
?crit :
> 
> > From the following errors,
> > 
> > [2015-07-21 14:36:30.495321] I [MSGID: 114020] [client.c:2118:notify]
0-vol_shared-client-0: parent translators are ready, attempting connect on
transport
> > [2015-07-21 14:36:30.498989] W [socket.c:923:__socket_keepalive]
0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 12, Protocole non
disponible
> > [2015-07-21 14:36:30.499004] E [socket.c:3015:socket_connect]
0-vol_shared-client-0: Failed to set keep-alive: Protocole non disponible
> > 
> > looks like setting TCP_USER_TIMEOUT value to 0 on the socket failed
with error (IIUC) "Protocol not available".
> > Could you check if 'network.ping-timeout' is set to zero for
that volume using 'gluster volume info'? Anyways from the code looks
like 'TCP_USER_TIMEOUT' can take value zero. Not sure why it has failed.
> > 
> > Niels, any thoughts?
> > 
> > Thanks,
> > Soumya
> > 
> > On 07/21/2015 08:15 PM, Geoffrey Letessier wrote:
> >> [2015-07-21 14:36:30.495321] I [MSGID: 114020]
[client.c:2118:notify]
> >> 0-vol_shared-client-0: parent translators are ready, attempting
connect
> >> on transport
> >> [2015-07-21 14:36:30.498989] W [socket.c:923:__socket_keepalive]
> >> 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 12, Protocole
non
> >> disponible
> >> [2015-07-21 14:36:30.499004] E [socket.c:3015:socket_connect]
> >> 0-vol_shared-client-0: Failed to set keep-alive: Protocole non
disponible
>

Gluster users - Jul 2015 - Change transport-type on volume from tcp to rdma, tcp

[Gluster-users] Change transport-type on volume from tcp to rdma, tcp

[Gluster-users] Change transport-type on volume from tcp to rdma, tcp