Hu Bert
2019-Jan-24 07:17 UTC
[Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC
Good morning, we currently transfer some data to a new glusterfs volume; to check the throughput of the new volume/setup while the transfer is running i decided to create some files on one of the gluster servers with dd in loop: while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M count=1024; rm /shared/private/1G.file; done /shared/private is the mount point of the glusterfs volume. The dd should run for about an hour. But now it happened twice that during this loop the transport endpoint gets disconnected: dd: failed to open '/shared/private/1G.file': Transport endpoint is not connected rm: cannot remove '/shared/private/1G.file': Transport endpoint is not connected In the /var/log/glusterfs/shared-private.log i see: [2019-01-24 07:03:28.938745] W [MSGID: 108001] [afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0: 7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i s not met [Transport endpoint is not connected] [2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free] (-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c) [0x7eff84248e8c] -->/usr/lib/x86_64-lin ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18) [0x7eff84248e18] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6) [0x7eff8a9485a6] ) 0-: Assertion failed: GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size) [----snip----] The whole output can be found here: https://pastebin.com/qTMmFxx0 gluster volume info here: https://pastebin.com/ENTWZ7j3 After umount + mount the transport endpoint is connected again - until the next disconnect. A /core file gets generated. Maybe someone wants to have a look at this file?
Amar Tumballi Suryanarayan
2019-Jan-24 07:29 UTC
[Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC
On Thu, Jan 24, 2019 at 12:47 PM Hu Bert <revirii at googlemail.com> wrote:> Good morning, > > we currently transfer some data to a new glusterfs volume; to check > the throughput of the new volume/setup while the transfer is running i > decided to create some files on one of the gluster servers with dd in > loop: > > while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M > count=1024; rm /shared/private/1G.file; done > > /shared/private is the mount point of the glusterfs volume. The dd > should run for about an hour. But now it happened twice that during > this loop the transport endpoint gets disconnected: > > dd: failed to open '/shared/private/1G.file': Transport endpoint is > not connected > rm: cannot remove '/shared/private/1G.file': Transport endpoint is not > connected > > In the /var/log/glusterfs/shared-private.log i see: > > [2019-01-24 07:03:28.938745] W [MSGID: 108001] > [afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0: > 7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i > s not met [Transport endpoint is not connected] > [2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free] > > (-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c) > [0x7eff84248e8c] -->/usr/lib/x86_64-lin > ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18) > [0x7eff84248e18] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6) > [0x7eff8a9485a6] ) 0-: Assertion failed: > GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size) > [----snip----] > > The whole output can be found here: https://pastebin.com/qTMmFxx0 > gluster volume info here: https://pastebin.com/ENTWZ7j3 > > After umount + mount the transport endpoint is connected again - until > the next disconnect. A /core file gets generated. Maybe someone wants > to have a look at this file? > _________________Hi Hu Bert, Thanks for these logs, and report. 'Transport end point not connected' on a mount comes for 2 reasons. 1. When the brick (in case of replica all the bricks) having the file is not reachable, or are down. This gets to normal state when the bricks are restarted. 2. When the client process crashes/asserts. In this case, /dev/fuse wouldn't be connected to a process, but mount will still have a reference. This needs 'umount' and mount again to work. We will see what is this issue and get back. Regards, Amar> ______________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190124/6b1e14e9/attachment.html>
Nithya Balachandran
2019-Feb-06 08:25 UTC
[Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC
Hi, The client logs indicates that the mount process has crashed. Please try mounting the volume with the volume option lru-limit=0 and see if it still crashes. Thanks, Nithya On Thu, 24 Jan 2019 at 12:47, Hu Bert <revirii at googlemail.com> wrote:> Good morning, > > we currently transfer some data to a new glusterfs volume; to check > the throughput of the new volume/setup while the transfer is running i > decided to create some files on one of the gluster servers with dd in > loop: > > while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M > count=1024; rm /shared/private/1G.file; done > > /shared/private is the mount point of the glusterfs volume. The dd > should run for about an hour. But now it happened twice that during > this loop the transport endpoint gets disconnected: > > dd: failed to open '/shared/private/1G.file': Transport endpoint is > not connected > rm: cannot remove '/shared/private/1G.file': Transport endpoint is not > connected > > In the /var/log/glusterfs/shared-private.log i see: > > [2019-01-24 07:03:28.938745] W [MSGID: 108001] > [afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0: > 7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i > s not met [Transport endpoint is not connected] > [2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free] > > (-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c) > [0x7eff84248e8c] -->/usr/lib/x86_64-lin > ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18) > [0x7eff84248e18] > -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6) > [0x7eff8a9485a6] ) 0-: Assertion failed: > GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size) > [----snip----] > > The whole output can be found here: https://pastebin.com/qTMmFxx0 > gluster volume info here: https://pastebin.com/ENTWZ7j3 > > After umount + mount the transport endpoint is connected again - until > the next disconnect. A /core file gets generated. Maybe someone wants > to have a look at this file? > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190206/4c826253/attachment.html>