Dear Gluster Community, I have this setup: 4-Node Glusterfs v5.5 Cluster, using SAMBA/CTDB v4.8 to access the volumes (each node has a VIP) I was testing this failover scenario: 1. Start Writing 940 GB with small files (64K-100K)from a Win10 Client to node1 2. During the write process I hardly shutdown node1 (where the client is connect via VIP) by turn off the power My expectation is, that the write process stops and after a while the Win10 Client offers me a Retry, so I can continue the write on different node (which has now the VIP of node1). In past time I did this observation, but now the system shows a strange bahaviour: The Win10 Client do nothing and the Explorer freezes, in the backend CTDB can not perform the failover and throws errors. The glusterd from node2 and node3 logs this messages:> [2019-04-16 14:47:31.828323] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive1 not held > [2019-04-16 14:47:31.828350] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive1 > [2019-04-16 14:47:31.828369] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive2 not held > [2019-04-16 14:47:31.828376] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive2 > [2019-04-16 14:47:31.828412] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol gluster_shared_storage not held > [2019-04-16 14:47:31.828423] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for gluster_shared_storage > >*In my oponion Samba/CTDB can not perform the failover correctly and continue the write process because glusterfs didn't released the lock.* What do you think? It seems to me like a bug because in past time the failover works correctly. Regards David Spisla -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190417/850daa88/attachment.html>
Hi. I have a some question about your testing. 1. What was the glusterfs version you used in past time? 2. How about a volume configuration? 3. Was CTDB vip failed over correctly? If so, Clould you attach /var/log/samba/glusterfs-volname.win10.ip.log ? Best Regards - kpkim 2019? 4? 17? (?) ?? 5:02, David Spisla <spisla80 at gmail.com>?? ??:> Dear Gluster Community, > > I have this setup: 4-Node Glusterfs v5.5 Cluster, using SAMBA/CTDB v4.8 to > access the volumes (each node has a VIP) > > I was testing this failover scenario: > > 1. Start Writing 940 GB with small files (64K-100K)from a Win10 Client to > node1 > 2. During the write process I hardly shutdown node1 (where the client is > connect via VIP) by turn off the power > > My expectation is, that the write process stops and after a while the > Win10 Client offers me a Retry, so I can continue the write on different > node (which has now the VIP of node1). > In past time I did this observation, but now the system shows a strange > bahaviour: > > The Win10 Client do nothing and the Explorer freezes, in the backend CTDB > can not perform the failover and throws errors. The glusterd from node2 and > node3 logs this messages: > >> [2019-04-16 14:47:31.828323] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive1 not held >> [2019-04-16 14:47:31.828350] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive1 >> [2019-04-16 14:47:31.828369] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive2 not held >> [2019-04-16 14:47:31.828376] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive2 >> [2019-04-16 14:47:31.828412] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol gluster_shared_storage not held >> [2019-04-16 14:47:31.828423] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for gluster_shared_storage >> >> > *In my oponion Samba/CTDB can not perform the failover correctly and > continue the write process because glusterfs didn't released the lock.* > What do you think? It seems to me like a bug because in past time the > failover works correctly. > > Regards > David Spisla > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190418/603c8973/attachment.html>
Amar Tumballi Suryanarayan
2019-May-01 12:46 UTC
[Gluster-users] Hard Failover with Samba and Glusterfs
On Wed, Apr 17, 2019 at 1:33 PM David Spisla <spisla80 at gmail.com> wrote:> Dear Gluster Community, > > I have this setup: 4-Node Glusterfs v5.5 Cluster, using SAMBA/CTDB v4.8 to > access the volumes (each node has a VIP) > > I was testing this failover scenario: > > 1. Start Writing 940 GB with small files (64K-100K)from a Win10 Client to > node1 > 2. During the write process I hardly shutdown node1 (where the client is > connect via VIP) by turn off the power > > My expectation is, that the write process stops and after a while the > Win10 Client offers me a Retry, so I can continue the write on different > node (which has now the VIP of node1). > In past time I did this observation, but now the system shows a strange > bahaviour: > > The Win10 Client do nothing and the Explorer freezes, in the backend CTDB > can not perform the failover and throws errors. The glusterd from node2 and > node3 logs this messages: > >> [2019-04-16 14:47:31.828323] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive1 not held >> [2019-04-16 14:47:31.828350] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive1 >> [2019-04-16 14:47:31.828369] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive2 not held >> [2019-04-16 14:47:31.828376] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive2 >> [2019-04-16 14:47:31.828412] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol gluster_shared_storage not held >> [2019-04-16 14:47:31.828423] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for gluster_shared_storage >> >> > *In my oponion Samba/CTDB can not perform the failover correctly and > continue the write process because glusterfs didn't released the lock.* > What do you think? It seems to me like a bug because in past time the > failover works correctly. > >Thanks for the report David. It surely looks like a bug, and I would let some experts on this domain answer the question. One request on such thing is to file a bug (preferred) or github issue, so it can be present in system.> Regards > David Spisla > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190501/e1e8c7e6/attachment.html>
All answers to this questions are in this bugreport: https://bugzilla.redhat.com/show_bug.cgi?GoAheadAndLogIn=Log%20in&id=1706842 Am Do., 18. Apr. 2019 um 09:21 Uhr schrieb hgichon <hgichon at gmail.com>:> Hi. > > I have a some question about your testing. > > 1. What was the glusterfs version you used in past time? > 2. How about a volume configuration? > 3. Was CTDB vip failed over correctly? If so, Clould you attach > /var/log/samba/glusterfs-volname.win10.ip.log ? > > Best Regards > > - kpkim > > > 2019? 4? 17? (?) ?? 5:02, David Spisla <spisla80 at gmail.com>?? ??: > >> Dear Gluster Community, >> >> I have this setup: 4-Node Glusterfs v5.5 Cluster, using SAMBA/CTDB v4.8 >> to access the volumes (each node has a VIP) >> >> I was testing this failover scenario: >> >> 1. Start Writing 940 GB with small files (64K-100K)from a Win10 Client >> to node1 >> 2. During the write process I hardly shutdown node1 (where the client >> is connect via VIP) by turn off the power >> >> My expectation is, that the write process stops and after a while the >> Win10 Client offers me a Retry, so I can continue the write on different >> node (which has now the VIP of node1). >> In past time I did this observation, but now the system shows a strange >> bahaviour: >> >> The Win10 Client do nothing and the Explorer freezes, in the backend CTDB >> can not perform the failover and throws errors. The glusterd from node2 and >> node3 logs this messages: >> >>> [2019-04-16 14:47:31.828323] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive1 not held >>> [2019-04-16 14:47:31.828350] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive1 >>> [2019-04-16 14:47:31.828369] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive2 not held >>> [2019-04-16 14:47:31.828376] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive2 >>> [2019-04-16 14:47:31.828412] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol gluster_shared_storage not held >>> [2019-04-16 14:47:31.828423] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for gluster_shared_storage >>> >>> >> *In my oponion Samba/CTDB can not perform the failover correctly and >> continue the write process because glusterfs didn't released the lock.* >> What do you think? It seems to me like a bug because in past time the >> failover works correctly. >> >> Regards >> David Spisla >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190507/8c32d761/attachment.html>