Gandalf Corvotempesta
2017-Sep-08 11:53 UTC
[Gluster-users] GlusterFS as virtual machine storage
2017-09-08 13:44 GMT+02:00 Pavel Szalbot <pavel.szalbot at gmail.com>:> I did not test SIGKILL because I suppose if graceful exit is bad, SIGKILL > will be as well. This assumption might be wrong. So I will test it. It would > be interesting to see client to work in case of crash (SIGKILL) and not in > case of graceful exit of glusterfsd.Exactly. if this happen, probably there is a bug in gluster's signal management.
I added firewall rule to block all traffic from Gluster VLAN on one of the nodes. Approximately 3 minutes in and no crash so far. Errors about missing node in qemu instance log are present, but this is normal. -ps On Fri, Sep 8, 2017 at 1:53 PM, Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> wrote:> 2017-09-08 13:44 GMT+02:00 Pavel Szalbot <pavel.szalbot at gmail.com>: >> I did not test SIGKILL because I suppose if graceful exit is bad, SIGKILL >> will be as well. This assumption might be wrong. So I will test it. It would >> be interesting to see client to work in case of crash (SIGKILL) and not in >> case of graceful exit of glusterfsd. > > Exactly. if this happen, probably there is a bug in gluster's signal management.
I currently only have a Windows 2012 R2 server VM in testing on top of the gluster storage, so I will have to take some time to provision a couple Linux VMs with both ext4 and XFS to see what happens on those. The Windows server VM is OK with killall glusterfsd, but when the 42 second timeout goes into effect, it gets paused and I have to go into RHEVM to un-pause it. Diego On Fri, Sep 8, 2017 at 7:53 AM, Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> wrote:> 2017-09-08 13:44 GMT+02:00 Pavel Szalbot <pavel.szalbot at gmail.com>: >> I did not test SIGKILL because I suppose if graceful exit is bad, SIGKILL >> will be as well. This assumption might be wrong. So I will test it. It would >> be interesting to see client to work in case of crash (SIGKILL) and not in >> case of graceful exit of glusterfsd. > > Exactly. if this happen, probably there is a bug in gluster's signal management.
Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after few minutes. SIGTERM on the other hand causes crash, but this time it is not read-only remount, but around 10 IOPS tops and 2 IOPS on average. -ps On Fri, Sep 8, 2017 at 1:56 PM, Diego Remolina <dijuremo at gmail.com> wrote:> I currently only have a Windows 2012 R2 server VM in testing on top of > the gluster storage, so I will have to take some time to provision a > couple Linux VMs with both ext4 and XFS to see what happens on those. > > The Windows server VM is OK with killall glusterfsd, but when the 42 > second timeout goes into effect, it gets paused and I have to go into > RHEVM to un-pause it. > > Diego > > On Fri, Sep 8, 2017 at 7:53 AM, Gandalf Corvotempesta > <gandalf.corvotempesta at gmail.com> wrote: >> 2017-09-08 13:44 GMT+02:00 Pavel Szalbot <pavel.szalbot at gmail.com>: >>> I did not test SIGKILL because I suppose if graceful exit is bad, SIGKILL >>> will be as well. This assumption might be wrong. So I will test it. It would >>> be interesting to see client to work in case of crash (SIGKILL) and not in >>> case of graceful exit of glusterfsd. >> >> Exactly. if this happen, probably there is a bug in gluster's signal management.