Gandalf Corvotempesta
2017-Sep-08 11:11 UTC
[Gluster-users] GlusterFS as virtual machine storage
2017-09-08 13:07 GMT+02:00 Pavel Szalbot <pavel.szalbot at gmail.com>:> OK, so killall seems to be ok after several attempts i.e. iops do not stop > on VM. Reboot caused I/O errors after maybe 20 seconds since issuing the > command. I will check the servers console during reboot to see if the VM > errors appear just after the power cycle and will try to crash the VM after > killall again... > >Also try to kill the Gluster VM without killing glusterfsd, simulating a server hard-crash . Or try to remove the network interface. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170908/2d815cab/attachment.html>
So even killall situation eventually kills VM (I/O errors). Gandalf, isn't possible server hard-crash too much? I mean if reboot reliably kills the VM, there is no doubt network crash or poweroff will as well. I am tempted to test this setup on DigitalOcean to eliminate possibility of my hardware/network. But if Diego is able to reproduce the "reboot crash", my doubts of hardware/network problems are close to none. -ps On Fri, Sep 8, 2017 at 1:11 PM, Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> wrote:> 2017-09-08 13:07 GMT+02:00 Pavel Szalbot <pavel.szalbot at gmail.com>: >> >> OK, so killall seems to be ok after several attempts i.e. iops do not stop >> on VM. Reboot caused I/O errors after maybe 20 seconds since issuing the >> command. I will check the servers console during reboot to see if the VM >> errors appear just after the power cycle and will try to crash the VM after >> killall again... >> > > Also try to kill the Gluster VM without killing glusterfsd, simulating a > server hard-crash . Or try to remove the network interface. >
Gandalf Corvotempesta
2017-Sep-08 11:36 UTC
[Gluster-users] GlusterFS as virtual machine storage
2017-09-08 13:21 GMT+02:00 Pavel Szalbot <pavel.szalbot at gmail.com>:> Gandalf, isn't possible server hard-crash too much? I mean if reboot > reliably kills the VM, there is no doubt network crash or poweroff > will as well.IIUP, the only way to keep I/O running is to gracefully exiting glusterfsd. killall should send signal 15 (SIGTERM) to the process, maybe a bug in signal management on gluster side? Because kernel is already telling glusterfsd to exit, though signal 15 but glusterfsd seems to handle this in a bad way. a server hard-crash doesn't send any signal. I think this could be also similiar to SIGKILL (9) that can't be catched/ignored software side. In other words: is this a bug in gluster's signal management (if SIGKILL is working and SIGTERM no, i'll almost sure this is a bug in signal management), a engineering bug (relying only on a graceful exit [but even SIGTERM should be threthed as graceful exit] to preserve I/O on clients) or something else ?