Hello, I wonder what the "open error -1" / "release open_disk error" messages in sanlock.log actually mean. I saw these messages in the log on a KVM host that rebooted, and after running "/usr/sbin/virt-sanlock-cleanup" on that host. The resources where disks from 2 guests running on another KVM host. So in fact the disks are still in use, bot got cleaned up by the other KVM host because it thought they were not longer in use... Several questions: 1. Could this indicate a problem with the lease of these resources on the server that host these guests? 2. Is there a way to reregister these resources, because they are still in use (but not for sanlock it seems)? 3. Overall, is it a bad idea to run virt-sanlock-cleanup? Some entries from the log file: 538 s1:r35 resource __LIBVIRT__DISKS__:e133401bfdff6979f76d9544ec8a5529:/var/lib/libvirt/sanlock/e133401bfdff6979f76d9544ec8a5529:0 for 1,9,4218 539 open error -1 /var/lib/libvirt/sanlock/e133401bfdff6979f76d9544ec8a5529 539 r35 release open_disk error /var/lib/libvirt/sanlock/e133401bfdff6979f76d9544ec8a5529 598 s1:r40 resource __LIBVIRT__DISKS__:fc82f9999b5afddb8aee9955db9b3477:/var/lib/libvirt/sanlock/fc82f9999b5afddb8aee9955db9b3477:0 for 2,12,4230 598 open error -1 /var/lib/libvirt/sanlock/fc82f9999b5afddb8aee9955db9b3477 598 r40 release open_disk error /var/lib/libvirt/sanlock/fc82f9999b5afddb8aee9955db9b3477 598 s1:r42 resource __LIBVIRT__DISKS__:ff9b82672a4dd869f9c9b38a6cbe3900:/var/lib/libvirt/sanlock/ff9b82672a4dd869f9c9b38a6cbe3900:0 for 2,12,4232 598 open error -1 /var/lib/libvirt/sanlock/ff9b82672a4dd869f9c9b38a6cbe3900 598 r42 release open_disk error /var/lib/libvirt/sanlock/ff9b82672a4dd869f9c9b38a6cbe3900 -- Frido Roose
On Tuesday 8 May 2012 at 18:56, Frido Roose wrote:> Hello, > > I wonder what the "open error -1" / "release open_disk error" messages in sanlock.log actually mean. > > I saw these messages in the log on a KVM host that rebooted, and after running "/usr/sbin/virt-sanlock-cleanup" on that host. > The resources where disks from 2 guests running on another KVM host. > > So in fact the disks are still in use, bot got cleaned up by the other KVM host because it thought they were not longer in use... > > Several questions: > 1. Could this indicate a problem with the lease of these resources on the server that host these guests? >Let's try to answer myself from what I've learned so far... Doing a "sanlock client status" revealed that in fact virt-sanlock-cleanup had removed the resource, so I suspect the device were no longer 'locked' at that point. The reason for this is still unclear to me, but I think I found a way to re-register the resources.> 2. Is there a way to reregister these resources, because they are still in use (but not for sanlock it seems)?I made a script to reregister the resources by looking up the devices that were assigned to the VM and by running some sanlock commands. The crucial part in this is: dd if=/dev/zero of=/var/lib/libvirt/sanlock/$MD5 bs=1M count=1 chmod 600 /var/lib/libvirt/sanlock/$MD5 sanlock direct init -r __LIBVIRT__DISKS__:$MD5:/var/lib/libvirt/sanlock/$MD5:0 sanlock client acquire -r __LIBVIRT__DISKS__:$MD5:/var/lib/libvirt/sanlock/$MD5:0 -p $pid The $pid of the VM was still registered with the sanlock daemon, so running these commands was sufficient.> 3. Overall, is it a bad idea to run virt-sanlock-cleanup?Running virt-sanlock-cleanup now leaves the resources assigned, so I don't know why the were cleaned up. Perhaps sanlock was not completely ready with initialization after the reboot? I'm not sure. I guess time will figure out. Best regards, Frido