On Monday, 2 August 2021 14:30:05 CEST Peter Krempa wrote:> On Mon, Aug 02, 2021 at 14:20:44 +0200, Vojtech Juranek wrote: > > Hi, > > as a follow-up of BZ #1883399 [1], we are reviewing vdsm VM migration > > flows and solve few follow-up bugs, e.g. BZ #1981079 [2]. I have couple > > of questions related to libvirt: > > > > * if we run disk extend during migration, it can happen that migration > > finishes sooner than disk extend. In such case we will try to set disk > > threshold on already stopped VM (we handle libvirt event that VM was > > stopper, but due to Python GIL there can be a delay between obtaining > > appropriate signal from libvirt and handling it). In such case we get > > libvirt > > VIR_ERR_OPERATION_INVALID when setting disk threshold.actually I was wrong here and the issue is actually caused by delay libvirt setBlockThreshold() call, form vdsm log: 2021-08-02 09:06:01,918-0400 WARN (mailbox-hsm/3) [virt.vm] (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') setting theshold using dom <vdsm.virt.virdomain.Notifying object at 0x7fd06610df28> (drivemonitor:122) [...] 2021-08-02 09:06:03,967-0400 WARN (libvirt/events) [virt.vm] (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') libvirt event Stopped detail 3 opaque None (vm:5657) [...] 2021-08-02 09:06:03,969-0400 WARN (mailbox-hsm/3) [virt.vm] (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') Domain not connected, skipping set block threshold for drive 'sdc' (drivemonitor:133) so it took about 2 second to libvirt setBlockThreshold() call to return and in meantime migration was finished and we get VIR_ERR_OPERATION_INVALID error from setBlockThreshold() call. What is the reason for this delay? Is this operation intentionally delayed until migration finishes? I posted relevant libvirt debug log on https://pastebin.com/YkdKYKM5> > Is it safe to > > catch this exception and ignore it or it's thrown for various reasons and > > the root cause can be something else than stopped VM? > > The API to set the block trheshold level can return the following errors > including cases when it can happen: > > VIR_ERR_OPERATION_UNSUPPORTED <- unlikely new qemu supports it > VIR_ERR_INVALID_ARG <- disk was not found in VM definition > VIR_ERR_INTERNAL_ERROR <- on error from qemu > > Thus VIR_ERR_OPERATION_INVALID seems to be safe to ignore in your > specific case, while not ignoring others can be used to catch problems.thanks for your answer -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20210802/22b3aa2c/attachment.sig>
On Mon, Aug 02, 2021 at 15:34:52 +0200, Vojtech Juranek wrote:> On Monday, 2 August 2021 14:30:05 CEST Peter Krempa wrote: > > On Mon, Aug 02, 2021 at 14:20:44 +0200, Vojtech Juranek wrote: > > > Hi, > > > as a follow-up of BZ #1883399 [1], we are reviewing vdsm VM migration > > > flows and solve few follow-up bugs, e.g. BZ #1981079 [2]. I have couple > > > of questions related to libvirt: > > > > > > * if we run disk extend during migration, it can happen that migration > > > finishes sooner than disk extend. In such case we will try to set disk > > > threshold on already stopped VM (we handle libvirt event that VM was > > > stopper, but due to Python GIL there can be a delay between obtaining > > > appropriate signal from libvirt and handling it). In such case we get > > > libvirt > > > VIR_ERR_OPERATION_INVALID when setting disk threshold. > > actually I was wrong here and the issue is actually caused by delay libvirt > setBlockThreshold() call, form vdsm log: > > 2021-08-02 09:06:01,918-0400 WARN (mailbox-hsm/3) [virt.vm] (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') setting theshold using dom <vdsm.virt.virdomain.Notifying object at 0x7fd06610df28> (drivemonitor:122) > > [...] > > 2021-08-02 09:06:03,967-0400 WARN (libvirt/events) [virt.vm] (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') libvirt event Stopped detail 3 opaque None (vm:5657) > > [...] > > 2021-08-02 09:06:03,969-0400 WARN (mailbox-hsm/3) [virt.vm] (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') Domain not connected, skipping set block threshold for drive 'sdc' (drivemonitor:133) > > > so it took about 2 second to libvirt setBlockThreshold() call to return and in meantime > migration was finished and we get VIR_ERR_OPERATION_INVALID error from setBlockThreshold() call. > > What is the reason for this delay? Is this operation intentionally delayed until > migration finishes?Actually, qemuDomainSetBlockThreshold which is the backend for virDomainSetBlockThreshold requires a QEMU_JOB_MODIFY job on the domain, so this actually can't even be set _during_ migration. In fact what happens is that the API call is waiting to be able to obtain the MODIFY job and that can happen only after the migration is finished, thus it always serializes after the migration.