Yedidyah Bar David
2020-Oct-14 05:54 UTC
Re: [Libguestfs] virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
On Tue, Oct 13, 2020 at 8:40 PM Richard W.M. Jones <rjones@redhat.com> wrote:> > On Tue, Oct 13, 2020 at 07:56:29PM +0300, Nir Soffer wrote: > > On Tue, Oct 13, 2020 at 7:15 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > > > > On Tue, Oct 13, 2020 at 06:45:42PM +0300, Nir Soffer wrote: > > > > I think this is the right solution - when virt-something tool fails, > > > > it should log the reason for the failure - the error that caused the > > > > tool to fail. I'm not sure this is easy to do as the failing code > > > > run inside a special VM. Maybe the code running in the VM should log > > > > the output in a machine readable way, so once an error is detected > > > > virt-something can report the error as the reason, without running > > > > in debug mode. > > > > > > All the virt-* tools that I've written have a non-zero exit code and > > > print an error message on stderr when they fail. Errors from inside > > > the appliance are propagated to the library and thence to the tool > > > correctly. > > > > > > I think the best thing to do is: > > > > > > - spool up stdout + stderr from the tool > > > > > > - if the exit code != 0, save the spooled output for analysis > > > > > > - if the exit code == 0, discard it (or keep it if you like) > > > > This is what we already do, and the result is not helpful. If you look > > at the log message in the previous message, basically the only > > info about the error is: > > > > libguestfs error: guestfs_launch failed > > > > I don't see what we can do with this error message. > > Right, so in this particular instance the error message would tell us > that you should run libguestfs-test-tool because your qemu/kernel/etc > is broken in some way :-/ > > There's not a particularly good answer here if you don't want to ever > use LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE, but perhaps you could run > libguestfs-test-tool if you see any error which matches the substring > /guestfs_launch/ ?Another (orthogonal?) option: Make LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE log elsewhere, not to stdout/err (e.g. some other file descriptor, or to a file passed via env or whatever). This way, it might make sense for vdsm to always pass these vars, continue logging all stdout/err, and log/keep debug/trace logs only on errors. Best regards, -- Didi
Yedidyah Bar David
2020-Nov-09 07:16 UTC
[Libguestfs] virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
On Wed, Oct 14, 2020 at 8:54 AM Yedidyah Bar David <didi at redhat.com> wrote:> > On Tue, Oct 13, 2020 at 8:40 PM Richard W.M. Jones <rjones at redhat.com> wrote: > > > > On Tue, Oct 13, 2020 at 07:56:29PM +0300, Nir Soffer wrote: > > > On Tue, Oct 13, 2020 at 7:15 PM Richard W.M. Jones <rjones at redhat.com> wrote: > > > > > > > > On Tue, Oct 13, 2020 at 06:45:42PM +0300, Nir Soffer wrote: > > > > > I think this is the right solution - when virt-something tool fails, > > > > > it should log the reason for the failure - the error that caused the > > > > > tool to fail. I'm not sure this is easy to do as the failing code > > > > > run inside a special VM. Maybe the code running in the VM should log > > > > > the output in a machine readable way, so once an error is detected > > > > > virt-something can report the error as the reason, without running > > > > > in debug mode. > > > > > > > > All the virt-* tools that I've written have a non-zero exit code and > > > > print an error message on stderr when they fail. Errors from inside > > > > the appliance are propagated to the library and thence to the tool > > > > correctly. > > > > > > > > I think the best thing to do is: > > > > > > > > - spool up stdout + stderr from the tool > > > > > > > > - if the exit code != 0, save the spooled output for analysis > > > > > > > > - if the exit code == 0, discard it (or keep it if you like) > > > > > > This is what we already do, and the result is not helpful. If you look > > > at the log message in the previous message, basically the only > > > info about the error is: > > > > > > libguestfs error: guestfs_launch failed > > > > > > I don't see what we can do with this error message. > > > > Right, so in this particular instance the error message would tell us > > that you should run libguestfs-test-tool because your qemu/kernel/etc > > is broken in some way :-/ > > > > There's not a particularly good answer here if you don't want to ever > > use LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE, but perhaps you could run > > libguestfs-test-tool if you see any error which matches the substring > > /guestfs_launch/ ? > > Another (orthogonal?) option: > > Make LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE log elsewhere, not to stdout/err > (e.g. some other file descriptor, or to a file passed via env or whatever). > This way, it might make sense for vdsm to always pass these vars, continue > logging all stdout/err, and log/keep debug/trace logs only on errors.This now happened again: https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/565/ https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/565/artifact/exported-artifacts/test_logs/basic-suite-master/lago-basic-suite-master-host-1/_var_log/vdsm/vdsm.log 2020-11-09 01:05:42,031-0500 INFO (jsonrpc/4) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'code': 0, 'message': 'Done'}, 'io_tune_policies_dict': {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}}} from=::1,34002 (api:54) 2020-11-09 01:05:42,038-0500 DEBUG (jsonrpc/4) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmIoTunePolicies' in bridge with {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [], 'current_values': [{'name': 'vda', 'path': '/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}} (__init__:360) 2020-11-09 01:05:42,435-0500 DEBUG (tasks/3) [common.commands] FAILED: <err> = b"virt-sparsify: error: libguestfs error: guestfs_launch failed.\nThis usually means the libguestfs appliance failed to start or crashed.\nDo:\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\nand run the command again. For further information, read:\n http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can also run 'libguestfs-test-tool' and post the *complete* output\ninto a bug report or message to the libguestfs mailing list.\n\nIf reporting bugs, run virt-sparsify with debugging enabled and include the \ncomplete output:\n\n virt-sparsify -v -x [...]\n"; <rc> = 1 (commands:98) I suggest that if we have come to a dead-end and no-one has any clue, then we either patch something (vdsm?) to allow getting more information if this happens again, or open a bug for further discussion/prioritization. Best regards, -- Didi
Sam Eiderman
2020-Nov-09 08:56 UTC
Re: [Libguestfs] virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
Hi, Not sure if this is the same issue or not, but I opened a thread about guestfs_launch failing not too long ago. https://www.redhat.com/archives/libguestfs/2020-August/msg00352.html Since we are using our own tool, we ended up retrying guestfs_launch (and also guestfs_open, since the documentation suggests not reusing handles) - Not sure if this solved the issue since we introduced this retrial not too long ago. Sam On Mon, Nov 9, 2020 at 9:57 AM Yedidyah Bar David <didi@redhat.com> wrote:> > On Wed, Oct 14, 2020 at 8:54 AM Yedidyah Bar David <didi@redhat.com> wrote: > > > > On Tue, Oct 13, 2020 at 8:40 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > > > > On Tue, Oct 13, 2020 at 07:56:29PM +0300, Nir Soffer wrote: > > > > On Tue, Oct 13, 2020 at 7:15 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > > > > > > > > On Tue, Oct 13, 2020 at 06:45:42PM +0300, Nir Soffer wrote: > > > > > > I think this is the right solution - when virt-something tool fails, > > > > > > it should log the reason for the failure - the error that caused the > > > > > > tool to fail. I'm not sure this is easy to do as the failing code > > > > > > run inside a special VM. Maybe the code running in the VM should log > > > > > > the output in a machine readable way, so once an error is detected > > > > > > virt-something can report the error as the reason, without running > > > > > > in debug mode. > > > > > > > > > > All the virt-* tools that I've written have a non-zero exit code and > > > > > print an error message on stderr when they fail. Errors from inside > > > > > the appliance are propagated to the library and thence to the tool > > > > > correctly. > > > > > > > > > > I think the best thing to do is: > > > > > > > > > > - spool up stdout + stderr from the tool > > > > > > > > > > - if the exit code != 0, save the spooled output for analysis > > > > > > > > > > - if the exit code == 0, discard it (or keep it if you like) > > > > > > > > This is what we already do, and the result is not helpful. If you look > > > > at the log message in the previous message, basically the only > > > > info about the error is: > > > > > > > > libguestfs error: guestfs_launch failed > > > > > > > > I don't see what we can do with this error message. > > > > > > Right, so in this particular instance the error message would tell us > > > that you should run libguestfs-test-tool because your qemu/kernel/etc > > > is broken in some way :-/ > > > > > > There's not a particularly good answer here if you don't want to ever > > > use LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE, but perhaps you could run > > > libguestfs-test-tool if you see any error which matches the substring > > > /guestfs_launch/ ? > > > > Another (orthogonal?) option: > > > > Make LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE log elsewhere, not to stdout/err > > (e.g. some other file descriptor, or to a file passed via env or whatever). > > This way, it might make sense for vdsm to always pass these vars, continue > > logging all stdout/err, and log/keep debug/trace logs only on errors. > > This now happened again: > > https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/565/ > > https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/565/artifact/exported-artifacts/test_logs/basic-suite-master/lago-basic-suite-master-host-1/_var_log/vdsm/vdsm.log > > 2020-11-09 01:05:42,031-0500 INFO (jsonrpc/4) [api.host] FINISH > getAllVmIoTunePolicies return={'status': {'code': 0, 'message': > 'Done'}, 'io_tune_policies_dict': > {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [], > 'current_values': [{'name': 'vda', 'path': > '/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80', > 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, > 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, > 'read_iops_sec': 0}}]}}} from=::1,34002 (api:54) > 2020-11-09 01:05:42,038-0500 DEBUG (jsonrpc/4) [jsonrpc.JsonRpcServer] > Return 'Host.getAllVmIoTunePolicies' in bridge with > {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [], > 'current_values': [{'name': 'vda', 'path': > '/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80', > 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, > 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, > 'read_iops_sec': 0}}]}} (__init__:360) > 2020-11-09 01:05:42,435-0500 DEBUG (tasks/3) [common.commands] FAILED: > <err> = b"virt-sparsify: error: libguestfs error: guestfs_launch > failed.\nThis usually means the libguestfs appliance failed to start > or crashed.\nDo:\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\nand > run the command again. For further information, read:\n > http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can > also run 'libguestfs-test-tool' and post the *complete* output\ninto a > bug report or message to the libguestfs mailing list.\n\nIf reporting > bugs, run virt-sparsify with debugging enabled and include the > \ncomplete output:\n\n virt-sparsify -v -x [...]\n"; <rc> = 1 > (commands:98) > > I suggest that if we have come to a dead-end and no-one has any clue, then > we either patch something (vdsm?) to allow getting more information if this > happens again, or open a bug for further discussion/prioritization. > > Best regards, > -- > Didi > > _______________________________________________ > Libguestfs mailing list > Libguestfs@redhat.com > https://www.redhat.com/mailman/listinfo/libguestfs >
Richard W.M. Jones
2020-Nov-09 09:03 UTC
Re: [Libguestfs] virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
On Mon, Nov 09, 2020 at 09:16:33AM +0200, Yedidyah Bar David wrote:> On Wed, Oct 14, 2020 at 8:54 AM Yedidyah Bar David <didi@redhat.com> wrote: > > > > On Tue, Oct 13, 2020 at 8:40 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > > > > On Tue, Oct 13, 2020 at 07:56:29PM +0300, Nir Soffer wrote: > > > > On Tue, Oct 13, 2020 at 7:15 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > > > > > > > > On Tue, Oct 13, 2020 at 06:45:42PM +0300, Nir Soffer wrote: > > > > > > I think this is the right solution - when virt-something tool fails, > > > > > > it should log the reason for the failure - the error that caused the > > > > > > tool to fail. I'm not sure this is easy to do as the failing code > > > > > > run inside a special VM. Maybe the code running in the VM should log > > > > > > the output in a machine readable way, so once an error is detected > > > > > > virt-something can report the error as the reason, without running > > > > > > in debug mode. > > > > > > > > > > All the virt-* tools that I've written have a non-zero exit code and > > > > > print an error message on stderr when they fail. Errors from inside > > > > > the appliance are propagated to the library and thence to the tool > > > > > correctly. > > > > > > > > > > I think the best thing to do is: > > > > > > > > > > - spool up stdout + stderr from the tool > > > > > > > > > > - if the exit code != 0, save the spooled output for analysis > > > > > > > > > > - if the exit code == 0, discard it (or keep it if you like) > > > > > > > > This is what we already do, and the result is not helpful. If you look > > > > at the log message in the previous message, basically the only > > > > info about the error is: > > > > > > > > libguestfs error: guestfs_launch failed > > > > > > > > I don't see what we can do with this error message. > > > > > > Right, so in this particular instance the error message would tell us > > > that you should run libguestfs-test-tool because your qemu/kernel/etc > > > is broken in some way :-/ > > > > > > There's not a particularly good answer here if you don't want to ever > > > use LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE, but perhaps you could run > > > libguestfs-test-tool if you see any error which matches the substring > > > /guestfs_launch/ ? > > > > Another (orthogonal?) option: > > > > Make LIBGUESTFS_DEBUG/LIBGUESTFS_TRACE log elsewhere, not to stdout/err > > (e.g. some other file descriptor, or to a file passed via env or whatever). > > This way, it might make sense for vdsm to always pass these vars, continue > > logging all stdout/err, and log/keep debug/trace logs only on errors. > > This now happened again: > > https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/565/ > > https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/565/artifact/exported-artifacts/test_logs/basic-suite-master/lago-basic-suite-master-host-1/_var_log/vdsm/vdsm.log > > 2020-11-09 01:05:42,031-0500 INFO (jsonrpc/4) [api.host] FINISH > getAllVmIoTunePolicies return={'status': {'code': 0, 'message': > 'Done'}, 'io_tune_policies_dict': > {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [], > 'current_values': [{'name': 'vda', 'path': > '/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80', > 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, > 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, > 'read_iops_sec': 0}}]}}} from=::1,34002 (api:54) > 2020-11-09 01:05:42,038-0500 DEBUG (jsonrpc/4) [jsonrpc.JsonRpcServer] > Return 'Host.getAllVmIoTunePolicies' in bridge with > {'c189ecb3-8f2e-4726-8766-7d2d9b514687': {'policy': [], > 'current_values': [{'name': 'vda', 'path': > '/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/1d093232-d41e-483f-a915-62f8db3c972f/images/e7ee6417-b319-4d84-81a5-5d77cbce2385/710d2c10-e6b7-4d16-bd37-50a9d4e14a80', > 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, > 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, > 'read_iops_sec': 0}}]}} (__init__:360) > 2020-11-09 01:05:42,435-0500 DEBUG (tasks/3) [common.commands] FAILED: > <err> = b"virt-sparsify: error: libguestfs error: guestfs_launch > failed.\nThis usually means the libguestfs appliance failed to start > or crashed.\nDo:\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\nand > run the command again. For further information, read:\n > http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can > also run 'libguestfs-test-tool' and post the *complete* output\ninto a > bug report or message to the libguestfs mailing list.\n\nIf reporting > bugs, run virt-sparsify with debugging enabled and include the > \ncomplete output:\n\n virt-sparsify -v -x [...]\n"; <rc> = 1 > (commands:98) > > I suggest that if we have come to a dead-end and no-one has any clue, then > we either patch something (vdsm?) to allow getting more information if this > happens again, or open a bug for further discussion/prioritization.You definitely need to run libguestfs-test-tool from vdsm if we're going to get any further with this. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Apparently Analagous Threads
- Re: virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
- Re: virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
- Re: virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
- Re: virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
- Re: virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)