On Thu, Sep 8, 2022 at 2:56 PM Daniel P. Berrang? <berrange at redhat.com> wrote:> On Thu, Sep 08, 2022 at 02:24:00PM +0200, Roman Mohr wrote: > > Hi, > > > > > > I have a question regarding capability caching in the context of > KubeVirt. > > Since we start in KubeVirt one libvirt instance per VM, libvirt has to > > re-discover on every VM start the qemu capabilities which leads to a > 1-2s+ > > delay in startup. > > > > We already discover the features in a dedicated KubeVirt pod on each > node. > > Therefore I tried to copy the capabilities over to see if that would > work. > > > > It looks like in general it could work, but libvirt seems to detect a > > mismatch in the exposed KVM CPU ID in every pod. Therefore it invalidates > > the cache. The recreated capability cache looks esctly like the original > > one though ... > > > > The check responsible for the invalidation is this: > > > > ``` > > Outdated capabilities for '%s': host cpuid changed > > ``` > > > > So the KVM_GET_SUPPORTED_CPUID call seems to return > > slightly different values in different containers. > > > > After trying out the attached golang scripts in different containers, I > > could indeed see differences. > > > > I can however not really judge what the differences in these KVM function > > registers mean and I am curious if someone else knows. The files are > > attached too (as json for easy diffing). > > Can you confirm whether the two attached data files were captured > by containers running on the same physical host, or could each > container have run on a different host. >They are coming from the same host, that is the most surprising bit for me. I am also very sure that this is the case, because I only had one k8s node from where I took these. The containers however differ (obviously) on namespaces and on the privilege level (less obvious). The handler dump is from a fully privileged container. Thanks for checking. Best regards, Roman> My understanding is that KVM_GET_SUPPORTED_CPUID returns the intersection > of CPUID flags supported by the physical CPUs and CPUID flag supported by > the KVM kernel module. > > IOW, I believe the results should only differe if run across hosts with > differing CPU models and/or kernel versions. > > I've not tried to diagnose exactly which feature bits are different > in your dumps yet. > > With regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- > https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- > https://www.instagram.com/dberrange :| > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20220908/10a44e6f/attachment.htm>
On Thu, Sep 08, 2022 at 03:10:09PM +0200, Roman Mohr wrote:> On Thu, Sep 8, 2022 at 2:56 PM Daniel P. Berrang? <berrange at redhat.com> > wrote: > > > On Thu, Sep 08, 2022 at 02:24:00PM +0200, Roman Mohr wrote: > > > Hi, > > > > > > > > > I have a question regarding capability caching in the context of > > KubeVirt. > > > Since we start in KubeVirt one libvirt instance per VM, libvirt has to > > > re-discover on every VM start the qemu capabilities which leads to a > > 1-2s+ > > > delay in startup. > > > > > > We already discover the features in a dedicated KubeVirt pod on each > > node. > > > Therefore I tried to copy the capabilities over to see if that would > > work. > > > > > > It looks like in general it could work, but libvirt seems to detect a > > > mismatch in the exposed KVM CPU ID in every pod. Therefore it invalidates > > > the cache. The recreated capability cache looks esctly like the original > > > one though ... > > > > > > The check responsible for the invalidation is this: > > > > > > ``` > > > Outdated capabilities for '%s': host cpuid changed > > > ``` > > > > > > So the KVM_GET_SUPPORTED_CPUID call seems to return > > > slightly different values in different containers. > > > > > > After trying out the attached golang scripts in different containers, I > > > could indeed see differences. > > > > > > I can however not really judge what the differences in these KVM function > > > registers mean and I am curious if someone else knows. The files are > > > attached too (as json for easy diffing). > > > > Can you confirm whether the two attached data files were captured > > by containers running on the same physical host, or could each > > container have run on a different host. > > > > They are coming from the same host, that is the most surprising bit for me. > I am also very sure that this is the case, because I only had one k8s node > from where I took these. > The containers however differ (obviously) on namespaces and on the > privilege level (less obvious). The handler dump is from a fully privileged > container.The privilege level sounds like something that might be impactful, so I'll investigate that. I'd be pretty surprised for namespaces to have any impact thnough. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|