thr3ads.net - Ovirt devel - Re: [ovirt-devel] [kubevirt-dev] Re: [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains [Mar 2018]

If this information is useful, please help other people find it:
Share via:

Daniel P. Berrangé

2018-Mar-21 18:39 UTC

Re: [ovirt-devel] [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains

On Wed, Mar 21, 2018 at 03:00:41PM -0300, Eduardo Habkost
wrote:> On Tue, Mar 20, 2018 at 03:10:12PM +0000, Daniel P. Berrangé wrote:
> > On Tue, Mar 20, 2018 at 03:20:31PM +0100, Martin Kletzander wrote:
> > > 1) Default devices/values
> > > 
> > > Libvirt itself must default to whatever values there were before
any
> > > particular element was introduced due to the fact that it strives
to
> > > keep the guest ABI stable.  That means, for example, that it
can't just
> > > add -vmcoreinfo option (for KASLR support) or magically add the
pvpanic
> > > device to all QEMU machines, even though it would be useful, as
that
> > > would change the guest ABI.
> > > 
> > > For default values this is even more obvious.  Let's say
someone figures
> > > out some "pretty good" default values for various
HyperV enlightenment
> > > feature tunables.  Libvirt can't magically change them, but
each one of
> > > the projects building on top of it doesn't want to keep that
list
> > > updated and take care of setting them in every new XML.  Some
projects
> > > don't even expose those to the end user as a knob, while
others might.
> > 
> > This gets very tricky, very fast.
> > 
> > Lets say that you have an initial good set of hyperv config
> > tunables. Now sometime passes and it is decided that there is a
> > different, better set of config tunables. If the module that is
> > providing this policy to apps like OpenStack just updates itself
> > to provide this new policy, this can cause problems with the
> > existing deployed applications in a number of ways.
> > 
> > First the new config probably depends on specific versions of
> > libvirt and QEMU,  and you can't mandate to consuming apps which
> > versions they must be using.  [...]
> 
> This is true.
> 
> >                       [...]  So you need a matrix of libvirt +
> > QEMU + config option settings.
> 
> But this is not.  If config options need support on the lower
> levels of the stack (libvirt and/or QEMU and/or KVM and/or host
> hardware), it already has to be represented by libvirt host
> capabilities somehow, so management layers know it's available.
> 
> This means any new config generation system can (and must) use
> host(s) capabilities as input before generating the
> configuration.
I don't think it is that simple. The capabilities reflect what the
current host is capable of only, not whether it is desirable to
actually use them. Just because a host reports that it has q35-2.11.0
machine type doesn't mean that it should be used. The mgmt app may
only wish to use that if it is available on all hosts in a particular
grouping. The config generation library can't query every host directly
to determine this. The mgmt app may have a way to collate capabilities
info from hosts, but it is probably then stored in a app specific
format and data source, or it may just ends up being a global config
parameter to the mgmt app per host.

There have been a number of times where a feature is available in
libvirt and/or QEMU, and the mgmt app still doesn't yet may still
not wish to use it because it is known broken / incompatible with
certain usage patterns. So the mgmt app would require an arbitrarily
newer libvirt/qemu before considering using it, regardless of
whether host capabilities report it is available.
> > Even if you have the matching libvirt & QEMU versions, it is not
> > safe to assume the application will want to use the new policy.
> > An application may need live migration compatibility with older
> > versions. Or it may need to retain guaranteed ABI compatibility
> > with the way the VM was previously launched and be using transient
> > guests, generating the XML fresh each time.
> 
> Why is that a problem?  If you want live migration or ABI
> guarantees, you simply don't use this system to generate a new
> configuration.  The same way you don't use the "pc"
machine-type
> if you want to ensure compatibility with existing VMs.
In many mgmt apps, every VM potentially needs live migration, so
unless I'm misunderstanding, you're effectively saying don't ever
use this config generator in these apps.
> > The application will have knowledge about when it wants to use new
> > vs old hyperv tunable policy, but exposing that to your policy module
> > is very tricky because it is inherantly application specific logic
> > largely determined by the way the application code is written.
> 
> We have a huge set of features where this is simply not a
> problem.  For most virtual hardware features, enabling them is
> not even a policy decision: it's just about telling the guest
> that the feature is now available.  QEMU have been enabling new
> features in the "pc" machine-type for years.
> 
> Now, why can't higher layers in the stack do something similar?
> 
> The proposal is equivalent to what already happens when people
> use the "pc" machine-type in their configurations, but:
> 1) the new defaults/features wouldn't be hidden behind a opaque
>    machine-type name, and would appear in the domain XML
>    explicitly;
> 2) the higher layers won't depend on QEMU introducing a new
>    machine-type just to have new features enabled by default;
> 3) features that depend on host capabilities but are available on
>    all hosts in a cluster can now be enabled automatically if
>    desired (which is something QEMU can't do because it doesn't
>    have enough information about the other hosts).
> 
> Choosing reasonable defaults might not be a trivial problem, but
> the current approach of pushing the responsibility to management
> layers doesn't improve the situation.
The simple cases have been added to the "pc" machine type, but
more complex cases have not been dealt with as they often require
contextual knowledge of either the host setup or the guest OS
choice.

We had a long debate over the best aio=threads,native setting for
OpenStack. Understanding the right defaults required knowledge about
the various different ways that Nova would setup its storage stack.
We certainly know enough now to be able to provide good recommendations
for the choice, with perf data to back it up, but interpreting those
recommendations still requires the app specific knowledge about its
storage mgmt approach, so ends up being code dev work.

Another case is the pvpanic device - while in theory that could
have been enabled by default for all guests, by QEMU or a config
generator library, doing so is not useful on its own. The hard
bit of the work is adding code to the mgmt app to choose the
action for when pvpanic triggers, and code to handle the results
of that action.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Eduardo Habkost

2018-Mar-21 19:34 UTC

head link

Re: [ovirt-devel] [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains

On Wed, Mar 21, 2018 at 06:39:52PM +0000, Daniel P. Berrangé
wrote:> On Wed, Mar 21, 2018 at 03:00:41PM -0300, Eduardo Habkost wrote:
> > On Tue, Mar 20, 2018 at 03:10:12PM +0000, Daniel P. Berrangé wrote:
> > > On Tue, Mar 20, 2018 at 03:20:31PM +0100, Martin Kletzander
wrote:
> > > > 1) Default devices/values
> > > > 
> > > > Libvirt itself must default to whatever values there were
before any
> > > > particular element was introduced due to the fact that it
strives to
> > > > keep the guest ABI stable.  That means, for example, that it
can't just
> > > > add -vmcoreinfo option (for KASLR support) or magically add
the pvpanic
> > > > device to all QEMU machines, even though it would be useful,
as that
> > > > would change the guest ABI.
> > > > 
> > > > For default values this is even more obvious.  Let's say
someone figures
> > > > out some "pretty good" default values for various
HyperV enlightenment
> > > > feature tunables.  Libvirt can't magically change them,
but each one of
> > > > the projects building on top of it doesn't want to keep
that list
> > > > updated and take care of setting them in every new XML. 
Some projects
> > > > don't even expose those to the end user as a knob, while
others might.
> > > 
> > > This gets very tricky, very fast.
> > > 
> > > Lets say that you have an initial good set of hyperv config
> > > tunables. Now sometime passes and it is decided that there is a
> > > different, better set of config tunables. If the module that is
> > > providing this policy to apps like OpenStack just updates itself
> > > to provide this new policy, this can cause problems with the
> > > existing deployed applications in a number of ways.
> > > 
> > > First the new config probably depends on specific versions of
> > > libvirt and QEMU,  and you can't mandate to consuming apps
which
> > > versions they must be using.  [...]
> > 
> > This is true.
> > 
> > >                       [...]  So you need a matrix of libvirt +
> > > QEMU + config option settings.
> > 
> > But this is not.  If config options need support on the lower
> > levels of the stack (libvirt and/or QEMU and/or KVM and/or host
> > hardware), it already has to be represented by libvirt host
> > capabilities somehow, so management layers know it's available.
> > 
> > This means any new config generation system can (and must) use
> > host(s) capabilities as input before generating the
> > configuration.
> 
> I don't think it is that simple. The capabilities reflect what the
> current host is capable of only, not whether it is desirable to
> actually use them. Just because a host reports that it has q35-2.11.0
> machine type doesn't mean that it should be used. The mgmt app may
> only wish to use that if it is available on all hosts in a particular
> grouping. The config generation library can't query every host directly
> to determine this. The mgmt app may have a way to collate capabilities
> info from hosts, but it is probably then stored in a app specific
> format and data source, or it may just ends up being a global config
> parameter to the mgmt app per host.
In other words, you need host capabilities from all hosts as
input when generating a new config XML.  We already have a format
to represent host capabilities defined by libvirt, users of the
new system would just need to reproduce the data they got from
libvirt and give it to the config generator.

Not completely trivial, but maybe worth the effort if you want to
benefit from work done by other people to find good defaults?
> 
> There have been a number of times where a feature is available in
> libvirt and/or QEMU, and the mgmt app still doesn't yet may still
> not wish to use it because it is known broken / incompatible with
> certain usage patterns. So the mgmt app would require an arbitrarily
> newer libvirt/qemu before considering using it, regardless of
> whether host capabilities report it is available.
If this happens sometimes, why is it better for the teams
maintaining management layers to duplicate the work of finding
what works, instead of solving the problem only once?

> 
> > > Even if you have the matching libvirt & QEMU versions, it is
not
> > > safe to assume the application will want to use the new policy.
> > > An application may need live migration compatibility with older
> > > versions. Or it may need to retain guaranteed ABI compatibility
> > > with the way the VM was previously launched and be using
transient
> > > guests, generating the XML fresh each time.
> > 
> > Why is that a problem?  If you want live migration or ABI
> > guarantees, you simply don't use this system to generate a new
> > configuration.  The same way you don't use the "pc"
machine-type
> > if you want to ensure compatibility with existing VMs.
> 
> In many mgmt apps, every VM potentially needs live migration, so
> unless I'm misunderstanding, you're effectively saying don't
ever
> use this config generator in these apps.
If you only need live migration, you can choose between:
a) not using it;
b) using an empty host capability list as input when generating the
   XML (maybe this would be completely useless, but it's still an
   option);
c) use only host _software_ capabilities as input, if you control
   the software that runs on all hosts.
d) use an intersection of the software+host capabilities of all
   hosts as input.

If you care about 100% static guest ABI (not just live
migration), you either generate the XML once and save it for
later, or you don't use the config generation system.  (IOW, the
same limitations as the "pc" machine-type alias).
> 
> > > The application will have knowledge about when it wants to use
new
> > > vs old hyperv tunable policy, but exposing that to your policy
module
> > > is very tricky because it is inherantly application specific
logic
> > > largely determined by the way the application code is written.
> > 
> > We have a huge set of features where this is simply not a
> > problem.  For most virtual hardware features, enabling them is
> > not even a policy decision: it's just about telling the guest
> > that the feature is now available.  QEMU have been enabling new
> > features in the "pc" machine-type for years.
> > 
> > Now, why can't higher layers in the stack do something similar?
> > 
> > The proposal is equivalent to what already happens when people
> > use the "pc" machine-type in their configurations, but:
> > 1) the new defaults/features wouldn't be hidden behind a opaque
> >    machine-type name, and would appear in the domain XML
> >    explicitly;
> > 2) the higher layers won't depend on QEMU introducing a new
> >    machine-type just to have new features enabled by default;
> > 3) features that depend on host capabilities but are available on
> >    all hosts in a cluster can now be enabled automatically if
> >    desired (which is something QEMU can't do because it
doesn't
> >    have enough information about the other hosts).
> > 
> > Choosing reasonable defaults might not be a trivial problem, but
> > the current approach of pushing the responsibility to management
> > layers doesn't improve the situation.
> 
> The simple cases have been added to the "pc" machine type, but
> more complex cases have not been dealt with as they often require
> contextual knowledge of either the host setup or the guest OS
> choice.
Exactly.  But on how many of those cases the decision requires
knowledge that is specific to the management stack being used
(like the ones you listed below), and how many are decisions that
could be made by simply looking at the host software/hardware and
guest OS?  I am under the impression that we have a reasonable
number of case of the latter.

The ones I remember are all relate to CPU configuration:
* Automatically enabling useful CPU features when they are
  available on all hosts;
* Always enabling check='full' by default.

Do we have other examples?
> 
> We had a long debate over the best aio=threads,native setting for
> OpenStack. Understanding the right defaults required knowledge about
> the various different ways that Nova would setup its storage stack.
> We certainly know enough now to be able to provide good recommendations
> for the choice, with perf data to back it up, but interpreting those
> recommendations still requires the app specific knowledge about its
> storage mgmt approach, so ends up being code dev work.
> 
> Another case is the pvpanic device - while in theory that could
> have been enabled by default for all guests, by QEMU or a config
> generator library, doing so is not useful on its own. The hard
> bit of the work is adding code to the mgmt app to choose the
> action for when pvpanic triggers, and code to handle the results
> of that action.
> 
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange
:|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com
:|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange
:|
-- 
Eduardo

Daniel P. Berrangé

2018-Mar-22 09:56 UTC

head link

Re: [ovirt-devel] [kubevirt-dev] Re: [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains

On Wed, Mar 21, 2018 at 04:34:23PM -0300, Eduardo Habkost
wrote:> On Wed, Mar 21, 2018 at 06:39:52PM +0000, Daniel P. Berrangé wrote:
> > On Wed, Mar 21, 2018 at 03:00:41PM -0300, Eduardo Habkost wrote:
> > > On Tue, Mar 20, 2018 at 03:10:12PM +0000, Daniel P. Berrangé
wrote:
> > > > On Tue, Mar 20, 2018 at 03:20:31PM +0100, Martin Kletzander
wrote:
> > > > > 1) Default devices/values
> > > > > 
> > > > > Libvirt itself must default to whatever values there
were before any
> > > > > particular element was introduced due to the fact that
it strives to
> > > > > keep the guest ABI stable.  That means, for example,
that it can't just
> > > > > add -vmcoreinfo option (for KASLR support) or magically
add the pvpanic
> > > > > device to all QEMU machines, even though it would be
useful, as that
> > > > > would change the guest ABI.
> > > > > 
> > > > > For default values this is even more obvious. 
Let's say someone figures
> > > > > out some "pretty good" default values for
various HyperV enlightenment
> > > > > feature tunables.  Libvirt can't magically change
them, but each one of
> > > > > the projects building on top of it doesn't want to
keep that list
> > > > > updated and take care of setting them in every new XML.
Some projects
> > > > > don't even expose those to the end user as a knob,
while others might.
> > > > 
> > > > This gets very tricky, very fast.
> > > > 
> > > > Lets say that you have an initial good set of hyperv config
> > > > tunables. Now sometime passes and it is decided that there
is a
> > > > different, better set of config tunables. If the module that
is
> > > > providing this policy to apps like OpenStack just updates
itself
> > > > to provide this new policy, this can cause problems with the
> > > > existing deployed applications in a number of ways.
> > > > 
> > > > First the new config probably depends on specific versions
of
> > > > libvirt and QEMU,  and you can't mandate to consuming
apps which
> > > > versions they must be using.  [...]
> > > 
> > > This is true.
> > > 
> > > >                       [...]  So you need a matrix of libvirt
+
> > > > QEMU + config option settings.
> > > 
> > > But this is not.  If config options need support on the lower
> > > levels of the stack (libvirt and/or QEMU and/or KVM and/or host
> > > hardware), it already has to be represented by libvirt host
> > > capabilities somehow, so management layers know it's
available.
> > > 
> > > This means any new config generation system can (and must) use
> > > host(s) capabilities as input before generating the
> > > configuration.
> > 
> > I don't think it is that simple. The capabilities reflect what the
> > current host is capable of only, not whether it is desirable to
> > actually use them. Just because a host reports that it has q35-2.11.0
> > machine type doesn't mean that it should be used. The mgmt app may
> > only wish to use that if it is available on all hosts in a particular
> > grouping. The config generation library can't query every host
directly
> > to determine this. The mgmt app may have a way to collate capabilities
> > info from hosts, but it is probably then stored in a app specific
> > format and data source, or it may just ends up being a global config
> > parameter to the mgmt app per host.
> 
> In other words, you need host capabilities from all hosts as
> input when generating a new config XML.  We already have a format
> to represent host capabilities defined by libvirt, users of the
> new system would just need to reproduce the data they got from
> libvirt and give it to the config generator.
Things aren't that simple - when openstack reports info from each host
it doesn't do it in any libvirt format - it uses an arbitrary format it
defines itself. Going from libvirt host capabilities to the app specific
format and back to libvirt host capabilities will loose information.
Then you also have matter of hosts coming & going over time, so fragile
to assume that the set of host capabilities you currently see are
representative of the steady state you desire.
> Not completely trivial, but maybe worth the effort if you want to
> benefit from work done by other people to find good defaults?
Perhaps, but there's many ways to share the work of figuring out
good defaults. Beyond what's represented in libosinfo database,
no one has even tried to document what current desirable defaults
are. Jumping straight from no documented best practice, to lets
build a API is a big ask, particularly when the suggestion involves
major architectural changes to any app that wants to use it.

For most immediate benefit actually documenting some best practice
would be the most tangible win for application developers, as they
can much more easily adapt existing code to follow it. ALso expanding
range of info we record in libosinfo would be beneficial, since there
is still plenty of OS specific data not captured. Not to mention that
most applications aren't even leveraging much of the stuff already
available.

> > There have been a number of times where a feature is available in
> > libvirt and/or QEMU, and the mgmt app still doesn't yet may still
> > not wish to use it because it is known broken / incompatible with
> > certain usage patterns. So the mgmt app would require an arbitrarily
> > newer libvirt/qemu before considering using it, regardless of
> > whether host capabilities report it is available.
> 
> If this happens sometimes, why is it better for the teams
> maintaining management layers to duplicate the work of finding
> what works, instead of solving the problem only once?
This point was in relation to my earlier thread where I said that
it would be neccessary to maintain a matrix of policy vs QEMU and
libvirt versions, not merely relying on host capabilities.
> > > Now, why can't higher layers in the stack do something
similar?
> > > 
> > > The proposal is equivalent to what already happens when people
> > > use the "pc" machine-type in their configurations, but:
> > > 1) the new defaults/features wouldn't be hidden behind a
opaque
> > >    machine-type name, and would appear in the domain XML
> > >    explicitly;
> > > 2) the higher layers won't depend on QEMU introducing a new
> > >    machine-type just to have new features enabled by default;
> > > 3) features that depend on host capabilities but are available on
> > >    all hosts in a cluster can now be enabled automatically if
> > >    desired (which is something QEMU can't do because it
doesn't
> > >    have enough information about the other hosts).
> > > 
> > > Choosing reasonable defaults might not be a trivial problem, but
> > > the current approach of pushing the responsibility to management
> > > layers doesn't improve the situation.
> > 
> > The simple cases have been added to the "pc" machine type,
but
> > more complex cases have not been dealt with as they often require
> > contextual knowledge of either the host setup or the guest OS
> > choice.
> 
> Exactly.  But on how many of those cases the decision requires
> knowledge that is specific to the management stack being used
> (like the ones you listed below), and how many are decisions that
> could be made by simply looking at the host software/hardware and
> guest OS?  I am under the impression that we have a reasonable
> number of case of the latter.
Anything todo with virtual hardware that is guest OS dependant
should be in scope of libosinfo project / database.

For other things, I think it would be useful if we at least started
to document some recommended best practices, so we have a better idea
of what we're trying to address. It would also give apps an idea of
what they're missing right now letting them fix gaps, if desired.
> The ones I remember are all relate to CPU configuration:
> * Automatically enabling useful CPU features when they are
>   available on all hosts;
This is really hard todo in an automated fashion because it
relies on having an accessible global view of all hosts, that is
accurate. I can easily see a situation where you have 20 hosts, 5
old CPUs, 15 new CPUs, and the old ones are coincidentally offline
for maintenance or software upgrade. Meanwhile you spawn a guest,
and check available host capabilities and never see the info from
older CPUs, so automatically enable a bunch of features that we
really did not want. It is more reliable if you just declare this
in the application config file, and have a mgmt tool that can
do distributed updates of the config file when needed.
> * Always enabling check='full' by default.
> 
> Do we have other examples?
I'm sure we can find plenty, but its a matter of someone doing the
work to investigate & pull together docs.
> > We had a long debate over the best aio=threads,native setting for
> > OpenStack. Understanding the right defaults required knowledge about
> > the various different ways that Nova would setup its storage stack.
> > We certainly know enough now to be able to provide good
recommendations
> > for the choice, with perf data to back it up, but interpreting those
> > recommendations still requires the app specific knowledge about its
> > storage mgmt approach, so ends up being code dev work.
> > 
> > Another case is the pvpanic device - while in theory that could
> > have been enabled by default for all guests, by QEMU or a config
> > generator library, doing so is not useful on its own. The hard
> > bit of the work is adding code to the mgmt app to choose the
> > action for when pvpanic triggers, and code to handle the results
> > of that action.
Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Ovirt devel - Mar 2018 - Re: [kubevirt-dev] Re: [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains

Re: [ovirt-devel] [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains

Re: [ovirt-devel] [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains

Re: [ovirt-devel] [kubevirt-dev] Re: [libvirt] [virt-tools-list] Project for profiles and defaults for libvirt domains