thr3ads.net - Virtualization - vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero) [Dec 2021]

If this information is useful, please help other people find it:
Share via:

Jason Wang

2021-Dec-15 02:06 UTC

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

On Wed, Dec 15, 2021 at 9:05 AM Si-Wei Liu <si-wei.liu at oracle.com>
wrote:>
>
>
> On 12/13/2021 9:06 PM, Michael S. Tsirkin wrote:
> > On Mon, Dec 13, 2021 at 05:59:45PM -0800, Si-Wei Liu wrote:
> >>
> >> On 12/12/2021 1:26 AM, Michael S. Tsirkin wrote:
> >>> On Fri, Dec 10, 2021 at 05:44:15PM -0800, Si-Wei Liu wrote:
> >>>> Sorry for reviving this ancient thread. I was kinda lost
for the conclusion
> >>>> it ended up with. I have the following questions,
> >>>>
> >>>> 1. legacy guest support: from the past conversations it
doesn't seem the
> >>>> support will be completely dropped from the table, is my
understanding
> >>>> correct? Actually we're interested in supporting
virtio v0.95 guest for x86,
> >>>> which is backed by the spec at
> >>>>
https://urldefense.com/v3/__https://ozlabs.org/*rusty/virtio-spec/virtio-0.9.5.pdf__;fg!!ACWV5N9M2RV99hQ!dTKmzJwwRsFM7BtSuTDu1cNly5n4XCotH0WYmidzGqHSXt40i7ZU43UcNg7GYxZg$
. Though I'm not sure
> >>>> if there's request/need to support wilder legacy
virtio versions earlier
> >>>> beyond.
> >>> I personally feel it's less work to add in kernel than try
to
> >>> work around it in userspace. Jason feels differently.
> >>> Maybe post the patches and this will prove to Jason it's
not
> >>> too terrible?
> >> I suppose if the vdpa vendor does support 0.95 in the datapath and
ring
> >> layout level and is limited to x86 only, there should be easy way
out.
> > Note a subtle difference: what matters is that guest, not host is x86.
> > Matters for emulators which might reorder memory accesses.
> > I guess this enforcement belongs in QEMU then?
> Right, I mean to get started, the initial guest driver support and the
> corresponding QEMU support for transitional vdpa backend can be limited
> to x86 guest/host only. Since the config space is emulated in QEMU, I
> suppose it's not hard to enforce in QEMU.
It's more than just config space, most devices have headers before the
buffer.
> QEMU can drive GET_LEGACY,
> GET_ENDIAN et al ioctls in advance to get the capability from the
> individual vendor driver. For that, we need another negotiation protocol
> similar to vhost_user's protocol_features between the vdpa kernel and
> QEMU, way before the guest driver is ever probed and its feature
> negotiation kicks in. Not sure we need a GET_MEMORY_ORDER ioctl call
> from the device, but we can assume weak ordering for legacy at this
> point (x86 only)?
I'm lost here, we have get_features() so:

1) VERSION_1 means the device uses LE if provided, otherwise natvie
2) ORDER_PLATFORM means device requires platform ordering

Any reason for having a new API for this?
>
> >
> >> I
> >> checked with Eli and other Mellanox/NVDIA folks for
hardware/firmware level
> >> 0.95 support, it seems all the ingredient had been there already
dated back
> >> to the DPDK days. The only major thing limiting is in the vDPA
software that
> >> the current vdpa core has the assumption around
VIRTIO_F_ACCESS_PLATFORM for
> >> a few DMA setup ops, which is virtio 1.0 only.
> >>
> >>>> 2. suppose some form of legacy guest support needs to be
there, how do we
> >>>> deal with the bogus assumption below in vdpa_get_config()
in the short term?
> >>>> It looks one of the intuitive fix is to move the
vdpa_set_features call out
> >>>> of vdpa_get_config() to vdpa_set_config().
> >>>>
> >>>>           /*
> >>>>            * Config accesses aren't supposed to
trigger before features are
> >>>> set.
> >>>>            * If it does happen we assume a legacy guest.
> >>>>            */
> >>>>           if (!vdev->features_valid)
> >>>>                   vdpa_set_features(vdev, 0);
> >>>>           ops->get_config(vdev, offset, buf, len);
> >>>>
> >>>> I can post a patch to fix 2) if there's consensus
already reached.
> >>>>
> >>>> Thanks,
> >>>> -Siwei
> >>> I'm not sure how important it is to change that.
> >>> In any case it only affects transitional devices, right?
> >>> Legacy only should not care ...
> >> Yes I'd like to distinguish legacy driver (suppose it is 0.95)
against the
> >> modern one in a transitional device model rather than being legacy
only.
> >> That way a v0.95 and v1.0 supporting vdpa parent can support both
types of
> >> guests without having to reconfigure. Or are you suggesting limit
to legacy
> >> only at the time of vdpa creation would simplify the
implementation a lot?
> >>
> >> Thanks,
> >> -Siwei
> >
> > I don't know for sure. Take a look at the work Halil was doing
> > to try and support transitional devices with BE guests.
> Hmmm, we can have those endianness ioctls defined but the initial QEMU
> implementation can be started to support x86 guest/host with little
> endian and weak memory ordering first. The real trick is to detect
> legacy guest - I am not sure if it's feasible to shift all the legacy
> detection work to QEMU, or the kernel has to be part of the detection
> (e.g. the kick before DRIVER_OK thing we have to duplicate the tracking
> effort in QEMU) as well. Let me take a further look and get back.
Michael may think differently but I think doing this in Qemu is much easier.

Thanks


>
> Meanwhile, I'll check internally to see if a legacy only model would
> work. Thanks.
>
> Thanks,
> -Siwei
>
>
> >
> >
> >>>> On 3/2/2021 2:53 AM, Jason Wang wrote:
> >>>>> On 2021/3/2 5:47 ??, Michael S. Tsirkin wrote:
> >>>>>> On Mon, Mar 01, 2021 at 11:56:50AM +0800, Jason
Wang wrote:
> >>>>>>> On 2021/3/1 5:34 ??, Michael S. Tsirkin wrote:
> >>>>>>>> On Wed, Feb 24, 2021 at 10:24:41AM -0800,
Si-Wei Liu wrote:
> >>>>>>>>>> Detecting it isn't enough
though, we will need a new ioctl to notify
> >>>>>>>>>> the kernel that it's a legacy
guest. Ugh :(
> >>>>>>>>> Well, although I think adding an ioctl
is doable, may I
> >>>>>>>>> know what the use
> >>>>>>>>> case there will be for kernel to
leverage such info
> >>>>>>>>> directly? Is there a
> >>>>>>>>> case QEMU can't do with dedicate
ioctls later if there's indeed
> >>>>>>>>> differentiation (legacy v.s. modern)
needed?
> >>>>>>>> BTW a good API could be
> >>>>>>>>
> >>>>>>>> #define VHOST_SET_ENDIAN
_IOW(VHOST_VIRTIO, ?, int)
> >>>>>>>> #define VHOST_GET_ENDIAN
_IOW(VHOST_VIRTIO, ?, int)
> >>>>>>>>
> >>>>>>>> we did it per vring but maybe that was a
mistake ...
> >>>>>>> Actually, I wonder whether it's good time
to just not support
> >>>>>>> legacy driver
> >>>>>>> for vDPA. Consider:
> >>>>>>>
> >>>>>>> 1) It's definition is no-normative
> >>>>>>> 2) A lot of budren of codes
> >>>>>>>
> >>>>>>> So qemu can still present the legacy device
since the config
> >>>>>>> space or other
> >>>>>>> stuffs that is presented by vhost-vDPA is not
expected to be
> >>>>>>> accessed by
> >>>>>>> guest directly. Qemu can do the endian
conversion when necessary
> >>>>>>> in this
> >>>>>>> case?
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>> Overall I would be fine with this approach but we
need to avoid breaking
> >>>>>> working userspace, qemu releases with vdpa support
are out there and
> >>>>>> seem to work for people. Any changes need to take
that into account
> >>>>>> and document compatibility concerns.
> >>>>> Agree, let me check.
> >>>>>
> >>>>>
> >>>>>>     I note that any hardware
> >>>>>> implementation is already broken for legacy except
on platforms with
> >>>>>> strong ordering which might be helpful in reducing
the scope.
> >>>>> Yes.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>>
>

Si-Wei Liu

2021-Dec-15 20:52 UTC

head link

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

On 12/14/2021 6:06 PM, Jason Wang wrote:> On Wed, Dec 15, 2021 at 9:05 AM Si-Wei Liu <si-wei.liu at oracle.com>
wrote:
>>
>>
>> On 12/13/2021 9:06 PM, Michael S. Tsirkin wrote:
>>> On Mon, Dec 13, 2021 at 05:59:45PM -0800, Si-Wei Liu wrote:
>>>> On 12/12/2021 1:26 AM, Michael S. Tsirkin wrote:
>>>>> On Fri, Dec 10, 2021 at 05:44:15PM -0800, Si-Wei Liu wrote:
>>>>>> Sorry for reviving this ancient thread. I was kinda
lost for the conclusion
>>>>>> it ended up with. I have the following questions,
>>>>>>
>>>>>> 1. legacy guest support: from the past conversations it
doesn't seem the
>>>>>> support will be completely dropped from the table, is
my understanding
>>>>>> correct? Actually we're interested in supporting
virtio v0.95 guest for x86,
>>>>>> which is backed by the spec at
>>>>>>
https://urldefense.com/v3/__https://ozlabs.org/*rusty/virtio-spec/virtio-0.9.5.pdf__;fg!!ACWV5N9M2RV99hQ!dTKmzJwwRsFM7BtSuTDu1cNly5n4XCotH0WYmidzGqHSXt40i7ZU43UcNg7GYxZg$
. Though I'm not sure
>>>>>> if there's request/need to support wilder legacy
virtio versions earlier
>>>>>> beyond.
>>>>> I personally feel it's less work to add in kernel than
try to
>>>>> work around it in userspace. Jason feels differently.
>>>>> Maybe post the patches and this will prove to Jason
it's not
>>>>> too terrible?
>>>> I suppose if the vdpa vendor does support 0.95 in the datapath
and ring
>>>> layout level and is limited to x86 only, there should be easy
way out.
>>> Note a subtle difference: what matters is that guest, not host is
x86.
>>> Matters for emulators which might reorder memory accesses.
>>> I guess this enforcement belongs in QEMU then?
>> Right, I mean to get started, the initial guest driver support and the
>> corresponding QEMU support for transitional vdpa backend can be limited
>> to x86 guest/host only. Since the config space is emulated in QEMU, I
>> suppose it's not hard to enforce in QEMU.
> It's more than just config space, most devices have headers before the
buffer.The ordering in datapath (data VQs) would have to rely on vendor's 
support. Since ORDER_PLATFORM is pretty new (v1.1), I guess vdpa h/w 
vendor nowadays can/should well support the case when ORDER_PLATFORM is 
not acked by the driver (actually this feature is filtered out by the 
QEMU vhost-vdpa driver today), even with v1.0 spec conforming and modern 
only vDPA device. The control VQ is implemented in software in the 
kernel, which can be easily accommodated/fixed when needed.
>
>> QEMU can drive GET_LEGACY,
>> GET_ENDIAN et al ioctls in advance to get the capability from the
>> individual vendor driver. For that, we need another negotiation
protocol
>> similar to vhost_user's protocol_features between the vdpa kernel
and
>> QEMU, way before the guest driver is ever probed and its feature
>> negotiation kicks in. Not sure we need a GET_MEMORY_ORDER ioctl call
>> from the device, but we can assume weak ordering for legacy at this
>> point (x86 only)?
> I'm lost here, we have get_features() so:I assume here you refer to get_device_features() that Eli just changed 
the name.>
> 1) VERSION_1 means the device uses LE if provided, otherwise natvie
> 2) ORDER_PLATFORM means device requires platform ordering
>
> Any reason for having a new API for this?Are you going to enforce all vDPA hardware vendors to support the 
transitional model for legacy guest? meaning guest not acknowledging 
VERSION_1 would use the legacy interfaces captured in the spec section 
7.4 (regarding ring layout, native endianness, message framing, vq 
alignment of 4096, 32bit feature, no features_ok bit in status, IO port 
interface i.e. all the things) instead? Noted we don't yet have a 
set_device_features() that allows the vdpa device to tell whether it is 
operating in transitional or modern-only mode. For software virtio, all 
support for the legacy part in a transitional model has been built up 
there already, however, it's not easy for vDPA vendors to implement all 
the requirements for an all-or-nothing legacy guest support (big endian 
guest for example). To these vendors, the legacy support within a 
transitional model is more of feature to them and it's best to leave 
some flexibility for them to implement partial support for legacy. That 
in turn calls out the need for a vhost-user protocol feature like 
negotiation API that can prohibit those unsupported guest setups to as 
early as backend_init before launching the VM.

>
>>>> I
>>>> checked with Eli and other Mellanox/NVDIA folks for
hardware/firmware level
>>>> 0.95 support, it seems all the ingredient had been there
already dated back
>>>> to the DPDK days. The only major thing limiting is in the vDPA
software that
>>>> the current vdpa core has the assumption around
VIRTIO_F_ACCESS_PLATFORM for
>>>> a few DMA setup ops, which is virtio 1.0 only.
>>>>
>>>>>> 2. suppose some form of legacy guest support needs to
be there, how do we
>>>>>> deal with the bogus assumption below in
vdpa_get_config() in the short term?
>>>>>> It looks one of the intuitive fix is to move the
vdpa_set_features call out
>>>>>> of vdpa_get_config() to vdpa_set_config().
>>>>>>
>>>>>>            /*
>>>>>>             * Config accesses aren't supposed to
trigger before features are
>>>>>> set.
>>>>>>             * If it does happen we assume a legacy
guest.
>>>>>>             */
>>>>>>            if (!vdev->features_valid)
>>>>>>                    vdpa_set_features(vdev, 0);
>>>>>>            ops->get_config(vdev, offset, buf, len);
>>>>>>
>>>>>> I can post a patch to fix 2) if there's consensus
already reached.
>>>>>>
>>>>>> Thanks,
>>>>>> -Siwei
>>>>> I'm not sure how important it is to change that.
>>>>> In any case it only affects transitional devices, right?
>>>>> Legacy only should not care ...
>>>> Yes I'd like to distinguish legacy driver (suppose it is
0.95) against the
>>>> modern one in a transitional device model rather than being
legacy only.
>>>> That way a v0.95 and v1.0 supporting vdpa parent can support
both types of
>>>> guests without having to reconfigure. Or are you suggesting
limit to legacy
>>>> only at the time of vdpa creation would simplify the
implementation a lot?
>>>>
>>>> Thanks,
>>>> -Siwei
>>> I don't know for sure. Take a look at the work Halil was doing
>>> to try and support transitional devices with BE guests.
>> Hmmm, we can have those endianness ioctls defined but the initial QEMU
>> implementation can be started to support x86 guest/host with little
>> endian and weak memory ordering first. The real trick is to detect
>> legacy guest - I am not sure if it's feasible to shift all the
legacy
>> detection work to QEMU, or the kernel has to be part of the detection
>> (e.g. the kick before DRIVER_OK thing we have to duplicate the tracking
>> effort in QEMU) as well. Let me take a further look and get back.
> Michael may think differently but I think doing this in Qemu is much
easier.I think the key is whether we position emulating legacy interfaces in 
QEMU doing translation on top of a v1.0 modern-only device in the 
kernel, or we allow vdpa core (or you can say vhost-vdpa) and vendor 
driver to support a transitional model in the kernel that is able to 
work for both v0.95 and v1.0 drivers, with some slight aid from QEMU for 
detecting/emulation/shadowing (for e.g CVQ, I/O port relay). I guess for 
the former we still rely on vendor for a performant data vqs 
implementation, leaving the question to what it may end up eventually in 
the kernel is effectively the latter).

Thanks,
-Siwei
>
> Thanks
>
>
>
>> Meanwhile, I'll check internally to see if a legacy only model
would
>> work. Thanks.
>>
>> Thanks,
>> -Siwei
>>
>>
>>>
>>>>>> On 3/2/2021 2:53 AM, Jason Wang wrote:
>>>>>>> On 2021/3/2 5:47 ??, Michael S. Tsirkin wrote:
>>>>>>>> On Mon, Mar 01, 2021 at 11:56:50AM +0800, Jason
Wang wrote:
>>>>>>>>> On 2021/3/1 5:34 ??, Michael S. Tsirkin
wrote:
>>>>>>>>>> On Wed, Feb 24, 2021 at 10:24:41AM
-0800, Si-Wei Liu wrote:
>>>>>>>>>>>> Detecting it isn't enough
though, we will need a new ioctl to notify
>>>>>>>>>>>> the kernel that it's a
legacy guest. Ugh :(
>>>>>>>>>>> Well, although I think adding an
ioctl is doable, may I
>>>>>>>>>>> know what the use
>>>>>>>>>>> case there will be for kernel to
leverage such info
>>>>>>>>>>> directly? Is there a
>>>>>>>>>>> case QEMU can't do with
dedicate ioctls later if there's indeed
>>>>>>>>>>> differentiation (legacy v.s.
modern) needed?
>>>>>>>>>> BTW a good API could be
>>>>>>>>>>
>>>>>>>>>> #define VHOST_SET_ENDIAN
_IOW(VHOST_VIRTIO, ?, int)
>>>>>>>>>> #define VHOST_GET_ENDIAN
_IOW(VHOST_VIRTIO, ?, int)
>>>>>>>>>>
>>>>>>>>>> we did it per vring but maybe that was
a mistake ...
>>>>>>>>> Actually, I wonder whether it's good
time to just not support
>>>>>>>>> legacy driver
>>>>>>>>> for vDPA. Consider:
>>>>>>>>>
>>>>>>>>> 1) It's definition is no-normative
>>>>>>>>> 2) A lot of budren of codes
>>>>>>>>>
>>>>>>>>> So qemu can still present the legacy device
since the config
>>>>>>>>> space or other
>>>>>>>>> stuffs that is presented by vhost-vDPA is
not expected to be
>>>>>>>>> accessed by
>>>>>>>>> guest directly. Qemu can do the endian
conversion when necessary
>>>>>>>>> in this
>>>>>>>>> case?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>> Overall I would be fine with this approach but
we need to avoid breaking
>>>>>>>> working userspace, qemu releases with vdpa
support are out there and
>>>>>>>> seem to work for people. Any changes need to
take that into account
>>>>>>>> and document compatibility concerns.
>>>>>>> Agree, let me check.
>>>>>>>
>>>>>>>
>>>>>>>>      I note that any hardware
>>>>>>>> implementation is already broken for legacy
except on platforms with
>>>>>>>> strong ordering which might be helpful in
reducing the scope.
>>>>>>> Yes.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>

Virtualization - Dec 2021 - vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)