thr3ads.net - Virtualization - vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero) [Dec 2021]

If this information is useful, please help other people find it:
Share via:

Si-Wei Liu

2021-Dec-14 01:59 UTC

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

On 12/12/2021 1:26 AM, Michael S. Tsirkin wrote:> On Fri, Dec 10, 2021 at 05:44:15PM -0800, Si-Wei Liu wrote:
>> Sorry for reviving this ancient thread. I was kinda lost for the
conclusion
>> it ended up with. I have the following questions,
>>
>> 1. legacy guest support: from the past conversations it doesn't
seem the
>> support will be completely dropped from the table, is my understanding
>> correct? Actually we're interested in supporting virtio v0.95 guest
for x86,
>> which is backed by the spec at
>>
https://urldefense.com/v3/__https://ozlabs.org/*rusty/virtio-spec/virtio-0.9.5.pdf__;fg!!ACWV5N9M2RV99hQ!dTKmzJwwRsFM7BtSuTDu1cNly5n4XCotH0WYmidzGqHSXt40i7ZU43UcNg7GYxZg$
. Though I'm not sure
>> if there's request/need to support wilder legacy virtio versions
earlier
>> beyond.
> I personally feel it's less work to add in kernel than try to
> work around it in userspace. Jason feels differently.
> Maybe post the patches and this will prove to Jason it's not
> too terrible?I suppose if the vdpa vendor does support 0.95 in the datapath and ring 
layout level and is limited to x86 only, there should be easy way out. I 
checked with Eli and other Mellanox/NVDIA folks for hardware/firmware 
level 0.95 support, it seems all the ingredient had been there already 
dated back to the DPDK days. The only major thing limiting is in the 
vDPA software that the current vdpa core has the assumption around 
VIRTIO_F_ACCESS_PLATFORM for a few DMA setup ops, which is virtio 1.0 only.
>
>> 2. suppose some form of legacy guest support needs to be there, how do
we
>> deal with the bogus assumption below in vdpa_get_config() in the short
term?
>> It looks one of the intuitive fix is to move the vdpa_set_features call
out
>> of vdpa_get_config() to vdpa_set_config().
>>
>>  ??????? /*
>>  ???????? * Config accesses aren't supposed to trigger before
features are
>> set.
>>  ???????? * If it does happen we assume a legacy guest.
>>  ???????? */
>>  ??????? if (!vdev->features_valid)
>>  ??????????????? vdpa_set_features(vdev, 0);
>>  ??????? ops->get_config(vdev, offset, buf, len);
>>
>> I can post a patch to fix 2) if there's consensus already reached.
>>
>> Thanks,
>> -Siwei
> I'm not sure how important it is to change that.
> In any case it only affects transitional devices, right?
> Legacy only should not care ...Yes I'd like to distinguish legacy driver (suppose it is 0.95) against 
the modern one in a transitional device model rather than being legacy 
only. That way a v0.95 and v1.0 supporting vdpa parent can support both 
types of guests without having to reconfigure. Or are you suggesting 
limit to legacy only at the time of vdpa creation would simplify the 
implementation a lot?

Thanks,
-Siwei
>
>> On 3/2/2021 2:53 AM, Jason Wang wrote:
>>> On 2021/3/2 5:47 ??, Michael S. Tsirkin wrote:
>>>> On Mon, Mar 01, 2021 at 11:56:50AM +0800, Jason Wang wrote:
>>>>> On 2021/3/1 5:34 ??, Michael S. Tsirkin wrote:
>>>>>> On Wed, Feb 24, 2021 at 10:24:41AM -0800, Si-Wei Liu
wrote:
>>>>>>>> Detecting it isn't enough though, we will
need a new ioctl to notify
>>>>>>>> the kernel that it's a legacy guest. Ugh :(
>>>>>>> Well, although I think adding an ioctl is doable,
may I
>>>>>>> know what the use
>>>>>>> case there will be for kernel to leverage such info
>>>>>>> directly? Is there a
>>>>>>> case QEMU can't do with dedicate ioctls later
if there's indeed
>>>>>>> differentiation (legacy v.s. modern) needed?
>>>>>> BTW a good API could be
>>>>>>
>>>>>> #define VHOST_SET_ENDIAN _IOW(VHOST_VIRTIO, ?, int)
>>>>>> #define VHOST_GET_ENDIAN _IOW(VHOST_VIRTIO, ?, int)
>>>>>>
>>>>>> we did it per vring but maybe that was a mistake ...
>>>>> Actually, I wonder whether it's good time to just not
support
>>>>> legacy driver
>>>>> for vDPA. Consider:
>>>>>
>>>>> 1) It's definition is no-normative
>>>>> 2) A lot of budren of codes
>>>>>
>>>>> So qemu can still present the legacy device since the
config
>>>>> space or other
>>>>> stuffs that is presented by vhost-vDPA is not expected to
be
>>>>> accessed by
>>>>> guest directly. Qemu can do the endian conversion when
necessary
>>>>> in this
>>>>> case?
>>>>>
>>>>> Thanks
>>>>>
>>>> Overall I would be fine with this approach but we need to avoid
breaking
>>>> working userspace, qemu releases with vdpa support are out
there and
>>>> seem to work for people. Any changes need to take that into
account
>>>> and document compatibility concerns.
>>>
>>> Agree, let me check.
>>>
>>>
>>>>  ? I note that any hardware
>>>> implementation is already broken for legacy except on platforms
with
>>>> strong ordering which might be helpful in reducing the scope.
>>>
>>> Yes.
>>>
>>> Thanks
>>>
>>>
>>>>

Jason Wang

2021-Dec-14 03:01 UTC

head link

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

On Tue, Dec 14, 2021 at 10:00 AM Si-Wei Liu <si-wei.liu at oracle.com>
wrote:>
>
>
> On 12/12/2021 1:26 AM, Michael S. Tsirkin wrote:
> > On Fri, Dec 10, 2021 at 05:44:15PM -0800, Si-Wei Liu wrote:
> >> Sorry for reviving this ancient thread. I was kinda lost for the
conclusion
> >> it ended up with. I have the following questions,
> >>
> >> 1. legacy guest support: from the past conversations it
doesn't seem the
> >> support will be completely dropped from the table, is my
understanding
> >> correct? Actually we're interested in supporting virtio v0.95
guest for x86,
> >> which is backed by the spec at
> >>
https://urldefense.com/v3/__https://ozlabs.org/*rusty/virtio-spec/virtio-0.9.5.pdf__;fg!!ACWV5N9M2RV99hQ!dTKmzJwwRsFM7BtSuTDu1cNly5n4XCotH0WYmidzGqHSXt40i7ZU43UcNg7GYxZg$
. Though I'm not sure
> >> if there's request/need to support wilder legacy virtio
versions earlier
> >> beyond.
> > I personally feel it's less work to add in kernel than try to
> > work around it in userspace. Jason feels differently.
> > Maybe post the patches and this will prove to Jason it's not
> > too terrible?
> I suppose if the vdpa vendor does support 0.95 in the datapath and ring
> layout level and is limited to x86 only, there should be easy way out.
Note that thought I try to mandate 1.0 device when writing the codes
but the core vdpa doesn't mandate it, and we've already had one parent
which is based on the 0.95 spec which is the eni_vdpa:

1) it depends on X86 (so no endian and ordering issues)
2) it has various subtle things like it can't work well without
mrg_rxbuf features negotiated since the device assumes a fixed vnet
header length.
3) it can only be used by legacy drivers in the guest (no VERSION_1
since the device mandates a 4096 alignment which doesn't comply with
1.0)

So it's a proof of 0.95 parent support in the vDPA core.

And we had a modern only parent, that is the vp_vdpa parent (though
it's not hard to add legacy support).

So for all the other vendors, assuming it has full support for
transitional devices for x86. As discussed, we need to handle:

1) config access before features
2) kick before driver_ok

Anything else? If not, it looks easier to do them in the userspace.
The only advantages for doing it in the kernel is to make it work for
virtio-vdpa. But virito-vdpa doesn't need transitional devices.
> I
> checked with Eli and other Mellanox/NVDIA folks for hardware/firmware
> level 0.95 support, it seems all the ingredient had been there already
> dated back to the DPDK days. The only major thing limiting is in the
> vDPA software that the current vdpa core has the assumption around
> VIRTIO_F_ACCESS_PLATFORM for a few DMA setup ops, which is virtio 1.0 only.
The code doesn't have such an assumption or anything I missed? Or you
meant the vhost-vdpa that tries to talk with the IOMMU layer directly,
it should be ok since host IOMMU is hidden from guest anyway.
>
> >
> >> 2. suppose some form of legacy guest support needs to be there,
how do we
> >> deal with the bogus assumption below in vdpa_get_config() in the
short term?
> >> It looks one of the intuitive fix is to move the vdpa_set_features
call out
> >> of vdpa_get_config() to vdpa_set_config().
> >>
> >>          /*
> >>           * Config accesses aren't supposed to trigger before
features are
> >> set.
> >>           * If it does happen we assume a legacy guest.
> >>           */
> >>          if (!vdev->features_valid)
> >>                  vdpa_set_features(vdev, 0);
> >>          ops->get_config(vdev, offset, buf, len);
> >>
> >> I can post a patch to fix 2) if there's consensus already
reached.
> >>
> >> Thanks,
> >> -Siwei
> > I'm not sure how important it is to change that.
> > In any case it only affects transitional devices, right?
> > Legacy only should not care ...
> Yes I'd like to distinguish legacy driver (suppose it is 0.95) against
> the modern one in a transitional device model rather than being legacy
> only. That way a v0.95 and v1.0 supporting vdpa parent can support both
> types of guests without having to reconfigure.
I think this is what a transitional device is expected to work.

Thanks
> Or are you suggesting
> limit to legacy only at the time of vdpa creation would simplify the
> implementation a lot?
>
> Thanks,
> -Siwei
>
> >
> >> On 3/2/2021 2:53 AM, Jason Wang wrote:
> >>> On 2021/3/2 5:47 ??, Michael S. Tsirkin wrote:
> >>>> On Mon, Mar 01, 2021 at 11:56:50AM +0800, Jason Wang
wrote:
> >>>>> On 2021/3/1 5:34 ??, Michael S. Tsirkin wrote:
> >>>>>> On Wed, Feb 24, 2021 at 10:24:41AM -0800, Si-Wei
Liu wrote:
> >>>>>>>> Detecting it isn't enough though, we
will need a new ioctl to notify
> >>>>>>>> the kernel that it's a legacy guest.
Ugh :(
> >>>>>>> Well, although I think adding an ioctl is
doable, may I
> >>>>>>> know what the use
> >>>>>>> case there will be for kernel to leverage such
info
> >>>>>>> directly? Is there a
> >>>>>>> case QEMU can't do with dedicate ioctls
later if there's indeed
> >>>>>>> differentiation (legacy v.s. modern) needed?
> >>>>>> BTW a good API could be
> >>>>>>
> >>>>>> #define VHOST_SET_ENDIAN _IOW(VHOST_VIRTIO, ?,
int)
> >>>>>> #define VHOST_GET_ENDIAN _IOW(VHOST_VIRTIO, ?,
int)
> >>>>>>
> >>>>>> we did it per vring but maybe that was a mistake
...
> >>>>> Actually, I wonder whether it's good time to just
not support
> >>>>> legacy driver
> >>>>> for vDPA. Consider:
> >>>>>
> >>>>> 1) It's definition is no-normative
> >>>>> 2) A lot of budren of codes
> >>>>>
> >>>>> So qemu can still present the legacy device since the
config
> >>>>> space or other
> >>>>> stuffs that is presented by vhost-vDPA is not expected
to be
> >>>>> accessed by
> >>>>> guest directly. Qemu can do the endian conversion when
necessary
> >>>>> in this
> >>>>> case?
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>> Overall I would be fine with this approach but we need to
avoid breaking
> >>>> working userspace, qemu releases with vdpa support are out
there and
> >>>> seem to work for people. Any changes need to take that
into account
> >>>> and document compatibility concerns.
> >>>
> >>> Agree, let me check.
> >>>
> >>>
> >>>>    I note that any hardware
> >>>> implementation is already broken for legacy except on
platforms with
> >>>> strong ordering which might be helpful in reducing the
scope.
> >>>
> >>> Yes.
> >>>
> >>> Thanks
> >>>
> >>>
> >>>>
>

Michael S. Tsirkin

2021-Dec-14 05:06 UTC

head link

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

On Mon, Dec 13, 2021 at 05:59:45PM -0800, Si-Wei Liu
wrote:> 
> 
> On 12/12/2021 1:26 AM, Michael S. Tsirkin wrote:
> > On Fri, Dec 10, 2021 at 05:44:15PM -0800, Si-Wei Liu wrote:
> > > Sorry for reviving this ancient thread. I was kinda lost for the
conclusion
> > > it ended up with. I have the following questions,
> > > 
> > > 1. legacy guest support: from the past conversations it
doesn't seem the
> > > support will be completely dropped from the table, is my
understanding
> > > correct? Actually we're interested in supporting virtio v0.95
guest for x86,
> > > which is backed by the spec at
> > >
https://urldefense.com/v3/__https://ozlabs.org/*rusty/virtio-spec/virtio-0.9.5.pdf__;fg!!ACWV5N9M2RV99hQ!dTKmzJwwRsFM7BtSuTDu1cNly5n4XCotH0WYmidzGqHSXt40i7ZU43UcNg7GYxZg$
. Though I'm not sure
> > > if there's request/need to support wilder legacy virtio
versions earlier
> > > beyond.
> > I personally feel it's less work to add in kernel than try to
> > work around it in userspace. Jason feels differently.
> > Maybe post the patches and this will prove to Jason it's not
> > too terrible?
> I suppose if the vdpa vendor does support 0.95 in the datapath and ring
> layout level and is limited to x86 only, there should be easy way out.
Note a subtle difference: what matters is that guest, not host is x86.
Matters for emulators which might reorder memory accesses.
I guess this enforcement belongs in QEMU then?
> I
> checked with Eli and other Mellanox/NVDIA folks for hardware/firmware level
> 0.95 support, it seems all the ingredient had been there already dated back
> to the DPDK days. The only major thing limiting is in the vDPA software
that
> the current vdpa core has the assumption around VIRTIO_F_ACCESS_PLATFORM
for
> a few DMA setup ops, which is virtio 1.0 only.
> 
> > 
> > > 2. suppose some form of legacy guest support needs to be there,
how do we
> > > deal with the bogus assumption below in vdpa_get_config() in the
short term?
> > > It looks one of the intuitive fix is to move the
vdpa_set_features call out
> > > of vdpa_get_config() to vdpa_set_config().
> > > 
> > >  ??????? /*
> > >  ???????? * Config accesses aren't supposed to trigger before
features are
> > > set.
> > >  ???????? * If it does happen we assume a legacy guest.
> > >  ???????? */
> > >  ??????? if (!vdev->features_valid)
> > >  ??????????????? vdpa_set_features(vdev, 0);
> > >  ??????? ops->get_config(vdev, offset, buf, len);
> > > 
> > > I can post a patch to fix 2) if there's consensus already
reached.
> > > 
> > > Thanks,
> > > -Siwei
> > I'm not sure how important it is to change that.
> > In any case it only affects transitional devices, right?
> > Legacy only should not care ...
> Yes I'd like to distinguish legacy driver (suppose it is 0.95) against
the
> modern one in a transitional device model rather than being legacy only.
> That way a v0.95 and v1.0 supporting vdpa parent can support both types of
> guests without having to reconfigure. Or are you suggesting limit to legacy
> only at the time of vdpa creation would simplify the implementation a lot?
> 
> Thanks,
> -Siwei

I don't know for sure. Take a look at the work Halil was doing
to try and support transitional devices with BE guests.

> > 
> > > On 3/2/2021 2:53 AM, Jason Wang wrote:
> > > > On 2021/3/2 5:47 ??, Michael S. Tsirkin wrote:
> > > > > On Mon, Mar 01, 2021 at 11:56:50AM +0800, Jason Wang
wrote:
> > > > > > On 2021/3/1 5:34 ??, Michael S. Tsirkin wrote:
> > > > > > > On Wed, Feb 24, 2021 at 10:24:41AM -0800,
Si-Wei Liu wrote:
> > > > > > > > > Detecting it isn't enough
though, we will need a new ioctl to notify
> > > > > > > > > the kernel that it's a legacy
guest. Ugh :(
> > > > > > > > Well, although I think adding an ioctl
is doable, may I
> > > > > > > > know what the use
> > > > > > > > case there will be for kernel to
leverage such info
> > > > > > > > directly? Is there a
> > > > > > > > case QEMU can't do with dedicate
ioctls later if there's indeed
> > > > > > > > differentiation (legacy v.s. modern)
needed?
> > > > > > > BTW a good API could be
> > > > > > > 
> > > > > > > #define VHOST_SET_ENDIAN _IOW(VHOST_VIRTIO,
?, int)
> > > > > > > #define VHOST_GET_ENDIAN _IOW(VHOST_VIRTIO,
?, int)
> > > > > > > 
> > > > > > > we did it per vring but maybe that was a
mistake ...
> > > > > > Actually, I wonder whether it's good time to
just not support
> > > > > > legacy driver
> > > > > > for vDPA. Consider:
> > > > > > 
> > > > > > 1) It's definition is no-normative
> > > > > > 2) A lot of budren of codes
> > > > > > 
> > > > > > So qemu can still present the legacy device since
the config
> > > > > > space or other
> > > > > > stuffs that is presented by vhost-vDPA is not
expected to be
> > > > > > accessed by
> > > > > > guest directly. Qemu can do the endian conversion
when necessary
> > > > > > in this
> > > > > > case?
> > > > > > 
> > > > > > Thanks
> > > > > > 
> > > > > Overall I would be fine with this approach but we need
to avoid breaking
> > > > > working userspace, qemu releases with vdpa support are
out there and
> > > > > seem to work for people. Any changes need to take that
into account
> > > > > and document compatibility concerns.
> > > > 
> > > > Agree, let me check.
> > > > 
> > > > 
> > > > >  ? I note that any hardware
> > > > > implementation is already broken for legacy except on
platforms with
> > > > > strong ordering which might be helpful in reducing the
scope.
> > > > 
> > > > Yes.
> > > > 
> > > > Thanks
> > > > 
> > > > 
> > > > >

Virtualization - Dec 2021 - vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)

vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)