Si-Wei Liu
2021-Dec-15 01:05 UTC
vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)
On 12/13/2021 9:06 PM, Michael S. Tsirkin wrote:> On Mon, Dec 13, 2021 at 05:59:45PM -0800, Si-Wei Liu wrote: >> >> On 12/12/2021 1:26 AM, Michael S. Tsirkin wrote: >>> On Fri, Dec 10, 2021 at 05:44:15PM -0800, Si-Wei Liu wrote: >>>> Sorry for reviving this ancient thread. I was kinda lost for the conclusion >>>> it ended up with. I have the following questions, >>>> >>>> 1. legacy guest support: from the past conversations it doesn't seem the >>>> support will be completely dropped from the table, is my understanding >>>> correct? Actually we're interested in supporting virtio v0.95 guest for x86, >>>> which is backed by the spec at >>>> https://urldefense.com/v3/__https://ozlabs.org/*rusty/virtio-spec/virtio-0.9.5.pdf__;fg!!ACWV5N9M2RV99hQ!dTKmzJwwRsFM7BtSuTDu1cNly5n4XCotH0WYmidzGqHSXt40i7ZU43UcNg7GYxZg$ . Though I'm not sure >>>> if there's request/need to support wilder legacy virtio versions earlier >>>> beyond. >>> I personally feel it's less work to add in kernel than try to >>> work around it in userspace. Jason feels differently. >>> Maybe post the patches and this will prove to Jason it's not >>> too terrible? >> I suppose if the vdpa vendor does support 0.95 in the datapath and ring >> layout level and is limited to x86 only, there should be easy way out. > Note a subtle difference: what matters is that guest, not host is x86. > Matters for emulators which might reorder memory accesses. > I guess this enforcement belongs in QEMU then?Right, I mean to get started, the initial guest driver support and the corresponding QEMU support for transitional vdpa backend can be limited to x86 guest/host only. Since the config space is emulated in QEMU, I suppose it's not hard to enforce in QEMU. QEMU can drive GET_LEGACY, GET_ENDIAN et al ioctls in advance to get the capability from the individual vendor driver. For that, we need another negotiation protocol similar to vhost_user's protocol_features between the vdpa kernel and QEMU, way before the guest driver is ever probed and its feature negotiation kicks in. Not sure we need a GET_MEMORY_ORDER ioctl call from the device, but we can assume weak ordering for legacy at this point (x86 only)?> >> I >> checked with Eli and other Mellanox/NVDIA folks for hardware/firmware level >> 0.95 support, it seems all the ingredient had been there already dated back >> to the DPDK days. The only major thing limiting is in the vDPA software that >> the current vdpa core has the assumption around VIRTIO_F_ACCESS_PLATFORM for >> a few DMA setup ops, which is virtio 1.0 only. >> >>>> 2. suppose some form of legacy guest support needs to be there, how do we >>>> deal with the bogus assumption below in vdpa_get_config() in the short term? >>>> It looks one of the intuitive fix is to move the vdpa_set_features call out >>>> of vdpa_get_config() to vdpa_set_config(). >>>> >>>> ??????? /* >>>> ???????? * Config accesses aren't supposed to trigger before features are >>>> set. >>>> ???????? * If it does happen we assume a legacy guest. >>>> ???????? */ >>>> ??????? if (!vdev->features_valid) >>>> ??????????????? vdpa_set_features(vdev, 0); >>>> ??????? ops->get_config(vdev, offset, buf, len); >>>> >>>> I can post a patch to fix 2) if there's consensus already reached. >>>> >>>> Thanks, >>>> -Siwei >>> I'm not sure how important it is to change that. >>> In any case it only affects transitional devices, right? >>> Legacy only should not care ... >> Yes I'd like to distinguish legacy driver (suppose it is 0.95) against the >> modern one in a transitional device model rather than being legacy only. >> That way a v0.95 and v1.0 supporting vdpa parent can support both types of >> guests without having to reconfigure. Or are you suggesting limit to legacy >> only at the time of vdpa creation would simplify the implementation a lot? >> >> Thanks, >> -Siwei > > I don't know for sure. Take a look at the work Halil was doing > to try and support transitional devices with BE guests.Hmmm, we can have those endianness ioctls defined but the initial QEMU implementation can be started to support x86 guest/host with little endian and weak memory ordering first. The real trick is to detect legacy guest - I am not sure if it's feasible to shift all the legacy detection work to QEMU, or the kernel has to be part of the detection (e.g. the kick before DRIVER_OK thing we have to duplicate the tracking effort in QEMU) as well. Let me take a further look and get back. Meanwhile, I'll check internally to see if a legacy only model would work. Thanks. Thanks, -Siwei> > >>>> On 3/2/2021 2:53 AM, Jason Wang wrote: >>>>> On 2021/3/2 5:47 ??, Michael S. Tsirkin wrote: >>>>>> On Mon, Mar 01, 2021 at 11:56:50AM +0800, Jason Wang wrote: >>>>>>> On 2021/3/1 5:34 ??, Michael S. Tsirkin wrote: >>>>>>>> On Wed, Feb 24, 2021 at 10:24:41AM -0800, Si-Wei Liu wrote: >>>>>>>>>> Detecting it isn't enough though, we will need a new ioctl to notify >>>>>>>>>> the kernel that it's a legacy guest. Ugh :( >>>>>>>>> Well, although I think adding an ioctl is doable, may I >>>>>>>>> know what the use >>>>>>>>> case there will be for kernel to leverage such info >>>>>>>>> directly? Is there a >>>>>>>>> case QEMU can't do with dedicate ioctls later if there's indeed >>>>>>>>> differentiation (legacy v.s. modern) needed? >>>>>>>> BTW a good API could be >>>>>>>> >>>>>>>> #define VHOST_SET_ENDIAN _IOW(VHOST_VIRTIO, ?, int) >>>>>>>> #define VHOST_GET_ENDIAN _IOW(VHOST_VIRTIO, ?, int) >>>>>>>> >>>>>>>> we did it per vring but maybe that was a mistake ... >>>>>>> Actually, I wonder whether it's good time to just not support >>>>>>> legacy driver >>>>>>> for vDPA. Consider: >>>>>>> >>>>>>> 1) It's definition is no-normative >>>>>>> 2) A lot of budren of codes >>>>>>> >>>>>>> So qemu can still present the legacy device since the config >>>>>>> space or other >>>>>>> stuffs that is presented by vhost-vDPA is not expected to be >>>>>>> accessed by >>>>>>> guest directly. Qemu can do the endian conversion when necessary >>>>>>> in this >>>>>>> case? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>> Overall I would be fine with this approach but we need to avoid breaking >>>>>> working userspace, qemu releases with vdpa support are out there and >>>>>> seem to work for people. Any changes need to take that into account >>>>>> and document compatibility concerns. >>>>> Agree, let me check. >>>>> >>>>> >>>>>> ? I note that any hardware >>>>>> implementation is already broken for legacy except on platforms with >>>>>> strong ordering which might be helpful in reducing the scope. >>>>> Yes. >>>>> >>>>> Thanks >>>>> >>>>>
Jason Wang
2021-Dec-15 02:06 UTC
vdpa legacy guest support (was Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero)
On Wed, Dec 15, 2021 at 9:05 AM Si-Wei Liu <si-wei.liu at oracle.com> wrote:> > > > On 12/13/2021 9:06 PM, Michael S. Tsirkin wrote: > > On Mon, Dec 13, 2021 at 05:59:45PM -0800, Si-Wei Liu wrote: > >> > >> On 12/12/2021 1:26 AM, Michael S. Tsirkin wrote: > >>> On Fri, Dec 10, 2021 at 05:44:15PM -0800, Si-Wei Liu wrote: > >>>> Sorry for reviving this ancient thread. I was kinda lost for the conclusion > >>>> it ended up with. I have the following questions, > >>>> > >>>> 1. legacy guest support: from the past conversations it doesn't seem the > >>>> support will be completely dropped from the table, is my understanding > >>>> correct? Actually we're interested in supporting virtio v0.95 guest for x86, > >>>> which is backed by the spec at > >>>> https://urldefense.com/v3/__https://ozlabs.org/*rusty/virtio-spec/virtio-0.9.5.pdf__;fg!!ACWV5N9M2RV99hQ!dTKmzJwwRsFM7BtSuTDu1cNly5n4XCotH0WYmidzGqHSXt40i7ZU43UcNg7GYxZg$ . Though I'm not sure > >>>> if there's request/need to support wilder legacy virtio versions earlier > >>>> beyond. > >>> I personally feel it's less work to add in kernel than try to > >>> work around it in userspace. Jason feels differently. > >>> Maybe post the patches and this will prove to Jason it's not > >>> too terrible? > >> I suppose if the vdpa vendor does support 0.95 in the datapath and ring > >> layout level and is limited to x86 only, there should be easy way out. > > Note a subtle difference: what matters is that guest, not host is x86. > > Matters for emulators which might reorder memory accesses. > > I guess this enforcement belongs in QEMU then? > Right, I mean to get started, the initial guest driver support and the > corresponding QEMU support for transitional vdpa backend can be limited > to x86 guest/host only. Since the config space is emulated in QEMU, I > suppose it's not hard to enforce in QEMU.It's more than just config space, most devices have headers before the buffer.> QEMU can drive GET_LEGACY, > GET_ENDIAN et al ioctls in advance to get the capability from the > individual vendor driver. For that, we need another negotiation protocol > similar to vhost_user's protocol_features between the vdpa kernel and > QEMU, way before the guest driver is ever probed and its feature > negotiation kicks in. Not sure we need a GET_MEMORY_ORDER ioctl call > from the device, but we can assume weak ordering for legacy at this > point (x86 only)?I'm lost here, we have get_features() so: 1) VERSION_1 means the device uses LE if provided, otherwise natvie 2) ORDER_PLATFORM means device requires platform ordering Any reason for having a new API for this?> > > > >> I > >> checked with Eli and other Mellanox/NVDIA folks for hardware/firmware level > >> 0.95 support, it seems all the ingredient had been there already dated back > >> to the DPDK days. The only major thing limiting is in the vDPA software that > >> the current vdpa core has the assumption around VIRTIO_F_ACCESS_PLATFORM for > >> a few DMA setup ops, which is virtio 1.0 only. > >> > >>>> 2. suppose some form of legacy guest support needs to be there, how do we > >>>> deal with the bogus assumption below in vdpa_get_config() in the short term? > >>>> It looks one of the intuitive fix is to move the vdpa_set_features call out > >>>> of vdpa_get_config() to vdpa_set_config(). > >>>> > >>>> /* > >>>> * Config accesses aren't supposed to trigger before features are > >>>> set. > >>>> * If it does happen we assume a legacy guest. > >>>> */ > >>>> if (!vdev->features_valid) > >>>> vdpa_set_features(vdev, 0); > >>>> ops->get_config(vdev, offset, buf, len); > >>>> > >>>> I can post a patch to fix 2) if there's consensus already reached. > >>>> > >>>> Thanks, > >>>> -Siwei > >>> I'm not sure how important it is to change that. > >>> In any case it only affects transitional devices, right? > >>> Legacy only should not care ... > >> Yes I'd like to distinguish legacy driver (suppose it is 0.95) against the > >> modern one in a transitional device model rather than being legacy only. > >> That way a v0.95 and v1.0 supporting vdpa parent can support both types of > >> guests without having to reconfigure. Or are you suggesting limit to legacy > >> only at the time of vdpa creation would simplify the implementation a lot? > >> > >> Thanks, > >> -Siwei > > > > I don't know for sure. Take a look at the work Halil was doing > > to try and support transitional devices with BE guests. > Hmmm, we can have those endianness ioctls defined but the initial QEMU > implementation can be started to support x86 guest/host with little > endian and weak memory ordering first. The real trick is to detect > legacy guest - I am not sure if it's feasible to shift all the legacy > detection work to QEMU, or the kernel has to be part of the detection > (e.g. the kick before DRIVER_OK thing we have to duplicate the tracking > effort in QEMU) as well. Let me take a further look and get back.Michael may think differently but I think doing this in Qemu is much easier. Thanks> > Meanwhile, I'll check internally to see if a legacy only model would > work. Thanks. > > Thanks, > -Siwei > > > > > > > >>>> On 3/2/2021 2:53 AM, Jason Wang wrote: > >>>>> On 2021/3/2 5:47 ??, Michael S. Tsirkin wrote: > >>>>>> On Mon, Mar 01, 2021 at 11:56:50AM +0800, Jason Wang wrote: > >>>>>>> On 2021/3/1 5:34 ??, Michael S. Tsirkin wrote: > >>>>>>>> On Wed, Feb 24, 2021 at 10:24:41AM -0800, Si-Wei Liu wrote: > >>>>>>>>>> Detecting it isn't enough though, we will need a new ioctl to notify > >>>>>>>>>> the kernel that it's a legacy guest. Ugh :( > >>>>>>>>> Well, although I think adding an ioctl is doable, may I > >>>>>>>>> know what the use > >>>>>>>>> case there will be for kernel to leverage such info > >>>>>>>>> directly? Is there a > >>>>>>>>> case QEMU can't do with dedicate ioctls later if there's indeed > >>>>>>>>> differentiation (legacy v.s. modern) needed? > >>>>>>>> BTW a good API could be > >>>>>>>> > >>>>>>>> #define VHOST_SET_ENDIAN _IOW(VHOST_VIRTIO, ?, int) > >>>>>>>> #define VHOST_GET_ENDIAN _IOW(VHOST_VIRTIO, ?, int) > >>>>>>>> > >>>>>>>> we did it per vring but maybe that was a mistake ... > >>>>>>> Actually, I wonder whether it's good time to just not support > >>>>>>> legacy driver > >>>>>>> for vDPA. Consider: > >>>>>>> > >>>>>>> 1) It's definition is no-normative > >>>>>>> 2) A lot of budren of codes > >>>>>>> > >>>>>>> So qemu can still present the legacy device since the config > >>>>>>> space or other > >>>>>>> stuffs that is presented by vhost-vDPA is not expected to be > >>>>>>> accessed by > >>>>>>> guest directly. Qemu can do the endian conversion when necessary > >>>>>>> in this > >>>>>>> case? > >>>>>>> > >>>>>>> Thanks > >>>>>>> > >>>>>> Overall I would be fine with this approach but we need to avoid breaking > >>>>>> working userspace, qemu releases with vdpa support are out there and > >>>>>> seem to work for people. Any changes need to take that into account > >>>>>> and document compatibility concerns. > >>>>> Agree, let me check. > >>>>> > >>>>> > >>>>>> I note that any hardware > >>>>>> implementation is already broken for legacy except on platforms with > >>>>>> strong ordering which might be helpful in reducing the scope. > >>>>> Yes. > >>>>> > >>>>> Thanks > >>>>> > >>>>> >