thr3ads.net - Virtualization - [PATCH] virtio: vdpa: new SolidNET DPU driver. [Nov 2022]

If this information is useful, please help other people find it:
Share via:

Alvaro Karsz

2022-Nov-09 09:44 UTC

[PATCH] virtio: vdpa: new SolidNET DPU driver.

Hi Jason,
> If I was not wrong, this value seems doesn't change. If yes, I wonder
we
> can move the kick_offset logic to snet_alloc_dev()?

There is not a real reason to have this logic in snet_set_vq_address,
so it could be moved to snet_build_vqs (which is called within
snet_alloc_dev).
> -EOPNOTSUPP?

Returning an error from the set_vq_state callback leads to probe failure.
This code is taken from drivers/virtio/virtio_vdpa.c,
virtio_vdpa_setup_vq function:
>  struct vdpa_vq_state state = {0};
>
> ......
>
> err = ops->set_vq_state(vdpa, index, &state);
> if (err)
>         goto err_vq;

I could check struct vdpa_vq_state, and return 0 if struct
vdpa_vq_state value is 0, -EOPNOTSUPP otherwise.
What do you think?
> I fail to understand how can this work. E.g after reset there should be
> no interaction from the device like DMA, otherwise there could have
> security implications.

You're right, I'll add a proper reset callback.
> Since read is ordered with write, a more easy way is to perform a ioread
> here.
> Interesting, which barrier is this paired?

Usually reads are slow, but we don't care about speed when sending
messages (since we only send a "destroy" message so far, meaning that
the pci remove callback was called), so the memory barrier can be
replaced with a read operation.
>
> Do we need to do endian conversion here (cpu_to_le64())?

Yes, I'll add it.
>
> Need to take endianess into account.

I'm not sure about that.
The endianness appears to be handled by the caller.
Function from include/linux/virtio_config.h
> static inline void virtio_cwrite16(struct virtio_device *vdev,
>   unsigned int offset, u16 val)
> {
>     __virtio16 v;
>
>
>     might_sleep();
>     v = cpu_to_virtio16(vdev, val);
>     vdev->config->set(vdev, offset, &v, sizeof(v));
> }
> static inline void virtio_cwrite32(struct virtio_device *vdev,
>   unsigned int offset, u32 val)
> {
>     __virtio32 v;
>
>
>     might_sleep();
>     v = cpu_to_virtio32(vdev, val);
>     vdev->config->set(vdev, offset, &v, sizeof(v));
> }
>
> static inline void virtio_cwrite64(struct virtio_device *vdev,
>   unsigned int offset, u64 val)
> {
>     __virtio64 v;
>
>
>     might_sleep();
>     v = cpu_to_virtio64(vdev, val);
>     vdev->config->set(vdev, offset, &v, sizeof(v));
> }

I'm not sure how the endianness can be handled by the vDPA driver.
This function may be called for a 8bit, 16bit, 32bit or 64bit variables.
It theoretically may be called to change multiple variables at once.
It may be called to change part of a variable.

>  If I was not wrong, the device depends on the platform IOMMU to work. So
> unless device have a more strict iova range than what platform IOMMU can
> provide, we can simply not advertise this and use the one that is
> provided by the IOMMU:
>
>
> static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
> {
>          struct vdpa_iova_range *range = &v->range;
>          struct vdpa_device *vdpa = v->vdpa;
>          const struct vdpa_config_ops *ops = vdpa->config;
>
>
>          if (ops->get_iova_range) {
>                  *range = ops->get_iova_range(vdpa);
>          } else if (v->domain &&
v->domain->geometry.force_aperture) {
>                  range->first =
v->domain->geometry.aperture_start;
>                  range->last = v->domain->geometry.aperture_end;
>          } else {
>                  range->first = 0;
>                  range->last = ULLONG_MAX;
>          }
> }

I'll delete the snet_get_iova_range function.
> Any chance to use device managed region helper here? It seems to
> simplify the codes (e.g the cleanup stuffs).

Ok, I'll do it.
> Is this better to say "config is not ready"? Btw, I wonder if it
makes
> more sense to wait until the confg is ready with a timeout?

Good idea, I'll implement the wait & timeout.
> I wonder if it's worth to bother consider we're using devres to
manage irqs.

You're right, this isn't required, I'll remove it.
>
> It looks to me all the devices created here use the same dma_dev (the
> PCI device), I wonder how the DMA is isolated among the vDPA devices
> created here.

All vDPA devices indeed use the same DMA device, there is no isolation
between the devices.
I'm not sure why there should be isolation in this case.
> Btw, the vDPA has been switched to use vDPA tool to create devices, it
> is suggested to implement the mgmt devices as what other parents did.
> Then the snet_alloc_dev() could be used for dev_add().

We want to leave control to the DPU at the moment, the number/type of
devices is determined by the DPU's config, and can't be controlled
from userspace.
> There looks like a race, the vDPA device could be registered to the bus
> and used by userspace by bus master/drvdata is set.

You're right, the vdpa registration should be done after the
master/drvdata is set.
> I think devres should take care of this since we're using
> pcim_enable_device()?

You're right, this isn't required, I'll remove it.
> According to the code, this seems to be the driver features and

These are the negotiated features
We're not keeping record of the driver features, when
set_driver_features is called, we just logic AND the driver features
with the supported features received from the DPU.
I'll rename it to be 'negotiated_features", this seems more
accurate.
> static int snet_set_drv_features(struct vdpa_device *vdev, u64 features)
> {
>     struct snet *snet = vdpa_to_snet(vdev);
>
>
>     snet->used_features = snet->cfg->features & features;
>     return 0;
> }

> This seems to be unused.

You're right, I'll remove it.


Thank you for your comments.
I'll send a new version once I finish working on the comments.

Alvaro

Jason Wang

2022-Nov-10 01:54 UTC

head link

[PATCH] virtio: vdpa: new SolidNET DPU driver.

? 2022/11/9 17:44, Alvaro Karsz ??:> Hi Jason,
>
>> If I was not wrong, this value seems doesn't change. If yes, I
wonder we
>> can move the kick_offset logic to snet_alloc_dev()?
>
> There is not a real reason to have this logic in snet_set_vq_address,
> so it could be moved to snet_build_vqs (which is called within
> snet_alloc_dev).
>
>> -EOPNOTSUPP?
>
> Returning an error from the set_vq_state callback leads to probe failure.
> This code is taken from drivers/virtio/virtio_vdpa.c,
> virtio_vdpa_setup_vq function:
>
>>   struct vdpa_vq_state state = {0};
>>
>> ......
>>
>> err = ops->set_vq_state(vdpa, index, &state);
>> if (err)
>>          goto err_vq;
>
> I could check struct vdpa_vq_state, and return 0 if struct
> vdpa_vq_state value is 0, -EOPNOTSUPP otherwise.
> What do you think?

That would be fine. (It could be something similar to 
vp_vdpa_set_vq_state_split()).

>
>> I fail to understand how can this work. E.g after reset there should be
>> no interaction from the device like DMA, otherwise there could have
>> security implications.
>
> You're right, I'll add a proper reset callback.
>
>> Since read is ordered with write, a more easy way is to perform a
ioread
>> here.
>> Interesting, which barrier is this paired?
>
> Usually reads are slow, but we don't care about speed when sending
> messages (since we only send a "destroy" message so far, meaning
that
> the pci remove callback was called), so the memory barrier can be
> replaced with a read operation.

Yes, actually, the memory barrier can't guarantee the write has been 
processed by the device.

>
>> Do we need to do endian conversion here (cpu_to_le64())?
>
> Yes, I'll add it.
>
>> Need to take endianess into account.
>
> I'm not sure about that.
> The endianness appears to be handled by the caller.
> Function from include/linux/virtio_config.h
>
>> static inline void virtio_cwrite16(struct virtio_device *vdev,
>>    unsigned int offset, u16 val)
>> {
>>      __virtio16 v;
>>
>>
>>      might_sleep();
>>      v = cpu_to_virtio16(vdev, val);
>>      vdev->config->set(vdev, offset, &v, sizeof(v));
>> }

You are right.

>
>> static inline void virtio_cwrite32(struct virtio_device *vdev,
>>    unsigned int offset, u32 val)
>> {
>>      __virtio32 v;
>>
>>
>>      might_sleep();
>>      v = cpu_to_virtio32(vdev, val);
>>      vdev->config->set(vdev, offset, &v, sizeof(v));
>> }
>>
>
>> static inline void virtio_cwrite64(struct virtio_device *vdev,
>>    unsigned int offset, u64 val)
>> {
>>      __virtio64 v;
>>
>>
>>      might_sleep();
>>      v = cpu_to_virtio64(vdev, val);
>>      vdev->config->set(vdev, offset, &v, sizeof(v));
>> }
>
> I'm not sure how the endianness can be handled by the vDPA driver.
> This function may be called for a 8bit, 16bit, 32bit or 64bit variables.
> It theoretically may be called to change multiple variables at once.
> It may be called to change part of a variable.
>
>
>>   If I was not wrong, the device depends on the platform IOMMU to work.
So
>> unless device have a more strict iova range than what platform IOMMU
can
>> provide, we can simply not advertise this and use the one that is
>> provided by the IOMMU:
>>
>>
>> static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
>> {
>>           struct vdpa_iova_range *range = &v->range;
>>           struct vdpa_device *vdpa = v->vdpa;
>>           const struct vdpa_config_ops *ops = vdpa->config;
>>
>>
>>           if (ops->get_iova_range) {
>>                   *range = ops->get_iova_range(vdpa);
>>           } else if (v->domain &&
v->domain->geometry.force_aperture) {
>>                   range->first =
v->domain->geometry.aperture_start;
>>                   range->last =
v->domain->geometry.aperture_end;
>>           } else {
>>                   range->first = 0;
>>                   range->last = ULLONG_MAX;
>>           }
>> }
>
> I'll delete the snet_get_iova_range function.
>
>> Any chance to use device managed region helper here? It seems to
>> simplify the codes (e.g the cleanup stuffs).
>
> Ok, I'll do it.
>
>> Is this better to say "config is not ready"? Btw, I wonder if
it makes
>> more sense to wait until the confg is ready with a timeout?
>
> Good idea, I'll implement the wait & timeout.
>
>> I wonder if it's worth to bother consider we're using devres to
manage irqs.
>
> You're right, this isn't required, I'll remove it.
>
>> It looks to me all the devices created here use the same dma_dev (the
>> PCI device), I wonder how the DMA is isolated among the vDPA devices
>> created here.
>
> All vDPA devices indeed use the same DMA device, there is no isolation
> between the devices.
> I'm not sure why there should be isolation in this case.

Each vDPA needs to be able to assigned to a userspace(VM) independently 
when bound to vhost-vDPA. If they share the same IOMMU domain, there 
will be security issues.

>
>> Btw, the vDPA has been switched to use vDPA tool to create devices, it
>> is suggested to implement the mgmt devices as what other parents did.
>> Then the snet_alloc_dev() could be used for dev_add().
>
> We want to leave control to the DPU at the moment, the number/type of
> devices is determined by the DPU's config, and can't be controlled
> from userspace.

That's fine, it might be better to state this in the changelog or as a 
comment somewhere in the code.

>
>> There looks like a race, the vDPA device could be registered to the bus
>> and used by userspace by bus master/drvdata is set.
>
> You're right, the vdpa registration should be done after the
> master/drvdata is set.
>
>> I think devres should take care of this since we're using
>> pcim_enable_device()?
>
> You're right, this isn't required, I'll remove it.
>
>> According to the code, this seems to be the driver features and
>
> These are the negotiated features
> We're not keeping record of the driver features, when
> set_driver_features is called, we just logic AND the driver features
> with the supported features received from the DPU.
> I'll rename it to be 'negotiated_features", this seems more
accurate.

Ok.

>
>> static int snet_set_drv_features(struct vdpa_device *vdev, u64
features)
>> {
>>      struct snet *snet = vdpa_to_snet(vdev);
>>
>>
>>      snet->used_features = snet->cfg->features & features;
>>      return 0;
>> }
>
>
>> This seems to be unused.
>
> You're right, I'll remove it.
>
>
> Thank you for your comments.
> I'll send a new version once I finish working on the comments.
>
> Alvaro

Thanks

>

Virtualization - Nov 2022 - [PATCH] virtio: vdpa: new SolidNET DPU driver.

[PATCH] virtio: vdpa: new SolidNET DPU driver.

[PATCH] virtio: vdpa: new SolidNET DPU driver.