? 2022/11/9 17:44, Alvaro Karsz ??:> Hi Jason,
>
>> If I was not wrong, this value seems doesn't change. If yes, I
wonder we
>> can move the kick_offset logic to snet_alloc_dev()?
>
> There is not a real reason to have this logic in snet_set_vq_address,
> so it could be moved to snet_build_vqs (which is called within
> snet_alloc_dev).
>
>> -EOPNOTSUPP?
>
> Returning an error from the set_vq_state callback leads to probe failure.
> This code is taken from drivers/virtio/virtio_vdpa.c,
> virtio_vdpa_setup_vq function:
>
>> struct vdpa_vq_state state = {0};
>>
>> ......
>>
>> err = ops->set_vq_state(vdpa, index, &state);
>> if (err)
>> goto err_vq;
>
> I could check struct vdpa_vq_state, and return 0 if struct
> vdpa_vq_state value is 0, -EOPNOTSUPP otherwise.
> What do you think?
That would be fine. (It could be something similar to
vp_vdpa_set_vq_state_split()).
>
>> I fail to understand how can this work. E.g after reset there should be
>> no interaction from the device like DMA, otherwise there could have
>> security implications.
>
> You're right, I'll add a proper reset callback.
>
>> Since read is ordered with write, a more easy way is to perform a
ioread
>> here.
>> Interesting, which barrier is this paired?
>
> Usually reads are slow, but we don't care about speed when sending
> messages (since we only send a "destroy" message so far, meaning
that
> the pci remove callback was called), so the memory barrier can be
> replaced with a read operation.
Yes, actually, the memory barrier can't guarantee the write has been
processed by the device.
>
>> Do we need to do endian conversion here (cpu_to_le64())?
>
> Yes, I'll add it.
>
>> Need to take endianess into account.
>
> I'm not sure about that.
> The endianness appears to be handled by the caller.
> Function from include/linux/virtio_config.h
>
>> static inline void virtio_cwrite16(struct virtio_device *vdev,
>> unsigned int offset, u16 val)
>> {
>> __virtio16 v;
>>
>>
>> might_sleep();
>> v = cpu_to_virtio16(vdev, val);
>> vdev->config->set(vdev, offset, &v, sizeof(v));
>> }
You are right.
>
>> static inline void virtio_cwrite32(struct virtio_device *vdev,
>> unsigned int offset, u32 val)
>> {
>> __virtio32 v;
>>
>>
>> might_sleep();
>> v = cpu_to_virtio32(vdev, val);
>> vdev->config->set(vdev, offset, &v, sizeof(v));
>> }
>>
>
>> static inline void virtio_cwrite64(struct virtio_device *vdev,
>> unsigned int offset, u64 val)
>> {
>> __virtio64 v;
>>
>>
>> might_sleep();
>> v = cpu_to_virtio64(vdev, val);
>> vdev->config->set(vdev, offset, &v, sizeof(v));
>> }
>
> I'm not sure how the endianness can be handled by the vDPA driver.
> This function may be called for a 8bit, 16bit, 32bit or 64bit variables.
> It theoretically may be called to change multiple variables at once.
> It may be called to change part of a variable.
>
>
>> If I was not wrong, the device depends on the platform IOMMU to work.
So
>> unless device have a more strict iova range than what platform IOMMU
can
>> provide, we can simply not advertise this and use the one that is
>> provided by the IOMMU:
>>
>>
>> static void vhost_vdpa_set_iova_range(struct vhost_vdpa *v)
>> {
>> struct vdpa_iova_range *range = &v->range;
>> struct vdpa_device *vdpa = v->vdpa;
>> const struct vdpa_config_ops *ops = vdpa->config;
>>
>>
>> if (ops->get_iova_range) {
>> *range = ops->get_iova_range(vdpa);
>> } else if (v->domain &&
v->domain->geometry.force_aperture) {
>> range->first =
v->domain->geometry.aperture_start;
>> range->last =
v->domain->geometry.aperture_end;
>> } else {
>> range->first = 0;
>> range->last = ULLONG_MAX;
>> }
>> }
>
> I'll delete the snet_get_iova_range function.
>
>> Any chance to use device managed region helper here? It seems to
>> simplify the codes (e.g the cleanup stuffs).
>
> Ok, I'll do it.
>
>> Is this better to say "config is not ready"? Btw, I wonder if
it makes
>> more sense to wait until the confg is ready with a timeout?
>
> Good idea, I'll implement the wait & timeout.
>
>> I wonder if it's worth to bother consider we're using devres to
manage irqs.
>
> You're right, this isn't required, I'll remove it.
>
>> It looks to me all the devices created here use the same dma_dev (the
>> PCI device), I wonder how the DMA is isolated among the vDPA devices
>> created here.
>
> All vDPA devices indeed use the same DMA device, there is no isolation
> between the devices.
> I'm not sure why there should be isolation in this case.
Each vDPA needs to be able to assigned to a userspace(VM) independently
when bound to vhost-vDPA. If they share the same IOMMU domain, there
will be security issues.
>
>> Btw, the vDPA has been switched to use vDPA tool to create devices, it
>> is suggested to implement the mgmt devices as what other parents did.
>> Then the snet_alloc_dev() could be used for dev_add().
>
> We want to leave control to the DPU at the moment, the number/type of
> devices is determined by the DPU's config, and can't be controlled
> from userspace.
That's fine, it might be better to state this in the changelog or as a
comment somewhere in the code.
>
>> There looks like a race, the vDPA device could be registered to the bus
>> and used by userspace by bus master/drvdata is set.
>
> You're right, the vdpa registration should be done after the
> master/drvdata is set.
>
>> I think devres should take care of this since we're using
>> pcim_enable_device()?
>
> You're right, this isn't required, I'll remove it.
>
>> According to the code, this seems to be the driver features and
>
> These are the negotiated features
> We're not keeping record of the driver features, when
> set_driver_features is called, we just logic AND the driver features
> with the supported features received from the DPU.
> I'll rename it to be 'negotiated_features", this seems more
accurate.
Ok.
>
>> static int snet_set_drv_features(struct vdpa_device *vdev, u64
features)
>> {
>> struct snet *snet = vdpa_to_snet(vdev);
>>
>>
>> snet->used_features = snet->cfg->features & features;
>> return 0;
>> }
>
>
>> This seems to be unused.
>
> You're right, I'll remove it.
>
>
> Thank you for your comments.
> I'll send a new version once I finish working on the comments.
>
> Alvaro
Thanks
>