thr3ads.net - Linux Virtualization - [PATCH vhost next 10/10] vdpa/mlx5: Add VDPA driver for supported mlx5 devices [Jul 2020]

If this information is useful, please help other people find it:
Share via:
Jason Wang
2020-Jul-17 08:57 UTC
[PATCH vhost next 10/10] vdpa/mlx5: Add VDPA driver for supported mlx5 devices

On 2020/7/16 ??7:54, Eli Cohen wrote:> On Thu, Jul 16, 2020 at 05:14:32PM +0800, Jason Wang wrote:
>>> +static void suspend_vqs(struct mlx5_vdpa_net *ndev)
>>> +{
>>> +	int i;
>>> +
>>> +	for (i = 0; i < MLX5_MAX_SUPPORTED_VQS; i++)
>>> +		suspend_vq(ndev, &ndev->vqs[i]);
>>
>> In teardown_virtqueues() it has a check of mvq->num_ent, any reason
>> not doing it here?
>>
> Looks like checking intialized is enough. Will fix this.
>   
>>> +
>>> +static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16
idx, bool ready)
>>> +{
>>> +	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>>> +	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
>>> +	struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx];
>>> +	int err;
>>> +
>>> +	if (!mvq->ready && ready && mvq->fw_state
!= MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY) {
>>> +		err = modify_virtqueue(ndev, mvq,
MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY);
>>> +		if (err) {
>>> +			mlx5_vdpa_warn(mvdev, "failed to modify
virtqueue(%d)\n", err);
>>> +			return;
>>> +		}
>>> +	}
>>
>> I wonder what's the reason of changing vq state on the hardware
>> here. I think we can defer it to set_status().
>>
> I can defer this to set status.
>
> I just wonder if it is possible that the core vdpa driver may call this
> function with ready equals false and after some time call it with ready
> equals true.

Good point, so I think we can keep the logic. But looks like the code 
can not work if ready equals false since it only tries to modify vq 
state to RDY.

>
>
>> (Anyhow we don't sync vq address in set_vq_address()).
>>
>>
>>> +static u64 mlx5_vdpa_get_features(struct vdpa_device *vdev)
>>> +{
>>> +	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>>> +	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
>>> +	u16 dev_features;
>>> +
>>> +	dev_features = MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev,
device_features_bits_mask);
>>> +	ndev->mvdev.mlx_features =
mlx_to_vritio_features(dev_features);
>>> +	if (MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev,
virtio_version_1_0))
>>> +		ndev->mvdev.mlx_features |= BIT(VIRTIO_F_VERSION_1);
>>
>> This is interesting. This suggests !VIRTIO_F_VERSION_1 &&
>> VIRTIO_F_IOMMU_PLATFORM is valid. But virito spec doesn't allow
such
>> configuration.
> Will fix
>> So I think you need either:
>>
>> 1) Fail vDPA device probe when VERSION_1 is not supported
>> 2) clear IOMMU_PLATFORM if VERSION_1 is not negotiated
>>
>> For 2) I guess it can't work, according to the spec, without
>> IOMMU_PLATFORM, device need to use PA to access the memory
>>
>>
>>> +	ndev->mvdev.mlx_features |= BIT(VIRTIO_F_ANY_LAYOUT);
> I think this should be removed too

Yes, I guess for hardware vDPA which depends on IOMMU_PLATFORM which 
implies VERSION_1 that implies ANY_LAYOUT.

>
>>> +	ndev->mvdev.mlx_features |= BIT(VIRTIO_F_IOMMU_PLATFORM);
>>> +	if (mlx5_vdpa_max_qps(ndev->mvdev.max_vqs) > 1)
>>> +		ndev->mvdev.mlx_features |= BIT(VIRTIO_NET_F_MQ);
> Also this, since multqueue requires configuration vq which is not
> supported in this version.

Yes.

>
>>> +
>>> +	print_features(mvdev, ndev->mvdev.mlx_features, false);
>>> +	return ndev->mvdev.mlx_features;
>>> +}
>>> +
>>> +static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64
features)
>>> +{
>>> +	/* FIXME: qemu currently does not set all the feaures due to a
bug.
>>> +	 * Add checks when this is fixed.
>>> +	 */
>>
>> I think we should add the check now then qemu can get notified. (E.g
>> IOMMU_PLATFORM)
> Will do.
>>
>>> +}
>>> +
>>> +#define MLX5_VDPA_MAX_VQ_ENTRIES 256
>>
>> Is this a hardware limitation, qemu can support up to 1K which the
>> requirement of some NFV cases.
>>
> Theoretically the device is limit is much higher. In the near future we
> will have a device capability for this. I wanted to stay on the safe side
> with this but I can change this if you think it's necessary.

I see, that's fine. Let keep this untouched.

>>> +
>>> +static void mlx5_vdpa_get_config(struct vdpa_device *vdev,
unsigned int offset, void *buf,
>>> +				 unsigned int len)
>>> +{
>>> +	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>>> +	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
>>> +
>>> +	if (offset + len < sizeof(struct virtio_net_config))
>>> +		memcpy(buf, &ndev->config + offset, len);
>>
>> Note that guest expect LE, what's the endian for ndev->config?
> This is struct virtio_net_config from include/uapi/linux/virtio_net.h.
> Looking there I see it has mixed endianess. I currently don't touch it
> but if I do, I should follow endianess guidance per the struct
> definition. So I don't think I should care about endianess when
copying.

So guest would expect LE, we need be careful when modify config space 
(e.g mtu or status). Consider we don't support VIRTIO_NET_F_STATUS and 
VIRTIO_NET_F_MTU, we're probably fine.

>
>>
>>> +}
>>> +
>>> +static void mlx5_vdpa_set_config(struct vdpa_device *vdev,
unsigned int offset, const void *buf,
>>> +				 unsigned int len)
>>> +{
>>> +	/* not supported */
>>> +}
>>> +
>>> +static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
>>> +{
>>> +	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>>> +
>>> +	return mvdev->generation;
>>> +}
>>> +
>>> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct
vhost_iotlb *iotlb)
>>> +{
>>> +	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>>> +	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
>>> +	bool change_map;
>>> +	int err;
>>> +
>>> +	err = mlx5_vdpa_handle_set_map(mvdev, iotlb, &change_map);
>>> +	if (err) {
>>> +		mlx5_vdpa_warn(mvdev, "set map failed(%d)\n", err);
>>> +		return err;
>>> +	}
>>> +
>>> +	if (change_map)
>>> +		return mlx5_vdpa_change_map(ndev, iotlb);
>>
>> Any reason for not doing this inside mlx5_handle_set_map()?
>>
> All memory registration related operations are done inside mr.c in the
> common code directory. But change map involves operations on other
> objects managed in this file.

Ok.

(Note that we can do more generalization in the future, since the only 
network specific part is the config space and control vq)

>
>>> +
>>> +void mlx5_vdpa_remove_dev(struct mlx5_vdpa_dev *mvdev)
>>> +{
>>> +	struct mlx5_vdpa_net *ndev;
>>> +
>>> +	mvdev->status = 0;
>>
>> This is probably unnecessary.
>>
> Right, will remove.

Thanks

>
Possibly Parallel Threads

Search for more reasonably related threads
Linux Virtualization - Jul 2020 - [PATCH vhost next 10/10] vdpa/mlx5: Add VDPA driver for supported mlx5 devices

[PATCH vhost next 10/10] vdpa/mlx5: Add VDPA driver for supported mlx5 devices

Possibly Parallel Threads

Wisdom of the Ancients