Jason Wang
2020-Jul-17  08:57 UTC
[PATCH vhost next 10/10] vdpa/mlx5: Add VDPA driver for supported mlx5 devices
On 2020/7/16 ??7:54, Eli Cohen wrote:> On Thu, Jul 16, 2020 at 05:14:32PM +0800, Jason Wang wrote: >>> +static void suspend_vqs(struct mlx5_vdpa_net *ndev) >>> +{ >>> + int i; >>> + >>> + for (i = 0; i < MLX5_MAX_SUPPORTED_VQS; i++) >>> + suspend_vq(ndev, &ndev->vqs[i]); >> >> In teardown_virtqueues() it has a check of mvq->num_ent, any reason >> not doing it here? >> > Looks like checking intialized is enough. Will fix this. > >>> + >>> +static void mlx5_vdpa_set_vq_ready(struct vdpa_device *vdev, u16 idx, bool ready) >>> +{ >>> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >>> + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); >>> + struct mlx5_vdpa_virtqueue *mvq = &ndev->vqs[idx]; >>> + int err; >>> + >>> + if (!mvq->ready && ready && mvq->fw_state != MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY) { >>> + err = modify_virtqueue(ndev, mvq, MLX5_VIRTIO_NET_Q_OBJECT_STATE_RDY); >>> + if (err) { >>> + mlx5_vdpa_warn(mvdev, "failed to modify virtqueue(%d)\n", err); >>> + return; >>> + } >>> + } >> >> I wonder what's the reason of changing vq state on the hardware >> here. I think we can defer it to set_status(). >> > I can defer this to set status. > > I just wonder if it is possible that the core vdpa driver may call this > function with ready equals false and after some time call it with ready > equals true.Good point, so I think we can keep the logic. But looks like the code can not work if ready equals false since it only tries to modify vq state to RDY.> > >> (Anyhow we don't sync vq address in set_vq_address()). >> >> >>> +static u64 mlx5_vdpa_get_features(struct vdpa_device *vdev) >>> +{ >>> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >>> + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); >>> + u16 dev_features; >>> + >>> + dev_features = MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, device_features_bits_mask); >>> + ndev->mvdev.mlx_features = mlx_to_vritio_features(dev_features); >>> + if (MLX5_CAP_DEV_VDPA_EMULATION(mvdev->mdev, virtio_version_1_0)) >>> + ndev->mvdev.mlx_features |= BIT(VIRTIO_F_VERSION_1); >> >> This is interesting. This suggests !VIRTIO_F_VERSION_1 && >> VIRTIO_F_IOMMU_PLATFORM is valid. But virito spec doesn't allow such >> configuration. > Will fix >> So I think you need either: >> >> 1) Fail vDPA device probe when VERSION_1 is not supported >> 2) clear IOMMU_PLATFORM if VERSION_1 is not negotiated >> >> For 2) I guess it can't work, according to the spec, without >> IOMMU_PLATFORM, device need to use PA to access the memory >> >> >>> + ndev->mvdev.mlx_features |= BIT(VIRTIO_F_ANY_LAYOUT); > I think this should be removed tooYes, I guess for hardware vDPA which depends on IOMMU_PLATFORM which implies VERSION_1 that implies ANY_LAYOUT.> >>> + ndev->mvdev.mlx_features |= BIT(VIRTIO_F_IOMMU_PLATFORM); >>> + if (mlx5_vdpa_max_qps(ndev->mvdev.max_vqs) > 1) >>> + ndev->mvdev.mlx_features |= BIT(VIRTIO_NET_F_MQ); > Also this, since multqueue requires configuration vq which is not > supported in this version.Yes.> >>> + >>> + print_features(mvdev, ndev->mvdev.mlx_features, false); >>> + return ndev->mvdev.mlx_features; >>> +} >>> + >>> +static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64 features) >>> +{ >>> + /* FIXME: qemu currently does not set all the feaures due to a bug. >>> + * Add checks when this is fixed. >>> + */ >> >> I think we should add the check now then qemu can get notified. (E.g >> IOMMU_PLATFORM) > Will do. >> >>> +} >>> + >>> +#define MLX5_VDPA_MAX_VQ_ENTRIES 256 >> >> Is this a hardware limitation, qemu can support up to 1K which the >> requirement of some NFV cases. >> > Theoretically the device is limit is much higher. In the near future we > will have a device capability for this. I wanted to stay on the safe side > with this but I can change this if you think it's necessary.I see, that's fine. Let keep this untouched.>>> + >>> +static void mlx5_vdpa_get_config(struct vdpa_device *vdev, unsigned int offset, void *buf, >>> + unsigned int len) >>> +{ >>> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >>> + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); >>> + >>> + if (offset + len < sizeof(struct virtio_net_config)) >>> + memcpy(buf, &ndev->config + offset, len); >> >> Note that guest expect LE, what's the endian for ndev->config? > This is struct virtio_net_config from include/uapi/linux/virtio_net.h. > Looking there I see it has mixed endianess. I currently don't touch it > but if I do, I should follow endianess guidance per the struct > definition. So I don't think I should care about endianess when copying.So guest would expect LE, we need be careful when modify config space (e.g mtu or status). Consider we don't support VIRTIO_NET_F_STATUS and VIRTIO_NET_F_MTU, we're probably fine.> >> >>> +} >>> + >>> +static void mlx5_vdpa_set_config(struct vdpa_device *vdev, unsigned int offset, const void *buf, >>> + unsigned int len) >>> +{ >>> + /* not supported */ >>> +} >>> + >>> +static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev) >>> +{ >>> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >>> + >>> + return mvdev->generation; >>> +} >>> + >>> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb) >>> +{ >>> + struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); >>> + struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); >>> + bool change_map; >>> + int err; >>> + >>> + err = mlx5_vdpa_handle_set_map(mvdev, iotlb, &change_map); >>> + if (err) { >>> + mlx5_vdpa_warn(mvdev, "set map failed(%d)\n", err); >>> + return err; >>> + } >>> + >>> + if (change_map) >>> + return mlx5_vdpa_change_map(ndev, iotlb); >> >> Any reason for not doing this inside mlx5_handle_set_map()? >> > All memory registration related operations are done inside mr.c in the > common code directory. But change map involves operations on other > objects managed in this file.Ok. (Note that we can do more generalization in the future, since the only network specific part is the config space and control vq)> >>> + >>> +void mlx5_vdpa_remove_dev(struct mlx5_vdpa_dev *mvdev) >>> +{ >>> + struct mlx5_vdpa_net *ndev; >>> + >>> + mvdev->status = 0; >> >> This is probably unnecessary. >> > Right, will remove.Thanks>