On Wed, Oct 27, 2021 at 10:47 AM Wu Zongyong
<wuzongyong at linux.alibaba.com> wrote:>
> On Mon, Oct 25, 2021 at 12:40:41PM +0800, Jason Wang wrote:
> >
> > ? 2021/10/25 ??11:21, Wu Zongyong ??:
> > > On Mon, Oct 25, 2021 at 10:27:31AM +0800, Jason Wang wrote:
> > > > On Fri, Oct 22, 2021 at 10:44 AM Wu Zongyong
> > > > <wuzongyong at linux.alibaba.com> wrote:
> > > > > This patch adds a new vDPA driver for Alibaba
ENI(Elastic Network
> > > > > Interface) which is build upon virtio 0.9.5
specification.
> > > > > And this driver doesn't support to run on BE host.
> > > > >
> > > > > Signed-off-by: Wu Zongyong <wuzongyong at
linux.alibaba.com>
> > > > > ---
> > > > > drivers/vdpa/Kconfig | 8 +
> > > > > drivers/vdpa/Makefile | 1 +
> > > > > drivers/vdpa/alibaba/Makefile | 3 +
> > > > > drivers/vdpa/alibaba/eni_vdpa.c | 553
++++++++++++++++++++++++++++++++
> > > > > 4 files changed, 565 insertions(+)
> > > > > create mode 100644 drivers/vdpa/alibaba/Makefile
> > > > > create mode 100644 drivers/vdpa/alibaba/eni_vdpa.c
> > > > >
> > > > > diff --git a/drivers/vdpa/Kconfig
b/drivers/vdpa/Kconfig
> > > > > index 3d91982d8371..c0232a2148a7 100644
> > > > > --- a/drivers/vdpa/Kconfig
> > > > > +++ b/drivers/vdpa/Kconfig
> > > > > @@ -78,4 +78,12 @@ config VP_VDPA
> > > > > help
> > > > > This kernel module bridges virtio PCI device
to vDPA bus.
> > > > >
> > > > > +config ALIBABA_ENI_VDPA
> > > > > + tristate "vDPA driver for Alibaba
ENI"
> > > > > + select VIRTIO_PCI_LEGACY_LIB
> > > > > + depends on PCI_MSI && !CPU_BIG_ENDIAN
> > > > > + help
> > > > > + VDPA driver for Alibaba ENI(Elastic Network
Interface) which is build upon
> > > > > + virtio 0.9.5 specification.
> > > > > +
> > > > > endif # VDPA
> > > > > diff --git a/drivers/vdpa/Makefile
b/drivers/vdpa/Makefile
> > > > > index f02ebed33f19..15665563a7f4 100644
> > > > > --- a/drivers/vdpa/Makefile
> > > > > +++ b/drivers/vdpa/Makefile
> > > > > @@ -5,3 +5,4 @@ obj-$(CONFIG_VDPA_USER) += vdpa_user/
> > > > > obj-$(CONFIG_IFCVF) += ifcvf/
> > > > > obj-$(CONFIG_MLX5_VDPA) += mlx5/
> > > > > obj-$(CONFIG_VP_VDPA) += virtio_pci/
> > > > > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += alibaba/
> > > > > diff --git a/drivers/vdpa/alibaba/Makefile
b/drivers/vdpa/alibaba/Makefile
> > > > > new file mode 100644
> > > > > index 000000000000..ef4aae69f87a
> > > > > --- /dev/null
> > > > > +++ b/drivers/vdpa/alibaba/Makefile
> > > > > @@ -0,0 +1,3 @@
> > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > +obj-$(CONFIG_ALIBABA_ENI_VDPA) += eni_vdpa.o
> > > > > +
> > > > > diff --git a/drivers/vdpa/alibaba/eni_vdpa.c
b/drivers/vdpa/alibaba/eni_vdpa.c
> > > > > new file mode 100644
> > > > > index 000000000000..6a09f157d810
> > > > > --- /dev/null
> > > > > +++ b/drivers/vdpa/alibaba/eni_vdpa.c
> > > > > @@ -0,0 +1,553 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > > +/*
> > > > > + * vDPA bridge driver for Alibaba ENI(Elastic Network
Interface)
> > > > > + *
> > > > > + * Copyright (c) 2021, Alibaba Inc. All rights
reserved.
> > > > > + * Author: Wu Zongyong <wuzongyong at
linux.alibaba.com>
> > > > > + *
> > > > > + */
> > > > > +
> > > > > +#include "linux/bits.h"
> > > > > +#include <linux/interrupt.h>
> > > > > +#include <linux/module.h>
> > > > > +#include <linux/pci.h>
> > > > > +#include <linux/vdpa.h>
> > > > > +#include <linux/virtio.h>
> > > > > +#include <linux/virtio_config.h>
> > > > > +#include <linux/virtio_ring.h>
> > > > > +#include <linux/virtio_pci.h>
> > > > > +#include <linux/virtio_pci_legacy.h>
> > > > > +#include <uapi/linux/virtio_net.h>
> > > > > +
> > > > > +#define ENI_MSIX_NAME_SIZE 256
> > > > > +
> > > > > +#define ENI_ERR(pdev, fmt, ...) \
> > > > > + dev_err(&pdev->dev, "%s"fmt,
"eni_vdpa: ", ##__VA_ARGS__)
> > > > > +#define ENI_DBG(pdev, fmt, ...) \
> > > > > + dev_dbg(&pdev->dev, "%s"fmt,
"eni_vdpa: ", ##__VA_ARGS__)
> > > > > +#define ENI_INFO(pdev, fmt, ...) \
> > > > > + dev_info(&pdev->dev, "%s"fmt,
"eni_vdpa: ", ##__VA_ARGS__)
> > > > > +
> > > > > +struct eni_vring {
> > > > > + void __iomem *notify;
> > > > > + char msix_name[ENI_MSIX_NAME_SIZE];
> > > > > + struct vdpa_callback cb;
> > > > > + int irq;
> > > > > +};
> > > > > +
> > > > > +struct eni_vdpa {
> > > > > + struct vdpa_device vdpa;
> > > > > + struct virtio_pci_legacy_device ldev;
> > > > > + struct eni_vring *vring;
> > > > > + struct vdpa_callback config_cb;
> > > > > + char msix_name[ENI_MSIX_NAME_SIZE];
> > > > > + int config_irq;
> > > > > + int queues;
> > > > > + int vectors;
> > > > > +};
> > > > > +
> > > > > +static struct eni_vdpa *vdpa_to_eni(struct vdpa_device
*vdpa)
> > > > > +{
> > > > > + return container_of(vdpa, struct eni_vdpa,
vdpa);
> > > > > +}
> > > > > +
> > > > > +static struct virtio_pci_legacy_device
*vdpa_to_ldev(struct vdpa_device *vdpa)
> > > > > +{
> > > > > + struct eni_vdpa *eni_vdpa = vdpa_to_eni(vdpa);
> > > > > +
> > > > > + return &eni_vdpa->ldev;
> > > > > +}
> > > > > +
> > > > > +static u64 eni_vdpa_get_features(struct vdpa_device
*vdpa)
> > > > > +{
> > > > > + struct virtio_pci_legacy_device *ldev =
vdpa_to_ldev(vdpa);
> > > > > + u64 features = vp_legacy_get_features(ldev);
> > > > > +
> > > > > + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> > > > > + features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);
> > > > > +
> > > > > + return features;
> > > > > +}
> > > > > +
> > > > > +static int eni_vdpa_set_features(struct vdpa_device
*vdpa, u64 features)
> > > > > +{
> > > > > + struct virtio_pci_legacy_device *ldev =
vdpa_to_ldev(vdpa);
> > > > > +
> > > > > + if (!(features &
BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
> > > > > + ENI_ERR(ldev->pci_dev,
> > > > > + "VIRTIO_NET_F_MRG_RXBUF is
not negotiated\n");
> > > > > + return -EINVAL;
> > > > > + }
> > > > > +
> > > > > + vp_legacy_set_features(ldev, (u32)features);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > So my comments have not been addressed since v4. Please
address or
> > > > answer the questions before posting a new version.
> > > >
> > > > Thanks
> > > Sorry, I forgot to reply the comments on this patch.
> > >
> > >
> > > > > +static u64 eni_vdpa_get_features(struct vdpa_device
*vdpa)
> > > > > +{
> > > > > + struct virtio_pci_legacy_device *ldev =
vdpa_to_ldev(vdpa);
> > > > > + u64 features = vp_legacy_get_features(ldev);
> > > > > +
> > > > > + features |= BIT_ULL(VIRTIO_F_ACCESS_PLATFORM);
> > > > > + features |= BIT_ULL(VIRTIO_F_ORDER_PLATFORM);
> > > > VERSION_1 is also needed?
> > > >
> > > No, queue align of legacy devices should be 4096,
> >
> >
> > Let's use VIRTIO_PCI_VRING_ALIGN instead of PAGE_SIZE in
get_vq_align then
> > since PAGE_SIZE is not necessarily 4096.
>
> I will fix it.
> >
> >
> > > but queue align of
> > > devices with VERSION_1 are SMP_CACHE_BYTES which may not equals
to
> > > 4096.
> > > If we set the VERSION_1, ENI will not work due to the queue
align.
> >
> >
> > Interesting, so I think it can only be used with legacy virtio drivers
in
> > the guest.
> Yes, ENI can only work well with legacy virtio drivers in the guest from
> my test.
>
> >
> > One major drawbacks is that guest can only see 32 feature bits which
means
> > we can't advertise VIRTIO_F_ACCESS_PLATFORM and
VIRTIO_F_ORDER_PLATFORM to
> > guest:
> >
> > /* virtio config->get_features() implementation */
> > static u64 vp_get_features(struct virtio_device *vdev)
> > {
> > struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> >
> > /* When someone needs more than 32 feature bits, we'll
need to
> > * steal a bit to indicate that the rest are somewhere else.
*/
> > return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
> > }
> >
> > For VIRTIO_F_ACCESS_PLATFORM, it should be fine. But how about
> > VIRTIO_F_ORDER_PLATFORM?
> >
> Is it necessary to advertise these two bits?
> ENI works well without these two bits regardless of bare metal or virtual
> machine.
For VIRTIO_F_ACCESS_PLATFORM. When using with virtio-vDPA it should be
fine since it will be negotiated. When using with vhost-vDPA, it can't
be advertised to the guest, so vIOMMU will be bypassed.
For VITIO_F_ORDER_PLATFORM, it's basically a question that whether
platform specific barrier (DMA barrier) is needed, see virtio_rmb():
static inline void virtio_rmb(bool weak_barriers)
{
if (weak_barriers)
virt_rmb();
else
dma_rmb();
}
So it looks to me x86 should be fine, but for others platform it might
not. Maybe we can enable this driver for x86 first.
Thanks
> >
> > >
> > > > > +
> > > > > + return features;
> > > > > +}
> > > > > +
> > > > > +static int eni_vdpa_set_features(struct vdpa_device
*vdpa, u64 features)
> > > > > +{
> > > > > + struct virtio_pci_legacy_device *ldev =
vdpa_to_ldev(vdpa);
> > > > > +
> > > > > + if (!(features &
BIT_ULL(VIRTIO_NET_F_MRG_RXBUF)) && features) {
> > > > > + ENI_ERR(ldev->pci_dev,
> > > > > + "VIRTIO_NET_F_MRG_RXBUF is
not negotiated\n");
> > > > > + return -EINVAL;
> > > > Do we need to make sure FEATURE_OK is not set in this case
or the ENI can do
> > > > this for us?
> > > Why we need to check this? I don't get what you worried
about.
> >
> >
> > I thought the plan is to advertise the VERSION_1, so failing when
without
> > mrg_rxbuf is a must. But looks like I was wrong, and there's no
need to
> > mandate mrg_rxbuf in future versions.
>
> I think it is necessary to check F_MRG_RXBUF here, since ENI will not
> work when use qemu with mrx_rxbuf=off.
> >
> > Thanks
> >
> >
> > >
> > > > Other looks good.
> > > >
> > > > Thanks
>