Si-Wei Liu
2021-Feb-06 12:29 UTC
[PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa, 22 extra bytes worth of MTU length is shown in guest. This is because the mlx5_query_port_max_mtu API returns the "hardware" MTU value, which does not just contain the Ethernet payload, but includes extra lengths starting from the Ethernet header up to the FCS altogether. Fix the MTU so packets won't get dropped silently. Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> --- drivers/vdpa/mlx5/core/mlx5_vdpa.h | 4 ++++ drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 ++++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h index 08f742f..b6cc53b 100644 --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h @@ -4,9 +4,13 @@ #ifndef __MLX5_VDPA_H__ #define __MLX5_VDPA_H__ +#include <linux/etherdevice.h> +#include <linux/if_vlan.h> #include <linux/vdpa.h> #include <linux/mlx5/driver.h> +#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN) + struct mlx5_vdpa_direct_mr { u64 start; u64 end; diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index dc88559..b8416c4 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx) .free = mlx5_vdpa_free, }; +static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu) +{ + u16 hw_mtu; + int err; + + err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu); + if (err) + return err; + + *mtu = hw_mtu - MLX5V_ETH_HARD_MTU; + return 0; +} + static int alloc_resources(struct mlx5_vdpa_net *ndev) { struct mlx5_vdpa_net_resources *res = &ndev->res; @@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev, init_mvqs(ndev); mutex_init(&ndev->reslock); config = &ndev->config; - err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu); + err = query_mtu(mdev, &ndev->mtu); if (err) goto err_mtu; -- 1.8.3.1
Si-Wei Liu
2021-Feb-06 12:29 UTC
[PATCH 2/3] mlx5_vdpa: fix feature negotiation across device reset
The mlx_features denotes the capability for which set of virtio features is supported by device. In principle, this field needs not be cleared during virtio device reset, as this capability is static and does not change across reset. In fact, the current code may have the assumption that mlx_features can be reloaded from firmware via the .get_features ops after device is reset (via the .set_status ops), which is unfortunately not true. The userspace VMM might save a copy of backend capable features and won't call into kernel again to get it on reset. This causes all virtio features getting disabled on newly created virtqs after device reset, while guest would hold mismatched view of available features. For e.g., the guest may still assume tx checksum offload is available after reset and feature negotiation, causing frames with bogus (incomplete) checksum transmitted on the wire. Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index b8416c4..aa6f8cd 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1788,7 +1788,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) clear_virtqueues(ndev); mlx5_vdpa_destroy_mr(&ndev->mvdev); ndev->mvdev.status = 0; - ndev->mvdev.mlx_features = 0; ++mvdev->generation; return; } -- 1.8.3.1
Si-Wei Liu
2021-Feb-06 12:29 UTC
[PATCH 3/3] mlx5_vdpa: defer clear_virtqueues to until DRIVER_OK
While virtq is stopped, get_vq_state() is supposed to be called to get sync'ed with the latest internal avail_index from device. The saved avail_index is used to restate the virtq once device is started. Commit b35ccebe3ef7 introduced the clear_virtqueues() routine to reset the saved avail_index, however, the index gets cleared a bit earlier before get_vq_state() tries to read it. This would cause consistency problems when virtq is restarted, e.g. through a series of link down and link up events. We could defer the clearing of avail_index to until the device is to be started, i.e. until VIRTIO_CONFIG_S_DRIVER_OK is set again in set_status(). Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com> --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index aa6f8cd..444ab58 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1785,7 +1785,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) if (!status) { mlx5_vdpa_info(mvdev, "performing device reset\n"); teardown_driver(ndev); - clear_virtqueues(ndev); mlx5_vdpa_destroy_mr(&ndev->mvdev); ndev->mvdev.status = 0; ++mvdev->generation; @@ -1794,6 +1793,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status) if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) { if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + clear_virtqueues(ndev); err = setup_driver(ndev); if (err) { mlx5_vdpa_warn(mvdev, "failed to setup driver\n"); -- 1.8.3.1
Jason Wang
2021-Feb-08 04:38 UTC
[PATCH 1/3] mlx5_vdpa: should exclude header length and fcs from mtu
On 2021/2/6 ??8:29, Si-Wei Liu wrote:> When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa, > 22 extra bytes worth of MTU length is shown in guest. > This is because the mlx5_query_port_max_mtu API returns > the "hardware" MTU value, which does not just contain the > Ethernet payload, but includes extra lengths starting > from the Ethernet header up to the FCS altogether. > > Fix the MTU so packets won't get dropped silently. > > Signed-off-by: Si-Wei Liu <si-wei.liu at oracle.com>Acked-by: Jason Wang <jasowang at redhat.com>> --- > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 4 ++++ > drivers/vdpa/mlx5/net/mlx5_vnet.c | 15 ++++++++++++++- > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > index 08f742f..b6cc53b 100644 > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > @@ -4,9 +4,13 @@ > #ifndef __MLX5_VDPA_H__ > #define __MLX5_VDPA_H__ > > +#include <linux/etherdevice.h> > +#include <linux/if_vlan.h> > #include <linux/vdpa.h> > #include <linux/mlx5/driver.h> > > +#define MLX5V_ETH_HARD_MTU (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN) > + > struct mlx5_vdpa_direct_mr { > u64 start; > u64 end; > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c > index dc88559..b8416c4 100644 > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > @@ -1907,6 +1907,19 @@ static int mlx5_get_vq_irq(struct vdpa_device *vdv, u16 idx) > .free = mlx5_vdpa_free, > }; > > +static int query_mtu(struct mlx5_core_dev *mdev, u16 *mtu) > +{ > + u16 hw_mtu; > + int err; > + > + err = mlx5_query_nic_vport_mtu(mdev, &hw_mtu); > + if (err) > + return err; > + > + *mtu = hw_mtu - MLX5V_ETH_HARD_MTU; > + return 0; > +} > + > static int alloc_resources(struct mlx5_vdpa_net *ndev) > { > struct mlx5_vdpa_net_resources *res = &ndev->res; > @@ -1992,7 +2005,7 @@ static int mlx5v_probe(struct auxiliary_device *adev, > init_mvqs(ndev); > mutex_init(&ndev->reslock); > config = &ndev->config; > - err = mlx5_query_nic_vport_mtu(mdev, &ndev->mtu); > + err = query_mtu(mdev, &ndev->mtu); > if (err) > goto err_mtu; >