Si-Wei Liu
2022-Aug-16 07:58 UTC
[PATCH 2/2] vDPA: conditionally read fields in virtio-net dev
On 8/15/2022 6:58 PM, Zhu, Lingshan wrote:> > > On 8/16/2022 7:32 AM, Si-Wei Liu wrote: >> >> >> On 8/15/2022 2:26 AM, Zhu Lingshan wrote: >>> Some fields of virtio-net device config space are >>> conditional on the feature bits, the spec says: >>> >>> "The mac address field always exists >>> (though is only valid if VIRTIO_NET_F_MAC is set)" >>> >>> "max_virtqueue_pairs only exists if VIRTIO_NET_F_MQ >>> or VIRTIO_NET_F_RSS is set" >>> >>> "mtu only exists if VIRTIO_NET_F_MTU is set" >>> >>> so we should read MTU, MAC and MQ in the device config >>> space only when these feature bits are offered. >>> >>> For MQ, if both VIRTIO_NET_F_MQ and VIRTIO_NET_F_RSS are >>> not set, the virtio device should have >>> one queue pair as default value, so when userspace querying queue >>> pair numbers, >>> it should return mq=1 than zero. >>> >>> For MTU, if VIRTIO_NET_F_MTU is not set, we should not read >>> MTU from the device config sapce. >>> RFC894 <A Standard for the Transmission of IP Datagrams over >>> Ethernet Networks> >>> says:"The minimum length of the data field of a packet sent over an >>> Ethernet is 1500 octets, thus the maximum length of an IP datagram >>> sent over an Ethernet is 1500 octets.? Implementations are encouraged >>> to support full-length packets" >> Noted there's a typo in the above "The *maximum* length of the data >> field of a packet sent over an Ethernet is 1500 octets ..." and the >> RFC was written 1984. > the spec RFC894 says it is 1500, see <a > href="https://urldefense.com/v3/__https://www.rfc-editor.org/rfc/rfc894.txt__;!!ACWV5N9M2RV99hQ!MdgxZjw5sp5Qz-GKfwT1IWcw_L4Jo1-UekuJPFz1UrG3YuqirKz7P9ksdJFh1vB6zHJ7z8Q04fpT0-9jWXCtlWM$">https://www.rfc-editor.org/rfc/rfc894.txt</a> >> >> Apparently that is no longer true with the introduction of Jumbo size >> frame later in the 2000s. I'm not sure what is the point of mention >> this ancient RFC. It doesn't say default MTU of any Ethernet >> NIC/switch should be 1500 in either? case. > This could be a larger number for sure, we are trying to find out the > min value for Ethernet here, to support 1500 octets, MTU should be > 1500 at least, so I assume 1500 should be the default value for MTU >> >>> >>> virtio spec says:"The virtio network device is a virtual ethernet >>> card", >> Right, >>> so the default MTU value should be 1500 for virtio-net. >> ... but it doesn't say the default is 1500. At least, not in explicit >> way. Why it can't be 1492 or even lower? In practice, if the network >> backend has a MTU higher than 1500, there's nothing wrong for guest >> to configure default MTU more than 1500. > same as above >> >>> >>> For MAC, the spec says:"If the VIRTIO_NET_F_MAC feature bit is set, >>> the configuration space mac entry indicates the ?physical? address >>> of the network card, otherwise the driver would typically >>> generate a random local MAC address." So there is no >>> default MAC address if VIRTIO_NET_F_MAC not set. >>> >>> This commits introduces functions vdpa_dev_net_mtu_config_fill() >>> and vdpa_dev_net_mac_config_fill() to fill MTU and MAC. >>> It also fixes vdpa_dev_net_mq_config_fill() to report correct >>> MQ when _F_MQ is not present. >>> >>> These functions should check devices features than driver >>> features, and struct vdpa_device is not needed as a parameter >>> >>> The test & userspace tool output: >>> >>> Feature bit VIRTIO_NET_F_MTU, VIRTIO_NET_F_RSS, VIRTIO_NET_F_MQ >>> and VIRTIO_NET_F_MAC can be mask out by hardcode. >>> >>> However, it is challenging to "disable" the related fields >>> in the HW device config space, so let's just assume the values >>> are meaningless if the feature bits are not set. >>> >>> Before this change, when feature bits for RSS, MQ, MTU and MAC >>> are not set, iproute2 output: >>> $vdpa vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false mtu 1500 >>> ?? negotiated_features >>> >>> without this commit, function vdpa_dev_net_config_fill() >>> reads all config space fields unconditionally, so let's >>> assume the MAC and MTU are meaningless, and it checks >>> MQ with driver_features, so we don't see max_vq_pairs. >>> >>> After applying this commit, when feature bits for >>> MQ, RSS, MAC and MTU are not set,iproute2 output: >>> $vdpa dev config show vdpa0 >>> vdpa0: link up link_announce false max_vq_pairs 1 mtu 1500 >>> ?? negotiated_features >>> >>> As explained above: >>> Here is no MAC, because VIRTIO_NET_F_MAC is not set, >>> and there is no default value for MAC. It shows >>> max_vq_paris = 1 because even without MQ feature, >>> a functional virtio-net must have one queue pair. >>> mtu = 1500 is the default value as ethernet >>> required. >>> >>> This commit also add supplementary comments for >>> __virtio16_to_cpu(true, xxx) operations in >>> vdpa_dev_net_config_fill() and vdpa_fill_stats_rec() >>> >>> Signed-off-by: Zhu Lingshan <lingshan.zhu at intel.com> >>> --- >>> ? drivers/vdpa/vdpa.c | 60 >>> +++++++++++++++++++++++++++++++++++---------- >>> ? 1 file changed, 47 insertions(+), 13 deletions(-) >>> >>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c >>> index efb55a06e961..a74660b98979 100644 >>> --- a/drivers/vdpa/vdpa.c >>> +++ b/drivers/vdpa/vdpa.c >>> @@ -801,19 +801,44 @@ static int vdpa_nl_cmd_dev_get_dumpit(struct >>> sk_buff *msg, struct netlink_callba >>> ????? return msg->len; >>> ? } >>> ? -static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev, >>> -?????????????????????? struct sk_buff *msg, u64 features, >>> +static int vdpa_dev_net_mq_config_fill(struct sk_buff *msg, u64 >>> features, >>> ???????????????????????? const struct virtio_net_config *config) >>> ? { >>> ????? u16 val_u16; >>> ? -??? if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0) >>> -??????? return 0; >>> +??? if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0 && >>> +??????? (features & BIT_ULL(VIRTIO_NET_F_RSS)) == 0) >>> +??????? val_u16 = 1; >>> +??? else >>> +??????? val_u16 = __virtio16_to_cpu(true, >>> config->max_virtqueue_pairs); >>> ? -??? val_u16 = le16_to_cpu(config->max_virtqueue_pairs); >>> ????? return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16); >>> ? } >>> ? +static int vdpa_dev_net_mtu_config_fill(struct sk_buff *msg, u64 >>> features, >>> +??????????????????? const struct virtio_net_config *config) >>> +{ >>> +??? u16 val_u16; >>> + >>> +??? if ((features & BIT_ULL(VIRTIO_NET_F_MTU)) == 0) >>> +??????? val_u16 = 1500; >> As said, there's no virtio spec defined value for MTU. Please leave >> this field out if feature VIRTIO_NET_F_MTU is not negotiated. > same as above >>> +??? else >>> +??????? val_u16 = __virtio16_to_cpu(true, config->mtu); >>> + >>> +??? return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16); >>> +} >>> + >>> +static int vdpa_dev_net_mac_config_fill(struct sk_buff *msg, u64 >>> features, >>> +??????????????????? const struct virtio_net_config *config) >>> +{ >>> +??? if ((features & BIT_ULL(VIRTIO_NET_F_MAC)) == 0) >>> +??????? return 0; >>> +??? else >>> +??????? return? nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, >>> +??????????????? sizeof(config->mac), config->mac); >>> +} >>> + >>> + >>> ? static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, >>> struct sk_buff *msg) >>> ? { >>> ????? struct virtio_net_config config = {}; >>> @@ -822,18 +847,16 @@ static int vdpa_dev_net_config_fill(struct >>> vdpa_device *vdev, struct sk_buff *ms >>> ? ????? vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config)); >>> ? -??? if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, >>> sizeof(config.mac), >>> -??????????? config.mac)) >>> -??????? return -EMSGSIZE; >>> +??? /* >>> +???? * Assume little endian for now, userspace can tweak this for >>> +???? * legacy guest support. >> You can leave it as a TODO for kernel (vdpa core limitation), but >> AFAIK there's nothing userspace needs to do to infer the endianness. >> IMHO it's the kernel's job to provide an abstraction rather than rely >> on userspace guessing it. > we have discussed it in another thread, and this comment is suggested > by MST.Can you provide the context or link? It shouldn't work like this, otherwise it is breaking uABI. E.g. how will a legacy/BE supporting kernel/device be backward compatible with older vdpa tool (which has knowledge of this endianness implication/assumption from day one)? -Siwei>> >>> +???? */ >>> +??? val_u16 = __virtio16_to_cpu(true, config.status); >>> ? ????? val_u16 = __virtio16_to_cpu(true, config.status); >>> ????? if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16)) >>> ????????? return -EMSGSIZE; >>> ? -??? val_u16 = __virtio16_to_cpu(true, config.mtu); >>> -??? if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16)) >>> -??????? return -EMSGSIZE; >>> - >>> ????? features_driver = vdev->config->get_driver_features(vdev); >>> ????? if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, >>> features_driver, >>> ??????????????????? VDPA_ATTR_PAD)) >>> @@ -846,7 +869,13 @@ static int vdpa_dev_net_config_fill(struct >>> vdpa_device *vdev, struct sk_buff *ms >>> ??????????????????? VDPA_ATTR_PAD)) >>> ????????? return -EMSGSIZE; >>> ? -??? return vdpa_dev_net_mq_config_fill(vdev, msg, >>> features_driver, &config); >>> +??? if (vdpa_dev_net_mac_config_fill(msg, features_device, &config)) >>> +??????? return -EMSGSIZE; >>> + >>> +??? if (vdpa_dev_net_mtu_config_fill(msg, features_device, &config)) >>> +??????? return -EMSGSIZE; >>> + >>> +??? return vdpa_dev_net_mq_config_fill(msg, features_device, &config); >>> ? } >>> ? ? static int >>> @@ -914,6 +943,11 @@ static int vdpa_fill_stats_rec(struct >>> vdpa_device *vdev, struct sk_buff *msg, >>> ????? } >>> ????? vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config)); >>> ? +??? /* >>> +???? * Assume little endian for now, userspace can tweak this for >>> +???? * legacy guest support. >>> +???? */ >>> + >> Ditto. > same as above > > Thanks >> >> Thanks, >> -Siwei >>> ????? max_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs); >>> ????? if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp)) >>> ????????? return -EMSGSIZE; >> >
Si-Wei Liu
2022-Aug-16 08:08 UTC
[PATCH 2/2] vDPA: conditionally read fields in virtio-net dev
On 8/16/2022 12:58 AM, Si-Wei Liu wrote:>> > Can you provide the context or link? It shouldn't work like this, > otherwise it is breaking uABI. E.g. how will a legacy/BE supporting > kernel/device be backward compatible with older vdpa tool (which has > knowledges/has knowledge/has no knowledge/> of this endianness implication/assumption from day one)? > > -Siwei-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20220816/5fb6891f/attachment.html>