Jason Wang
2022-Mar-17 02:32 UTC
[PATCH v1 1/2] vdpa: Add support for querying vendor statistics
On Thu, Mar 17, 2022 at 6:00 AM Si-Wei Liu <si-wei.liu at oracle.com> wrote:> > > > On 3/16/2022 12:10 AM, Eli Cohen wrote: > >> From: Si-Wei Liu <si-wei.liu at oracle.com> > >> Sent: Wednesday, March 16, 2022 8:52 AM > >> To: Eli Cohen <elic at nvidia.com> > >> Cc: mst at redhat.com; jasowang at redhat.com; virtualization at lists.linux-foundation.org; eperezma at redhat.com; amorenoz at redhat.com; > >> lvivier at redhat.com; sgarzare at redhat.com; Parav Pandit <parav at nvidia.com> > >> Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying vendor statistics > >> > >> > >> > >> On 3/15/2022 2:10 AM, Eli Cohen wrote: > >> > >> <...snip...> > >> > >>>> Say you got a vdpa net device created with 4 data queue pairs and a > >>>> control vq. On boot some guest firmware may support just F_CTRL_VQ but > >>>> not F_MQ, then the index for the control vq in guest ends up with 2, as > >>>> in this case there's only a single queue pair enabled for rx (index 0) > >>>> and tx (index 1). From the host driver (e.g. mlx5_vdpa) perspective, the > >>>> control vq is the last vq following 8 > >>> If the host sees F_MQ was not negotiated but F_CTRL_VQ was, then it knows > >>> that control VQ index is 2 > >> Right, but I don't see this feature negotiation info getting returned > >> from your vdpa_dev_vendor_stats_fill(), or did I miss something? How do > >> you plan for host user to get this info? If you meant another "vdpa dev > >> show" command to query negotiated features ahead, this won't get the > >> same lock protected as the time you run the stat query. It's very easy > >> to miss that ephemeral queue index. > > Right, so I suggested to include the negotiated features in the netlink message > > for the statistics. That would save us from using two system calls to get the > > information required and it answers your concern with respect to locking. > > I think Jason was reluctant to adding this attribute to the message but can't > > find where he explained the reasoning. > Maybe Jason can clarify and correct me, but I just did not get the same > impression as what you said? I just skimmed through all of the emails in > the thread, only finding that he didn't want device specific attribute > such as queue type to get returned by the vdpa core, which I agree. I'm > not sure if he's explicitly against piggyback negotiated features to aid > userspace parsing the index.I think we need piggyback the negotiated features, otherwise as you mentioned, we will probably get in-consistency. But a question for the "host queue index", as mentioned before. It's something that is not defined in the spec, so technically, vendor can do any mappings between it and the index what guest can see. I feel like we need to clarify it in the spec first. Thanks> > Another way around, vdpa tool may pass down -1 to get_vq_vstat() to > represent the queue index for the control queue - but that's less > favorable as the vdpa core needs to maintain device specific knowledge. > > > > > > >>>> data vqs of all 4 pairs, hence got > >>>> the 8th index in the rank. Since F_MQ is not negotiated and only 1 data > >>>> queue pair enabled, in such event only host qindex 0,1 and 8 have vendor > >>>> stats available, and the rest of qindex would get invalid/empty stat. > >>>> > >>>> Later on say boot continues towards loading the Linux virtio driver, > >>>> then guest could successfully negotiate both F_CTRL_VQ and F_MQ > >>>> features. In this case, all 8 data virtqueues are fully enabled, the > >>>> index for the control vq ends up as 8, following tightly after all the 4 > >>>> data queue pairs. Only until both features are negotiated, the guest and > >>>> host are able to see consistent view in identifying the control vq. > >>>> Since F_MQ is negotiated, all host queues, indexed from 0 through 8, > >>>> should have vendor stats available. > >>>> > >>>> That's why I said the guest qindex is ephemeral and hard to predict > >>>> subjected to negotiated features, but host qindex is reliable and more > >>>> eligible for command line identification purpose. > >>>> > >> <...snip...> > >>>>> So what are you actually proposing? Display received and completed descriptors > >>>>> per queue index without further interpretation? > >>>> I'd suggest using a more stable queue id i.e. the host queue index to > >>>> represent the qidx (which seems to be what you're doing now?), and > >>>> displaying both the host qindex (queue_index_device in the example > >>>> below), as well as the guest's (queue_index_driver as below) in the output: > >>>> > >>> Given that per vdpa device you can display statistics only after features have > >>> been negotiated, you can always know the correct queue index for the control > >>> VQ. > >> The stats can be displayed only after features are negotiated, and only > >> when the corresponding queue is enabled. If you know it from "vdpa dev > >> show" on day 1 that the control vq and mq features are negotiated, but > >> then on day2 you got nothing for the predicted control vq index, what > >> would you recommend the host admin to do to get the right qindex again? > >> Re-run the stat query on the same queue index, or check the "vdpa dev > >> show" output again on day 3? This CLI design makes cloud administrator > >> really challenging to follow the dynamics of guest activities were to > >> manage hundreds or thousands of virtual machines... > >> > >> It would be easier, in my opinion, to grasp some well-defined handle > >> that is easily predictable or fixed across the board, for looking up the > >> control virtqueue. This could be a constant host queue index, or a > >> special magic keyword like "qidx ctrlvq". If cloud admin runs vstat > >> query on the control vq using a determined handle but get nothing back, > >> then s/he knows *for sure* the control vq was not available for some > >> reason at the point when the stat was being collected. S/he doesn't even > >> need to care negotiated status via "vdpa dev show" at all. Why bother? > > So, per my suggestion above, passing the negotiated attribute in the netlink > > message would satisfy the requirements for atomicity, right? > Yes, it satisfied the atomicity requirement, though not sure how you > want to represent the queue index for control vq? Basically if cloud > admin wants to dump control queue stats explicitly with a fixed handle > or identifier, how that can be done with the negotiated attribute? > > Thanks, > -Siwei > > > >>> Do you still hold see your proposal required? > >> Yes, this is essential to any cloud admin that runs stat query on all of > >> the queues on periodic basis. You'd get some deterministic without > >> blindly guessing or bothering other irrelevant command. > >> > >> > >> Thanks, > >> -Siwei > >>>> $ vdpa -jp dev vstats show vdpa-a qidx 8 > >>>> { > >>>> "vstats": { > >>>> "vdpa-a": { > >>>> "queue_stats": [{ > >>>> "queue_index_device": 8, > >>>> "queue_index_driver": 2, > >>>> "queue_type": "control_vq", > >>>> "stat_name": [ "received_desc","completed_desc" ], > >>>> "stat_value": [ 417776,417775 ], > >>>> }] > >>>> } > >>>> } > >>>> } > >>>> > >>>> Optionally, user may use guest queue index gqidx, which is kind of an > >>>> ephemeral ID and F_MQ negotiation depended, to query the stat on a > >>>> specific guest queue: > >>>> > >>>> $ vdpa -jp dev vstats show vdpa-a gqidx 2 > >>>> { > >>>> "vstats": { > >>>> "vdpa-a": { > >>>> "queue_stats": [{ > >>>> "queue_index_device": 8, > >>>> "queue_index_driver": 2, > >>>> "queue_type": "control_vq", > >>>> "stat_name": [ "received_desc","completed_desc" ], > >>>> "stat_value": [ 417776,417775 ], > >>>> }] > >>>> } > >>>> } > >>>> } > >>>> > >>>> Thanks, > >>>> -Siwei > >>>> > >>>>>> Thanks, > >>>>>> -Siwei > >>>>>> > >>>>>>>> Regards, > >>>>>>>> -Siwei > >>>>>>>> > >>>>>>>>>>>> Looks to me there are still some loose end I don't quite yet > >>>>>>>>>>>> understand. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>>>> "queue_index": 0, > >>>>>>>>>>> I think this can be removed since the command is for a specific index. > >>>>>>>>>>> > >>>>>>>>>>>>>>> "name": "received_desc", > >>>>>>>>>>>>>>> "value": 417776, > >>>>>>>>>>>>>>> "name": "completed_desc", > >>>>>>>>>>>>>>> "value": 417548 > >>>>>>>>>>>>>> Not for this kernel patch, but IMHO it's the best to put the name > >>>>>>>>>>>>>> & value pairs in an array instead of flat entries in json's > >>>>>>>>>>>>>> hash/dictionary. The hash entries can be re-ordered deliberately > >>>>>>>>>>>>>> by external json parsing tool, ending up with inconsistent stat values. > >>>>>>>>>>>> This comment is missed for some reason. Please change the example > >>>>>>>>>>>> in the log if you agree to address it in vdpa tool. Or justify why > >>>>>>>>>>>> keeping the order for json hash/dictionary is fine. > >>>>>>>>>>> Sorry for skipping this comment. > >>>>>>>>>>> Do you mean to present the information like: > >>>>>>>>>>> "received_desc": 417776, > >>>>>>>>>>> "completed_desc": 417548, > >>>>>>>>>> I mean the following presentation: > >>>>>>>>>> > >>>>>>>>>> $ vdpa -jp dev vstats show vdpa-a qidx 0 { > >>>>>>>>>> "vstats": { > >>>>>>>>>> "vdpa-a": { > >>>>>>>>>> "queue_stats": [{ > >>>>>>>>>> "queue_index": 0, > >>>>>>>>>> "queue_type": "rx", > >>>>>>>>>> "stat_name": [ "received_desc","completed_desc" ], > >>>>>>>>>> "stat_value": [ 417776,417548 ], > >>>>>>>>>> }] > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> I think Parav had similar suggestion, too. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> -Siwei > >>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> -Siwei > >>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>> -Siwei > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Signed-off-by: Eli Cohen <elic at nvidia.com> > >>>>>>>>>>>>>>> --- > >>>>>>>>>>>>>>> drivers/vdpa/vdpa.c | 129 > >>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++ > >>>>>>>>>>>>>>> include/linux/vdpa.h | 5 ++ > >>>>>>>>>>>>>>> include/uapi/linux/vdpa.h | 7 +++ > >>>>>>>>>>>>>>> 3 files changed, 141 insertions(+) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index > >>>>>>>>>>>>>>> 9846c9de4bfa..d0ff671baf88 100644 > >>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c > >>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c > >>>>>>>>>>>>>>> @@ -909,6 +909,74 @@ vdpa_dev_config_fill(struct vdpa_device > >>>>>>>>>>>>>>> *vdev, > >>>>>>>>>>>> struct sk_buff *msg, u32 portid, > >>>>>>>>>>>>>>> return err; > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> +static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct > >>>>>>>>>>>>>>> +sk_buff > >>>>>>>>>>>> *msg, > >>>>>>>>>>>>>>> + struct genl_info *info, u32 index) { > >>>>>>>>>>>>>>> + int err; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + if (nla_put_u32(msg, VDPA_ATTR_DEV_QUEUE_INDEX, index)) > >>>>>>>>>>>>>>> + return -EMSGSIZE; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + err = vdev->config->get_vendor_vq_stats(vdev, index, msg, > >>>>>>>>>>>>>>> +info- > >>>>>>>>>>>>> extack); > >>>>>>>>>>>>>>> + if (err) > >>>>>>>>>>>>>>> + return err; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + return 0; > >>>>>>>>>>>>>>> +} > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> +static int vendor_stats_fill(struct vdpa_device *vdev, struct > >>>>>>>>>>>>>>> +sk_buff > >>>>>>>>>> *msg, > >>>>>>>>>>>>>>> + struct genl_info *info, u32 index) { > >>>>>>>>>>>>>>> + int err; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + if (!vdev->config->get_vendor_vq_stats) > >>>>>>>>>>>>>>> + return -EOPNOTSUPP; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + err = vdpa_fill_stats_rec(vdev, msg, info, index); > >>>>>>>>>>>>>>> + if (err) > >>>>>>>>>>>>>>> + return err; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + return 0; > >>>>>>>>>>>>>>> +} > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> +static int vdpa_dev_vendor_stats_fill(struct vdpa_device *vdev, > >>>>>>>>>>>>>>> + struct sk_buff *msg, > >>>>>>>>>>>>>>> + struct genl_info *info, u32 index) { > >>>>>>>>>>>>>>> + u32 device_id; > >>>>>>>>>>>>>>> + void *hdr; > >>>>>>>>>>>>>>> + int err; > >>>>>>>>>>>>>>> + u32 portid = info->snd_portid; > >>>>>>>>>>>>>>> + u32 seq = info->snd_seq; > >>>>>>>>>>>>>>> + u32 flags = 0; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, > >>>>>>>>>>>>>>> + VDPA_CMD_DEV_VSTATS_GET); > >>>>>>>>>>>>>>> + if (!hdr) > >>>>>>>>>>>>>>> + return -EMSGSIZE; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, > >>>>>>>> dev_name(&vdev- > >>>>>>>>>>>>> dev))) { > >>>>>>>>>>>>>>> + err = -EMSGSIZE; > >>>>>>>>>>>>>>> + goto undo_msg; > >>>>>>>>>>>>>>> + } > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + device_id = vdev->config->get_device_id(vdev); > >>>>>>>>>>>>>>> + if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id)) { > >>>>>>>>>>>>>>> + err = -EMSGSIZE; > >>>>>>>>>>>>>>> + goto undo_msg; > >>>>>>>>>>>>>>> + } > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + err = vendor_stats_fill(vdev, msg, info, index); > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + genlmsg_end(msg, hdr); > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + return err; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> +undo_msg: > >>>>>>>>>>>>>>> + genlmsg_cancel(msg, hdr); > >>>>>>>>>>>>>>> + return err; > >>>>>>>>>>>>>>> +} > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> static int vdpa_nl_cmd_dev_config_get_doit(struct sk_buff > >>>>>>>>>>>>>>> *skb, struct > >>>>>>>>>>>> genl_info *info) > >>>>>>>>>>>>>>> { > >>>>>>>>>>>>>>> struct vdpa_device *vdev; > >>>>>>>>>>>>>>> @@ -990,6 +1058,60 @@ > >>>>>>>> vdpa_nl_cmd_dev_config_get_dumpit(struct > >>>>>>>>>>>> sk_buff *msg, struct netlink_callback * > >>>>>>>>>>>>>>> return msg->len; > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> +static int vdpa_nl_cmd_dev_stats_get_doit(struct sk_buff *skb, > >>>>>>>>>>>>>>> + struct genl_info *info) > >>>>>>>>>>>>>>> +{ > >>>>>>>>>>>>>>> + struct vdpa_device *vdev; > >>>>>>>>>>>>>>> + struct sk_buff *msg; > >>>>>>>>>>>>>>> + const char *devname; > >>>>>>>>>>>>>>> + struct device *dev; > >>>>>>>>>>>>>>> + u32 index; > >>>>>>>>>>>>>>> + int err; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + if (!info->attrs[VDPA_ATTR_DEV_NAME]) > >>>>>>>>>>>>>>> + return -EINVAL; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + if (!info->attrs[VDPA_ATTR_DEV_QUEUE_INDEX]) > >>>>>>>>>>>>>>> + return -EINVAL; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]); > >>>>>>>>>>>>>>> + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); > >>>>>>>>>>>>>>> + if (!msg) > >>>>>>>>>>>>>>> + return -ENOMEM; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + index = nla_get_u32(info- > >>>>>>>>> attrs[VDPA_ATTR_DEV_QUEUE_INDEX]); > >>>>>>>>>>>>>>> + mutex_lock(&vdpa_dev_mutex); > >>>>>>>>>>>>>>> + dev = bus_find_device(&vdpa_bus, NULL, devname, > >>>>>>>>>>>> vdpa_name_match); > >>>>>>>>>>>>>>> + if (!dev) { > >>>>>>>>>>>>>>> + NL_SET_ERR_MSG_MOD(info->extack, "device not > >>>>>>>> found"); > >>>>>>>>>>>>>>> + err = -ENODEV; > >>>>>>>>>>>>>>> + goto dev_err; > >>>>>>>>>>>>>>> + } > >>>>>>>>>>>>>>> + vdev = container_of(dev, struct vdpa_device, dev); > >>>>>>>>>>>>>>> + if (!vdev->mdev) { > >>>>>>>>>>>>>>> + NL_SET_ERR_MSG_MOD(info->extack, "unmanaged > >>>>>>>> vdpa > >>>>>>>>>>>> device"); > >>>>>>>>>>>>>>> + err = -EINVAL; > >>>>>>>>>>>>>>> + goto mdev_err; > >>>>>>>>>>>>>>> + } > >>>>>>>>>>>>>>> + err = vdpa_dev_vendor_stats_fill(vdev, msg, info, index); > >>>>>>>>>>>>>>> + if (!err) > >>>>>>>>>>>>>>> + err = genlmsg_reply(msg, info); > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + put_device(dev); > >>>>>>>>>>>>>>> + mutex_unlock(&vdpa_dev_mutex); > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + if (err) > >>>>>>>>>>>>>>> + nlmsg_free(msg); > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + return err; > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> +mdev_err: > >>>>>>>>>>>>>>> + put_device(dev); > >>>>>>>>>>>>>>> +dev_err: > >>>>>>>>>>>>>>> + mutex_unlock(&vdpa_dev_mutex); > >>>>>>>>>>>>>>> + return err; > >>>>>>>>>>>>>>> +} > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] > >>>>>>>> = { > >>>>>>>>>>>>>>> [VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type > >>>>>>>>>> NLA_NUL_STRING }, > >>>>>>>>>>>>>>> [VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING > >>>>>>>>>> }, @@ - > >>>>>>>>>>>> 997,6 > >>>>>>>>>>>>>>> +1119,7 @@ static const struct nla_policy > >>>>>>>>>>>> vdpa_nl_policy[VDPA_ATTR_MAX + 1] = { > >>>>>>>>>>>>>>> [VDPA_ATTR_DEV_NET_CFG_MACADDR] > >>>>>>>>>> NLA_POLICY_ETH_ADDR, > >>>>>>>>>>>>>>> /* virtio spec 1.1 section 5.1.4.1 for valid MTU range */ > >>>>>>>>>>>>>>> [VDPA_ATTR_DEV_NET_CFG_MTU] > >>>>>>>>>> NLA_POLICY_MIN(NLA_U16, 68), > >>>>>>>>>>>>>>> + [VDPA_ATTR_DEV_QUEUE_INDEX] > >>>>>>>> NLA_POLICY_RANGE(NLA_U32, 0, > >>>>>>>>>>>> 65535), > >>>>>>>>>>>>>>> }; > >>>>>>>>>>>>>>> static const struct genl_ops vdpa_nl_ops[] = { @@ -1030,6 > >>>>>>>>>>>>>>> +1153,12 @@ static const struct genl_ops vdpa_nl_ops[] = { > >>>>>>>>>>>>>>> .doit = vdpa_nl_cmd_dev_config_get_doit, > >>>>>>>>>>>>>>> .dumpit = vdpa_nl_cmd_dev_config_get_dumpit, > >>>>>>>>>>>>>>> }, > >>>>>>>>>>>>>>> + { > >>>>>>>>>>>>>>> + .cmd = VDPA_CMD_DEV_VSTATS_GET, > >>>>>>>>>>>>>>> + .validate = GENL_DONT_VALIDATE_STRICT | > >>>>>>>>>>>> GENL_DONT_VALIDATE_DUMP, > >>>>>>>>>>>>>>> + .doit = vdpa_nl_cmd_dev_stats_get_doit, > >>>>>>>>>>>>>>> + .flags = GENL_ADMIN_PERM, > >>>>>>>>>>>>>>> + }, > >>>>>>>>>>>>>>> }; > >>>>>>>>>>>>>>> static struct genl_family vdpa_nl_family __ro_after_init > >>>>>>>>>>>>>>> { diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index > >>>>>>>>>>>>>>> 2de442ececae..274203845cfc 100644 > >>>>>>>>>>>>>>> --- a/include/linux/vdpa.h > >>>>>>>>>>>>>>> +++ b/include/linux/vdpa.h > >>>>>>>>>>>>>>> @@ -275,6 +275,9 @@ struct vdpa_config_ops { > >>>>>>>>>>>>>>> const struct vdpa_vq_state *state); > >>>>>>>>>>>>>>> int (*get_vq_state)(struct vdpa_device *vdev, u16 idx, > >>>>>>>>>>>>>>> struct vdpa_vq_state *state); > >>>>>>>>>>>>>>> + int (*get_vendor_vq_stats)(struct vdpa_device *vdev, u16 idx, > >>>>>>>>>>>>>>> + struct sk_buff *msg, > >>>>>>>>>>>>>>> + struct netlink_ext_ack *extack); > >>>>>>>>>>>>>>> struct vdpa_notification_area > >>>>>>>>>>>>>>> (*get_vq_notification)(struct vdpa_device *vdev, u16 idx); > >>>>>>>>>>>>>>> /* vq irq is not expected to be changed once DRIVER_OK is > >>>>>>>>>>>>>>> set */ @@ -466,4 +469,6 @@ struct vdpa_mgmt_dev { > >>>>>>>>>>>>>>> int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev); > >>>>>>>>>>>>>>> void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev); > >>>>>>>>>>>>>>> +#define VDPA_INVAL_QUEUE_INDEX 0xffff > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> #endif /* _LINUX_VDPA_H */ > >>>>>>>>>>>>>>> diff --git a/include/uapi/linux/vdpa.h > >>>>>>>>>>>>>>> b/include/uapi/linux/vdpa.h index 1061d8d2d09d..c5f229a41dc2 > >>>>>>>>>>>>>>> 100644 > >>>>>>>>>>>>>>> --- a/include/uapi/linux/vdpa.h > >>>>>>>>>>>>>>> +++ b/include/uapi/linux/vdpa.h > >>>>>>>>>>>>>>> @@ -18,6 +18,7 @@ enum vdpa_command { > >>>>>>>>>>>>>>> VDPA_CMD_DEV_DEL, > >>>>>>>>>>>>>>> VDPA_CMD_DEV_GET, /* can dump */ > >>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET, /* can dump */ > >>>>>>>>>>>>>>> + VDPA_CMD_DEV_VSTATS_GET, > >>>>>>>>>>>>>>> }; > >>>>>>>>>>>>>>> enum vdpa_attr { > >>>>>>>>>>>>>>> @@ -46,6 +47,12 @@ enum vdpa_attr { > >>>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, /* u64 */ > >>>>>>>>>>>>>>> VDPA_ATTR_DEV_MGMTDEV_MAX_VQS, /* > >>>>>>>>>> u32 */ > >>>>>>>>>>>>>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES, /* u64 */ > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> + VDPA_ATTR_DEV_QUEUE_INDEX, /* u16 */ > >>>>>>>>>>>>>>> + VDPA_ATTR_DEV_QUEUE_TYPE, /* string */ > >>>>>>>>>>>>>>> + VDPA_ATTR_DEV_VENDOR_ATTR_NAME, /* > >>>>>>>> string */ > >>>>>>>>>>>>>>> + VDPA_ATTR_DEV_VENDOR_ATTR_VALUE, /* u64 */ > >>>>>>>>>>>>>>> + > >>>>>>>>>>>>>>> /* new attributes must be added above here */ > >>>>>>>>>>>>>>> VDPA_ATTR_MAX, > >>>>>>>>>>>>>>> }; >
Si-Wei Liu
2022-Mar-18 00:58 UTC
[PATCH v1 1/2] vdpa: Add support for querying vendor statistics
On 3/16/2022 7:32 PM, Jason Wang wrote:> On Thu, Mar 17, 2022 at 6:00 AM Si-Wei Liu <si-wei.liu at oracle.com> wrote: >> >> >> On 3/16/2022 12:10 AM, Eli Cohen wrote: >>>> From: Si-Wei Liu <si-wei.liu at oracle.com> >>>> Sent: Wednesday, March 16, 2022 8:52 AM >>>> To: Eli Cohen <elic at nvidia.com> >>>> Cc: mst at redhat.com; jasowang at redhat.com; virtualization at lists.linux-foundation.org; eperezma at redhat.com; amorenoz at redhat.com; >>>> lvivier at redhat.com; sgarzare at redhat.com; Parav Pandit <parav at nvidia.com> >>>> Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying vendor statistics >>>> >>>> >>>> >>>> On 3/15/2022 2:10 AM, Eli Cohen wrote: >>>> >>>> <...snip...> >>>> >>>>>> Say you got a vdpa net device created with 4 data queue pairs and a >>>>>> control vq. On boot some guest firmware may support just F_CTRL_VQ but >>>>>> not F_MQ, then the index for the control vq in guest ends up with 2, as >>>>>> in this case there's only a single queue pair enabled for rx (index 0) >>>>>> and tx (index 1). From the host driver (e.g. mlx5_vdpa) perspective, the >>>>>> control vq is the last vq following 8 >>>>> If the host sees F_MQ was not negotiated but F_CTRL_VQ was, then it knows >>>>> that control VQ index is 2 >>>> Right, but I don't see this feature negotiation info getting returned >>>> from your vdpa_dev_vendor_stats_fill(), or did I miss something? How do >>>> you plan for host user to get this info? If you meant another "vdpa dev >>>> show" command to query negotiated features ahead, this won't get the >>>> same lock protected as the time you run the stat query. It's very easy >>>> to miss that ephemeral queue index. >>> Right, so I suggested to include the negotiated features in the netlink message >>> for the statistics. That would save us from using two system calls to get the >>> information required and it answers your concern with respect to locking. >>> I think Jason was reluctant to adding this attribute to the message but can't >>> find where he explained the reasoning. >> Maybe Jason can clarify and correct me, but I just did not get the same >> impression as what you said? I just skimmed through all of the emails in >> the thread, only finding that he didn't want device specific attribute >> such as queue type to get returned by the vdpa core, which I agree. I'm >> not sure if he's explicitly against piggyback negotiated features to aid >> userspace parsing the index. > I think we need piggyback the negotiated features, otherwise as you > mentioned, we will probably get in-consistency.Great. Thanks for confirming it.> > But a question for the "host queue index", as mentioned before. It's > something that is not defined in the spec, so technically, vendor can > do any mappings between it and the index what guest can see. I feel > like we need to clarify it in the spec first.I have been thinking about this for some while today. Actually I am not against exposing the host queue index to the spec, as we know it's somewhat implicitly defined in the QEMU device model for multiqueue. The thing is, I'm not sure if there's extra benefit than this minor requirement (*) given that all of the other vDPA kAPI are taking the guest queue index rather than the host queue index. It works for mlx5_vdpa as the control vq is implemented in the software, so it can map to whatever guest qindex it wishes to. But would it cause extra trouble for some other emulated vDPA device or other vendor's vDPA such as ifcvf to fabricate a fake mapping between the host queue index and the one guest can see? I would have to send a heads-up ahead that the current vhost-vdpa mq implementation in upstream QEMU has some issue in mapping the host qindex to the guest one. This would become a problem with MQ enabled vdpa device and a non-MQ supporting guest e.g. OVMF, for which I'm about to share some RFC patches shortly to demonstrate the issue. If exposing the host queue index to the spec turns is essential to resolving this issue and maybe help with software virtio QEMU implementation too, I won't hesitate to expose this important implementation detail to the spec. (*) another means that may somehow address my use case is to use some magic keyword e.g. "ctrlvq" to identify the control vq. Implementation wise, we can extensively pass -1 to indicate the last guest qindex to the get_vq_vstat() API given that we know for sure the ctrlvq is the last queue in the array when the relevant features are present. Since the negotiated features are piggybacked, it's not hard for the vdpa tool to tell apart whether the last queue is a control vq or not. I'd also welcome other ideas that can make virtqueue identification easier and predictable from the CLI. Thanks, -Siwei> > Thanks > >> Another way around, vdpa tool may pass down -1 to get_vq_vstat() to >> represent the queue index for the control queue - but that's less >> favorable as the vdpa core needs to maintain device specific knowledge. >> >> >> >>>>>> data vqs of all 4 pairs, hence got >>>>>> the 8th index in the rank. Since F_MQ is not negotiated and only 1 data >>>>>> queue pair enabled, in such event only host qindex 0,1 and 8 have vendor >>>>>> stats available, and the rest of qindex would get invalid/empty stat. >>>>>> >>>>>> Later on say boot continues towards loading the Linux virtio driver, >>>>>> then guest could successfully negotiate both F_CTRL_VQ and F_MQ >>>>>> features. In this case, all 8 data virtqueues are fully enabled, the >>>>>> index for the control vq ends up as 8, following tightly after all the 4 >>>>>> data queue pairs. Only until both features are negotiated, the guest and >>>>>> host are able to see consistent view in identifying the control vq. >>>>>> Since F_MQ is negotiated, all host queues, indexed from 0 through 8, >>>>>> should have vendor stats available. >>>>>> >>>>>> That's why I said the guest qindex is ephemeral and hard to predict >>>>>> subjected to negotiated features, but host qindex is reliable and more >>>>>> eligible for command line identification purpose. >>>>>> >>>> <...snip...> >>>>>>> So what are you actually proposing? Display received and completed descriptors >>>>>>> per queue index without further interpretation? >>>>>> I'd suggest using a more stable queue id i.e. the host queue index to >>>>>> represent the qidx (which seems to be what you're doing now?), and >>>>>> displaying both the host qindex (queue_index_device in the example >>>>>> below), as well as the guest's (queue_index_driver as below) in the output: >>>>>> >>>>> Given that per vdpa device you can display statistics only after features have >>>>> been negotiated, you can always know the correct queue index for the control >>>>> VQ. >>>> The stats can be displayed only after features are negotiated, and only >>>> when the corresponding queue is enabled. If you know it from "vdpa dev >>>> show" on day 1 that the control vq and mq features are negotiated, but >>>> then on day2 you got nothing for the predicted control vq index, what >>>> would you recommend the host admin to do to get the right qindex again? >>>> Re-run the stat query on the same queue index, or check the "vdpa dev >>>> show" output again on day 3? This CLI design makes cloud administrator >>>> really challenging to follow the dynamics of guest activities were to >>>> manage hundreds or thousands of virtual machines... >>>> >>>> It would be easier, in my opinion, to grasp some well-defined handle >>>> that is easily predictable or fixed across the board, for looking up the >>>> control virtqueue. This could be a constant host queue index, or a >>>> special magic keyword like "qidx ctrlvq". If cloud admin runs vstat >>>> query on the control vq using a determined handle but get nothing back, >>>> then s/he knows *for sure* the control vq was not available for some >>>> reason at the point when the stat was being collected. S/he doesn't even >>>> need to care negotiated status via "vdpa dev show" at all. Why bother? >>> So, per my suggestion above, passing the negotiated attribute in the netlink >>> message would satisfy the requirements for atomicity, right? >> Yes, it satisfied the atomicity requirement, though not sure how you >> want to represent the queue index for control vq? Basically if cloud >> admin wants to dump control queue stats explicitly with a fixed handle >> or identifier, how that can be done with the negotiated attribute? >> >> Thanks, >> -Siwei >>>>> Do you still hold see your proposal required? >>>> Yes, this is essential to any cloud admin that runs stat query on all of >>>> the queues on periodic basis. You'd get some deterministic without >>>> blindly guessing or bothering other irrelevant command. >>>> >>>> >>>> Thanks, >>>> -Siwei >>>>>> $ vdpa -jp dev vstats show vdpa-a qidx 8 >>>>>> { >>>>>> "vstats": { >>>>>> "vdpa-a": { >>>>>> "queue_stats": [{ >>>>>> "queue_index_device": 8, >>>>>> "queue_index_driver": 2, >>>>>> "queue_type": "control_vq", >>>>>> "stat_name": [ "received_desc","completed_desc" ], >>>>>> "stat_value": [ 417776,417775 ], >>>>>> }] >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> Optionally, user may use guest queue index gqidx, which is kind of an >>>>>> ephemeral ID and F_MQ negotiation depended, to query the stat on a >>>>>> specific guest queue: >>>>>> >>>>>> $ vdpa -jp dev vstats show vdpa-a gqidx 2 >>>>>> { >>>>>> "vstats": { >>>>>> "vdpa-a": { >>>>>> "queue_stats": [{ >>>>>> "queue_index_device": 8, >>>>>> "queue_index_driver": 2, >>>>>> "queue_type": "control_vq", >>>>>> "stat_name": [ "received_desc","completed_desc" ], >>>>>> "stat_value": [ 417776,417775 ], >>>>>> }] >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> Thanks, >>>>>> -Siwei >>>>>> >>>>>>>> Thanks, >>>>>>>> -Siwei >>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> -Siwei >>>>>>>>>> >>>>>>>>>>>>>> Looks to me there are still some loose end I don't quite yet >>>>>>>>>>>>>> understand. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "queue_index": 0, >>>>>>>>>>>>> I think this can be removed since the command is for a specific index. >>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "name": "received_desc", >>>>>>>>>>>>>>>>> "value": 417776, >>>>>>>>>>>>>>>>> "name": "completed_desc", >>>>>>>>>>>>>>>>> "value": 417548 >>>>>>>>>>>>>>>> Not for this kernel patch, but IMHO it's the best to put the name >>>>>>>>>>>>>>>> & value pairs in an array instead of flat entries in json's >>>>>>>>>>>>>>>> hash/dictionary. The hash entries can be re-ordered deliberately >>>>>>>>>>>>>>>> by external json parsing tool, ending up with inconsistent stat values. >>>>>>>>>>>>>> This comment is missed for some reason. Please change the example >>>>>>>>>>>>>> in the log if you agree to address it in vdpa tool. Or justify why >>>>>>>>>>>>>> keeping the order for json hash/dictionary is fine. >>>>>>>>>>>>> Sorry for skipping this comment. >>>>>>>>>>>>> Do you mean to present the information like: >>>>>>>>>>>>> "received_desc": 417776, >>>>>>>>>>>>> "completed_desc": 417548, >>>>>>>>>>>> I mean the following presentation: >>>>>>>>>>>> >>>>>>>>>>>> $ vdpa -jp dev vstats show vdpa-a qidx 0 { >>>>>>>>>>>> "vstats": { >>>>>>>>>>>> "vdpa-a": { >>>>>>>>>>>> "queue_stats": [{ >>>>>>>>>>>> "queue_index": 0, >>>>>>>>>>>> "queue_type": "rx", >>>>>>>>>>>> "stat_name": [ "received_desc","completed_desc" ], >>>>>>>>>>>> "stat_value": [ 417776,417548 ], >>>>>>>>>>>> }] >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> I think Parav had similar suggestion, too. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -Siwei >>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -Siwei >>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> -Siwei >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Signed-off-by: Eli Cohen <elic at nvidia.com> >>>>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>>>> drivers/vdpa/vdpa.c | 129 >>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++ >>>>>>>>>>>>>>>>> include/linux/vdpa.h | 5 ++ >>>>>>>>>>>>>>>>> include/uapi/linux/vdpa.h | 7 +++ >>>>>>>>>>>>>>>>> 3 files changed, 141 insertions(+) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index >>>>>>>>>>>>>>>>> 9846c9de4bfa..d0ff671baf88 100644 >>>>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c >>>>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c >>>>>>>>>>>>>>>>> @@ -909,6 +909,74 @@ vdpa_dev_config_fill(struct vdpa_device >>>>>>>>>>>>>>>>> *vdev, >>>>>>>>>>>>>> struct sk_buff *msg, u32 portid, >>>>>>>>>>>>>>>>> return err; >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> +static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct >>>>>>>>>>>>>>>>> +sk_buff >>>>>>>>>>>>>> *msg, >>>>>>>>>>>>>>>>> + struct genl_info *info, u32 index) { >>>>>>>>>>>>>>>>> + int err; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + if (nla_put_u32(msg, VDPA_ATTR_DEV_QUEUE_INDEX, index)) >>>>>>>>>>>>>>>>> + return -EMSGSIZE; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + err = vdev->config->get_vendor_vq_stats(vdev, index, msg, >>>>>>>>>>>>>>>>> +info- >>>>>>>>>>>>>>> extack); >>>>>>>>>>>>>>>>> + if (err) >>>>>>>>>>>>>>>>> + return err; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + return 0; >>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> +static int vendor_stats_fill(struct vdpa_device *vdev, struct >>>>>>>>>>>>>>>>> +sk_buff >>>>>>>>>>>> *msg, >>>>>>>>>>>>>>>>> + struct genl_info *info, u32 index) { >>>>>>>>>>>>>>>>> + int err; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + if (!vdev->config->get_vendor_vq_stats) >>>>>>>>>>>>>>>>> + return -EOPNOTSUPP; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + err = vdpa_fill_stats_rec(vdev, msg, info, index); >>>>>>>>>>>>>>>>> + if (err) >>>>>>>>>>>>>>>>> + return err; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + return 0; >>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> +static int vdpa_dev_vendor_stats_fill(struct vdpa_device *vdev, >>>>>>>>>>>>>>>>> + struct sk_buff *msg, >>>>>>>>>>>>>>>>> + struct genl_info *info, u32 index) { >>>>>>>>>>>>>>>>> + u32 device_id; >>>>>>>>>>>>>>>>> + void *hdr; >>>>>>>>>>>>>>>>> + int err; >>>>>>>>>>>>>>>>> + u32 portid = info->snd_portid; >>>>>>>>>>>>>>>>> + u32 seq = info->snd_seq; >>>>>>>>>>>>>>>>> + u32 flags = 0; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags, >>>>>>>>>>>>>>>>> + VDPA_CMD_DEV_VSTATS_GET); >>>>>>>>>>>>>>>>> + if (!hdr) >>>>>>>>>>>>>>>>> + return -EMSGSIZE; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, >>>>>>>>>> dev_name(&vdev- >>>>>>>>>>>>>>> dev))) { >>>>>>>>>>>>>>>>> + err = -EMSGSIZE; >>>>>>>>>>>>>>>>> + goto undo_msg; >>>>>>>>>>>>>>>>> + } >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + device_id = vdev->config->get_device_id(vdev); >>>>>>>>>>>>>>>>> + if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id)) { >>>>>>>>>>>>>>>>> + err = -EMSGSIZE; >>>>>>>>>>>>>>>>> + goto undo_msg; >>>>>>>>>>>>>>>>> + } >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + err = vendor_stats_fill(vdev, msg, info, index); >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + genlmsg_end(msg, hdr); >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + return err; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> +undo_msg: >>>>>>>>>>>>>>>>> + genlmsg_cancel(msg, hdr); >>>>>>>>>>>>>>>>> + return err; >>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> static int vdpa_nl_cmd_dev_config_get_doit(struct sk_buff >>>>>>>>>>>>>>>>> *skb, struct >>>>>>>>>>>>>> genl_info *info) >>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>> struct vdpa_device *vdev; >>>>>>>>>>>>>>>>> @@ -990,6 +1058,60 @@ >>>>>>>>>> vdpa_nl_cmd_dev_config_get_dumpit(struct >>>>>>>>>>>>>> sk_buff *msg, struct netlink_callback * >>>>>>>>>>>>>>>>> return msg->len; >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> +static int vdpa_nl_cmd_dev_stats_get_doit(struct sk_buff *skb, >>>>>>>>>>>>>>>>> + struct genl_info *info) >>>>>>>>>>>>>>>>> +{ >>>>>>>>>>>>>>>>> + struct vdpa_device *vdev; >>>>>>>>>>>>>>>>> + struct sk_buff *msg; >>>>>>>>>>>>>>>>> + const char *devname; >>>>>>>>>>>>>>>>> + struct device *dev; >>>>>>>>>>>>>>>>> + u32 index; >>>>>>>>>>>>>>>>> + int err; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + if (!info->attrs[VDPA_ATTR_DEV_NAME]) >>>>>>>>>>>>>>>>> + return -EINVAL; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + if (!info->attrs[VDPA_ATTR_DEV_QUEUE_INDEX]) >>>>>>>>>>>>>>>>> + return -EINVAL; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]); >>>>>>>>>>>>>>>>> + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); >>>>>>>>>>>>>>>>> + if (!msg) >>>>>>>>>>>>>>>>> + return -ENOMEM; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + index = nla_get_u32(info- >>>>>>>>>>> attrs[VDPA_ATTR_DEV_QUEUE_INDEX]); >>>>>>>>>>>>>>>>> + mutex_lock(&vdpa_dev_mutex); >>>>>>>>>>>>>>>>> + dev = bus_find_device(&vdpa_bus, NULL, devname, >>>>>>>>>>>>>> vdpa_name_match); >>>>>>>>>>>>>>>>> + if (!dev) { >>>>>>>>>>>>>>>>> + NL_SET_ERR_MSG_MOD(info->extack, "device not >>>>>>>>>> found"); >>>>>>>>>>>>>>>>> + err = -ENODEV; >>>>>>>>>>>>>>>>> + goto dev_err; >>>>>>>>>>>>>>>>> + } >>>>>>>>>>>>>>>>> + vdev = container_of(dev, struct vdpa_device, dev); >>>>>>>>>>>>>>>>> + if (!vdev->mdev) { >>>>>>>>>>>>>>>>> + NL_SET_ERR_MSG_MOD(info->extack, "unmanaged >>>>>>>>>> vdpa >>>>>>>>>>>>>> device"); >>>>>>>>>>>>>>>>> + err = -EINVAL; >>>>>>>>>>>>>>>>> + goto mdev_err; >>>>>>>>>>>>>>>>> + } >>>>>>>>>>>>>>>>> + err = vdpa_dev_vendor_stats_fill(vdev, msg, info, index); >>>>>>>>>>>>>>>>> + if (!err) >>>>>>>>>>>>>>>>> + err = genlmsg_reply(msg, info); >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + put_device(dev); >>>>>>>>>>>>>>>>> + mutex_unlock(&vdpa_dev_mutex); >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + if (err) >>>>>>>>>>>>>>>>> + nlmsg_free(msg); >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + return err; >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> +mdev_err: >>>>>>>>>>>>>>>>> + put_device(dev); >>>>>>>>>>>>>>>>> +dev_err: >>>>>>>>>>>>>>>>> + mutex_unlock(&vdpa_dev_mutex); >>>>>>>>>>>>>>>>> + return err; >>>>>>>>>>>>>>>>> +} >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] >>>>>>>>>> = { >>>>>>>>>>>>>>>>> [VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type >>>>>>>>>>>> NLA_NUL_STRING }, >>>>>>>>>>>>>>>>> [VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING >>>>>>>>>>>> }, @@ - >>>>>>>>>>>>>> 997,6 >>>>>>>>>>>>>>>>> +1119,7 @@ static const struct nla_policy >>>>>>>>>>>>>> vdpa_nl_policy[VDPA_ATTR_MAX + 1] = { >>>>>>>>>>>>>>>>> [VDPA_ATTR_DEV_NET_CFG_MACADDR] >>>>>>>>>>>> NLA_POLICY_ETH_ADDR, >>>>>>>>>>>>>>>>> /* virtio spec 1.1 section 5.1.4.1 for valid MTU range */ >>>>>>>>>>>>>>>>> [VDPA_ATTR_DEV_NET_CFG_MTU] >>>>>>>>>>>> NLA_POLICY_MIN(NLA_U16, 68), >>>>>>>>>>>>>>>>> + [VDPA_ATTR_DEV_QUEUE_INDEX] >>>>>>>>>> NLA_POLICY_RANGE(NLA_U32, 0, >>>>>>>>>>>>>> 65535), >>>>>>>>>>>>>>>>> }; >>>>>>>>>>>>>>>>> static const struct genl_ops vdpa_nl_ops[] = { @@ -1030,6 >>>>>>>>>>>>>>>>> +1153,12 @@ static const struct genl_ops vdpa_nl_ops[] = { >>>>>>>>>>>>>>>>> .doit = vdpa_nl_cmd_dev_config_get_doit, >>>>>>>>>>>>>>>>> .dumpit = vdpa_nl_cmd_dev_config_get_dumpit, >>>>>>>>>>>>>>>>> }, >>>>>>>>>>>>>>>>> + { >>>>>>>>>>>>>>>>> + .cmd = VDPA_CMD_DEV_VSTATS_GET, >>>>>>>>>>>>>>>>> + .validate = GENL_DONT_VALIDATE_STRICT | >>>>>>>>>>>>>> GENL_DONT_VALIDATE_DUMP, >>>>>>>>>>>>>>>>> + .doit = vdpa_nl_cmd_dev_stats_get_doit, >>>>>>>>>>>>>>>>> + .flags = GENL_ADMIN_PERM, >>>>>>>>>>>>>>>>> + }, >>>>>>>>>>>>>>>>> }; >>>>>>>>>>>>>>>>> static struct genl_family vdpa_nl_family __ro_after_init >>>>>>>>>>>>>>>>> { diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index >>>>>>>>>>>>>>>>> 2de442ececae..274203845cfc 100644 >>>>>>>>>>>>>>>>> --- a/include/linux/vdpa.h >>>>>>>>>>>>>>>>> +++ b/include/linux/vdpa.h >>>>>>>>>>>>>>>>> @@ -275,6 +275,9 @@ struct vdpa_config_ops { >>>>>>>>>>>>>>>>> const struct vdpa_vq_state *state); >>>>>>>>>>>>>>>>> int (*get_vq_state)(struct vdpa_device *vdev, u16 idx, >>>>>>>>>>>>>>>>> struct vdpa_vq_state *state); >>>>>>>>>>>>>>>>> + int (*get_vendor_vq_stats)(struct vdpa_device *vdev, u16 idx, >>>>>>>>>>>>>>>>> + struct sk_buff *msg, >>>>>>>>>>>>>>>>> + struct netlink_ext_ack *extack); >>>>>>>>>>>>>>>>> struct vdpa_notification_area >>>>>>>>>>>>>>>>> (*get_vq_notification)(struct vdpa_device *vdev, u16 idx); >>>>>>>>>>>>>>>>> /* vq irq is not expected to be changed once DRIVER_OK is >>>>>>>>>>>>>>>>> set */ @@ -466,4 +469,6 @@ struct vdpa_mgmt_dev { >>>>>>>>>>>>>>>>> int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev); >>>>>>>>>>>>>>>>> void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev); >>>>>>>>>>>>>>>>> +#define VDPA_INVAL_QUEUE_INDEX 0xffff >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> #endif /* _LINUX_VDPA_H */ >>>>>>>>>>>>>>>>> diff --git a/include/uapi/linux/vdpa.h >>>>>>>>>>>>>>>>> b/include/uapi/linux/vdpa.h index 1061d8d2d09d..c5f229a41dc2 >>>>>>>>>>>>>>>>> 100644 >>>>>>>>>>>>>>>>> --- a/include/uapi/linux/vdpa.h >>>>>>>>>>>>>>>>> +++ b/include/uapi/linux/vdpa.h >>>>>>>>>>>>>>>>> @@ -18,6 +18,7 @@ enum vdpa_command { >>>>>>>>>>>>>>>>> VDPA_CMD_DEV_DEL, >>>>>>>>>>>>>>>>> VDPA_CMD_DEV_GET, /* can dump */ >>>>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET, /* can dump */ >>>>>>>>>>>>>>>>> + VDPA_CMD_DEV_VSTATS_GET, >>>>>>>>>>>>>>>>> }; >>>>>>>>>>>>>>>>> enum vdpa_attr { >>>>>>>>>>>>>>>>> @@ -46,6 +47,12 @@ enum vdpa_attr { >>>>>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, /* u64 */ >>>>>>>>>>>>>>>>> VDPA_ATTR_DEV_MGMTDEV_MAX_VQS, /* >>>>>>>>>>>> u32 */ >>>>>>>>>>>>>>>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES, /* u64 */ >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> + VDPA_ATTR_DEV_QUEUE_INDEX, /* u16 */ >>>>>>>>>>>>>>>>> + VDPA_ATTR_DEV_QUEUE_TYPE, /* string */ >>>>>>>>>>>>>>>>> + VDPA_ATTR_DEV_VENDOR_ATTR_NAME, /* >>>>>>>>>> string */ >>>>>>>>>>>>>>>>> + VDPA_ATTR_DEV_VENDOR_ATTR_VALUE, /* u64 */ >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> /* new attributes must be added above here */ >>>>>>>>>>>>>>>>> VDPA_ATTR_MAX, >>>>>>>>>>>>>>>>> };