Si-Wei Liu
2022-Aug-19 00:04 UTC
[PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
On 8/17/2022 11:52 PM, Zhu, Lingshan wrote:> > > On 8/18/2022 2:48 AM, Si-Wei Liu wrote: >> >> >> On 8/16/2022 11:23 PM, Zhu, Lingshan wrote: >>> >>> >>> On 8/17/2022 2:14 PM, Michael S. Tsirkin wrote: >>>> On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote: >>>>> >>>>> On 8/17/2022 6:48 AM, Si-Wei Liu wrote: >>>>> >>>>> >>>>> >>>>> ???? On 8/16/2022 1:29 AM, Zhu, Lingshan wrote: >>>>> >>>>> >>>>> >>>>> ???????? On 8/16/2022 3:41 PM, Si-Wei Liu wrote: >>>>> >>>>> ???????????? Hi Michael, >>>>> >>>>> ???????????? I just noticed this patch got pulled to linux-next >>>>> prematurely >>>>> ???????????? without getting consensus on code review, am not sure >>>>> why. Hope it >>>>> ???????????? was just an oversight. >>>>> >>>>> ???????????? Unfortunately this introduced functionality >>>>> regression to at least >>>>> ???????????? two cases so far as I see: >>>>> >>>>> ???????????? 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are >>>>> inadvertently >>>>> ???????????? exposed and displayed in "vdpa dev config show" >>>>> before feature >>>>> ???????????? negotiation is done. Noted the corresponding features >>>>> name shown in >>>>> ???????????? vdpa tool is called "negotiated_features" rather than >>>>> ???????????? "driver_features". I see in no way the intended >>>>> change of the patch >>>>> ???????????? should break this user level expectation regardless >>>>> of any spec >>>>> ???????????? requirement. Do you agree on this point? >>>>> >>>>> ???????? I will post a patch for iptour2, doing: >>>>> ???????? 1) if iprout2 does not get driver_features from the >>>>> kernel, then don't >>>>> ???????? show negotiated features in the command output >>>>> >>>>> ???? This won't work as the vdpa userspace tool won't know *when* >>>>> features are >>>>> ???? negotiated. There's no guarantee in the kernel to assume 0 >>>>> will be returned >>>>> ???? from vendor driver during negotiation. On the other hand, >>>>> with the supposed >>>>> ???? change, userspace can't tell if there's really none of >>>>> features negotiated, >>>>> ???? or the feature negotiation is over. Before the change the >>>>> userspace either >>>>> ???? gets all the attributes when feature negotiation is over, or >>>>> it gets >>>>> ???? nothing when it's ongoing, so there was a distinction.This >>>>> expectation of >>>>> ???? what "negotiated_features" represents is established from day >>>>> one, I see no >>>>> ???? reason the intended kernel change to show other attributes >>>>> should break >>>>> ???? userspace behavior and user's expectation. >>>>> >>>>> User space can only read valid *driver_features* after the >>>>> features negotiation >>>>> is done, *device_features* does not require the negotiation. >>>>> >>>>> If you want to prevent random values read from driver_features, >>>>> here I propose >>>>> a fix: only read driver_features when the negotiation is done, >>>>> this means to >>>>> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the >>>>> driver_features. >>>>> Sounds good? >>>>> >>>>> @MST, if this is OK, I can include this change in my next version >>>>> patch series. >>>>> >>>>> Thanks, >>>>> Zhu Lingshan >>>> Sorry I don't get it. Is there going to be a new version? Do you >>>> want me >>>> to revert this one and then apply a new one? It's ok if yes. >>> Not a new version, it is a new patch, though I still didn't get the >>> race condition, but I believe it >>> is reasonable to block reading the *driver_features* before >>> FEATURES_OK. >>> >>> So, I added code to check whether _FEATURES_OK is set: >>> >>> ?861???????? /* only read driver features after the feature >>> negotiation is done */ >>> ?862???????? status = vdev->config->get_status(vdev); >>> ?863???????? if (status & VIRTIO_CONFIG_S_FEATURES_OK) { >>> ?864???????????????? features_driver = >>> vdev->config->get_driver_features(vdev); >>> ?865???????????????? if (nla_put_u64_64bit(msg, >>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver, >>> ?866?????????????????????????????????????? VDPA_ATTR_PAD)) >>> ?867???????????????? return -EMSGSIZE; >>> ?868???????? } >>> >>> If this solution looks good, I will add this patch in my V2 series. >> This solves the 1st issue in my report, but without a fix for the 2nd >> issue ( vdpa_dev_net_config_fill and vdpa_set_features race) >> addressed I don't think it covers all incurred regressions. >> >> I've replied the way to reproduce the race. For me it's very obvious >> to see in my setup. > Though I still did not see any errors in my run, but I guess you mean > the race condition in set_features(), right?Yes.> > The spec says: The device MUST allow reading of any device-specific > configuration field before FEATURES_OK is set by the driver. > > So there is no need to check whether driver_features is set in > vdpa_get_config_unlocked(). We should remove the code checks > feature_valid and the code set_features to zero. This conflicts with > the spec. And so no race conditions.I don't disagree with the removal, you can try once more. This proposal had been attempted but rejected. -Siwei> > Thanks, > Zhu Lingshan >>> >>> > So what is the race? >>> You'll see the race if you keep 'vdpa dev config show ...' running >>> in a tight loop while launching a VM with the vDPA device under query. >>> >> -Siwei >> >>> >>> Thanks >>> Zhu Lingshan >>> >>>> >>>> >>>>> ???????? 2) process and decoding the device features. >>>>> >>>>> >>>>> ???????????? 2. There was also another implicit assumption that is >>>>> broken by >>>>> ???????????? this patch. There could be a vdpa tool query of >>>>> config via >>>>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races >>>>> ???????????? with the first vdpa_set_features() call from VMM e.g. >>>>> QEMU. Since >>>>> ???????????? the S_FEATURES_OK blocking condition is removed, if >>>>> the vdpa tool >>>>> ???????????? query occurs earlier than the first >>>>> set_driver_features() call from >>>>> ???????????? VMM, the following code will treat the guest as >>>>> legacy and then >>>>> ???????????? trigger an erroneous vdpa_set_features_unlocked(... , >>>>> 0) call to >>>>> ???????????? the vdpa driver: >>>>> >>>>> ???????????? ?374???????? /* >>>>> ???????????? ?375????????? * Config accesses aren't supposed to >>>>> trigger before >>>>> ???????????? features are set. >>>>> ???????????? ?376????????? * If it does happen we assume a legacy >>>>> guest. >>>>> ???????????? ?377????????? */ >>>>> ???????????? ?378???????? if (!vdev->features_valid) >>>>> ???????????? ?379 vdpa_set_features_unlocked(vdev, 0); >>>>> ???????????? ?380???????? ops->get_config(vdev, offset, buf, len); >>>>> >>>>> ???????????? Depending on vendor driver's implementation, L380 may >>>>> either return >>>>> ???????????? invalid config data (or invalid endianness if on BE) >>>>> or only config >>>>> ???????????? fields that are valid in legacy layout. What's more >>>>> severe is that, >>>>> ???????????? vdpa tool query in theory shouldn't affect feature >>>>> negotiation at >>>>> ???????????? all by making confusing calls to the device, but now >>>>> it is possible >>>>> ???????????? with the patch. Fixing this would require more >>>>> delicate work on the >>>>> ???????????? other paths involving the cf_lock reader/write >>>>> semaphore. >>>>> >>>>> ???????????? Not sure what you plan to do next, post the fixes for >>>>> both issues >>>>> ???????????? and get the community review? Or simply revert the >>>>> patch in >>>>> ???????????? question? Let us know. >>>>> >>>>> ???????? The spec says: >>>>> ???????? The device MUST allow reading of any device-specific >>>>> configuration >>>>> ???????? field before FEATURES_OK is set by >>>>> ???????? the driver. This includes fields which are conditional on >>>>> feature bits, >>>>> ???????? as long as those feature bits are offered >>>>> ???????? by the device. >>>>> >>>>> ???????? so whether FEATURES_OK should not block reading the >>>>> device config >>>>> ???????? space. vdpa_get_config_unlocked() will read the features, >>>>> I don't know >>>>> ???????? why it has a comment: >>>>> ???????? ??????? /* >>>>> ???????? ???????? * Config accesses aren't supposed to trigger >>>>> before features >>>>> ???????? are set. >>>>> ???????? ???????? * If it does happen we assume a legacy guest. >>>>> ???????? ???????? */ >>>>> >>>>> ???????? This conflicts with the spec. >>>>> >>>>> ???????? vdpa_get_config_unlocked() checks vdev->features_valid, >>>>> if not valid, >>>>> ???????? it will set the drivers_features 0, I think this intends >>>>> to prevent >>>>> ???????? reading random driver_features. This function does not >>>>> hold any locks, >>>>> ???????? and didn't change anything. >>>>> >>>>> ???????? So what is the race? >>>>> ??? ???? You'll see the race if you keep 'vdpa dev config show >>>>> ...' running in a >>>>> ???? tight loop while launching a VM with the vDPA device under >>>>> query. >>>>> >>>>> ???? -Siwei >>>>> >>>>> >>>>> >>>>> ??????? ???????? Thanks >>>>> >>>>> >>>>> ???????????? Thanks, >>>>> ???????????? -Siwei >>>>> >>>>> >>>>> ???????????? On 8/12/2022 3:44 AM, Zhu Lingshan wrote: >>>>> >>>>> ???????????????? Users may want to query the config space of a >>>>> vDPA device, >>>>> ???????????????? to choose a appropriate one for a certain guest. >>>>> This means the >>>>> ???????????????? users need to read the config space before >>>>> FEATURES_OK, and >>>>> ???????????????? the existence of config space contents does not >>>>> depend on >>>>> ???????????????? FEATURES_OK. >>>>> >>>>> ???????????????? The spec says: >>>>> ???????????????? The device MUST allow reading of any device-specific >>>>> ???????????????? configuration >>>>> ???????????????? field before FEATURES_OK is set by the driver. >>>>> This includes >>>>> ???????????????? fields which are conditional on feature bits, as >>>>> long as those >>>>> ???????????????? feature bits are offered by the device. >>>>> >>>>> ???????????????? Signed-off-by: Zhu Lingshan <lingshan.zhu at intel.com> >>>>> ???????????????? --- >>>>> ???????????????? ? drivers/vdpa/vdpa.c | 8 -------- >>>>> ???????????????? ? 1 file changed, 8 deletions(-) >>>>> >>>>> ???????????????? diff --git a/drivers/vdpa/vdpa.c >>>>> b/drivers/vdpa/vdpa.c >>>>> ???????????????? index 6eb3d972d802..bf312d9c59ab 100644 >>>>> ???????????????? --- a/drivers/vdpa/vdpa.c >>>>> ???????????????? +++ b/drivers/vdpa/vdpa.c >>>>> ???????????????? @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct >>>>> vdpa_device >>>>> ???????????????? *vdev, struct sk_buff *msg, u32 portid, >>>>> ???????????????? ? { >>>>> ???????????????? ????? u32 device_id; >>>>> ???????????????? ????? void *hdr; >>>>> ???????????????? -??? u8 status; >>>>> ???????????????? ????? int err; >>>>> down_read(&vdev->cf_lock); >>>>> ???????????????? -??? status = vdev->config->get_status(vdev); >>>>> ???????????????? -??? if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { >>>>> ???????????????? -??????? NL_SET_ERR_MSG_MOD(extack, "Features >>>>> negotiation not >>>>> ???????????????? completed"); >>>>> ???????????????? -??????? err = -EAGAIN; >>>>> ???????????????? -??????? goto out; >>>>> ???????????????? -??? } >>>>> ???????????????? - >>>>> ???????????????? ????? hdr = genlmsg_put(msg, portid, seq, >>>>> &vdpa_nl_family, >>>>> ???????????????? flags, >>>>> VDPA_CMD_DEV_CONFIG_GET); >>>>> ???????????????? ????? if (!hdr) { >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20220818/11dc6ba9/attachment-0001.html>