Si-Wei Liu
2022-Aug-17 18:48 UTC
[PATCH V5 4/6] vDPA: !FEATURES_OK should not block querying device config space
On 8/16/2022 11:23 PM, Zhu, Lingshan wrote:> > > On 8/17/2022 2:14 PM, Michael S. Tsirkin wrote: >> On Wed, Aug 17, 2022 at 10:11:36AM +0800, Zhu, Lingshan wrote: >>> >>> On 8/17/2022 6:48 AM, Si-Wei Liu wrote: >>> >>> >>> >>> ???? On 8/16/2022 1:29 AM, Zhu, Lingshan wrote: >>> >>> >>> >>> ???????? On 8/16/2022 3:41 PM, Si-Wei Liu wrote: >>> >>> ???????????? Hi Michael, >>> >>> ???????????? I just noticed this patch got pulled to linux-next >>> prematurely >>> ???????????? without getting consensus on code review, am not sure >>> why. Hope it >>> ???????????? was just an oversight. >>> >>> ???????????? Unfortunately this introduced functionality regression >>> to at least >>> ???????????? two cases so far as I see: >>> >>> ???????????? 1. (bogus) VDPA_ATTR_DEV_NEGOTIATED_FEATURES are >>> inadvertently >>> ???????????? exposed and displayed in "vdpa dev config show" before >>> feature >>> ???????????? negotiation is done. Noted the corresponding features >>> name shown in >>> ???????????? vdpa tool is called "negotiated_features" rather than >>> ???????????? "driver_features". I see in no way the intended change >>> of the patch >>> ???????????? should break this user level expectation regardless of >>> any spec >>> ???????????? requirement. Do you agree on this point? >>> >>> ???????? I will post a patch for iptour2, doing: >>> ???????? 1) if iprout2 does not get driver_features from the kernel, >>> then don't >>> ???????? show negotiated features in the command output >>> >>> ???? This won't work as the vdpa userspace tool won't know *when* >>> features are >>> ???? negotiated. There's no guarantee in the kernel to assume 0 will >>> be returned >>> ???? from vendor driver during negotiation. On the other hand, with >>> the supposed >>> ???? change, userspace can't tell if there's really none of features >>> negotiated, >>> ???? or the feature negotiation is over. Before the change the >>> userspace either >>> ???? gets all the attributes when feature negotiation is over, or it >>> gets >>> ???? nothing when it's ongoing, so there was a distinction.This >>> expectation of >>> ???? what "negotiated_features" represents is established from day >>> one, I see no >>> ???? reason the intended kernel change to show other attributes >>> should break >>> ???? userspace behavior and user's expectation. >>> >>> User space can only read valid *driver_features* after the features >>> negotiation >>> is done, *device_features* does not require the negotiation. >>> >>> If you want to prevent random values read from driver_features, here >>> I propose >>> a fix: only read driver_features when the negotiation is done, this >>> means to >>> check (status & VIRTIO_CONFIG_S_FEATURES_OK) before reading the >>> driver_features. >>> Sounds good? >>> >>> @MST, if this is OK, I can include this change in my next version >>> patch series. >>> >>> Thanks, >>> Zhu Lingshan >> Sorry I don't get it. Is there going to be a new version? Do you want me >> to revert this one and then apply a new one? It's ok if yes. > Not a new version, it is a new patch, though I still didn't get the > race condition, but I believe it > is reasonable to block reading the *driver_features* before FEATURES_OK. > > So, I added code to check whether _FEATURES_OK is set: > > ?861???????? /* only read driver features after the feature > negotiation is done */ > ?862???????? status = vdev->config->get_status(vdev); > ?863???????? if (status & VIRTIO_CONFIG_S_FEATURES_OK) { > ?864???????????????? features_driver = > vdev->config->get_driver_features(vdev); > ?865???????????????? if (nla_put_u64_64bit(msg, > VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver, > ?866?????????????????????????????????????? VDPA_ATTR_PAD)) > ?867???????????????? return -EMSGSIZE; > ?868???????? } > > If this solution looks good, I will add this patch in my V2 series.This solves the 1st issue in my report, but without a fix for the 2nd issue ( vdpa_dev_net_config_fill and vdpa_set_features race) addressed I don't think it covers all incurred regressions. I've replied the way to reproduce the race. For me it's very obvious to see in my setup.> > > So what is the race? > You'll see the race if you keep 'vdpa dev config show ...' running in > a tight loop while launching a VM with the vDPA device under query. >-Siwei> > Thanks > Zhu Lingshan > >> >> >>> ???????? 2) process and decoding the device features. >>> >>> >>> ???????????? 2. There was also another implicit assumption that is >>> broken by >>> ???????????? this patch. There could be a vdpa tool query of config via >>> vdpa_dev_net_config_fill()->vdpa_get_config_unlocked() that races >>> ???????????? with the first vdpa_set_features() call from VMM e.g. >>> QEMU. Since >>> ???????????? the S_FEATURES_OK blocking condition is removed, if the >>> vdpa tool >>> ???????????? query occurs earlier than the first >>> set_driver_features() call from >>> ???????????? VMM, the following code will treat the guest as legacy >>> and then >>> ???????????? trigger an erroneous vdpa_set_features_unlocked(... , >>> 0) call to >>> ???????????? the vdpa driver: >>> >>> ???????????? ?374???????? /* >>> ???????????? ?375????????? * Config accesses aren't supposed to >>> trigger before >>> ???????????? features are set. >>> ???????????? ?376????????? * If it does happen we assume a legacy >>> guest. >>> ???????????? ?377????????? */ >>> ???????????? ?378???????? if (!vdev->features_valid) >>> ???????????? ?379 vdpa_set_features_unlocked(vdev, 0); >>> ???????????? ?380???????? ops->get_config(vdev, offset, buf, len); >>> >>> ???????????? Depending on vendor driver's implementation, L380 may >>> either return >>> ???????????? invalid config data (or invalid endianness if on BE) or >>> only config >>> ???????????? fields that are valid in legacy layout. What's more >>> severe is that, >>> ???????????? vdpa tool query in theory shouldn't affect feature >>> negotiation at >>> ???????????? all by making confusing calls to the device, but now it >>> is possible >>> ???????????? with the patch. Fixing this would require more delicate >>> work on the >>> ???????????? other paths involving the cf_lock reader/write semaphore. >>> >>> ???????????? Not sure what you plan to do next, post the fixes for >>> both issues >>> ???????????? and get the community review? Or simply revert the >>> patch in >>> ???????????? question? Let us know. >>> >>> ???????? The spec says: >>> ???????? The device MUST allow reading of any device-specific >>> configuration >>> ???????? field before FEATURES_OK is set by >>> ???????? the driver. This includes fields which are conditional on >>> feature bits, >>> ???????? as long as those feature bits are offered >>> ???????? by the device. >>> >>> ???????? so whether FEATURES_OK should not block reading the device >>> config >>> ???????? space. vdpa_get_config_unlocked() will read the features, I >>> don't know >>> ???????? why it has a comment: >>> ???????? ??????? /* >>> ???????? ???????? * Config accesses aren't supposed to trigger >>> before features >>> ???????? are set. >>> ???????? ???????? * If it does happen we assume a legacy guest. >>> ???????? ???????? */ >>> >>> ???????? This conflicts with the spec. >>> >>> ???????? vdpa_get_config_unlocked() checks vdev->features_valid, if >>> not valid, >>> ???????? it will set the drivers_features 0, I think this intends to >>> prevent >>> ???????? reading random driver_features. This function does not hold >>> any locks, >>> ???????? and didn't change anything. >>> >>> ???????? So what is the race? >>> ??? ???? You'll see the race if you keep 'vdpa dev config show ...' >>> running in a >>> ???? tight loop while launching a VM with the vDPA device under query. >>> >>> ???? -Siwei >>> >>> >>> >>> ??????? ???????? Thanks >>> >>> >>> ???????????? Thanks, >>> ???????????? -Siwei >>> >>> >>> ???????????? On 8/12/2022 3:44 AM, Zhu Lingshan wrote: >>> >>> ???????????????? Users may want to query the config space of a vDPA >>> device, >>> ???????????????? to choose a appropriate one for a certain guest. >>> This means the >>> ???????????????? users need to read the config space before >>> FEATURES_OK, and >>> ???????????????? the existence of config space contents does not >>> depend on >>> ???????????????? FEATURES_OK. >>> >>> ???????????????? The spec says: >>> ???????????????? The device MUST allow reading of any device-specific >>> ???????????????? configuration >>> ???????????????? field before FEATURES_OK is set by the driver. This >>> includes >>> ???????????????? fields which are conditional on feature bits, as >>> long as those >>> ???????????????? feature bits are offered by the device. >>> >>> ???????????????? Signed-off-by: Zhu Lingshan <lingshan.zhu at intel.com> >>> ???????????????? --- >>> ???????????????? ? drivers/vdpa/vdpa.c | 8 -------- >>> ???????????????? ? 1 file changed, 8 deletions(-) >>> >>> ???????????????? diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c >>> ???????????????? index 6eb3d972d802..bf312d9c59ab 100644 >>> ???????????????? --- a/drivers/vdpa/vdpa.c >>> ???????????????? +++ b/drivers/vdpa/vdpa.c >>> ???????????????? @@ -855,17 +855,9 @@ vdpa_dev_config_fill(struct >>> vdpa_device >>> ???????????????? *vdev, struct sk_buff *msg, u32 portid, >>> ???????????????? ? { >>> ???????????????? ????? u32 device_id; >>> ???????????????? ????? void *hdr; >>> ???????????????? -??? u8 status; >>> ???????????????? ????? int err; >>> ???????????????? ? ????? down_read(&vdev->cf_lock); >>> ???????????????? -??? status = vdev->config->get_status(vdev); >>> ???????????????? -??? if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { >>> ???????????????? -??????? NL_SET_ERR_MSG_MOD(extack, "Features >>> negotiation not >>> ???????????????? completed"); >>> ???????????????? -??????? err = -EAGAIN; >>> ???????????????? -??????? goto out; >>> ???????????????? -??? } >>> ???????????????? - >>> ???????????????? ????? hdr = genlmsg_put(msg, portid, seq, >>> &vdpa_nl_family, >>> ???????????????? flags, >>> ???????????????? ??????????????? VDPA_CMD_DEV_CONFIG_GET); >>> ???????????????? ????? if (!hdr) { >>> >>> >>> >>> >>> >>> >>> >>> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20220817/09682660/attachment-0001.html>