Jason Wang
2021-May-31 04:38 UTC
[PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
? 2021/5/31 ??12:27, Yongji Xie ??:> On Fri, May 28, 2021 at 10:31 AM Jason Wang <jasowang at redhat.com> wrote: >> >> ? 2021/5/27 ??9:17, Yongji Xie ??: >>> On Thu, May 27, 2021 at 4:41 PM Jason Wang <jasowang at redhat.com> wrote: >>>> ? 2021/5/27 ??3:34, Yongji Xie ??: >>>>> On Thu, May 27, 2021 at 1:40 PM Jason Wang <jasowang at redhat.com> wrote: >>>>>> ? 2021/5/27 ??1:08, Yongji Xie ??: >>>>>>> On Thu, May 27, 2021 at 1:00 PM Jason Wang <jasowang at redhat.com> wrote: >>>>>>>> ? 2021/5/27 ??12:57, Yongji Xie ??: >>>>>>>>> On Thu, May 27, 2021 at 12:13 PM Jason Wang <jasowang at redhat.com> wrote: >>>>>>>>>> ? 2021/5/17 ??5:55, Xie Yongji ??: >>>>>>>>>>> + >>>>>>>>>>> +static int vduse_dev_msg_sync(struct vduse_dev *dev, >>>>>>>>>>> + struct vduse_dev_msg *msg) >>>>>>>>>>> +{ >>>>>>>>>>> + init_waitqueue_head(&msg->waitq); >>>>>>>>>>> + spin_lock(&dev->msg_lock); >>>>>>>>>>> + vduse_enqueue_msg(&dev->send_list, msg); >>>>>>>>>>> + wake_up(&dev->waitq); >>>>>>>>>>> + spin_unlock(&dev->msg_lock); >>>>>>>>>>> + wait_event_killable(msg->waitq, msg->completed); >>>>>>>>>> What happens if the userspace(malicous) doesn't give a response forever? >>>>>>>>>> >>>>>>>>>> It looks like a DOS. If yes, we need to consider a way to fix that. >>>>>>>>>> >>>>>>>>> How about using wait_event_killable_timeout() instead? >>>>>>>> Probably, and then we need choose a suitable timeout and more important, >>>>>>>> need to report the failure to virtio. >>>>>>>> >>>>>>> Makes sense to me. But it looks like some >>>>>>> vdpa_config_ops/virtio_config_ops such as set_status() didn't have a >>>>>>> return value. Now I add a WARN_ON() for the failure. Do you mean we >>>>>>> need to add some change for virtio core to handle the failure? >>>>>> Maybe, but I'm not sure how hard we can do that. >>>>>> >>>>> We need to change all virtio device drivers in this way. >>>> Probably. >>>> >>>> >>>>>> We had NEEDS_RESET but it looks we don't implement it. >>>>>> >>>>> Could it handle the failure of get_feature() and get/set_config()? >>>> Looks not: >>>> >>>> " >>>> >>>> The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state >>>> that a reset is needed. If DRIVER_OK is set, after it sets >>>> DEVICE_NEEDS_RESET, the device MUST send a device configuration change >>>> notification to the driver. >>>> >>>> " >>>> >>>> This looks implies that NEEDS_RESET may only work after device is >>>> probed. But in the current design, even the reset() is not reliable. >>>> >>>> >>>>>> Or a rough idea is that maybe need some relaxing to be coupled loosely >>>>>> with userspace. E.g the device (control path) is implemented in the >>>>>> kernel but the datapath is implemented in the userspace like TUN/TAP. >>>>>> >>>>> I think it can work for most cases. One problem is that the set_config >>>>> might change the behavior of the data path at runtime, e.g. >>>>> virtnet_set_mac_address() in the virtio-net driver and >>>>> cache_type_store() in the virtio-blk driver. Not sure if this path is >>>>> able to return before the datapath is aware of this change. >>>> Good point. >>>> >>>> But set_config() should be rare: >>>> >>>> E.g in the case of virtio-net with VERSION_1, config space is read only, >>>> and it was set via control vq. >>>> >>>> For block, we can >>>> >>>> 1) start from without WCE or >>>> 2) we add a config change notification to userspace or >>> I prefer this way. And I think we also need to do similar things for >>> set/get_vq_state(). >> >> Yes, I agree. >> > Hi Jason, > > Now I'm working on this. But I found the config change notification > must be synchronous in the virtio-blk case, which means the kernel > still needs to wait for the response from userspace in set_config(). > Otherwise, some I/Os might still run the old way after we change the > cache_type in sysfs. > > The simple ways to solve this problem are: > > 1. Only support read-only config space, disable WCE as you suggested > 2. Add a return value to set_config() and handle the failure only in > virtio-blk driver > 3. Print some warnings after timeout since it only affects the > dataplane which is under userspace's control > > Any suggestions?Let's go without WCE first and make VDUSE work first. We can then think of a solution for WCE on top. Thanks> > Thanks, > Yongji >