Jason Wang
2021-May-28 06:38 UTC
[PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
? 2021/5/28 ??11:54, Yongji Xie ??:> On Fri, May 28, 2021 at 9:33 AM Jason Wang <jasowang at redhat.com> wrote: >> >> ? 2021/5/27 ??6:14, Yongji Xie ??: >>> On Thu, May 27, 2021 at 4:43 PM Jason Wang <jasowang at redhat.com> wrote: >>>> ? 2021/5/27 ??4:41, Jason Wang ??: >>>>> ? 2021/5/27 ??3:34, Yongji Xie ??: >>>>>> On Thu, May 27, 2021 at 1:40 PM Jason Wang <jasowang at redhat.com> wrote: >>>>>>> ? 2021/5/27 ??1:08, Yongji Xie ??: >>>>>>>> On Thu, May 27, 2021 at 1:00 PM Jason Wang <jasowang at redhat.com> >>>>>>>> wrote: >>>>>>>>> ? 2021/5/27 ??12:57, Yongji Xie ??: >>>>>>>>>> On Thu, May 27, 2021 at 12:13 PM Jason Wang <jasowang at redhat.com> >>>>>>>>>> wrote: >>>>>>>>>>> ? 2021/5/17 ??5:55, Xie Yongji ??: >>>>>>>>>>>> + >>>>>>>>>>>> +static int vduse_dev_msg_sync(struct vduse_dev *dev, >>>>>>>>>>>> + struct vduse_dev_msg *msg) >>>>>>>>>>>> +{ >>>>>>>>>>>> + init_waitqueue_head(&msg->waitq); >>>>>>>>>>>> + spin_lock(&dev->msg_lock); >>>>>>>>>>>> + vduse_enqueue_msg(&dev->send_list, msg); >>>>>>>>>>>> + wake_up(&dev->waitq); >>>>>>>>>>>> + spin_unlock(&dev->msg_lock); >>>>>>>>>>>> + wait_event_killable(msg->waitq, msg->completed); >>>>>>>>>>> What happens if the userspace(malicous) doesn't give a response >>>>>>>>>>> forever? >>>>>>>>>>> >>>>>>>>>>> It looks like a DOS. If yes, we need to consider a way to fix that. >>>>>>>>>>> >>>>>>>>>> How about using wait_event_killable_timeout() instead? >>>>>>>>> Probably, and then we need choose a suitable timeout and more >>>>>>>>> important, >>>>>>>>> need to report the failure to virtio. >>>>>>>>> >>>>>>>> Makes sense to me. But it looks like some >>>>>>>> vdpa_config_ops/virtio_config_ops such as set_status() didn't have a >>>>>>>> return value. Now I add a WARN_ON() for the failure. Do you mean we >>>>>>>> need to add some change for virtio core to handle the failure? >>>>>>> Maybe, but I'm not sure how hard we can do that. >>>>>>> >>>>>> We need to change all virtio device drivers in this way. >>>>> Probably. >>>>> >>>>> >>>>>>> We had NEEDS_RESET but it looks we don't implement it. >>>>>>> >>>>>> Could it handle the failure of get_feature() and get/set_config()? >>>>> Looks not: >>>>> >>>>> " >>>>> >>>>> The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state >>>>> that a reset is needed. If DRIVER_OK is set, after it sets >>>>> DEVICE_NEEDS_RESET, the device MUST send a device configuration change >>>>> notification to the driver. >>>>> >>>>> " >>>>> >>>>> This looks implies that NEEDS_RESET may only work after device is >>>>> probed. But in the current design, even the reset() is not reliable. >>>>> >>>>> >>>>>>> Or a rough idea is that maybe need some relaxing to be coupled loosely >>>>>>> with userspace. E.g the device (control path) is implemented in the >>>>>>> kernel but the datapath is implemented in the userspace like TUN/TAP. >>>>>>> >>>>>> I think it can work for most cases. One problem is that the set_config >>>>>> might change the behavior of the data path at runtime, e.g. >>>>>> virtnet_set_mac_address() in the virtio-net driver and >>>>>> cache_type_store() in the virtio-blk driver. Not sure if this path is >>>>>> able to return before the datapath is aware of this change. >>>>> Good point. >>>>> >>>>> But set_config() should be rare: >>>>> >>>>> E.g in the case of virtio-net with VERSION_1, config space is read >>>>> only, and it was set via control vq. >>>>> >>>>> For block, we can >>>>> >>>>> 1) start from without WCE or >>>>> 2) we add a config change notification to userspace or >>>>> 3) extend the spec to use vq instead of config space >>>>> >>>>> Thanks >>>> Another thing if we want to go this way: >>>> >>>> We need find a way to terminate the data path from the kernel side, to >>>> implement to reset semantic. >>>> >>> Do you mean terminate the data path in vdpa_reset(). >> >> Yes. >> >> >>> Is it ok to just >>> notify userspace to stop data path asynchronously? >> >> For well-behaved userspace, yes but no for buggy or malicious ones. >> > But the buggy or malicious daemons can't do anything if my > understanding is correct.You're right. I originally thought there can still have bouncing. But consider we don't do that during fault. It should be safe.> >> I had an idea, how about terminate IOTLB in this case? Then we're in >> fact turn datapath off. >> > Sorry, I didn't get your point here. What do you mean by terminating > IOTLB?I meant terminate the bouncing but it looks safe after a second thought :) Thanks> Remove iotlb mapping? But userspace can still access the mapped > region. > > Thanks, > Yongji >