Jason Wang
2021-Apr-16 05:39 UTC
[PATCH v6 10/10] Documentation: Add documentation for VDUSE
? 2021/4/16 ??11:19, Yongji Xie ??:> On Fri, Apr 16, 2021 at 10:24 AM Jason Wang <jasowang at redhat.com> wrote: >> >> ? 2021/4/15 ??10:38, Stefan Hajnoczi ??: >>> On Thu, Apr 15, 2021 at 04:36:35PM +0800, Jason Wang wrote: >>>> ? 2021/4/15 ??3:19, Stefan Hajnoczi ??: >>>>> On Thu, Apr 15, 2021 at 01:38:37PM +0800, Yongji Xie wrote: >>>>>> On Wed, Apr 14, 2021 at 10:15 PM Stefan Hajnoczi <stefanha at redhat.com> wrote: >>>>>>> On Wed, Mar 31, 2021 at 04:05:19PM +0800, Xie Yongji wrote: >>>>>>>> VDUSE (vDPA Device in Userspace) is a framework to support >>>>>>>> implementing software-emulated vDPA devices in userspace. This >>>>>>>> document is intended to clarify the VDUSE design and usage. >>>>>>>> >>>>>>>> Signed-off-by: Xie Yongji <xieyongji at bytedance.com> >>>>>>>> --- >>>>>>>> Documentation/userspace-api/index.rst | 1 + >>>>>>>> Documentation/userspace-api/vduse.rst | 212 ++++++++++++++++++++++++++++++++++ >>>>>>>> 2 files changed, 213 insertions(+) >>>>>>>> create mode 100644 Documentation/userspace-api/vduse.rst >>>>>>> Just looking over the documentation briefly (I haven't studied the code >>>>>>> yet)... >>>>>>> >>>>>> Thank you! >>>>>> >>>>>>>> +How VDUSE works >>>>>>>> +------------ >>>>>>>> +Each userspace vDPA device is created by the VDUSE_CREATE_DEV ioctl on >>>>>>>> +the character device (/dev/vduse/control). Then a device file with the >>>>>>>> +specified name (/dev/vduse/$NAME) will appear, which can be used to >>>>>>>> +implement the userspace vDPA device's control path and data path. >>>>>>> These steps are taken after sending the VDPA_CMD_DEV_NEW netlink >>>>>>> message? (Please consider reordering the documentation to make it clear >>>>>>> what the sequence of steps are.) >>>>>>> >>>>>> No, VDUSE devices should be created before sending the >>>>>> VDPA_CMD_DEV_NEW netlink messages which might produce I/Os to VDUSE. >>>>> I see. Please include an overview of the steps before going into detail. >>>>> Something like: >>>>> >>>>> VDUSE devices are started as follows: >>>>> >>>>> 1. Create a new VDUSE instance with ioctl(VDUSE_CREATE_DEV) on >>>>> /dev/vduse/control. >>>>> >>>>> 2. Begin processing VDUSE messages from /dev/vduse/$NAME. The first >>>>> messages will arrive while attaching the VDUSE instance to vDPA. >>>>> >>>>> 3. Send the VDPA_CMD_DEV_NEW netlink message to attach the VDUSE >>>>> instance to vDPA. >>>>> >>>>> VDUSE devices are stopped as follows: >>>>> >>>>> ... >>>>> >>>>>>>> + static int netlink_add_vduse(const char *name, int device_id) >>>>>>>> + { >>>>>>>> + struct nl_sock *nlsock; >>>>>>>> + struct nl_msg *msg; >>>>>>>> + int famid; >>>>>>>> + >>>>>>>> + nlsock = nl_socket_alloc(); >>>>>>>> + if (!nlsock) >>>>>>>> + return -ENOMEM; >>>>>>>> + >>>>>>>> + if (genl_connect(nlsock)) >>>>>>>> + goto free_sock; >>>>>>>> + >>>>>>>> + famid = genl_ctrl_resolve(nlsock, VDPA_GENL_NAME); >>>>>>>> + if (famid < 0) >>>>>>>> + goto close_sock; >>>>>>>> + >>>>>>>> + msg = nlmsg_alloc(); >>>>>>>> + if (!msg) >>>>>>>> + goto close_sock; >>>>>>>> + >>>>>>>> + if (!genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, famid, 0, 0, >>>>>>>> + VDPA_CMD_DEV_NEW, 0)) >>>>>>>> + goto nla_put_failure; >>>>>>>> + >>>>>>>> + NLA_PUT_STRING(msg, VDPA_ATTR_DEV_NAME, name); >>>>>>>> + NLA_PUT_STRING(msg, VDPA_ATTR_MGMTDEV_DEV_NAME, "vduse"); >>>>>>>> + NLA_PUT_U32(msg, VDPA_ATTR_DEV_ID, device_id); >>>>>>> What are the permission/capability requirements for VDUSE? >>>>>>> >>>>>> Now I think we need privileged permission (root user). Because >>>>>> userspace daemon is able to access avail vring, used vring, descriptor >>>>>> table in kernel driver directly. >>>>> Please state this explicitly at the start of the document. Existing >>>>> interfaces like FUSE are designed to avoid trusting userspace. >>>> There're some subtle difference here. VDUSE present a device to kernel which >>>> means IOMMU is probably the only thing to prevent a malicous device. >>>> >>>> >>>>> Therefore >>>>> people might think the same is the case here. It's critical that people >>>>> are aware of this before deploying VDUSE with virtio-vdpa. >>>>> >>>>> We should probably pause here and think about whether it's possible to >>>>> avoid trusting userspace. Even if it takes some effort and costs some >>>>> performance it would probably be worthwhile. >>>> Since the bounce buffer is used the only attack surface is the coherent >>>> area, if we want to enforce stronger isolation we need to use shadow >>>> virtqueue (which is proposed in earlier version by me) in this case. But I'm >>>> not sure it's worth to do that. >>> The security situation needs to be clear before merging this feature. >> >> +1 >> >> >>> I think the IOMMU and vring can be made secure. What is more concerning >>> is the kernel code that runs on top: VIRTIO device drivers, network >>> stack, file systems, etc. They trust devices to an extent. >>> >>> Since virtio-vdpa is a big reason for doing VDUSE in the first place I >>> don't think it makes sense to disable virtio-vdpa with VDUSE. A solution >>> is needed. >> >> Yes, so the case of VDUSE is something similar to the case of e.g SEV. >> >> Both cases won't trust device and use some kind of software IOTLB. >> >> That means we need to protect at both IOTLB and virtio drivers. >> >> Let me post patches for virtio first. >> > Looking forward your patches. > > Thanks. > Yongji >Fortuantely, packed ring has already did this since the descriptor talbe is expected to be re-wrote by the device. I just need to conver the split ring. Thanks