Jiang Wang .
2021-Mar-26 23:40 UTC
[External] Re: [RFC PATCH] virtio-vsock: add description for datagram type
Hi Michael and Stefan, I thought about this and discussed it with my colleague Cong Wang. One idea is to make current asynchronous send_pkt flow to be synchronous, then if the virtqueue is full, the function can return ENOMEM all the way back to the caller and the caller can check the return value of sendmsg and slow down when necessary. In the spec, we can put something like, if the virtqueue is full, the caller should be notified with an error etc. In terms of implementation, that means we will remove the current send_pkt_work for both stream and dgram sockets. Currently, the code path uses RCU and a work queue, then grab a mutex in the work queue function. Since we cannot grab mutex when in rcu critical section, we have to change RCU to a normal reference counting mechanism. I think this is doable. The drawback is that the reference counting in general spends a little more cycles than the RCU, so there is a small price to pay. Another option is to use Sleepable RCU and remove the work queue. What do you guys think? btw, I will also add some SENDBUF restrictions for the dgram sockets, but I don't think it needs to be in the spec. Regards, Jiang On Tue, Mar 23, 2021 at 1:53 AM Stefan Hajnoczi <stefanha at redhat.com> wrote:> > On Mon, Mar 22, 2021 at 07:23:14PM -0700, Jiang Wang . wrote: > > Got it. Will do. > > You could look at udp_sendmsg() to see how sockets compete when > transmitting to the same net device. > > I'm not very familiar with this but I guess that the qdisc (like > fq_codel) decides which packets to place into the device's tx queue. I > guess sk_buffs waiting to be put onto the device's tx queue are > accounted for against the socket's sndbuf. Further sendmsg calls will > fail with -ENOBUFS when the sndbuf limit is reached. > > It's not clear to me how much of the existing code can be reused since > vsock does not use sk_buff or netdev :(. > > Stefan
Stefan Hajnoczi
2021-Mar-29 09:25 UTC
[External] Re: [RFC PATCH] virtio-vsock: add description for datagram type
On Fri, Mar 26, 2021 at 04:40:09PM -0700, Jiang Wang . wrote:> I thought about this and discussed it with my colleague Cong Wang. > One idea is to make current asynchronous send_pkt flow to be synchronous, > then if the virtqueue is full, the function can return ENOMEM all the way back > to the caller and the caller can check the return value of sendmsg > and slow down when necessary. > > In the spec, we can put something like, if the virtqueue is full, the caller > should be notified with an error etc. > > In terms of implementation, that means we will remove the current > send_pkt_work for both stream and dgram sockets. Currently, the > code path uses RCU and a work queue, then grab a mutex in the > work queue function. Since we cannot grab mutex when in rcu > critical section, we have to change RCU to a normal reference > counting mechanism. I think this is doable. The drawback is > that the reference counting in general spends a little more > cycles than the RCU, so there is a small price to pay. Another > option is to use Sleepable RCU and remove the work queue. > > What do you guys think?I think the tx code path is like this because of reliable delivery. Maybe a separate datagram rx/tx code path would be simpler? Take the datagram tx virtqueue lock, try to add the packet into the virtqueue, and return -ENOBUFS if the virtqueue is full. Then use the datagram socket's sndbuf accounting to prevent queuing up too many packets. When a datagram tx virtqueue buffer is used by the device, select queued packets for transmission. Unlike the stream tx/rx code path there is no dependency between tx and rx because we don't have the credit mechanism.> btw, I will also add some SENDBUF restrictions for the dgram > sockets, but I don't think it needs to be in the spec.Yes, the spec doesn't need to explain implementation-specific issues. If there are common implementation issues then the spec can explain them in general terms (not referring to Linux internals) to help implementors. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20210329/9f3dcd59/attachment.sig>
Stefano Garzarella
2021-Mar-30 15:32 UTC
[External] Re: [RFC PATCH] virtio-vsock: add description for datagram type
Hi Jiang, On Fri, Mar 26, 2021 at 04:40:09PM -0700, Jiang Wang . wrote:>Hi Michael and Stefan, > >I thought about this and discussed it with my colleague Cong Wang. >One idea is to make current asynchronous send_pkt flow to be synchronous, >then if the virtqueue is full, the function can return ENOMEM all the way back >to the caller and the caller can check the return value of sendmsg >and slow down when necessary. > >In the spec, we can put something like, if the virtqueue is full, the caller >should be notified with an error etc. > >In terms of implementation, that means we will remove the current >send_pkt_work for both stream and dgram sockets. Currently, the >code path uses RCU and a work queue, then grab a mutex in the >work queue function. Since we cannot grab mutex when in rcu >critical section, we have to change RCU to a normal reference >counting mechanism. I think this is doable. The drawback is >that the reference counting in general spends a little more >cycles than the RCU, so there is a small price to pay. Another >option is to use Sleepable RCU and remove the work queue. > >What do you guys think?another thing that came to mind not related to the spec but to the Linux implementation, is the multi-transport support. When we discussed the initial proposals [1][2], we decided to take a shortcut for DGRAM, since the only transport with DGRAM support was vmci. So for now only a single transport with VSOCK_TRANSPORT_F_DGRAM set can be registered. Since also virtio-transport and vhost-transport will support DGRAM, we need to find a way to allow multiple transports that support DGRAM to be registered together to support nested VMs. Do you have already thought about how to solve this problem? We should definitely allow the registration of more transports with VSOCK_TRANSPORT_F_DGRAM, and dynamically choose which one to use when sending a packet. Thanks, Stefano [1] https://www.spinics.net/lists/netdev/msg570274.html [2] https://www.spinics.net/lists/netdev/msg575792.html