Stefano Garzarella
2021-Apr-09 14:27 UTC
[virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] Re: [MASSMAIL KLMS] Re: [virtio-comment] [RFC PATCH v4 2/2] virtio-vsock: SOCK_SEQPACKET description
On Sat, Apr 03, 2021 at 12:45:54PM +0300, Arseny Krasnov wrote:> >On 31.03.2021 17:48, Stefan Hajnoczi wrote: >> On Tue, Mar 30, 2021 at 05:24:19PM +0300, Arseny Krasnov wrote: >>> On 30.03.2021 16:57, Stefan Hajnoczi wrote: >>>> On Tue, Mar 30, 2021 at 12:50:06PM +0300, Arseny Krasnov wrote: >>>>> On 30.03.2021 11:55, Stefan Hajnoczi wrote: >>>>>> On Tue, Mar 30, 2021 at 09:15:39AM +0300, Arseny Krasnov wrote: >>>>>>> On 30.03.2021 00:28, Stefano Garzarella wrote: >>>>>>>> On Mon, Mar 29, 2021 at 08:33:27PM +0300, Arseny Krasnov wrote: >>>>>>>>> On 29.03.2021 19:11, Stefan Hajnoczi wrote: >>>>>>>>>> On Fri, Mar 26, 2021 at 12:02:50PM +0300, Arseny Krasnov wrote: >>>>>>>>>>> @@ -98,6 +102,10 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op >>>>>>>>>>> #define VIRTIO_VSOCK_OP_CREDIT_UPDATE 6 >>>>>>>>>>> /* Request the peer to send the credit info to us */ >>>>>>>>>>> #define VIRTIO_VSOCK_OP_CREDIT_REQUEST 7 >>>>>>>>>>> +/* Message begin for SOCK_SEQPACKET */ >>>>>>>>>>> +#define VIRTIO_VSOCK_OP_SEQ_BEGIN 8 >>>>>>>>>>> +/* Message end for SOCK_SEQPACKET */ >>>>>>>>>>> +#define VIRTIO_VSOCK_OP_SEQ_END 9 >>>>>>>>>> The struct virtio_vsock_hdr->flags field is le32 and currently unused. >>>>>>>>>> Could 24 bits be used for a unique message id and 8 bits for flags? 1 >>>>>>>>>> flag bit could be used for end-of-message and the remaining 7 bits could >>>>>>>>>> be reserved. That way SEQ_BEGIN and SEQ_END are not necessary. >>>>>>>>>> Pressure >>>>>>>>>> on the virtqueue would be reduced and performance should be comparable >>>>>>>>>> to SOCK_STREAM. >>>>>>>>> Well, my first versions of SOCK_SEQPACKET implementation, worked >>>>>>>>> something like this: i used flags field of header as length of whole >>>>>>>>> message. I discussed it with Stefano Garzarella, and he told that it >>>>>>>>> will >>>>>>>>> be better to use special "header" in packet's payload, to keep some >>>>>>>>> SOCK_SEQPACKET specific data, instead of reusing packet's header >>>>>>>>> fields. >>>>>>>> IIRC in the first implementation SEQ_BEGIN was an empty message and we >>>>>>>> didn't added the msg_id yet. So since we needed to carry both id and >>>>>>>> total length, I suggested to use the payload to put these extra >>>>>>>> information. >>>>>>>> >>>>>>>> IIUC what Stefan is suggesting is a bit different and I think it should >>>>>>>> be cool to implement: we can remove the boundary packets, use only 8 >>>>>>>> bits for the flags, and add a new field to reuse the 24 unused bits, >>>>>>>> maybe also 16 bits would be enough. >>>>>>>> At that point we will only use the EOR flag to know the last packet. >>>>>>>> >>>>>>>> The main difference will be that the receiver will know the total size >>>>>>>> only when the last packet is received. >>>>>>>> >>>>>>>> Do you see any issue on that approach? >>>>>>> It will work, except we can't check that any packet of message, >>>>>>> >>>>>>> except last(with EOR bit) was dropped, since receiver don't know >>>>>>> >>>>>>> real length of message. If it is ok, then i can implement it. >>>>>> The credit mechanism ensures that packets are not dropped, so it's not >>>>>> necessary to check for this condition. >>>>>> >>>>>> By the way, is a unique message ID needed? My understanding is: >>>>>> >>>>>> If two messages are being sent on a socket at the same time either their >>>>>> order is serialized (whichever message began first) or it is undefined >>>>>> (whichever message completes first). >>>>> If we are talking about case, when two threads writes to one socket at the same time, >>>>> >>>>> in Linux it is possible that two message will interleave(for vsock). But as i know, for example >>>>> >>>>> when TCP socket is used, both 'write()' calls will be serialized. May be it is bug of vsock: when >>>>> >>>>> first writer goes out of space, it will sleep. Then second writer tries to send something, but >>>>> >>>>> as free space is over, it will sleep too. Then, credit update is received from peer. Both sender's >>>>> >>>>> will be woken up, but sender which grab socket lock first will continue to send it's message. >>>>> >>>>> So may be we can add something like semaphore to new/vmw_vsock/af_vsock.c which will >>>>> >>>>> serialize two 'write()' calls: second sender enters 'write()' path, only when first left this path. >>>>> >>>>> My implementation doesn't care about that, because i wanted to add semaphore later, by another >>>>> >>>>> patch. >>>> Yes, that is definitely an issue that the driver needs to take care of >>>> if we don't have unique message IDs. Thanks for explaining! >>> So may I? include patch with serializer to next version of my patchset? >> Sounds good! > >There is small problem with approach, when we remove SEQ_BEGIN/SEQ_END > >bounds of messages in flags field(EOR bit ) of packet header: consider case, when vhost sends data > >to guest(let it be 64kb). In vhost vsock driver, big buffer of packet(for example 64kb) will > >be splitted to small buffers, to fit guests's virtio rx descriptors(in current implementation of > >Linux it is 4kb). All fields of new headers in rx queue will be copied from fields of header > >of big packet(except len field).? Thus we get 16 headers with? EOR bit set. >set. > > >May be it is possible to: > >1) Handle such bit in vhost driver(set EOR bit in flags only for last small buffer of guests rx queue) > >OR > >2) I can remove SEQ_BEGIN, msg_len and msg_id. But keep SEQ_END op. This will be special > >packet which marks end of message without any payload. In this case, such packet will be > >processed by vhost "as is". > > >What do You think? >IMHO option 1 is the best and should not be too complicated to implement. Do you see a specific issue? Thanks, Stefano