Stefano Garzarella
2021-Mar-25 10:52 UTC
[RFC PATCH v7 00/22] virtio/vsock: introduce SOCK_SEQPACKET support
Hi Arseny, On Tue, Mar 23, 2021 at 04:07:13PM +0300, Arseny Krasnov wrote:> This patchset implements support of SOCK_SEQPACKET for virtio >transport. > As SOCK_SEQPACKET guarantees to save record boundaries, so to >do it, two new packet operations were added: first for start of record > and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also, >both operations carries metadata - to maintain boundaries and payload >integrity. Metadata is introduced by adding special header with two >fields - message id and message length: > > struct virtio_vsock_seq_hdr { > __le32 msg_id; > __le32 msg_len; > } __attribute__((packed)); > > This header is transmitted as payload of SEQ_BEGIN and SEQ_END >packets(buffer of second virtio descriptor in chain) in the same way as >data transmitted in RW packets. Payload was chosen as buffer for this >header to avoid touching first virtio buffer which carries header of >packet, because someone could check that size of this buffer is equal >to size of packet header. To send record, packet with start marker is >sent first(it's header carries length of record and id),then all data >is sent as usual 'RW' packets and finally SEQ_END is sent(it carries >id of message, which is equal to id of SEQ_BEGIN), also after sending >SEQ_END id is incremented. On receiver's side,size of record is known >from packet with start record marker. To check that no packets were >dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END >are checked to be equal and length of data between two markers is >compared to then length in SEQ_BEGIN header. > Now as packets of one socket are not reordered neither on >vsock nor on vhost transport layers, such markers allows to restore >original record on receiver's side. If user's buffer is smaller that >record length, when all out of size data is dropped. > Maximum length of datagram is not limited as in stream socket, >because same credit logic is used. Difference with stream socket is >that user is not woken up until whole record is received or error >occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags. > Tests also implemented. > > Thanks to stsp2 at yandex.ru for encouragements and initial design >recommendations. > > Arseny Krasnov (22): > af_vsock: update functions for connectible socket > af_vsock: separate wait data loop > af_vsock: separate receive data loop > af_vsock: implement SEQPACKET receive loop > af_vsock: separate wait space loop > af_vsock: implement send logic for SEQPACKET > af_vsock: rest of SEQPACKET support > af_vsock: update comments for stream sockets > virtio/vsock: set packet's type in virtio_transport_send_pkt_info() > virtio/vsock: simplify credit update function API > virtio/vsock: dequeue callback for SOCK_SEQPACKET > virtio/vsock: fetch length for SEQPACKET record > virtio/vsock: add SEQPACKET receive logic > virtio/vsock: rest of SOCK_SEQPACKET support > virtio/vsock: SEQPACKET support feature bit > virtio/vsock: setup SEQPACKET ops for transport > vhost/vsock: setup SEQPACKET ops for transport > vsock/loopback: setup SEQPACKET ops for transport > vhost/vsock: SEQPACKET feature bit support > virtio/vsock: SEQPACKET feature bit support > vsock_test: add SOCK_SEQPACKET tests > virtio/vsock: update trace event for SEQPACKET > > drivers/vhost/vsock.c | 21 +- > include/linux/virtio_vsock.h | 21 + > include/net/af_vsock.h | 9 + > .../events/vsock_virtio_transport_common.h | 48 +- > include/uapi/linux/virtio_vsock.h | 19 + > net/vmw_vsock/af_vsock.c | 581 +++++++++++------ > net/vmw_vsock/virtio_transport.c | 17 + > net/vmw_vsock/virtio_transport_common.c | 379 +++++++++-- > net/vmw_vsock/vsock_loopback.c | 12 + > tools/testing/vsock/util.c | 32 +- > tools/testing/vsock/util.h | 3 + > tools/testing/vsock/vsock_test.c | 126 ++++ > 12 files changed, 1015 insertions(+), 253 deletions(-) > > v6 -> v7: > General changelog: > - virtio transport callback for message length now removed > from transport. Length of record is returned by dequeue > callback. > > - function which tries to get message length now returns 0 > when rx queue is empty. Also length of current message in > progress is set to 0, when message processed or error > happens. > > - patches for virtio feature bit moved after patches with > transport ops. > > Per patch changelog: > see every patch after '---' line.I reviewed the series and I left some comments, I think we are at a good point, but we should have the specification accepted before merging this series to avoid having to change the implementation later. What do you think? Thanks, Stefano