Michael S. Tsirkin
2023-Sep-02 08:35 UTC
[PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
On Sat, Sep 02, 2023 at 04:56:42AM +0000, Bobby Eshleman wrote:> On Fri, Sep 01, 2023 at 02:45:14PM +0200, Stefano Garzarella wrote: > > On Tue, Aug 29, 2023 at 09:29:45PM +0000, Bobby Eshleman wrote: > > > This adds support for datagrams to the virtio-vsock device. > > > > > > virtio-vsock already supports stream and seqpacket types. The existing > > > message types and header fields are extended to support datagrams. > > > Semantic differences between the flow types are stated, as well as any > > > additional requirements for devices and drivers implementing this > > > feature. > > > > > > Signed-off-by: Bobby Eshleman <bobby.eshleman at bytedance.com> > > > --- > > > device-types/vsock/description.tex | 95 +++++++++++++++++++++++++++--- > > > 1 file changed, 88 insertions(+), 7 deletions(-) > > > > > > diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex > > > index 7d91d159872f..638dca8e5da1 100644 > > > --- a/device-types/vsock/description.tex > > > +++ b/device-types/vsock/description.tex > > > @@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits} > > > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported. > > > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported. > > > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied. > > > +\item[VIRTIO_VSOCK_F_DGRAM (3)] datagram socket type is supported. > > > \end{description} > > > > > > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits} > > > @@ -167,17 +168,22 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera > > > consists of a (cid, port number) tuple. The header fields used for this are > > > \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}. > > > > > > -Currently stream and seqpacket sockets are supported. \field{type} is 1 (VIRTIO_VSOCK_TYPE_STREAM) > > > -for stream socket types, and 2 (VIRTIO_VSOCK_TYPE_SEQPACKET) for seqpacket socket types. > > > + > > > +Currently stream, seqpacket, and datagram sockets are supported. \field{type} is > > > +1 (VIRTIO_VSOCK_TYPE_STREAM) for stream socket types, 2 (VIRTIO_VSOCK_TYPE_SEQPACKET) for > > > +seqpacket socket types, and 3 (VIRTIO_VSOCK_TYPE_DGRAM) for datagram socket types. > > > > > > \begin{lstlisting} > > > #define VIRTIO_VSOCK_TYPE_STREAM 1 > > > #define VIRTIO_VSOCK_TYPE_SEQPACKET 2 > > > +#define VIRTIO_VSOCK_TYPE_DGRAM 3 > > > \end{lstlisting} > > > > > > Stream sockets provide in-order, guaranteed, connection-oriented delivery > > > without message boundaries. Seqpacket sockets provide in-order, guaranteed, > > > -connection-oriented delivery with message and record boundaries. > > > +connection-oriented delivery with message and record boundaries. Datagram > > > +sockets provide connection-less, best-effort delivery of messages, with no > > > +order or reliability guarantees. > > > > > > \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / Device Operation / Buffer Space Management} > > > \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of > > > @@ -203,16 +209,19 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / > > > previously receiving a VIRTIO_VSOCK_OP_CREDIT_REQUEST packet. This allows > > > communicating updates any time a change in buffer space occurs. > > > > > > +\field{buf_alloc} and \field{fwd_cnt} are reserved for future use by datagram > > > +sockets. These fields are not used for datagram buffer space management. > > > + > > > \drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Buffer Space Management} > > > -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has > > > -sufficient free buffer space for the payload. > > > +For stream and seqpacket flows, VIRTIO_VSOCK_OP_RW data packets MUST only be > > > +transmitted when the peer has sufficient free buffer space for the payload. > > > > > > All packets associated with a stream flow MUST contain valid information in > > > \field{buf_alloc} and \field{fwd_cnt} fields. > > > > > > \devicenormative{\paragraph}{Device Operation: Buffer Space > > > Management}{Device Types / Socket Device / Device Operation / Buffer > > > Space Management} > > > -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has > > > -sufficient free buffer space for the payload. > > > +For stream and seqpacket flows, VIRTIO_VSOCK_OP_RW data packets MUST only be > > > +transmitted when the peer has sufficient free buffer space for the payload. > > > > > > All packets associated with a stream flow MUST contain valid information in > > > \field{buf_alloc} and \field{fwd_cnt} fields. > > > @@ -299,6 +308,78 @@ \subsubsection{Seqpacket Sockets}\label{sec:Device Types / Socket Device / Devic > > > #define VIRTIO_VSOCK_SEQ_EOR (1 << 1) > > > \end{lstlisting} > > > > > > +\subsubsection{Datagram Sockets}\label{sec:Device Types / Socket Device / Device Operation / Datagram Sockets} > > > + > > > +\drivernormative{\paragraph}{Device Operation: Packet Fragmentation}{Device Types / Socket Device / Datagram Sockets / Fragmentation} > > > + > > > +Drivers MAY disassemble packets into smaller fragments. If drivers fragment a > > > +packet, they MUST follow the fragmentation rules described in section > > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}. > > > + > > > +Drivers MUST support assembly of received packet fragments according to the > > > +fragmentation rules described in section > > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram > > > Sockets / Fragmentation}. > > > + > > > +\devicenormative{\paragraph}{Device Operation: Packet Fragmentation}{Device Types / Socket Device / Datagram Sockets / Fragmentation} > > > + > > > +Devices MAY disassemble packets into smaller fragments. If devices fragment a > > > +packet, they MUST follow the fragmentation rules described in section > > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}. > > > + > > > +Devices MUST support assembly of received packet fragments according to the > > > +fragmentation rules described in section > > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}. > > > + > > > +\drivernormative{\paragraph}{Device Operation: Packet Dropping}{Device Types / Socket Device / Datagram Sockets / Dropping} > > > + > > > +The driver MAY drop received packets with no notification to the device. This > > > +can happen if, for example, there are insufficient resources or no socket > > > +exists for the destination address. > > > + > > > +\devicenormative{\paragraph}{Device Operation: Packet Dropping}{Device Types / Socket Device / Datagram Sockets / Dropping} > > > + > > > +The device MAY drop received packets with no notification to the driver. This > > > +can happen if, for example, there are insufficient resources or no socket > > > +exists for the destination address. > > > > Should we provide some notification if the socket does not exist at the > > destination? > > > > Yes, I think so. I believe a start/stop congestion notification scheme > actually manages this issue well. > > For example, the source begins sending packets to a destination. > > The destination finds that there exists no socket for that destination > address. The destination sends a "stop" notification to the source that > contains the address in question. Meanwhile, packets are still coming in > but they are being dropped. > > The source receives the "stop" notification with the address and adds it > to the "stopped destinations" list. Any new packet destination address > will be compared to that list. Any matches will be dropped before > sending (and ideally, before wasting time allocating the packet). > > Only when a socket is bound to an address that matches a "stopped" > address does the destination send a "start" notification to any source > it has previusly sent a "stop" notification to. > > Once "start" is received, flow may resume as normal.Again, dropping as control flow tactic has a bunch of problems. Blocking senders sounds more reasonable.> > > + > > > +\paragraph{Datagram Fragmentation}\label{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation} > > > + > > > +\field{flags} may have the following bit set: > > > + > > > +\begin{lstlisting} > > > +#define VIRTIO_VSOCK_DGRAM_EOM (1 << 0) > > > +\end{lstlisting} > > > + > > > +When the header \field{flags} field bit VIRTIO_VSOCK_DGRAM_EOM (bit 0) is set, > > > +it indicates that the current payload is the end of a datagram fragment > > > OR that > > > +the current payload is an entire datagram packet. > > > > In the destination, if we discard some fragments, then could we > > reconstruct a different datagram from the one sent? > > > > Is that anything acceptable? > > > > Dropping fragments should be explicitly disallowed. The sender is > explicitly disallowed from NOT placing fragments on the virtqueue, but I > see that I am missing the piece that states that they may not be dropped > on the receive side. > > I think it is worth mentioning that implicit in this spec is that > socket-to-socket dgram communication is unreliable, but device-to-driver > (and vice versa) is still reliable. That is, we can rely at least on the > virtqueues to work... and if they fail then the device/driver can simply > requeue (think send_pkt_queue in Linux)... so there is some reliability > at the lowest layer.Well you have this weird timeout thing for some reason.> > Thanks, > > Stefano > > > > Thanks! > Bobby
Stefano Garzarella
2023-Sep-06 14:28 UTC
[PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
On Sat, Sep 02, 2023 at 04:35:25AM -0400, Michael S. Tsirkin wrote:>On Sat, Sep 02, 2023 at 04:56:42AM +0000, Bobby Eshleman wrote: >> On Fri, Sep 01, 2023 at 02:45:14PM +0200, Stefano Garzarella wrote: >> > On Tue, Aug 29, 2023 at 09:29:45PM +0000, Bobby Eshleman wrote: >> > > This adds support for datagrams to the virtio-vsock device. >> > > >> > > virtio-vsock already supports stream and seqpacket types. The existing >> > > message types and header fields are extended to support datagrams. >> > > Semantic differences between the flow types are stated, as well as any >> > > additional requirements for devices and drivers implementing this >> > > feature. >> > > >> > > Signed-off-by: Bobby Eshleman <bobby.eshleman at bytedance.com> >> > > --- >> > > device-types/vsock/description.tex | 95 +++++++++++++++++++++++++++--- >> > > 1 file changed, 88 insertions(+), 7 deletions(-) >> > > >> > > diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex >> > > index 7d91d159872f..638dca8e5da1 100644 >> > > --- a/device-types/vsock/description.tex >> > > +++ b/device-types/vsock/description.tex >> > > @@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits} >> > > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported. >> > > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported. >> > > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied. >> > > +\item[VIRTIO_VSOCK_F_DGRAM (3)] datagram socket type is supported. >> > > \end{description} >> > > >> > > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits} >> > > @@ -167,17 +168,22 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera >> > > consists of a (cid, port number) tuple. The header fields used for this are >> > > \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}. >> > > >> > > -Currently stream and seqpacket sockets are supported. \field{type} is 1 (VIRTIO_VSOCK_TYPE_STREAM) >> > > -for stream socket types, and 2 (VIRTIO_VSOCK_TYPE_SEQPACKET) for seqpacket socket types. >> > > + >> > > +Currently stream, seqpacket, and datagram sockets are supported. \field{type} is >> > > +1 (VIRTIO_VSOCK_TYPE_STREAM) for stream socket types, 2 (VIRTIO_VSOCK_TYPE_SEQPACKET) for >> > > +seqpacket socket types, and 3 (VIRTIO_VSOCK_TYPE_DGRAM) for datagram socket types. >> > > >> > > \begin{lstlisting} >> > > #define VIRTIO_VSOCK_TYPE_STREAM 1 >> > > #define VIRTIO_VSOCK_TYPE_SEQPACKET 2 >> > > +#define VIRTIO_VSOCK_TYPE_DGRAM 3 >> > > \end{lstlisting} >> > > >> > > Stream sockets provide in-order, guaranteed, connection-oriented delivery >> > > without message boundaries. Seqpacket sockets provide in-order, guaranteed, >> > > -connection-oriented delivery with message and record boundaries. >> > > +connection-oriented delivery with message and record boundaries. Datagram >> > > +sockets provide connection-less, best-effort delivery of messages, with no >> > > +order or reliability guarantees. >> > > >> > > \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / Device Operation / Buffer Space Management} >> > > \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of >> > > @@ -203,16 +209,19 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / >> > > previously receiving a VIRTIO_VSOCK_OP_CREDIT_REQUEST packet. This allows >> > > communicating updates any time a change in buffer space occurs. >> > > >> > > +\field{buf_alloc} and \field{fwd_cnt} are reserved for future use by datagram >> > > +sockets. These fields are not used for datagram buffer space management. >> > > + >> > > \drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Buffer Space Management} >> > > -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has >> > > -sufficient free buffer space for the payload. >> > > +For stream and seqpacket flows, VIRTIO_VSOCK_OP_RW data packets MUST only be >> > > +transmitted when the peer has sufficient free buffer space for the payload. >> > > >> > > All packets associated with a stream flow MUST contain valid information in >> > > \field{buf_alloc} and \field{fwd_cnt} fields. >> > > >> > > \devicenormative{\paragraph}{Device Operation: Buffer Space >> > > Management}{Device Types / Socket Device / Device Operation / Buffer >> > > Space Management} >> > > -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has >> > > -sufficient free buffer space for the payload. >> > > +For stream and seqpacket flows, VIRTIO_VSOCK_OP_RW data packets MUST only be >> > > +transmitted when the peer has sufficient free buffer space for the payload. >> > > >> > > All packets associated with a stream flow MUST contain valid information in >> > > \field{buf_alloc} and \field{fwd_cnt} fields. >> > > @@ -299,6 +308,78 @@ \subsubsection{Seqpacket Sockets}\label{sec:Device Types / Socket Device / Devic >> > > #define VIRTIO_VSOCK_SEQ_EOR (1 << 1) >> > > \end{lstlisting} >> > > >> > > +\subsubsection{Datagram Sockets}\label{sec:Device Types / Socket Device / Device Operation / Datagram Sockets} >> > > + >> > > +\drivernormative{\paragraph}{Device Operation: Packet Fragmentation}{Device Types / Socket Device / Datagram Sockets / Fragmentation} >> > > + >> > > +Drivers MAY disassemble packets into smaller fragments. If drivers fragment a >> > > +packet, they MUST follow the fragmentation rules described in section >> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}. >> > > + >> > > +Drivers MUST support assembly of received packet fragments according to the >> > > +fragmentation rules described in section >> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram >> > > Sockets / Fragmentation}. >> > > + >> > > +\devicenormative{\paragraph}{Device Operation: Packet Fragmentation}{Device Types / Socket Device / Datagram Sockets / Fragmentation} >> > > + >> > > +Devices MAY disassemble packets into smaller fragments. If devices fragment a >> > > +packet, they MUST follow the fragmentation rules described in section >> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}. >> > > + >> > > +Devices MUST support assembly of received packet fragments according to the >> > > +fragmentation rules described in section >> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}. >> > > + >> > > +\drivernormative{\paragraph}{Device Operation: Packet Dropping}{Device Types / Socket Device / Datagram Sockets / Dropping} >> > > + >> > > +The driver MAY drop received packets with no notification to the device. This >> > > +can happen if, for example, there are insufficient resources or no socket >> > > +exists for the destination address. >> > > + >> > > +\devicenormative{\paragraph}{Device Operation: Packet Dropping}{Device Types / Socket Device / Datagram Sockets / Dropping} >> > > + >> > > +The device MAY drop received packets with no notification to the driver. This >> > > +can happen if, for example, there are insufficient resources or no socket >> > > +exists for the destination address. >> > >> > Should we provide some notification if the socket does not exist at the >> > destination? >> > >> >> Yes, I think so. I believe a start/stop congestion notification scheme >> actually manages this issue well. >> >> For example, the source begins sending packets to a destination. >> >> The destination finds that there exists no socket for that destination >> address. The destination sends a "stop" notification to the source that >> contains the address in question. Meanwhile, packets are still coming in >> but they are being dropped. >> >> The source receives the "stop" notification with the address and adds it >> to the "stopped destinations" list. Any new packet destination address >> will be compared to that list. Any matches will be dropped before >> sending (and ideally, before wasting time allocating the packet). >> >> Only when a socket is bound to an address that matches a "stopped" >> address does the destination send a "start" notification to any source >> it has previusly sent a "stop" notification to.mmm, keeping the state forever could facilitate a DoS, perhaps we should provide a timeout after which to try again>> >> Once "start" is received, flow may resume as normal. > >Again, dropping as control flow tactic has a bunch of problems. >Blocking senders sounds more reasonable.Yep, I think so.> > >> > > + >> > > +\paragraph{Datagram Fragmentation}\label{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation} >> > > + >> > > +\field{flags} may have the following bit set: >> > > + >> > > +\begin{lstlisting} >> > > +#define VIRTIO_VSOCK_DGRAM_EOM (1 << 0) >> > > +\end{lstlisting} >> > > + >> > > +When the header \field{flags} field bit VIRTIO_VSOCK_DGRAM_EOM (bit 0) is set, >> > > +it indicates that the current payload is the end of a datagram fragment >> > > OR that >> > > +the current payload is an entire datagram packet. >> > >> > In the destination, if we discard some fragments, then could we >> > reconstruct a different datagram from the one sent? >> > >> > Is that anything acceptable? >> > >> >> Dropping fragments should be explicitly disallowed. The sender is >> explicitly disallowed from NOT placing fragments on the virtqueue, but I >> see that I am missing the piece that states that they may not be dropped >> on the receive side. >> >> I think it is worth mentioning that implicit in this spec is that >> socket-to-socket dgram communication is unreliable, but device-to-driver >> (and vice versa) is still reliable. That is, we can rely at least on the >> virtqueues to work... and if they fail then the device/driver can simply >> requeue (think send_pkt_queue in Linux)... so there is some reliability >> at the lowest layer.Yep, we should clarify this better. Thanks, Stefano
Reasonably Related Threads
- [PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
- [PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
- [PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
- [PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
- [PATCH v2 2/2] virtio-net: add default_mtu configuration field