On Mon, 22 Nov 2021 11:51:09 +0800 Jason Wang <jasowang at redhat.com> wrote:> On Fri, Nov 19, 2021 at 11:10 PM Halil Pasic <pasic at linux.ibm.com> wrote: > > > > On Wed, 27 Oct 2021 10:21:04 +0800 > > Jason Wang <jasowang at redhat.com> wrote: > > > > > This patch validate the used buffer length provided by the device > > > before trying to use it. This is done by record the in buffer length > > > in a new field in desc_state structure during virtqueue_add(), then we > > > can fail the virtqueue_get_buf() when we find the device is trying to > > > give us a used buffer length which is greater than the in buffer > > > length. > > > > > > Since some drivers have already done the validation by themselves, > > > this patch tries to makes the core validation optional. For the driver > > > that doesn't want the validation, it can set the > > > suppress_used_validation to be true (which could be overridden by > > > force_used_validation module parameter). To be more efficient, a > > > dedicate array is used for storing the validate used length, this > > > helps to eliminate the cache stress if validation is done by the > > > driver. > > > > > > Signed-off-by: Jason Wang <jasowang at redhat.com> > > > > Hi Jason! > > > > Our CI has detected, that virtio-vsock became unusable with this > > patch on s390x. I didn't test on x86 yet. The guest kernel says > > something like: > > vmw_vsock_virtio_transport virtio1: tx: used len 44 is larger than in buflen 0 > > > > Did you, or anybody else, see something like this on platforms other that > > s390x? > > Adding Stefan and Stefano. > > I think it should be a common issue, looking at > vhost_vsock_handle_tx_kick(), it did: > > len += sizeof(pkt->hdr); > vhost_add_used(vq, head, len); > > which looks like a violation of the spec since it's TX.I'm not sure the lines above look like a violation of the spec. If you examine vhost_vsock_alloc_pkt() I believe that you will agree that: len == pkt->len == pkt->hdr.len which makes sense since according to the spec both tx and rx messages are hdr+payload. And I believe hdr.len is the size of the payload, although that does not seem to be properly documented by the spec. On the other hand tx messages are stated to be device read-only (in the spec) so if the device writes stuff, that is certainly wrong. If that is what happens. Looking at virtqueue_get_buf_ctx_split() I'm not sure that is what happens. My hypothesis is that we just a last descriptor is an 'in' type descriptor (i.e. a device writable one). For tx that assumption would be wrong. I will have another look at this today and send a fix patch if my suspicion is confirmed.> > > > > I had a quick look at this code, and I speculate that it probably > > uncovers a pre-existig bug, rather than introducing a new one. > > I agree. >:) I'm not so sure any more myself.> > > > If somebody is already working on this please reach out to me. > > AFAIK, no.Thanks for the info! Then I will dig a little deeper. I asked in order to avoid doing the debugging and fixing just to see that somebody was faster :D> I think the plan is to fix both the device and drive side > (but I'm not sure we need a new feature for this if we stick to the > validation). > > Thanks >Thank you! Regards, Halil
On Mon, 22 Nov 2021 06:35:18 +0100 Halil Pasic <pasic at linux.ibm.com> wrote:> > I think it should be a common issue, looking at > > vhost_vsock_handle_tx_kick(), it did: > > > > len += sizeof(pkt->hdr); > > vhost_add_used(vq, head, len); > > > > which looks like a violation of the spec since it's TX. > > I'm not sure the lines above look like a violation of the spec. If you > examine vhost_vsock_alloc_pkt() I believe that you will agree that: > len == pkt->len == pkt->hdr.len > which makes sense since according to the spec both tx and rx messages > are hdr+payload. And I believe hdr.len is the size of the payload, > although that does not seem to be properly documented by the spec. > > On the other hand tx messages are stated to be device read-only (in the > spec) so if the device writes stuff, that is certainly wrong. > > If that is what happens. > > Looking at virtqueue_get_buf_ctx_split() I'm not sure that is what > happens. My hypothesis is that we just a last descriptor is an 'in' > type descriptor (i.e. a device writable one). For tx that assumption > would be wrong. > > I will have another look at this today and send a fix patch if my > suspicion is confirmed.If my suspicion is right something like: diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 00f64f2f8b72..efb57898920b 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -764,6 +764,7 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, struct vring_virtqueue *vq = to_vvq(_vq); void *ret; unsigned int i; + bool has_in; u16 last_used; START_USE(vq); @@ -787,6 +788,9 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, vq->split.vring.used->ring[last_used].id); *len = virtio32_to_cpu(_vq->vdev, vq->split.vring.used->ring[last_used].len); + has_in = virtio16_to_cpu(_vq->vdev, + vq->split.vring.used->ring[last_used].flags) + & VRING_DESC_F_WRITE; if (unlikely(i >= vq->split.vring.num)) { BAD_RING(vq, "id %u out of range\n", i); @@ -796,7 +800,7 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq, BAD_RING(vq, "id %u is not a head!\n", i); return NULL; } - if (vq->buflen && unlikely(*len > vq->buflen[i])) { + if (has_in && q->buflen && unlikely(*len > vq->buflen[i])) { BAD_RING(vq, "used len %d is larger than in buflen %u\n", *len, vq->buflen[i]); return NULL; would fix the problem for split. I will try that out and let you know later. Regards, Halil