Michael S. Tsirkin
2022-Jun-13 14:07 UTC
[PATCH] virtio_ring : fix vring_packed_desc memory out of bounds bug
On Mon, Jun 13, 2022 at 09:47:55PM +0800, ?? wrote:> Michael S. Tsirkin <mst at redhat.com> ?2022?6?13??? 16:55??? > > > > On Mon, Jun 13, 2022 at 04:44:03PM +0800, ?? wrote: > > > Michael S. Tsirkin <mst at redhat.com> ?2022?6?12??? 22:13??? > > > > > > > > On Sun, Jun 12, 2022 at 07:02:25PM +0800, ?? wrote: > > > > > Michael S. Tsirkin <mst at redhat.com> ?2022?6?11??? 08:35??? > > > > > > > > > > > > On Sat, Jun 11, 2022 at 12:38:10AM +0800, ?? wrote: > > > > > > > > This pattern was always iffy, but I don't think the patch > > > > > > > > improves the situation much. last_used_idx and vq->packed.used_wrap_counter > > > > > > > > can still get out of sync. > > > > > > > > > > > > > > Yes? You are absolutely correct, thanks for pointing out this issue, I > > > > > > > didn't take that into consideration, > > > > > > > how about disabling interrupts before this code below: > > > > > > > > > > > > > > > vq->last_used_idx += vq->packed.desc_state[id].num; > > > > > > > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > > > > > > > vq->last_used_idx -= vq->packed.vring.num; > > > > > > > > vq->packed.used_wrap_counter ^= 1; > > > > > > > > } > > > > > > > > > > > > > > it seems to be fine to just turn off the interrupts of the current vring. > > > > > > > > > > > > > > BR > > > > > > > > > > > > That would make datapath significantly slower. > > > > > > > > > > > > > > > > > > > > Michael S. Tsirkin <mst at redhat.com> ?2022?6?10??? 22:50??? > > > > > > > > > > > > > > > > On Fri, Jun 10, 2022 at 06:33:14PM +0800, huangjie.albert wrote: > > > > > > > > > ksoftirqd may consume the packet and it will call: > > > > > > > > > virtnet_poll > > > > > > > > > -->virtnet_receive > > > > > > > > > -->virtqueue_get_buf_ctx > > > > > > > > > -->virtqueue_get_buf_ctx_packed > > > > > > > > > and in virtqueue_get_buf_ctx_packed: > > > > > > > > > > > > > > > > > > vq->last_used_idx += vq->packed.desc_state[id].num; > > > > > > > > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > > > > > > > > vq->last_used_idx -= vq->packed.vring.num; > > > > > > > > > vq->packed.used_wrap_counter ^= 1; > > > > > > > > > } > > > > > > > > > > > > > > > > > > if at the same time, there comes a vring interrupt?in vring_interrupt: > > > > > > > > > we will call: > > > > > > > > > vring_interrupt > > > > > > > > > -->more_used > > > > > > > > > -->more_used_packed > > > > > > > > > -->is_used_desc_packed > > > > > > > > > in is_used_desc_packed, the last_used_idx maybe >= vq->packed.vring.num. > > > > > > > > > so this could case a memory out of bounds bug. > > > > > > > > > > > > > > > > > > this patch is to fix this. > > > > > > > > > > > > > > > > > > Signed-off-by: huangjie.albert <huangjie.albert at bytedance.com> > > > > > > > > > > > > > > > > > > > > > > > > This pattern was always iffy, but I don't think the patch > > > > > > > > improves the situation much. last_used_idx and vq->packed.used_wrap_counter > > > > > > > > can still get out of sync. > > > > > > > > > > > > > > > > Maybe refactor code to keep everything in vq->last_used_idx? > > > > > > > > > > > > > > > > Jason what is your take? > > > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > drivers/virtio/virtio_ring.c | 3 +++ > > > > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > > > > > index 13a7348cedff..d2abbb3a8187 100644 > > > > > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > > > > > @@ -1397,6 +1397,9 @@ static inline bool is_used_desc_packed(const struct vring_virtqueue *vq, > > > > > > > > > bool avail, used; > > > > > > > > > u16 flags; > > > > > > > > > > > > > > > > > > + if (idx >= vq->packed.vring.num) > > > > > > > > > + return false; > > > > > > > > > + > > > > > > > > > flags = le16_to_cpu(vq->packed.vring.desc[idx].flags); > > > > > > > > > avail = !!(flags & (1 << VRING_PACKED_DESC_F_AVAIL)); > > > > > > > > > used = !!(flags & (1 << VRING_PACKED_DESC_F_USED)); > > > > > > > > > -- > > > > > > > > > 2.27.0 > > > > > > > > > > > > > > > > > > > > > > > > Michael S , thanks for your correction, there may be another simple > > > > > solution here: > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > index 13a7348cedff..4db4db19f94a 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -1397,6 +1397,9 @@ static inline bool is_used_desc_packed(const > > > > > struct vring_virtqueue *vq, > > > > > bool avail, used; > > > > > u16 flags; > > > > > > > > > > + if (idx >= vq->packed.vring.num) > > > > > + return false; > > > > > + > > > > > flags = le16_to_cpu(vq->packed.vring.desc[idx].flags); > > > > > avail = !!(flags & (1 << VRING_PACKED_DESC_F_AVAIL)); > > > > > used = !!(flags & (1 << VRING_PACKED_DESC_F_USED)); > > > > > @@ -1453,8 +1456,9 @@ static void *virtqueue_get_buf_ctx_packed(struct > > > > > virtqueue *_vq, > > > > > > > > > > vq->last_used_idx += vq->packed.desc_state[id].num; > > > > > if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > > > > - vq->last_used_idx -= vq->packed.vring.num; > > > > > vq->packed.used_wrap_counter ^= 1; > > > > > + barrier(); > > > > > + vq->last_used_idx -= vq->packed.vring.num; > > > > > } > > > > > > > > > > /* > > > > > > > > > > vq->packed.used_wrap_counter and vq->last_used_idx only increased > > > > > by the virtqueue_get_buf_ctx_packed, and > > > > > so we can add a memory barrier and Changing the order in which > > > > > last_used_idx and used_wrap_counter are assigned > > > > > should temporarily solve the problem. But as you said, the code may > > > > > need to be refactored to fully address these kinds of issues. > > > > > > > > > > BR > > > > > > > > this might solve the OOB access but not the problem that interrupt > > > > might use an incorrect value to check for the used index. > > > > > > > > > > Yes, thanks for that, but it seems that it can not solve the problem > > > that interrupt > > > might use an incorrect value to check for the used index if we do not > > > disable irq in softirqd. > > > the following code in virtqueue_get_buf_ctx_packed: > > > > > > >1453 /* detach_buf_packed clears data, so grab it now. */ > > > >1454 ret = vq->packed.desc_state[id].data; > > > >1455 detach_buf_packed(vq, id, ctx); > > > >1456 > > > >1457 vq->last_used_idx += vq->packed.desc_state[id].num; > > > >1458 if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { > > > >1459 vq->last_used_idx -= vq->packed.vring.num; > > > >1460 vq->packed.used_wrap_counter ^= 1; > > > >1461 } > > > > > > after call line 1455, the real last_used_idx should add > > > vq->packed.desc_state[id].num, but it > > > add it in line 1457. if the interrupt comes before 1457, we also get > > > the incorrect last_used_idx. > > > do you have any good comments? This problem exists even if > > > last_used_idx and used_wrap_counter are merged and their operations > > > are atomic. > > > > > > BR > > > > The assumption is that we do not need to worry about > > the case where ring already had used buffers: these > > are going to be polled by the driver. > > We do need to worry about any buffers added after > > driver drains the vq. > > In other words the job of the callback is to signal > > to driver the no used buffers -> some used buffers > > condition. > > This ignores virtqueue_enable_cb_delayed. > > > > Many thanks for you . but I'm sorry I didn't understand the meaning of > the last sentence? > > This ignores virtqueue_enable_cb_delayed. > > virtqueue_enable_cb_delayed is to enable the device > notification?interrupt?delayed(if VIRTIO_RING_F_EVENT_IDX enabled). > How does this relate to our previous discussion?I said: the job of the callback is to signal to driver the no used buffers -> some used buffers This does not apply to virtqueue_enable_cb_delayed - for that one the job of the callback is to signal the no used buffers -> lots of used buffers condition.