thr3ads.net - Virtualization - [virtio-dev] packed ring layout proposal v2 [Feb 2017]

If this information is useful, please help other people find it:
Share via:

Michael S. Tsirkin

2017-Feb-09 18:24 UTC

[virtio-dev] packed ring layout proposal v2

On Thu, Feb 09, 2017 at 04:48:53PM +0100, Paolo Bonzini
wrote:> 
> 
> On 08/02/2017 20:59, Michael S. Tsirkin wrote:
> > We couldn't decide what's better for everyone in 1.0 days and
I doubt
> > we'll be able to now, but yes, benchmarking is needed to make
> > sire it's required. Very easy to remove or not to use/support in
> > drivers/devices though.
> 
> Fair enough, but of course then we must specify that devices MUST
> support either VRING_DESC_F_NEXT or VRING_DESC_F_INDIRECT, and drivers
> SHOULD support both (or use neither).
> 
> The drivers' part adds to the implementation burden, which is why I
> wanted to remove it.  Alternatively we can say that indirect is
> mandatory for both devices and drivers (and save a feature bit), while
> VRING_DESC_F_NEXT is optional.
> 
> >>> * Non power-of-2 ring sizes
> >>>
> >>> As the ring simply wraps around, there's no reason to
> >>> require ring size to be power of two.
> >>> It can be made a separate feature though.
> >>
> >> Power of 2 ring sizes are required in order to ignore the high
bits of
> >> the indices.  With non-power-of-2 sizes you are forced to keep the
> >> indices less than the ring size.
> > 
> > Right. So
> > 
> > 	if (unlikely(idx++ > size))
> > 		idx = 0;
> > 
> > OTOH ring size that's twice larger than necessary
> > because of power of two requirements wastes cache.
> 
> I don't know.  Power of 2 ring size is pretty standard, I'd rather
avoid
> the complication and the gratuitous difference with 1.0.
I thought originally there's a reason 1.0 rings had to be powers of two
but now I don't see why. OK, we can make it a feature flag later if we
want to.
> If batching is mostly advisory (with exceptions such as mrg-rxbuf) I
> don't have any problem with it.
> 
> Paolo

Paolo Bonzini

2017-Feb-10 11:32 UTC

head link

[virtio-dev] packed ring layout proposal v2

On 09/02/2017 19:24, Michael S. Tsirkin wrote:>> I don't know.  Power of 2 ring size is pretty standard, I'd
rather avoid
>> the complication and the gratuitous difference with 1.0.
>
> I thought originally there's a reason 1.0 rings had to be powers of two
> but now I don't see why. OK, we can make it a feature flag later if we
> want to.
The reason is that it allows indices to be free running.  This is an 
example of QEMU code that requires that:

            nheads = vring_avail_idx(&vdev->vq[i]) -
vdev->vq[i].last_avail_idx;
            /* Check it isn't doing strange things with descriptor numbers.
*/
            if (nheads > vdev->vq[i].vring.num) {
                error_report("VQ %d size 0x%x Guest index 0x%x "
                             "inconsistent with Host index 0x%x: delta
0x%x",
                             i, vdev->vq[i].vring.num,
                             vring_avail_idx(&vdev->vq[i]),
                             vdev->vq[i].last_avail_idx, nheads);
                return -1;
            }

Paolo

Michael S. Tsirkin

2017-Feb-10 15:20 UTC

head link

[virtio-dev] packed ring layout proposal v2

On Fri, Feb 10, 2017 at 12:32:49PM +0100, Paolo Bonzini
wrote:> 
> 
> On 09/02/2017 19:24, Michael S. Tsirkin wrote:
> >> I don't know.  Power of 2 ring size is pretty standard,
I'd rather avoid
> >> the complication and the gratuitous difference with 1.0.
> >
> > I thought originally there's a reason 1.0 rings had to be powers
of two
> > but now I don't see why. OK, we can make it a feature flag later
if we
> > want to.
> 
> The reason is that it allows indices to be free running.
Well what I meant is that with qsize not a power of 2 you can still do
this but have to do everything mod N*qsize as opposed to mod 2^16. So
you need a branch there - easiest to do if you do signed math.

int nheads = avail - last_avail;
/*Check and handle index wrap-around */
if (unlikely(nheads < 0)) {
	nheads += N_qsize;
}

if (nheads < 0 || nheads > vdev->vq[i].vring.num) {
	error_report(...);
	return -1;
}

This can only catch bugs if N > 1
>  This is an 
> example of QEMU code that requires that:
> 
>             nheads = vring_avail_idx(&vdev->vq[i]) -
vdev->vq[i].last_avail_idx;
>             /* Check it isn't doing strange things with descriptor
numbers. */
>             if (nheads > vdev->vq[i].vring.num) {
>                 error_report("VQ %d size 0x%x Guest index 0x%x "
>                              "inconsistent with Host index 0x%x: delta
0x%x",
>                              i, vdev->vq[i].vring.num,
>                              vring_avail_idx(&vdev->vq[i]),
>                              vdev->vq[i].last_avail_idx, nheads);
>                 return -1;
>             }
> 
> Paolo
Same thing here, this never triggers if vring.num == 2^16

-- 
MST

Maybe Matching Threads

Search for more maybe matching threads

Virtualization - Feb 2017 - [virtio-dev] packed ring layout proposal v2

[virtio-dev] packed ring layout proposal v2

[virtio-dev] packed ring layout proposal v2

[virtio-dev] packed ring layout proposal v2

Maybe Matching Threads