On Thu, Feb 09, 2017 at 04:48:53PM +0100, Paolo Bonzini wrote:> > > On 08/02/2017 20:59, Michael S. Tsirkin wrote: > > We couldn't decide what's better for everyone in 1.0 days and I doubt > > we'll be able to now, but yes, benchmarking is needed to make > > sire it's required. Very easy to remove or not to use/support in > > drivers/devices though. > > Fair enough, but of course then we must specify that devices MUST > support either VRING_DESC_F_NEXT or VRING_DESC_F_INDIRECT, and drivers > SHOULD support both (or use neither). > > The drivers' part adds to the implementation burden, which is why I > wanted to remove it. Alternatively we can say that indirect is > mandatory for both devices and drivers (and save a feature bit), while > VRING_DESC_F_NEXT is optional. > > >>> * Non power-of-2 ring sizes > >>> > >>> As the ring simply wraps around, there's no reason to > >>> require ring size to be power of two. > >>> It can be made a separate feature though. > >> > >> Power of 2 ring sizes are required in order to ignore the high bits of > >> the indices. With non-power-of-2 sizes you are forced to keep the > >> indices less than the ring size. > > > > Right. So > > > > if (unlikely(idx++ > size)) > > idx = 0; > > > > OTOH ring size that's twice larger than necessary > > because of power of two requirements wastes cache. > > I don't know. Power of 2 ring size is pretty standard, I'd rather avoid > the complication and the gratuitous difference with 1.0.I thought originally there's a reason 1.0 rings had to be powers of two but now I don't see why. OK, we can make it a feature flag later if we want to.> If batching is mostly advisory (with exceptions such as mrg-rxbuf) I > don't have any problem with it. > > Paolo
On 09/02/2017 19:24, Michael S. Tsirkin wrote:>> I don't know. Power of 2 ring size is pretty standard, I'd rather avoid >> the complication and the gratuitous difference with 1.0. > > I thought originally there's a reason 1.0 rings had to be powers of two > but now I don't see why. OK, we can make it a feature flag later if we > want to.The reason is that it allows indices to be free running. This is an example of QEMU code that requires that: nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx; /* Check it isn't doing strange things with descriptor numbers. */ if (nheads > vdev->vq[i].vring.num) { error_report("VQ %d size 0x%x Guest index 0x%x " "inconsistent with Host index 0x%x: delta 0x%x", i, vdev->vq[i].vring.num, vring_avail_idx(&vdev->vq[i]), vdev->vq[i].last_avail_idx, nheads); return -1; } Paolo
On Fri, Feb 10, 2017 at 12:32:49PM +0100, Paolo Bonzini wrote:> > > On 09/02/2017 19:24, Michael S. Tsirkin wrote: > >> I don't know. Power of 2 ring size is pretty standard, I'd rather avoid > >> the complication and the gratuitous difference with 1.0. > > > > I thought originally there's a reason 1.0 rings had to be powers of two > > but now I don't see why. OK, we can make it a feature flag later if we > > want to. > > The reason is that it allows indices to be free running.Well what I meant is that with qsize not a power of 2 you can still do this but have to do everything mod N*qsize as opposed to mod 2^16. So you need a branch there - easiest to do if you do signed math. int nheads = avail - last_avail; /*Check and handle index wrap-around */ if (unlikely(nheads < 0)) { nheads += N_qsize; } if (nheads < 0 || nheads > vdev->vq[i].vring.num) { error_report(...); return -1; } This can only catch bugs if N > 1> This is an > example of QEMU code that requires that: > > nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx; > /* Check it isn't doing strange things with descriptor numbers. */ > if (nheads > vdev->vq[i].vring.num) { > error_report("VQ %d size 0x%x Guest index 0x%x " > "inconsistent with Host index 0x%x: delta 0x%x", > i, vdev->vq[i].vring.num, > vring_avail_idx(&vdev->vq[i]), > vdev->vq[i].last_avail_idx, nheads); > return -1; > } > > PaoloSame thing here, this never triggers if vring.num == 2^16 -- MST