Leon Romanovsky
2021-Sep-27 18:23 UTC
[PATCH 2/2] virtio-blk: set NUMA affinity for a tagset
On Mon, Sep 27, 2021 at 08:25:09PM +0300, Max Gurtovoy wrote:> > On 9/27/2021 2:34 PM, Leon Romanovsky wrote: > > On Sun, Sep 26, 2021 at 05:55:18PM +0300, Max Gurtovoy wrote: > > > To optimize performance, set the affinity of the block device tagset > > > according to the virtio device affinity. > > > > > > Signed-off-by: Max Gurtovoy <mgurtovoy at nvidia.com> > > > --- > > > drivers/block/virtio_blk.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > > > index 9b3bd083b411..1c68c3e0ebf9 100644 > > > --- a/drivers/block/virtio_blk.c > > > +++ b/drivers/block/virtio_blk.c > > > @@ -774,7 +774,7 @@ static int virtblk_probe(struct virtio_device *vdev) > > > memset(&vblk->tag_set, 0, sizeof(vblk->tag_set)); > > > vblk->tag_set.ops = &virtio_mq_ops; > > > vblk->tag_set.queue_depth = queue_depth; > > > - vblk->tag_set.numa_node = NUMA_NO_NODE; > > > + vblk->tag_set.numa_node = virtio_dev_to_node(vdev); > > I afraid that by doing it, you will increase chances to see OOM, because > > in NUMA_NO_NODE, MM will try allocate memory in whole system, while in > > the latter mode only on specific NUMA which can be depleted. > > This is a common methodology we use in the block layer and in NVMe subsystem > and we don't afraid of the OOM issue you raised.There are many reasons for that, but we are talking about virtio here and not about NVMe.> > This is not new and I guess that the kernel MM will (or should) be handling > the fallback you raised.I afraid that it is not. Can you point me to the place where such fallback is implemented?> > Anyway, if we're doing this in NVMe I don't see a reason to afraid doing it > in virtio-blk.Still, it is nice to have some empirical data to support this copy/paste. There are too many myths related to optimizations, so finally it will be good to get some supportive data. Thanks