David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 0/4] virtio-mem: memory unplug/offlining related cleanups
Some cleanups+optimizations primarily around offline_and_remove_memory(). Patch #1 drops the "unsafe unplug" feature where we might get stuck in offline_and_remove_memory() forever. Patch #2 handles unexpected errors from offline_and_remove_memory() a bit nicer. Patch #3 handles the case where offline_and_remove_memory() failed and we want to retry later to remove a completely unplugged Linux memory block, to not have them waste memory forever. Patch #4 something I had lying around for longer, which reacts faster on config changes when unplugging memory. Cc: "Michael S. Tsirkin" <mst at redhat.com> Cc: Jason Wang <jasowang at redhat.com> Cc: Xuan Zhuo <xuanzhuo at linux.alibaba.com> David Hildenbrand (4): virtio-mem: remove unsafe unplug in Big Block Mode (BBM) virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY virtio-mem: keep retrying on offline_and_remove_memory() errors in Sub Block Mode (SBM) virtio-mem: check if the config changed before fake offlining memory drivers/virtio/virtio_mem.c | 168 ++++++++++++++++++++++++------------ 1 file changed, 112 insertions(+), 56 deletions(-) base-commit: 3f01e9fed8454dcd89727016c3e5b2fbb8f8e50c -- 2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 1/4] virtio-mem: remove unsafe unplug in Big Block Mode (BBM)
When "unsafe unplug" is enabled, we don't fake-offline all memory ahead of actual memory offlining using alloc_contig_range(). Instead, we rely on offline_pages() to also perform actual page migration, which might fail or take a very long time. In that case, it's possible to easily run into endless loops that cannot be aborted anymore (as offlining is triggered by a workqueue then): For example, a single (accidentally) permanently unmovable page in ZONE_MOVABLE results in an endless loop. For ZONE_NORMAL, races between isolating the pageblock (and checking for unmovable pages) and concurrent page allocation are possible and similarly result in endless loops. The idea of the unsafe unplug mode was to make it possible to more reliably unplug large memory blocks. However, (a) we really should be tackling that differently, by extending the alloc_contig_range()-based mechanism; and (b) this mode is not the default and as far as I know, it's unused either way. So let's simply get rid of it. Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 51 +++++++++++++++---------------------- 1 file changed, 20 insertions(+), 31 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 835f6cc2fb66..ed15d2a4bd96 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -38,11 +38,6 @@ module_param(bbm_block_size, ulong, 0444); MODULE_PARM_DESC(bbm_block_size, "Big Block size in bytes. Default is 0 (auto-detection)."); -static bool bbm_safe_unplug = true; -module_param(bbm_safe_unplug, bool, 0444); -MODULE_PARM_DESC(bbm_safe_unplug, - "Use a safe unplug mechanism in BBM, avoiding long/endless loops"); - /* * virtio-mem currently supports the following modes of operation: * @@ -2111,38 +2106,32 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm, VIRTIO_MEM_BBM_BB_ADDED)) return -EINVAL; - if (bbm_safe_unplug) { - /* - * Start by fake-offlining all memory. Once we marked the device - * block as fake-offline, all newly onlined memory will - * automatically be kept fake-offline. Protect from concurrent - * onlining/offlining until we have a consistent state. - */ - mutex_lock(&vm->hotplug_mutex); - virtio_mem_bbm_set_bb_state(vm, bb_id, - VIRTIO_MEM_BBM_BB_FAKE_OFFLINE); + /* + * Start by fake-offlining all memory. Once we marked the device + * block as fake-offline, all newly onlined memory will + * automatically be kept fake-offline. Protect from concurrent + * onlining/offlining until we have a consistent state. + */ + mutex_lock(&vm->hotplug_mutex); + virtio_mem_bbm_set_bb_state(vm, bb_id, VIRTIO_MEM_BBM_BB_FAKE_OFFLINE); - for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { - page = pfn_to_online_page(pfn); - if (!page) - continue; + for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { + page = pfn_to_online_page(pfn); + if (!page) + continue; - rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION); - if (rc) { - end_pfn = pfn; - goto rollback_safe_unplug; - } + rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION); + if (rc) { + end_pfn = pfn; + goto rollback; } - mutex_unlock(&vm->hotplug_mutex); } + mutex_unlock(&vm->hotplug_mutex); rc = virtio_mem_bbm_offline_and_remove_bb(vm, bb_id); if (rc) { - if (bbm_safe_unplug) { - mutex_lock(&vm->hotplug_mutex); - goto rollback_safe_unplug; - } - return rc; + mutex_lock(&vm->hotplug_mutex); + goto rollback; } rc = virtio_mem_bbm_unplug_bb(vm, bb_id); @@ -2154,7 +2143,7 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm, VIRTIO_MEM_BBM_BB_UNUSED); return rc; -rollback_safe_unplug: +rollback: for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { page = pfn_to_online_page(pfn); if (!page) -- 2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 2/4] virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
Just like we do with alloc_contig_range(), let's convert all unknown errors to -EBUSY, but WARN so we can look into the issue. For example, offline_pages() could fail with -EINTR, which would be unexpected in our case. Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index ed15d2a4bd96..1a76ba2bc118 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -741,11 +741,15 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm, * immediately instead of waiting. */ virtio_mem_retry(vm); - } else { - dev_dbg(&vm->vdev->dev, - "offlining and removing memory failed: %d\n", rc); + return 0; } - return rc; + dev_dbg(&vm->vdev->dev, "offlining and removing memory failed: %d\n", rc); + /* + * We don't really expect this to fail, because we fake-offlined all + * memory already. But it could fail in corner cases. + */ + WARN_ON_ONCE(rc != -ENOMEM && rc != -EBUSY); + return rc == -ENOMEM ? -ENOMEM : -EBUSY; } /* -- 2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 3/4] virtio-mem: keep retrying on offline_and_remove_memory() errors in Sub Block Mode (SBM)
In case offline_and_remove_memory() fails in SBM, we leave a completely unplugged Linux memory block stick around until we try plugging memory again. We won't try removing that memory block again. offline_and_remove_memory() may, for example, fail if we're racing with another alloc_contig_range() user, if allocating temporary memory fails, or if some memory notifier rejected the offlining request. Let's handle that case better, by simple retrying to offline and remove such memory. Tested using CONFIG_MEMORY_NOTIFIER_ERROR_INJECT. Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 92 +++++++++++++++++++++++++++++-------- 1 file changed, 73 insertions(+), 19 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 1a76ba2bc118..a5cf92e3e5af 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -168,6 +168,13 @@ struct virtio_mem { /* The number of subblocks per Linux memory block. */ uint32_t sbs_per_mb; + /* + * Some of the Linux memory blocks tracked as "partially + * plugged" are completely unplugged and can be offlined + * and removed -- which previously failed. + */ + bool have_unplugged_mb; + /* Summary of all memory block states. */ unsigned long mb_count[VIRTIO_MEM_SBM_MB_COUNT]; @@ -765,6 +772,34 @@ static int virtio_mem_sbm_offline_and_remove_mb(struct virtio_mem *vm, return virtio_mem_offline_and_remove_memory(vm, addr, size); } +/* + * Try (offlining and) removing memory from Linux in case all subblocks are + * unplugged. Can be called on online and offline memory blocks. + * + * May modify the state of memory blocks in virtio-mem. + */ +static int virtio_mem_sbm_try_remove_unplugged_mb(struct virtio_mem *vm, + unsigned long mb_id) +{ + int rc; + + /* + * Once all subblocks of a memory block were unplugged, offline and + * remove it. + */ + if (!virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) + return 0; + + /* offline_and_remove_memory() works for online and offline memory. */ + mutex_unlock(&vm->hotplug_mutex); + rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id); + mutex_lock(&vm->hotplug_mutex); + if (!rc) + virtio_mem_sbm_set_mb_state(vm, mb_id, + VIRTIO_MEM_SBM_MB_UNUSED); + return rc; +} + /* * See virtio_mem_offline_and_remove_memory(): Try to offline and remove a * all Linux memory blocks covered by the big block. @@ -1988,20 +2023,10 @@ static int virtio_mem_sbm_unplug_any_sb_online(struct virtio_mem *vm, } unplugged: - /* - * Once all subblocks of a memory block were unplugged, offline and - * remove it. This will usually not fail, as no memory is in use - * anymore - however some other notifiers might NACK the request. - */ - if (virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) { - mutex_unlock(&vm->hotplug_mutex); - rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id); - mutex_lock(&vm->hotplug_mutex); - if (!rc) - virtio_mem_sbm_set_mb_state(vm, mb_id, - VIRTIO_MEM_SBM_MB_UNUSED); - } - + rc = virtio_mem_sbm_try_remove_unplugged_mb(vm, mb_id); + if (rc) + vm->sbm.have_unplugged_mb = 1; + /* Ignore errors, this is not critical. We'll retry later. */ return 0; } @@ -2253,12 +2278,13 @@ static int virtio_mem_unplug_request(struct virtio_mem *vm, uint64_t diff) /* * Try to unplug all blocks that couldn't be unplugged before, for example, - * because the hypervisor was busy. + * because the hypervisor was busy. Further, offline and remove any memory + * blocks where we previously failed. */ -static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm) +static int virtio_mem_cleanup_pending_mb(struct virtio_mem *vm) { unsigned long id; - int rc; + int rc = 0; if (!vm->in_sbm) { virtio_mem_bbm_for_each_bb(vm, id, @@ -2280,6 +2306,27 @@ static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm) VIRTIO_MEM_SBM_MB_UNUSED); } + if (!vm->sbm.have_unplugged_mb) + return 0; + + /* + * Let's retry (offlining and) removing completely unplugged Linux + * memory blocks. + */ + vm->sbm.have_unplugged_mb = false; + + mutex_lock(&vm->hotplug_mutex); + virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL) + rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id); + virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL) + rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id); + virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL) + rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id); + mutex_unlock(&vm->hotplug_mutex); + + if (rc) + vm->sbm.have_unplugged_mb = true; + /* Ignore errors, this is not critical. We'll retry later. */ return 0; } @@ -2361,9 +2408,9 @@ static void virtio_mem_run_wq(struct work_struct *work) virtio_mem_refresh_config(vm); } - /* Unplug any leftovers from previous runs */ + /* Cleanup any leftovers from previous runs */ if (!rc) - rc = virtio_mem_unplug_pending_mb(vm); + rc = virtio_mem_cleanup_pending_mb(vm); if (!rc && vm->requested_size != vm->plugged_size) { if (vm->requested_size > vm->plugged_size) { @@ -2375,6 +2422,13 @@ static void virtio_mem_run_wq(struct work_struct *work) } } + /* + * Keep retrying to offline and remove completely unplugged Linux + * memory blocks. + */ + if (!rc && vm->in_sbm && vm->sbm.have_unplugged_mb) + rc = -EBUSY; + switch (rc) { case 0: vm->retry_timer_ms = VIRTIO_MEM_RETRY_TIMER_MIN_MS; -- 2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 4/4] virtio-mem: check if the config changed before fake offlining memory
If we repeatedly fail to fake offline memory to unplug it, we won't be sending any unplug requests to the device. However, we only check if the config changed when sending such (un)plug requests. We could end up trying for a long time to unplug memory, even though the config changed already and we're not supposed to unplug memory anymore. For example, the hypervisor might detect a low-memory situation while unplugging memory and decide to replug some memory. Continuing trying to unplug memory in that case can be problematic. So let's check on a more regular basis. Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index a5cf92e3e5af..fa5226c198cc 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1189,7 +1189,8 @@ static void virtio_mem_fake_online(unsigned long pfn, unsigned long nr_pages) * Try to allocate a range, marking pages fake-offline, effectively * fake-offlining them. */ -static int virtio_mem_fake_offline(unsigned long pfn, unsigned long nr_pages) +static int virtio_mem_fake_offline(struct virtio_mem *vm, unsigned long pfn, + unsigned long nr_pages) { const bool is_movable = is_zone_movable_page(pfn_to_page(pfn)); int rc, retry_count; @@ -1202,6 +1203,14 @@ static int virtio_mem_fake_offline(unsigned long pfn, unsigned long nr_pages) * some guarantees. */ for (retry_count = 0; retry_count < 5; retry_count++) { + /* + * If the config changed, stop immediately and go back to the + * main loop: avoid trying to keep unplugging if the device + * might have decided to not remove any more memory. + */ + if (atomic_read(&vm->config_changed)) + return -EAGAIN; + rc = alloc_contig_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE, GFP_KERNEL); if (rc == -ENOMEM) @@ -1951,7 +1960,7 @@ static int virtio_mem_sbm_unplug_sb_online(struct virtio_mem *vm, start_pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) + sb_id * vm->sbm.sb_size); - rc = virtio_mem_fake_offline(start_pfn, nr_pages); + rc = virtio_mem_fake_offline(vm, start_pfn, nr_pages); if (rc) return rc; @@ -2149,7 +2158,7 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm, if (!page) continue; - rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION); + rc = virtio_mem_fake_offline(vm, pfn, PAGES_PER_SECTION); if (rc) { end_pfn = pfn; goto rollback; -- 2.41.0
Michael S. Tsirkin
2023-Jul-13 15:03 UTC
[PATCH v1 0/4] virtio-mem: memory unplug/offlining related cleanups
On Thu, Jul 13, 2023 at 04:55:47PM +0200, David Hildenbrand wrote:> Some cleanups+optimizations primarily around offline_and_remove_memory(). > > Patch #1 drops the "unsafe unplug" feature where we might get stuck in > offline_and_remove_memory() forever. > > Patch #2 handles unexpected errors from offline_and_remove_memory() a bit > nicer. > > Patch #3 handles the case where offline_and_remove_memory() failed and > we want to retry later to remove a completely unplugged Linux memory > block, to not have them waste memory forever. > > Patch #4 something I had lying around for longer, which reacts faster > on config changes when unplugging memory. > > Cc: "Michael S. Tsirkin" <mst at redhat.com> > Cc: Jason Wang <jasowang at redhat.com> > Cc: Xuan Zhuo <xuanzhuo at linux.alibaba.com>This looks like something that's reasonable to put in this linux, right? These are fixes even though they are for theoretical issues.> David Hildenbrand (4): > virtio-mem: remove unsafe unplug in Big Block Mode (BBM) > virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY > virtio-mem: keep retrying on offline_and_remove_memory() errors in Sub > Block Mode (SBM) > virtio-mem: check if the config changed before fake offlining memory > > drivers/virtio/virtio_mem.c | 168 ++++++++++++++++++++++++------------ > 1 file changed, 112 insertions(+), 56 deletions(-) > > > base-commit: 3f01e9fed8454dcd89727016c3e5b2fbb8f8e50c > -- > 2.41.0