David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 0/4] virtio-mem: memory unplug/offlining related cleanups
Some cleanups+optimizations primarily around offline_and_remove_memory().
Patch #1 drops the "unsafe unplug" feature where we might get stuck in
offline_and_remove_memory() forever.
Patch #2 handles unexpected errors from offline_and_remove_memory() a bit
nicer.
Patch #3 handles the case where offline_and_remove_memory() failed and
we want to retry later to remove a completely unplugged Linux memory
block, to not have them waste memory forever.
Patch #4 something I had lying around for longer, which reacts faster
on config changes when unplugging memory.
Cc: "Michael S. Tsirkin" <mst at redhat.com>
Cc: Jason Wang <jasowang at redhat.com>
Cc: Xuan Zhuo <xuanzhuo at linux.alibaba.com>
David Hildenbrand (4):
virtio-mem: remove unsafe unplug in Big Block Mode (BBM)
virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
virtio-mem: keep retrying on offline_and_remove_memory() errors in Sub
Block Mode (SBM)
virtio-mem: check if the config changed before fake offlining memory
drivers/virtio/virtio_mem.c | 168 ++++++++++++++++++++++++------------
1 file changed, 112 insertions(+), 56 deletions(-)
base-commit: 3f01e9fed8454dcd89727016c3e5b2fbb8f8e50c
--
2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 1/4] virtio-mem: remove unsafe unplug in Big Block Mode (BBM)
When "unsafe unplug" is enabled, we don't fake-offline all memory
ahead of
actual memory offlining using alloc_contig_range(). Instead, we rely on
offline_pages() to also perform actual page migration, which might fail
or take a very long time.
In that case, it's possible to easily run into endless loops that cannot be
aborted anymore (as offlining is triggered by a workqueue then): For
example, a single (accidentally) permanently unmovable page in
ZONE_MOVABLE results in an endless loop. For ZONE_NORMAL, races between
isolating the pageblock (and checking for unmovable pages) and
concurrent page allocation are possible and similarly result in endless
loops.
The idea of the unsafe unplug mode was to make it possible to more
reliably unplug large memory blocks. However, (a) we really should be
tackling that differently, by extending the alloc_contig_range()-based
mechanism; and (b) this mode is not the default and as far as I know,
it's unused either way.
So let's simply get rid of it.
Signed-off-by: David Hildenbrand <david at redhat.com>
---
drivers/virtio/virtio_mem.c | 51 +++++++++++++++----------------------
1 file changed, 20 insertions(+), 31 deletions(-)
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 835f6cc2fb66..ed15d2a4bd96 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -38,11 +38,6 @@ module_param(bbm_block_size, ulong, 0444);
MODULE_PARM_DESC(bbm_block_size,
"Big Block size in bytes. Default is 0 (auto-detection).");
-static bool bbm_safe_unplug = true;
-module_param(bbm_safe_unplug, bool, 0444);
-MODULE_PARM_DESC(bbm_safe_unplug,
- "Use a safe unplug mechanism in BBM, avoiding long/endless
loops");
-
/*
* virtio-mem currently supports the following modes of operation:
*
@@ -2111,38 +2106,32 @@ static int
virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
VIRTIO_MEM_BBM_BB_ADDED))
return -EINVAL;
- if (bbm_safe_unplug) {
- /*
- * Start by fake-offlining all memory. Once we marked the device
- * block as fake-offline, all newly onlined memory will
- * automatically be kept fake-offline. Protect from concurrent
- * onlining/offlining until we have a consistent state.
- */
- mutex_lock(&vm->hotplug_mutex);
- virtio_mem_bbm_set_bb_state(vm, bb_id,
- VIRTIO_MEM_BBM_BB_FAKE_OFFLINE);
+ /*
+ * Start by fake-offlining all memory. Once we marked the device
+ * block as fake-offline, all newly onlined memory will
+ * automatically be kept fake-offline. Protect from concurrent
+ * onlining/offlining until we have a consistent state.
+ */
+ mutex_lock(&vm->hotplug_mutex);
+ virtio_mem_bbm_set_bb_state(vm, bb_id, VIRTIO_MEM_BBM_BB_FAKE_OFFLINE);
- for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
- page = pfn_to_online_page(pfn);
- if (!page)
- continue;
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ page = pfn_to_online_page(pfn);
+ if (!page)
+ continue;
- rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION);
- if (rc) {
- end_pfn = pfn;
- goto rollback_safe_unplug;
- }
+ rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION);
+ if (rc) {
+ end_pfn = pfn;
+ goto rollback;
}
- mutex_unlock(&vm->hotplug_mutex);
}
+ mutex_unlock(&vm->hotplug_mutex);
rc = virtio_mem_bbm_offline_and_remove_bb(vm, bb_id);
if (rc) {
- if (bbm_safe_unplug) {
- mutex_lock(&vm->hotplug_mutex);
- goto rollback_safe_unplug;
- }
- return rc;
+ mutex_lock(&vm->hotplug_mutex);
+ goto rollback;
}
rc = virtio_mem_bbm_unplug_bb(vm, bb_id);
@@ -2154,7 +2143,7 @@ static int
virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
VIRTIO_MEM_BBM_BB_UNUSED);
return rc;
-rollback_safe_unplug:
+rollback:
for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
page = pfn_to_online_page(pfn);
if (!page)
--
2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 2/4] virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
Just like we do with alloc_contig_range(), let's convert all unknown
errors to -EBUSY, but WARN so we can look into the issue. For example,
offline_pages() could fail with -EINTR, which would be unexpected in our
case.
Signed-off-by: David Hildenbrand <david at redhat.com>
---
drivers/virtio/virtio_mem.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index ed15d2a4bd96..1a76ba2bc118 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -741,11 +741,15 @@ static int virtio_mem_offline_and_remove_memory(struct
virtio_mem *vm,
* immediately instead of waiting.
*/
virtio_mem_retry(vm);
- } else {
- dev_dbg(&vm->vdev->dev,
- "offlining and removing memory failed: %d\n", rc);
+ return 0;
}
- return rc;
+ dev_dbg(&vm->vdev->dev, "offlining and removing memory failed:
%d\n", rc);
+ /*
+ * We don't really expect this to fail, because we fake-offlined all
+ * memory already. But it could fail in corner cases.
+ */
+ WARN_ON_ONCE(rc != -ENOMEM && rc != -EBUSY);
+ return rc == -ENOMEM ? -ENOMEM : -EBUSY;
}
/*
--
2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 3/4] virtio-mem: keep retrying on offline_and_remove_memory() errors in Sub Block Mode (SBM)
In case offline_and_remove_memory() fails in SBM, we leave a completely
unplugged Linux memory block stick around until we try plugging memory
again. We won't try removing that memory block again.
offline_and_remove_memory() may, for example, fail if we're racing with
another alloc_contig_range() user, if allocating temporary memory fails,
or if some memory notifier rejected the offlining request.
Let's handle that case better, by simple retrying to offline and remove
such memory.
Tested using CONFIG_MEMORY_NOTIFIER_ERROR_INJECT.
Signed-off-by: David Hildenbrand <david at redhat.com>
---
drivers/virtio/virtio_mem.c | 92 +++++++++++++++++++++++++++++--------
1 file changed, 73 insertions(+), 19 deletions(-)
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 1a76ba2bc118..a5cf92e3e5af 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -168,6 +168,13 @@ struct virtio_mem {
/* The number of subblocks per Linux memory block. */
uint32_t sbs_per_mb;
+ /*
+ * Some of the Linux memory blocks tracked as "partially
+ * plugged" are completely unplugged and can be offlined
+ * and removed -- which previously failed.
+ */
+ bool have_unplugged_mb;
+
/* Summary of all memory block states. */
unsigned long mb_count[VIRTIO_MEM_SBM_MB_COUNT];
@@ -765,6 +772,34 @@ static int virtio_mem_sbm_offline_and_remove_mb(struct
virtio_mem *vm,
return virtio_mem_offline_and_remove_memory(vm, addr, size);
}
+/*
+ * Try (offlining and) removing memory from Linux in case all subblocks are
+ * unplugged. Can be called on online and offline memory blocks.
+ *
+ * May modify the state of memory blocks in virtio-mem.
+ */
+static int virtio_mem_sbm_try_remove_unplugged_mb(struct virtio_mem *vm,
+ unsigned long mb_id)
+{
+ int rc;
+
+ /*
+ * Once all subblocks of a memory block were unplugged, offline and
+ * remove it.
+ */
+ if (!virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb))
+ return 0;
+
+ /* offline_and_remove_memory() works for online and offline memory. */
+ mutex_unlock(&vm->hotplug_mutex);
+ rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id);
+ mutex_lock(&vm->hotplug_mutex);
+ if (!rc)
+ virtio_mem_sbm_set_mb_state(vm, mb_id,
+ VIRTIO_MEM_SBM_MB_UNUSED);
+ return rc;
+}
+
/*
* See virtio_mem_offline_and_remove_memory(): Try to offline and remove a
* all Linux memory blocks covered by the big block.
@@ -1988,20 +2023,10 @@ static int virtio_mem_sbm_unplug_any_sb_online(struct
virtio_mem *vm,
}
unplugged:
- /*
- * Once all subblocks of a memory block were unplugged, offline and
- * remove it. This will usually not fail, as no memory is in use
- * anymore - however some other notifiers might NACK the request.
- */
- if (virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) {
- mutex_unlock(&vm->hotplug_mutex);
- rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id);
- mutex_lock(&vm->hotplug_mutex);
- if (!rc)
- virtio_mem_sbm_set_mb_state(vm, mb_id,
- VIRTIO_MEM_SBM_MB_UNUSED);
- }
-
+ rc = virtio_mem_sbm_try_remove_unplugged_mb(vm, mb_id);
+ if (rc)
+ vm->sbm.have_unplugged_mb = 1;
+ /* Ignore errors, this is not critical. We'll retry later. */
return 0;
}
@@ -2253,12 +2278,13 @@ static int virtio_mem_unplug_request(struct virtio_mem
*vm, uint64_t diff)
/*
* Try to unplug all blocks that couldn't be unplugged before, for example,
- * because the hypervisor was busy.
+ * because the hypervisor was busy. Further, offline and remove any memory
+ * blocks where we previously failed.
*/
-static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm)
+static int virtio_mem_cleanup_pending_mb(struct virtio_mem *vm)
{
unsigned long id;
- int rc;
+ int rc = 0;
if (!vm->in_sbm) {
virtio_mem_bbm_for_each_bb(vm, id,
@@ -2280,6 +2306,27 @@ static int virtio_mem_unplug_pending_mb(struct virtio_mem
*vm)
VIRTIO_MEM_SBM_MB_UNUSED);
}
+ if (!vm->sbm.have_unplugged_mb)
+ return 0;
+
+ /*
+ * Let's retry (offlining and) removing completely unplugged Linux
+ * memory blocks.
+ */
+ vm->sbm.have_unplugged_mb = false;
+
+ mutex_lock(&vm->hotplug_mutex);
+ virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL)
+ rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id);
+ virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL)
+ rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id);
+ virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL)
+ rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id);
+ mutex_unlock(&vm->hotplug_mutex);
+
+ if (rc)
+ vm->sbm.have_unplugged_mb = true;
+ /* Ignore errors, this is not critical. We'll retry later. */
return 0;
}
@@ -2361,9 +2408,9 @@ static void virtio_mem_run_wq(struct work_struct *work)
virtio_mem_refresh_config(vm);
}
- /* Unplug any leftovers from previous runs */
+ /* Cleanup any leftovers from previous runs */
if (!rc)
- rc = virtio_mem_unplug_pending_mb(vm);
+ rc = virtio_mem_cleanup_pending_mb(vm);
if (!rc && vm->requested_size != vm->plugged_size) {
if (vm->requested_size > vm->plugged_size) {
@@ -2375,6 +2422,13 @@ static void virtio_mem_run_wq(struct work_struct *work)
}
}
+ /*
+ * Keep retrying to offline and remove completely unplugged Linux
+ * memory blocks.
+ */
+ if (!rc && vm->in_sbm && vm->sbm.have_unplugged_mb)
+ rc = -EBUSY;
+
switch (rc) {
case 0:
vm->retry_timer_ms = VIRTIO_MEM_RETRY_TIMER_MIN_MS;
--
2.41.0
David Hildenbrand
2023-Jul-13 14:55 UTC
[PATCH v1 4/4] virtio-mem: check if the config changed before fake offlining memory
If we repeatedly fail to fake offline memory to unplug it, we won't be
sending any unplug requests to the device. However, we only check if the
config changed when sending such (un)plug requests.
We could end up trying for a long time to unplug memory, even though
the config changed already and we're not supposed to unplug memory
anymore. For example, the hypervisor might detect a low-memory situation
while unplugging memory and decide to replug some memory. Continuing
trying to unplug memory in that case can be problematic.
So let's check on a more regular basis.
Signed-off-by: David Hildenbrand <david at redhat.com>
---
drivers/virtio/virtio_mem.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index a5cf92e3e5af..fa5226c198cc 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -1189,7 +1189,8 @@ static void virtio_mem_fake_online(unsigned long pfn,
unsigned long nr_pages)
* Try to allocate a range, marking pages fake-offline, effectively
* fake-offlining them.
*/
-static int virtio_mem_fake_offline(unsigned long pfn, unsigned long nr_pages)
+static int virtio_mem_fake_offline(struct virtio_mem *vm, unsigned long pfn,
+ unsigned long nr_pages)
{
const bool is_movable = is_zone_movable_page(pfn_to_page(pfn));
int rc, retry_count;
@@ -1202,6 +1203,14 @@ static int virtio_mem_fake_offline(unsigned long pfn,
unsigned long nr_pages)
* some guarantees.
*/
for (retry_count = 0; retry_count < 5; retry_count++) {
+ /*
+ * If the config changed, stop immediately and go back to the
+ * main loop: avoid trying to keep unplugging if the device
+ * might have decided to not remove any more memory.
+ */
+ if (atomic_read(&vm->config_changed))
+ return -EAGAIN;
+
rc = alloc_contig_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE,
GFP_KERNEL);
if (rc == -ENOMEM)
@@ -1951,7 +1960,7 @@ static int virtio_mem_sbm_unplug_sb_online(struct
virtio_mem *vm,
start_pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
sb_id * vm->sbm.sb_size);
- rc = virtio_mem_fake_offline(start_pfn, nr_pages);
+ rc = virtio_mem_fake_offline(vm, start_pfn, nr_pages);
if (rc)
return rc;
@@ -2149,7 +2158,7 @@ static int
virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
if (!page)
continue;
- rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION);
+ rc = virtio_mem_fake_offline(vm, pfn, PAGES_PER_SECTION);
if (rc) {
end_pfn = pfn;
goto rollback;
--
2.41.0
Michael S. Tsirkin
2023-Jul-13 15:03 UTC
[PATCH v1 0/4] virtio-mem: memory unplug/offlining related cleanups
On Thu, Jul 13, 2023 at 04:55:47PM +0200, David Hildenbrand wrote:> Some cleanups+optimizations primarily around offline_and_remove_memory(). > > Patch #1 drops the "unsafe unplug" feature where we might get stuck in > offline_and_remove_memory() forever. > > Patch #2 handles unexpected errors from offline_and_remove_memory() a bit > nicer. > > Patch #3 handles the case where offline_and_remove_memory() failed and > we want to retry later to remove a completely unplugged Linux memory > block, to not have them waste memory forever. > > Patch #4 something I had lying around for longer, which reacts faster > on config changes when unplugging memory. > > Cc: "Michael S. Tsirkin" <mst at redhat.com> > Cc: Jason Wang <jasowang at redhat.com> > Cc: Xuan Zhuo <xuanzhuo at linux.alibaba.com>This looks like something that's reasonable to put in this linux, right? These are fixes even though they are for theoretical issues.> David Hildenbrand (4): > virtio-mem: remove unsafe unplug in Big Block Mode (BBM) > virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY > virtio-mem: keep retrying on offline_and_remove_memory() errors in Sub > Block Mode (SBM) > virtio-mem: check if the config changed before fake offlining memory > > drivers/virtio/virtio_mem.c | 168 ++++++++++++++++++++++++------------ > 1 file changed, 112 insertions(+), 56 deletions(-) > > > base-commit: 3f01e9fed8454dcd89727016c3e5b2fbb8f8e50c > -- > 2.41.0