David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 0/7] virtio-mem: prioritize unplug from ZONE_MOVABLE
Until now, memory provided by a single virtio-mem device was usually either onlined completely to ZONE_MOVABLE (online_movable) or to ZONE_NORMAL (online_kernel), so we didn't actually care about "what" we are unplugging; however, that will change in the future when we will have memory blocks in different zones within a single virtio-mem device. There are two reasons why we want to track to which zone a memory block belongs to and prioritize ZONE_MOVABLE blocks: 1) Memory managed by ZONE_MOVABLE can more likely get unplugged, therefore, resulting in a faster memory hotunplug process. Further, we can more reliably unplug and remove complete memory blocks, removing metadata allocated for the whole memory block. 2) We want to avoid corner cases where unplugging with the current scheme (highest to lowest address) could result in accidential zone imbalances, whereby we remove too much ZONE_NORMAL memory for ZONE_MOVABLE memory of the same device. This series unplugs ZONE_MOVABLE memory blocks first, before falling back to ZONE_NORMAL ones. Patch #1 is an unrelated fix, previously sent in other context that didn't get picked up yet. Patch #2-#4 and #6 are cleanups. Patch #5 and #7 implement ZONE_MOVABLE aware handling in Sub Block Mode and Big Block Mode respectively. Cc: "Michael S. Tsirkin" <mst at redhat.com> Cc: Jason Wang <jasowang at redhat.com> Cc: Marek Kedzierski <mkedzier at redhat.com> Cc: Hui Zhu <teawater at gmail.com> Cc: Pankaj Gupta <pankaj.gupta.linux at gmail.com> Cc: Wei Yang <richard.weiyang at linux.alibaba.com> Cc: Oscar Salvador <osalvador at suse.de> Cc: Michal Hocko <mhocko at kernel.org> Cc: virtualization at lists.linux-foundation.org Cc: linux-mm at kvack.org David Hildenbrand (7): virtio-mem: don't read big block size in Sub Block Mode virtio-mem: use page_zonenum() in virtio_mem_fake_offline() virtio-mem: simplify high-level plug handling in Sub Block Mode virtio-mem: simplify high-level unplug handling in Sub Block Mode virtio-mem: prioritize unplug from ZONE_MOVABLE in Sub Block Mode virtio-mem: simplify high-level unplug handling in Big Block Mode virtio-mem: prioritize unplug from ZONE_MOVABLE in Big Block Mode drivers/virtio/virtio_mem.c | 338 +++++++++++++++++++----------------- 1 file changed, 174 insertions(+), 164 deletions(-) base-commit: 8124c8a6b35386f73523d27eacb71b5364a68c4c -- 2.31.1
David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 1/7] virtio-mem: don't read big block size in Sub Block Mode
We are reading a Big Block Mode value while in Sub Block Mode when initializing. Fortunately, vm->bbm.bb_size maps to some counter in the vm->sbm.mb_count array, which is 0 at that point in time. No harm done; still, this was unintended and is not future-proof. Fixes: 4ba50cd3355d ("virtio-mem: Big Block Mode (BBM) memory hotplug") Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 10ec60d81e84..3bf08b5bb359 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -2420,6 +2420,10 @@ static int virtio_mem_init(struct virtio_mem *vm) dev_warn(&vm->vdev->dev, "Some device memory is not addressable/pluggable. This can make some memory unusable.\n"); + /* Prepare the offline threshold - make sure we can add two blocks. */ + vm->offline_threshold = max_t(uint64_t, 2 * memory_block_size_bytes(), + VIRTIO_MEM_DEFAULT_OFFLINE_THRESHOLD); + /* * We want subblocks to span at least MAX_ORDER_NR_PAGES and * pageblock_nr_pages pages. This: @@ -2466,14 +2470,11 @@ static int virtio_mem_init(struct virtio_mem *vm) vm->bbm.bb_size - 1; vm->bbm.first_bb_id = virtio_mem_phys_to_bb_id(vm, addr); vm->bbm.next_bb_id = vm->bbm.first_bb_id; - } - /* Prepare the offline threshold - make sure we can add two blocks. */ - vm->offline_threshold = max_t(uint64_t, 2 * memory_block_size_bytes(), - VIRTIO_MEM_DEFAULT_OFFLINE_THRESHOLD); - /* In BBM, we also want at least two big blocks. */ - vm->offline_threshold = max_t(uint64_t, 2 * vm->bbm.bb_size, - vm->offline_threshold); + /* Make sure we can add two big blocks. */ + vm->offline_threshold = max_t(uint64_t, 2 * vm->bbm.bb_size, + vm->offline_threshold); + } dev_info(&vm->vdev->dev, "start address: 0x%llx", vm->addr); dev_info(&vm->vdev->dev, "region size: 0x%llx", vm->region_size); -- 2.31.1
David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 2/7] virtio-mem: use page_zonenum() in virtio_mem_fake_offline()
Let's use page_zonenum() instead of zone_idx(page_zone()). Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 3bf08b5bb359..1d4b1e25ac8b 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1135,7 +1135,7 @@ static void virtio_mem_fake_online(unsigned long pfn, unsigned long nr_pages) */ static int virtio_mem_fake_offline(unsigned long pfn, unsigned long nr_pages) { - const bool is_movable = zone_idx(page_zone(pfn_to_page(pfn))) =+ const bool is_movable = page_zonenum(pfn_to_page(pfn)) = ZONE_MOVABLE; int rc, retry_count; -- 2.31.1
David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 3/7] virtio-mem: simplify high-level plug handling in Sub Block Mode
Let's simplify high-level memory block selection when plugging in Sub Block Mode. No need for two separate loops when selecting memory blocks for plugging memory. Avoid passing the "online" state by simply obtaining the state in virtio_mem_sbm_plug_any_sb(). Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 45 ++++++++++++++----------------------- 1 file changed, 17 insertions(+), 28 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 1d4b1e25ac8b..c0e6ea6244e4 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1583,9 +1583,9 @@ static int virtio_mem_sbm_plug_and_add_mb(struct virtio_mem *vm, * Note: Can fail after some subblocks were successfully plugged. */ static int virtio_mem_sbm_plug_any_sb(struct virtio_mem *vm, - unsigned long mb_id, uint64_t *nb_sb, - bool online) + unsigned long mb_id, uint64_t *nb_sb) { + const int old_state = virtio_mem_sbm_get_mb_state(vm, mb_id); unsigned long pfn, nr_pages; int sb_id, count; int rc; @@ -1607,7 +1607,7 @@ static int virtio_mem_sbm_plug_any_sb(struct virtio_mem *vm, if (rc) return rc; *nb_sb -= count; - if (!online) + if (old_state == VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL) continue; /* fake-online the pages if the memory block is online */ @@ -1617,23 +1617,21 @@ static int virtio_mem_sbm_plug_any_sb(struct virtio_mem *vm, virtio_mem_fake_online(pfn, nr_pages); } - if (virtio_mem_sbm_test_sb_plugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) { - if (online) - virtio_mem_sbm_set_mb_state(vm, mb_id, - VIRTIO_MEM_SBM_MB_ONLINE); - else - virtio_mem_sbm_set_mb_state(vm, mb_id, - VIRTIO_MEM_SBM_MB_OFFLINE); - } + if (virtio_mem_sbm_test_sb_plugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) + virtio_mem_sbm_set_mb_state(vm, mb_id, old_state - 1); return 0; } static int virtio_mem_sbm_plug_request(struct virtio_mem *vm, uint64_t diff) { + const int mb_states[] = { + VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL, + VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL, + }; uint64_t nb_sb = diff / vm->sbm.sb_size; unsigned long mb_id; - int rc; + int rc, i; if (!nb_sb) return 0; @@ -1641,22 +1639,13 @@ static int virtio_mem_sbm_plug_request(struct virtio_mem *vm, uint64_t diff) /* Don't race with onlining/offlining */ mutex_lock(&vm->hotplug_mutex); - /* Try to plug subblocks of partially plugged online blocks. */ - virtio_mem_sbm_for_each_mb(vm, mb_id, - VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL) { - rc = virtio_mem_sbm_plug_any_sb(vm, mb_id, &nb_sb, true); - if (rc || !nb_sb) - goto out_unlock; - cond_resched(); - } - - /* Try to plug subblocks of partially plugged offline blocks. */ - virtio_mem_sbm_for_each_mb(vm, mb_id, - VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL) { - rc = virtio_mem_sbm_plug_any_sb(vm, mb_id, &nb_sb, false); - if (rc || !nb_sb) - goto out_unlock; - cond_resched(); + for (i = 0; i < ARRAY_SIZE(mb_states); i++) { + virtio_mem_sbm_for_each_mb(vm, mb_id, mb_states[i]) { + rc = virtio_mem_sbm_plug_any_sb(vm, mb_id, &nb_sb); + if (rc || !nb_sb) + goto out_unlock; + cond_resched(); + } } /* -- 2.31.1
David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 4/7] virtio-mem: simplify high-level unplug handling in Sub Block Mode
Let's simplify by introducing a new virtio_mem_sbm_unplug_any_sb(), similar to virtio_mem_sbm_plug_any_sb(), to simplify high-level memory block selection when unplugging in Sub Block Mode. Rename existing virtio_mem_sbm_unplug_any_sb() to virtio_mem_sbm_unplug_any_sb_raw(). The only change is that we now temporarily unlock the hotplug mutex around cond_resched() when processing offline memory blocks, which doesn't make a real difference as we already have to temporarily unlock in virtio_mem_sbm_unplug_any_sb_offline() when removing a memory block. Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 103 ++++++++++++++++++++---------------- 1 file changed, 57 insertions(+), 46 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index c0e6ea6244e4..d54bb34a7ed8 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1453,8 +1453,8 @@ static int virtio_mem_bbm_plug_bb(struct virtio_mem *vm, unsigned long bb_id) * * Note: can fail after some subblocks were unplugged. */ -static int virtio_mem_sbm_unplug_any_sb(struct virtio_mem *vm, - unsigned long mb_id, uint64_t *nb_sb) +static int virtio_mem_sbm_unplug_any_sb_raw(struct virtio_mem *vm, + unsigned long mb_id, uint64_t *nb_sb) { int sb_id, count; int rc; @@ -1496,7 +1496,7 @@ static int virtio_mem_sbm_unplug_mb(struct virtio_mem *vm, unsigned long mb_id) { uint64_t nb_sb = vm->sbm.sbs_per_mb; - return virtio_mem_sbm_unplug_any_sb(vm, mb_id, &nb_sb); + return virtio_mem_sbm_unplug_any_sb_raw(vm, mb_id, &nb_sb); } /* @@ -1806,7 +1806,7 @@ static int virtio_mem_sbm_unplug_any_sb_offline(struct virtio_mem *vm, { int rc; - rc = virtio_mem_sbm_unplug_any_sb(vm, mb_id, nb_sb); + rc = virtio_mem_sbm_unplug_any_sb_raw(vm, mb_id, nb_sb); /* some subblocks might have been unplugged even on failure */ if (!virtio_mem_sbm_test_sb_plugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) @@ -1929,11 +1929,46 @@ static int virtio_mem_sbm_unplug_any_sb_online(struct virtio_mem *vm, return 0; } +/* + * Unplug the desired number of plugged subblocks of a memory block that is + * already added to Linux. Will skip subblock of online memory blocks that are + * busy (by the OS). Will fail if any subblock that's not busy cannot get + * unplugged. + * + * Will modify the state of the memory block. Might temporarily drop the + * hotplug_mutex. + * + * Note: Can fail after some subblocks were successfully unplugged. Can + * return 0 even if subblocks were busy and could not get unplugged. + */ +static int virtio_mem_sbm_unplug_any_sb(struct virtio_mem *vm, + unsigned long mb_id, + uint64_t *nb_sb) +{ + const int old_state = virtio_mem_sbm_get_mb_state(vm, mb_id); + + switch (old_state) { + case VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL: + case VIRTIO_MEM_SBM_MB_ONLINE: + return virtio_mem_sbm_unplug_any_sb_online(vm, mb_id, nb_sb); + case VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL: + case VIRTIO_MEM_SBM_MB_OFFLINE: + return virtio_mem_sbm_unplug_any_sb_offline(vm, mb_id, nb_sb); + } + return -EINVAL; +} + static int virtio_mem_sbm_unplug_request(struct virtio_mem *vm, uint64_t diff) { + const int mb_states[] = { + VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL, + VIRTIO_MEM_SBM_MB_OFFLINE, + VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL, + VIRTIO_MEM_SBM_MB_ONLINE, + }; uint64_t nb_sb = diff / vm->sbm.sb_size; unsigned long mb_id; - int rc; + int rc, i; if (!nb_sb) return 0; @@ -1945,47 +1980,23 @@ static int virtio_mem_sbm_unplug_request(struct virtio_mem *vm, uint64_t diff) */ mutex_lock(&vm->hotplug_mutex); - /* Try to unplug subblocks of partially plugged offline blocks. */ - virtio_mem_sbm_for_each_mb_rev(vm, mb_id, - VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL) { - rc = virtio_mem_sbm_unplug_any_sb_offline(vm, mb_id, &nb_sb); - if (rc || !nb_sb) - goto out_unlock; - cond_resched(); - } - - /* Try to unplug subblocks of plugged offline blocks. */ - virtio_mem_sbm_for_each_mb_rev(vm, mb_id, VIRTIO_MEM_SBM_MB_OFFLINE) { - rc = virtio_mem_sbm_unplug_any_sb_offline(vm, mb_id, &nb_sb); - if (rc || !nb_sb) - goto out_unlock; - cond_resched(); - } - - if (!unplug_online) { - mutex_unlock(&vm->hotplug_mutex); - return 0; - } - - /* Try to unplug subblocks of partially plugged online blocks. */ - virtio_mem_sbm_for_each_mb_rev(vm, mb_id, - VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL) { - rc = virtio_mem_sbm_unplug_any_sb_online(vm, mb_id, &nb_sb); - if (rc || !nb_sb) - goto out_unlock; - mutex_unlock(&vm->hotplug_mutex); - cond_resched(); - mutex_lock(&vm->hotplug_mutex); - } - - /* Try to unplug subblocks of plugged online blocks. */ - virtio_mem_sbm_for_each_mb_rev(vm, mb_id, VIRTIO_MEM_SBM_MB_ONLINE) { - rc = virtio_mem_sbm_unplug_any_sb_online(vm, mb_id, &nb_sb); - if (rc || !nb_sb) - goto out_unlock; - mutex_unlock(&vm->hotplug_mutex); - cond_resched(); - mutex_lock(&vm->hotplug_mutex); + /* + * We try unplug from partially plugged blocks first, to try removing + * whole memory blocks along with metadata. + */ + for (i = 0; i < ARRAY_SIZE(mb_states); i++) { + virtio_mem_sbm_for_each_mb_rev(vm, mb_id, mb_states[i]) { + rc = virtio_mem_sbm_unplug_any_sb(vm, mb_id, &nb_sb); + if (rc || !nb_sb) + goto out_unlock; + mutex_unlock(&vm->hotplug_mutex); + cond_resched(); + mutex_lock(&vm->hotplug_mutex); + } + if (!unplug_online && i == 1) { + mutex_unlock(&vm->hotplug_mutex); + return 0; + } } mutex_unlock(&vm->hotplug_mutex); -- 2.31.1
David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 5/7] virtio-mem: prioritize unplug from ZONE_MOVABLE in Sub Block Mode
Until now, memory provided by a single virtio-mem device was usually either onlined completely to ZONE_MOVABLE (online_movable) or to ZONE_NORMAL (online_kernel); however, that will change in the future. There are two reasons why we want to track to which zone a memory blocks belongs to and prioritize ZONE_MOVABLE blocks: 1) Memory managed by ZONE_MOVABLE can more likely get unplugged, therefore, resulting in a faster memory hotunplug process. Further, we can more reliably unplug and remove complete memory blocks, removing metadata allocated for the whole memory block. 2) We want to avoid corner cases where unplugging with the current scheme (highest to lowest address) could result in accidential zone imbalances, whereby we remove too much ZONE_NORMAL memory for ZONE_MOVABLE memory of the same device. Let's track the zone via memory block states and try unplug from ZONE_MOVABLE first. Rename VIRTIO_MEM_SBM_MB_ONLINE* to VIRTIO_MEM_SBM_MB_KERNEL* to avoid even longer state names. In commit 27f852795a06 ("virtio-mem: don't special-case ZONE_MOVABLE"), we removed slightly similar tracking for fully plugged memory blocks to support unplugging from ZONE_MOVABLE at all -- as we didn't allow partially plugged memory blocks in ZONE_MOVABLE before that. That commit already mentioned "In the future, we might want to remember the zone again and use the information when (un)plugging memory." Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 72 ++++++++++++++++++++++++++----------- 1 file changed, 52 insertions(+), 20 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index d54bb34a7ed8..156a79ceb9fc 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -75,10 +75,14 @@ enum virtio_mem_sbm_mb_state { VIRTIO_MEM_SBM_MB_OFFLINE, /* Partially plugged, fully added to Linux, offline. */ VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL, - /* Fully plugged, fully added to Linux, online. */ - VIRTIO_MEM_SBM_MB_ONLINE, - /* Partially plugged, fully added to Linux, online. */ - VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL, + /* Fully plugged, fully added to Linux, onlined to a kernel zone. */ + VIRTIO_MEM_SBM_MB_KERNEL, + /* Partially plugged, fully added to Linux, online to a kernel zone */ + VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL, + /* Fully plugged, fully added to Linux, onlined to ZONE_MOVABLE. */ + VIRTIO_MEM_SBM_MB_MOVABLE, + /* Partially plugged, fully added to Linux, onlined to ZONE_MOVABLE. */ + VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL, VIRTIO_MEM_SBM_MB_COUNT }; @@ -832,11 +836,13 @@ static void virtio_mem_sbm_notify_offline(struct virtio_mem *vm, unsigned long mb_id) { switch (virtio_mem_sbm_get_mb_state(vm, mb_id)) { - case VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL: + case VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL: + case VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL: virtio_mem_sbm_set_mb_state(vm, mb_id, VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL); break; - case VIRTIO_MEM_SBM_MB_ONLINE: + case VIRTIO_MEM_SBM_MB_KERNEL: + case VIRTIO_MEM_SBM_MB_MOVABLE: virtio_mem_sbm_set_mb_state(vm, mb_id, VIRTIO_MEM_SBM_MB_OFFLINE); break; @@ -847,21 +853,29 @@ static void virtio_mem_sbm_notify_offline(struct virtio_mem *vm, } static void virtio_mem_sbm_notify_online(struct virtio_mem *vm, - unsigned long mb_id) + unsigned long mb_id, + unsigned long start_pfn) { + const bool is_movable = page_zonenum(pfn_to_page(start_pfn)) =+ ZONE_MOVABLE; + int new_state; + switch (virtio_mem_sbm_get_mb_state(vm, mb_id)) { case VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL: - virtio_mem_sbm_set_mb_state(vm, mb_id, - VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL); + new_state = VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL; + if (is_movable) + new_state = VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL; break; case VIRTIO_MEM_SBM_MB_OFFLINE: - virtio_mem_sbm_set_mb_state(vm, mb_id, - VIRTIO_MEM_SBM_MB_ONLINE); + new_state = VIRTIO_MEM_SBM_MB_KERNEL; + if (is_movable) + new_state = VIRTIO_MEM_SBM_MB_MOVABLE; break; default: BUG(); break; } + virtio_mem_sbm_set_mb_state(vm, mb_id, new_state); } static void virtio_mem_sbm_notify_going_offline(struct virtio_mem *vm, @@ -1015,7 +1029,7 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb, break; case MEM_ONLINE: if (vm->in_sbm) - virtio_mem_sbm_notify_online(vm, id); + virtio_mem_sbm_notify_online(vm, id, mhp->start_pfn); atomic64_sub(size, &vm->offline_size); /* @@ -1626,7 +1640,8 @@ static int virtio_mem_sbm_plug_any_sb(struct virtio_mem *vm, static int virtio_mem_sbm_plug_request(struct virtio_mem *vm, uint64_t diff) { const int mb_states[] = { - VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL, + VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL, + VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL, VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL, }; uint64_t nb_sb = diff / vm->sbm.sb_size; @@ -1843,6 +1858,7 @@ static int virtio_mem_sbm_unplug_sb_online(struct virtio_mem *vm, int count) { const unsigned long nr_pages = PFN_DOWN(vm->sbm.sb_size) * count; + const int old_state = virtio_mem_sbm_get_mb_state(vm, mb_id); unsigned long start_pfn; int rc; @@ -1861,8 +1877,17 @@ static int virtio_mem_sbm_unplug_sb_online(struct virtio_mem *vm, return rc; } - virtio_mem_sbm_set_mb_state(vm, mb_id, - VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL); + switch (old_state) { + case VIRTIO_MEM_SBM_MB_KERNEL: + virtio_mem_sbm_set_mb_state(vm, mb_id, + VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL); + break; + case VIRTIO_MEM_SBM_MB_MOVABLE: + virtio_mem_sbm_set_mb_state(vm, mb_id, + VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL); + break; + } + return 0; } @@ -1948,8 +1973,10 @@ static int virtio_mem_sbm_unplug_any_sb(struct virtio_mem *vm, const int old_state = virtio_mem_sbm_get_mb_state(vm, mb_id); switch (old_state) { - case VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL: - case VIRTIO_MEM_SBM_MB_ONLINE: + case VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL: + case VIRTIO_MEM_SBM_MB_KERNEL: + case VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL: + case VIRTIO_MEM_SBM_MB_MOVABLE: return virtio_mem_sbm_unplug_any_sb_online(vm, mb_id, nb_sb); case VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL: case VIRTIO_MEM_SBM_MB_OFFLINE: @@ -1963,8 +1990,10 @@ static int virtio_mem_sbm_unplug_request(struct virtio_mem *vm, uint64_t diff) const int mb_states[] = { VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL, VIRTIO_MEM_SBM_MB_OFFLINE, - VIRTIO_MEM_SBM_MB_ONLINE_PARTIAL, - VIRTIO_MEM_SBM_MB_ONLINE, + VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL, + VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL, + VIRTIO_MEM_SBM_MB_MOVABLE, + VIRTIO_MEM_SBM_MB_KERNEL, }; uint64_t nb_sb = diff / vm->sbm.sb_size; unsigned long mb_id; @@ -1982,7 +2011,10 @@ static int virtio_mem_sbm_unplug_request(struct virtio_mem *vm, uint64_t diff) /* * We try unplug from partially plugged blocks first, to try removing - * whole memory blocks along with metadata. + * whole memory blocks along with metadata. We prioritize ZONE_MOVABLE + * as it's more reliable to unplug memory and remove whole memory + * blocks, and we don't want to trigger a zone imbalances by + * accidentially removing too much kernel memory. */ for (i = 0; i < ARRAY_SIZE(mb_states); i++) { virtio_mem_sbm_for_each_mb_rev(vm, mb_id, mb_states[i]) { -- 2.31.1
David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 6/7] virtio-mem: simplify high-level unplug handling in Big Block Mode
Let's simplify high-level big block selection when unplugging in Big Block Mode. Combine handling of offline and online blocks. We can get rid of virtio_mem_bbm_bb_is_offline() and simply use virtio_mem_bbm_offline_remove_and_unplug_bb(), as that already tolerates offline parts. We can race with concurrent onlining/offlining either way, so we don;t have to be super correct by failing if an offline big block we'd like to unplug just got (partially) onlined. Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 96 ++++++++++--------------------------- 1 file changed, 24 insertions(+), 72 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 156a79ceb9fc..43199389c414 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -702,18 +702,6 @@ static int virtio_mem_sbm_remove_mb(struct virtio_mem *vm, unsigned long mb_id) return virtio_mem_remove_memory(vm, addr, size); } -/* - * See virtio_mem_remove_memory(): Try to remove all Linux memory blocks covered - * by the big block. - */ -static int virtio_mem_bbm_remove_bb(struct virtio_mem *vm, unsigned long bb_id) -{ - const uint64_t addr = virtio_mem_bb_id_to_phys(vm, bb_id); - const uint64_t size = vm->bbm.bb_size; - - return virtio_mem_remove_memory(vm, addr, size); -} - /* * Try offlining and removing memory from Linux. * @@ -2114,35 +2102,6 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm, return rc; } -/* - * Try to remove a big block from Linux and unplug it. Will fail with - * -EBUSY if some memory is online. - * - * Will modify the state of the memory block. - */ -static int virtio_mem_bbm_remove_and_unplug_bb(struct virtio_mem *vm, - unsigned long bb_id) -{ - int rc; - - if (WARN_ON_ONCE(virtio_mem_bbm_get_bb_state(vm, bb_id) !- VIRTIO_MEM_BBM_BB_ADDED)) - return -EINVAL; - - rc = virtio_mem_bbm_remove_bb(vm, bb_id); - if (rc) - return -EBUSY; - - rc = virtio_mem_bbm_unplug_bb(vm, bb_id); - if (rc) - virtio_mem_bbm_set_bb_state(vm, bb_id, - VIRTIO_MEM_BBM_BB_PLUGGED); - else - virtio_mem_bbm_set_bb_state(vm, bb_id, - VIRTIO_MEM_BBM_BB_UNUSED); - return rc; -} - /* * Test if a big block is completely offline. */ @@ -2166,42 +2125,35 @@ static int virtio_mem_bbm_unplug_request(struct virtio_mem *vm, uint64_t diff) { uint64_t nb_bb = diff / vm->bbm.bb_size; uint64_t bb_id; - int rc; + int rc, i; if (!nb_bb) return 0; - /* Try to unplug completely offline big blocks first. */ - virtio_mem_bbm_for_each_bb_rev(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED) { - cond_resched(); - /* - * As we're holding no locks, this check is racy as memory - * can get onlined in the meantime - but we'll fail gracefully. - */ - if (!virtio_mem_bbm_bb_is_offline(vm, bb_id)) - continue; - rc = virtio_mem_bbm_remove_and_unplug_bb(vm, bb_id); - if (rc == -EBUSY) - continue; - if (!rc) - nb_bb--; - if (rc || !nb_bb) - return rc; - } - - if (!unplug_online) - return 0; + /* + * Try to unplug big blocks. Similar to SBM, start with offline + * big blocks. + */ + for (i = 0; i < 2; i++) { + virtio_mem_bbm_for_each_bb_rev(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED) { + cond_resched(); - /* Try to unplug any big blocks. */ - virtio_mem_bbm_for_each_bb_rev(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED) { - cond_resched(); - rc = virtio_mem_bbm_offline_remove_and_unplug_bb(vm, bb_id); - if (rc == -EBUSY) - continue; - if (!rc) - nb_bb--; - if (rc || !nb_bb) - return rc; + /* + * As we're holding no locks, these checks are racy, + * but we don't care. + */ + if (i == 0 && !virtio_mem_bbm_bb_is_offline(vm, bb_id)) + continue; + rc = virtio_mem_bbm_offline_remove_and_unplug_bb(vm, bb_id); + if (rc == -EBUSY) + continue; + if (!rc) + nb_bb--; + if (rc || !nb_bb) + return rc; + } + if (i == 0 && !unplug_online) + return 0; } return nb_bb ? -EBUSY : 0; -- 2.31.1
David Hildenbrand
2021-Jun-02 18:57 UTC
[PATCH v1 7/7] virtio-mem: prioritize unplug from ZONE_MOVABLE in Big Block Mode
Let's handle unplug in Big Block Mode similar to Sub Block Mode -- prioritize memory blocks onlined to ZONE_MOVABLE. We won't care further about big blocks with mixed zones, as it's rather a corner case that won't matter in practice. Signed-off-by: David Hildenbrand <david at redhat.com> --- drivers/virtio/virtio_mem.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 43199389c414..d3e874b6b50b 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -2121,6 +2121,29 @@ static bool virtio_mem_bbm_bb_is_offline(struct virtio_mem *vm, return true; } +/* + * Test if a big block is completely onlined to ZONE_MOVABLE (or offline). + */ +static bool virtio_mem_bbm_bb_is_movable(struct virtio_mem *vm, + unsigned long bb_id) +{ + const unsigned long start_pfn = PFN_DOWN(virtio_mem_bb_id_to_phys(vm, bb_id)); + const unsigned long nr_pages = PFN_DOWN(vm->bbm.bb_size); + struct page *page; + unsigned long pfn; + + for (pfn = start_pfn; pfn < start_pfn + nr_pages; + pfn += PAGES_PER_SECTION) { + page = pfn_to_online_page(pfn); + if (!page) + continue; + if (page_zonenum(page) != ZONE_MOVABLE) + return false; + } + + return true; +} + static int virtio_mem_bbm_unplug_request(struct virtio_mem *vm, uint64_t diff) { uint64_t nb_bb = diff / vm->bbm.bb_size; @@ -2134,7 +2157,7 @@ static int virtio_mem_bbm_unplug_request(struct virtio_mem *vm, uint64_t diff) * Try to unplug big blocks. Similar to SBM, start with offline * big blocks. */ - for (i = 0; i < 2; i++) { + for (i = 0; i < 3; i++) { virtio_mem_bbm_for_each_bb_rev(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED) { cond_resched(); @@ -2144,6 +2167,8 @@ static int virtio_mem_bbm_unplug_request(struct virtio_mem *vm, uint64_t diff) */ if (i == 0 && !virtio_mem_bbm_bb_is_offline(vm, bb_id)) continue; + if (i == 1 && !virtio_mem_bbm_bb_is_movable(vm, bb_id)) + continue; rc = virtio_mem_bbm_offline_remove_and_unplug_bb(vm, bb_id); if (rc == -EBUSY) continue; -- 2.31.1