On 5/25/22 03:35, Sean Christopherson wrote:> On Fri, May 20, 2022, zhenwei pi wrote: >> @@ -59,6 +60,12 @@ enum virtio_balloon_config_read { >> VIRTIO_BALLOON_CONFIG_READ_CMD_ID = 0, >> }; >> >> +/* the request body to commucate with host side */ >> +struct __virtio_balloon_recover { >> + struct virtio_balloon_recover vbr; >> + __virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE]; > > I assume this is copied from virtio_balloon.pfns, which also uses __virtio32, but > isn't that horribly broken? PFNs are 'unsigned long', i.e. 64 bits on 64-bit kernels. > x86-64 at least most definitely generates 64-bit PFNs. Unless there's magic I'm > missing, page_to_balloon_pfn() will truncate PFNs and feed the host bad info. >Yes, I also noticed this point, I suppose the balloon device can not work on a virtual machine which has physical address larger than 16T. I still let the recover VQ keep aligned with the inflate VQ and deflate VQ. I prefer the recover VQ to be workable/unworkable with inflate/deflate VQ together. So I leave this to the virtio balloon maintainer to decide ...>> @@ -494,6 +511,198 @@ static void update_balloon_size_func(struct work_struct *work) >> queue_work(system_freezable_wq, work); >> } >> >> +/* >> + * virtballoon_memory_failure - notified by memory failure, try to fix the >> + * corrupted page. >> + * The memory failure notifier is designed to call back when the kernel handled >> + * successfully only, WARN_ON_ONCE on the unlikely condition to find out any >> + * error(memory error handling is a best effort, not 100% coverd). >> + */ >> +static int virtballoon_memory_failure(struct notifier_block *notifier, >> + unsigned long pfn, void *parm) >> +{ >> + struct virtio_balloon *vb = container_of(notifier, struct virtio_balloon, >> + memory_failure_nb); >> + struct page *page; >> + struct __virtio_balloon_recover *out_vbr; >> + struct scatterlist sg; >> + unsigned long flags; >> + int err; >> + >> + page = pfn_to_online_page(pfn); >> + if (WARN_ON_ONCE(!page)) >> + return NOTIFY_DONE; >> + >> + if (PageHuge(page)) >> + return NOTIFY_DONE; >> + >> + if (WARN_ON_ONCE(!PageHWPoison(page))) >> + return NOTIFY_DONE; >> + >> + if (WARN_ON_ONCE(page_count(page) != 1)) >> + return NOTIFY_DONE; >> + >> + get_page(page); /* balloon reference */ >> + >> + out_vbr = kzalloc(sizeof(*out_vbr), GFP_KERNEL); >> + if (WARN_ON_ONCE(!out_vbr)) >> + return NOTIFY_BAD; > > Not that it truly matters, but won't failure at this point leak the poisoned page?I'll fix this, thanks! -- zhenwei pi
David Hildenbrand
2022-May-30 07:53 UTC
[PATCH 3/3] virtio_balloon: Introduce memory recover
On 25.05.22 01:32, zhenwei pi wrote:> > > On 5/25/22 03:35, Sean Christopherson wrote: >> On Fri, May 20, 2022, zhenwei pi wrote: >>> @@ -59,6 +60,12 @@ enum virtio_balloon_config_read { >>> VIRTIO_BALLOON_CONFIG_READ_CMD_ID = 0, >>> }; >>> >>> +/* the request body to commucate with host side */ >>> +struct __virtio_balloon_recover { >>> + struct virtio_balloon_recover vbr; >>> + __virtio32 pfns[VIRTIO_BALLOON_PAGES_PER_PAGE]; >> >> I assume this is copied from virtio_balloon.pfns, which also uses __virtio32, but >> isn't that horribly broken? PFNs are 'unsigned long', i.e. 64 bits on 64-bit kernels. >> x86-64 at least most definitely generates 64-bit PFNs. Unless there's magic I'm >> missing, page_to_balloon_pfn() will truncate PFNs and feed the host bad info. >> > > Yes, I also noticed this point, I suppose the balloon device can not > work on a virtual machine which has physical address larger than 16T.Yes, that's a historical artifact and we never ran into it in practice -- because 16TB VMs are still rare, especially when paired with virtio-balloon inflation/deflation. Most probably the guest should just stop inflating when hitting such a big PFN. In the future, we might want a proper sg interface instead. -- Thanks, David / dhildenb