Dave Hansen
2016-Jul-27 16:40 UTC
[PATCH v2 repost 6/7] mm: add the related functions to get free page info
On 07/26/2016 06:23 PM, Liang Li wrote:> + for_each_migratetype_order(order, t) { > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > + if (pfn >= start_pfn && pfn <= end_pfn) { > + page_num = 1UL << order; > + if (pfn + page_num > end_pfn) > + page_num = end_pfn - pfn; > + bitmap_set(bitmap, pfn - start_pfn, page_num); > + } > + } > + }Nit: The 'page_num' nomenclature really confused me here. It is the number of bits being set in the bitmap. Seems like calling it nr_pages or num_pages would be more appropriate. Isn't this bitmap out of date by the time it's send up to the hypervisor? Is there something that makes the inaccuracy OK here?
Michael S. Tsirkin
2016-Jul-27 22:05 UTC
[PATCH v2 repost 6/7] mm: add the related functions to get free page info
On Wed, Jul 27, 2016 at 09:40:56AM -0700, Dave Hansen wrote:> On 07/26/2016 06:23 PM, Liang Li wrote: > > + for_each_migratetype_order(order, t) { > > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > > + if (pfn >= start_pfn && pfn <= end_pfn) { > > + page_num = 1UL << order; > > + if (pfn + page_num > end_pfn) > > + page_num = end_pfn - pfn; > > + bitmap_set(bitmap, pfn - start_pfn, page_num); > > + } > > + } > > + } > > Nit: The 'page_num' nomenclature really confused me here. It is the > number of bits being set in the bitmap. Seems like calling it nr_pages > or num_pages would be more appropriate. > > Isn't this bitmap out of date by the time it's send up to the > hypervisor? Is there something that makes the inaccuracy OK here?Yes. Calling these free pages is unfortunate. It's likely to confuse people thinking they can just discard these pages. Hypervisor sends a request. We respond with this list of pages, and the guarantee hypervisor needs is that these were free sometime between request and response, so they are safe to free if they are unmodified since the request. hypervisor can detect modifications so it can detect modifications itself and does not need guest help. Maybe just call these "free if unmodified" and reflect this everywhere - verbose but hey. Better naming suggestions would be welcome. -- MST
Dave Hansen
2016-Jul-27 22:16 UTC
[PATCH v2 repost 6/7] mm: add the related functions to get free page info
On 07/27/2016 03:05 PM, Michael S. Tsirkin wrote:> On Wed, Jul 27, 2016 at 09:40:56AM -0700, Dave Hansen wrote: >> On 07/26/2016 06:23 PM, Liang Li wrote: >>> + for_each_migratetype_order(order, t) { >>> + list_for_each(curr, &zone->free_area[order].free_list[t]) { >>> + pfn = page_to_pfn(list_entry(curr, struct page, lru)); >>> + if (pfn >= start_pfn && pfn <= end_pfn) { >>> + page_num = 1UL << order; >>> + if (pfn + page_num > end_pfn) >>> + page_num = end_pfn - pfn; >>> + bitmap_set(bitmap, pfn - start_pfn, page_num); >>> + } >>> + } >>> + } >> >> Nit: The 'page_num' nomenclature really confused me here. It is the >> number of bits being set in the bitmap. Seems like calling it nr_pages >> or num_pages would be more appropriate. >> >> Isn't this bitmap out of date by the time it's send up to the >> hypervisor? Is there something that makes the inaccuracy OK here? > > Yes. Calling these free pages is unfortunate. It's likely to confuse > people thinking they can just discard these pages. > > Hypervisor sends a request. We respond with this list of pages, and > the guarantee hypervisor needs is that these were free sometime between request > and response, so they are safe to free if they are unmodified > since the request. hypervisor can detect modifications so > it can detect modifications itself and does not need guest help.Ahh, that makes sense. So the hypervisor is trying to figure out: "Which pages do I move?". It wants to know which pages the guest thinks have good data and need to move. But, the list of free pages is (likely) smaller than the list of pages with good data, so it asks for that instead. A write to a page means that it has valuable data, regardless of whether it was in the free list or not. The hypervisor only skips moving pages that were free *and* were never written to. So we never lose data, even if this "get free page info" stuff is totally out of date. The patch description and code comments are, um, a _bit_ light for this level of subtlety. :)
Li, Liang Z
2016-Jul-28 00:10 UTC
[PATCH v2 repost 6/7] mm: add the related functions to get free page info
> Subject: Re: [PATCH v2 repost 6/7] mm: add the related functions to get free > page info > > On 07/26/2016 06:23 PM, Liang Li wrote: > > + for_each_migratetype_order(order, t) { > > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > > + if (pfn >= start_pfn && pfn <= end_pfn) { > > + page_num = 1UL << order; > > + if (pfn + page_num > end_pfn) > > + page_num = end_pfn - pfn; > > + bitmap_set(bitmap, pfn - start_pfn, > page_num); > > + } > > + } > > + } > > Nit: The 'page_num' nomenclature really confused me here. It is the > number of bits being set in the bitmap. Seems like calling it nr_pages or > num_pages would be more appropriate. >You are right, will change.> Isn't this bitmap out of date by the time it's send up to the hypervisor? Is > there something that makes the inaccuracy OK here?Yes. The dirty page logging will be used to correct the inaccuracy. The dirty page logging should be started before getting the free page bitmap, then if some of the free pages become no free for writing, these pages will be tracked by the dirty page logging mechanism. Thanks! Liang
Michael S. Tsirkin
2016-Jul-28 00:17 UTC
[PATCH v2 repost 6/7] mm: add the related functions to get free page info
On Thu, Jul 28, 2016 at 12:10:16AM +0000, Li, Liang Z wrote:> > Subject: Re: [PATCH v2 repost 6/7] mm: add the related functions to get free > > page info > > > > On 07/26/2016 06:23 PM, Liang Li wrote: > > > + for_each_migratetype_order(order, t) { > > > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > > > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > > > + if (pfn >= start_pfn && pfn <= end_pfn) { > > > + page_num = 1UL << order; > > > + if (pfn + page_num > end_pfn) > > > + page_num = end_pfn - pfn; > > > + bitmap_set(bitmap, pfn - start_pfn, > > page_num); > > > + } > > > + } > > > + } > > > > Nit: The 'page_num' nomenclature really confused me here. It is the > > number of bits being set in the bitmap. Seems like calling it nr_pages or > > num_pages would be more appropriate. > > > > You are right, will change. > > > Isn't this bitmap out of date by the time it's send up to the hypervisor? Is > > there something that makes the inaccuracy OK here? > > Yes. The dirty page logging will be used to correct the inaccuracy. > The dirty page logging should be started before getting the free page bitmap, then if some of the free pages become no free for writing, these pages will be tracked by the dirty page logging mechanism. > > Thanks! > LiangRight but this should be clear from code and naming.
Possibly Parallel Threads
- [PATCH v2 repost 6/7] mm: add the related functions to get free page info
- [PATCH v2 repost 6/7] mm: add the related functions to get free page info
- [PATCH v2 repost 6/7] mm: add the related functions to get free page info
- [PATCH v2 repost 6/7] mm: add the related functions to get free page info
- [PATCH v2 repost 6/7] mm: add the related functions to get free page info