Stephen C. Tweedie
2006-Oct-05 13:11 UTC
[Xen-devel] Insane contiguous physical memory requirements in blkbk/blktap
Hi all, blkbk and blktap do not load reliably as modules. As soon as you hit a boot where a fs needs to be fscked, memory gets fragmented, and the order-10 (yes, *TEN*) kmalloc in balloon_alloc_empty_page_range() fails. But is there any need at all for these pages to be contiguous? All we do with the start of the array is: for (i = 0; i < mmap_pages; i++) { pending_vaddrs[i] = mmap_vstart + (i << PAGE_SHIFT); pending_grant_handles[i] = BLKBACK_INVALID_HANDLE; } and thereafter we only ever look at individual page addresses indexed through pending_vaddrs[]. Given that we had problems recently with block IO failing when straddling a page boundary, we certainly don''t have any requirements for contiguity during normal operations. The only thing I can see is struct page *balloon_alloc_empty_page_range(unsigned long nr_pages) { ... balloon_lock(flags); if (xen_feature(XENFEAT_auto_translated_physmap)) { unsigned long gmfn = __pa(vstart) >> PAGE_SHIFT; ... set_xen_guest_handle(reservation.extent_start, &gmfn); ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation); which is performing reservations on the whole extent. Would there be any real problem simply falling back to multiple reservations of lower orders if we fail these calls? --Stephen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Aron Griffis
2006-Oct-05 15:38 UTC
Re: [Xen-devel] Insane contiguous physical memory requirements in blkbk/blktap
Stephen C. Tweedie wrote: [Thu Oct 05 2006, 09:11:10AM EDT]> blkbk and blktap do not load reliably as modules.also netbk> As soon as you hit a > boot where a fs needs to be fscked, memory gets fragmented, and the > order-10 (yes, *TEN*) kmalloc in balloon_alloc_empty_page_range() fails.On ia64, we don''t need an fsck for the failure to occur. The modules almost always fail to load. This is really screwing up xen/ia64 on Fedora right now. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202971> But is there any need at all for these pages to be contiguous?I''ve been looking at the exact same question, but don''t know the answer. If you come up with a patch before I do, I''d be thrilled to test it. Aron _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Oct-05 15:47 UTC
Re: [Xen-devel] Insane contiguous physical memory requirements in blkbk/blktap
On 5/10/06 16:38, "Aron Griffis" <aron@hp.com> wrote:>> But is there any need at all for these pages to be contiguous? > > I''ve been looking at the exact same question, but don''t know the > answer. If you come up with a patch before I do, I''d be thrilled to > test it.This is easily fixable. Anywhere we use the virtual address to compute an offset into a state structure, we can instead store the appropriate ''slot index'' in a spare field in the appropriate ''struct page''. That''ll get rid of any arithmetic on the virtual addresses that depends on them being contiguous. Then the drivers can simply grab a bag of order-0 allocations. That really only leaves the question of how much of this can be put in a helper function (probably in balloon driver again) for use of all our drivers. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephen C. Tweedie
2006-Oct-05 16:10 UTC
Re: [Xen-devel] Insane contiguous physical memory requirements in blkbk/blktap
Hi, On Thu, 2006-10-05 at 16:47 +0100, Keir Fraser wrote:> > I''ve been looking at the exact same question, but don''t know the > > answer. If you come up with a patch before I do, I''d be thrilled to > > test it. > > This is easily fixable. Anywhere we use the virtual address to compute an > offset into a state structure, we can instead store the appropriate ''slot > index'' in a spare field in the appropriate ''struct page''.As far as I can tell, nothing uses the VA in any way --- it can''t, because the start of the order-10 kmalloc area is not actually used anywhere after the initial mmap setup. After making the variable mmap_vstart local to the __init function, everything still compiles, so I don''t think there''s anything lurking in header files that implicitly relies on it. There is a ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation); call on the entire range, though, if (xen_feature (XENFEAT_auto_translated_physmap)) is true. Those reservations are one- off requests, aren''t they? I don''t think we''re adding overhead by doing this for 1024 separate pages rather than a single order-10 chunk, but I''d appreciate a second opinion there.> That really only leaves the question of how much of this can be put in a > helper function (probably in balloon driver again) for use of all our > drivers.Right --- a helper which returns a vm-detached page vector rather than a vaddr is exactly what I had in mind. --Stephen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Oct-05 16:19 UTC
Re: [Xen-devel] Insane contiguous physical memory requirements in blkbk/blktap
On 5/10/06 17:10, "Stephen C. Tweedie" <sct@redhat.com> wrote:>> This is easily fixable. Anywhere we use the virtual address to compute an >> offset into a state structure, we can instead store the appropriate ''slot >> index'' in a spare field in the appropriate ''struct page''. > > As far as I can tell, nothing uses the VA in any way --- it can''t, > because the start of the order-10 kmalloc area is not actually used > anywhere after the initial mmap setup. After making the variable > mmap_vstart local to the __init function, everything still compiles, so > I don''t think there''s anything lurking in header files that implicitly > relies on it.I''m thinking of netback. That''s the one driver where I think we use a virtual address (actually page-struct pointer) as a handle to driver-internal state. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephen C. Tweedie
2006-Oct-05 16:40 UTC
Re: [Xen-devel] Insane contiguous physical memory requirements in blkbk/blktap
Hi, On Thu, 2006-10-05 at 17:19 +0100, Keir Fraser wrote:> >> This is easily fixable. Anywhere we use the virtual address to compute an > >> offset into a state structure, we can instead store the appropriate ''slot > >> index'' in a spare field in the appropriate ''struct page''.> I''m thinking of netback. That''s the one driver where I think we use a > virtual address (actually page-struct pointer) as a handle to > driver-internal state.OK, I was looking at blkback in this case. Indexing is easy, we already have page->index for that purpose, and I don''t think there''s anything else using that for these pages once they are detached from the main VM. Is there any reason we''re not using memory in the vmalloc area for these things? That memory is *supposed* to be allocated out for virtual use on demand. --Stephen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Oct-05 16:43 UTC
Re: [Xen-devel] Insane contiguous physical memory requirements in blkbk/blktap
On 5/10/06 17:40, "Stephen C. Tweedie" <sct@redhat.com> wrote:> OK, I was looking at blkback in this case. Indexing is easy, we already > have page->index for that purpose, and I don''t think there''s anything > else using that for these pages once they are detached from the main VM. > > Is there any reason we''re not using memory in the vmalloc area for these > things? That memory is *supposed* to be allocated out for virtual use > on demand.In most cases we want the ''struct page'' as well as the virtual address space. For example, to stuff into skbuff fragment lists or block-device scatter-gather lists. Kmalloc() gets us both those things -- it''s just the underlying mapped RAM we don''t want. ;-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel