Mukesh Rathor
2010-Jun-26 01:40 UTC
[Xen-devel] dom0 boot failure: dma_reserve in reserve_bootmem_generic()
Hi, I''ve been debugging an interesting dom0 boot failure which happens only on certain machines when dom0_mem=830M. A scsi driver fails to allocate 512 bytes in GFP_DMA. However, the system boots with 500M or 930M. The root cause: reserve_bootmem_generic(): .... if (phys+len <= MAX_DMA_PFN*PAGE_SIZE) dma_reserve += len / PAGE_SIZE; <--- In case of 830M, phys+len is just enough to set dma_reserve where the dma memory zone is then ''holed'' out. With less dom0_mem, it leaves few pages for GFP_DMA. With more, phys+len is larger than MAX_DMA_PFN*PAGE_SIZE and it skips setting dma_reserve so DMA cache is good again. So, anyone know the point of setting dma_reserve? Obviously, things are implied OK without it, so would it be safe to just remove the if stmt completely? thanks a lot, Mukesh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jun-28 09:21 UTC
[Xen-devel] dom0 boot failure: dma_reserve in reserve_bootmem_generic()
>>> On 26.06.10 at 03:40, Mukesh Rathor <mukesh.rathor@oracle.com> wrote: > So, anyone know the point of setting dma_reserve? Obviously, thingsI think the comment immediately before set_dma_reserve() explains it quite well: * The per-cpu batchsize and zone watermarks are determined by present_pages. * In the DMA zone, a significant percentage may be consumed by kernel image * and other unfreeable allocations which can skew the watermarks badly. This * function may optionally be used to account for unfreeable pages in the * first zone (e.g., ZONE_DMA). The effect will be lower watermarks and * smaller per-cpu batchsize.> are implied OK without it, so would it be safe to just remove the > if stmt completely?In all our post-2.6.18 kernels we indeed have this disabled, and didn''t have any issue with it so far. Nevertheless I''m not convinced us really doing a good thing with disabling it after the change (a pretty long while ago) to no longer put all memory in the DMA zone. For your issue, I rather wonder why dma_reserve reaches this high a value only with the particular dom0_mem= you''re stating. Did you check where those reservations come from, and how they differfrom when using smaller or larger dom0_mem= values? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mukesh Rathor
2010-Jun-29 03:19 UTC
Re: [Xen-devel] dom0 boot failure: dma_reserve in reserve_bootmem_generic()
On Mon, 28 Jun 2010 10:21:09 +0100 "Jan Beulich" <JBeulich@novell.com> wrote:> I think the comment immediately before set_dma_reserve() explains > it quite well:I''m acutally looking at 2.6.18-164* kernel, looks like set_dma_reserve() and comments were added later.> In all our post-2.6.18 kernels we indeed have this disabled, and > didn''t have any issue with it so far. Nevertheless I''m not convinced > us really doing a good thing with disabling it after the change (a > pretty long while ago) to no longer put all memory in the DMA zone.I may also just disable it for now. I''m not sure I understand the reason behind putting it in DMA zone.> For your issue, I rather wonder why dma_reserve reaches this high > a value only with the particular dom0_mem= you''re stating. Did > you check where those reservations come from, and how they > differfrom when using smaller or larger dom0_mem= values?Yeah, I checked two values which boot fine: dom0_mem = 500M reserve_bootmem_generic(phys = 0, len = e91000) if (phys+len <= MAX_DMA_PFN*PAGE_SIZE) dma_reserve += len / PAGE_SIZE; dom0_mem = 930M reserve_bootmem_generic(phys = 0, len = 1040000) with dom0_mem = 830M, failing to boot: reserve_bootmem_generic(phys = 0, len = fdb000) Add to that statically allocated pages, and with 500M, few pages are left in DMA zone it appears. With 930M, it''s skipped altogether, so no problem with driver allocating from GFP_DMA later. The start_pfn in dom0 is 0xe34, resulting in table_end == fdb000: (XEN) Dom0 alloc.: 000000021d000000->000000021e000000 (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: ffffffff80000000->ffffffff80531eec (XEN) Init. ramdisk: ffffffff80532000->ffffffff80c88200 (XEN) Phys-Mach map: ffffffff80c89000->ffffffff80e28000 (XEN) Start info: ffffffff80e28000->ffffffff80e284b4 (XEN) Page tables: ffffffff80e29000->ffffffff80e34000 (XEN) Boot stack: ffffffff80e34000->ffffffff80e35000 (XEN) TOTAL: ffffffff80000000->ffffffff81000000 (XEN) ENTRY ADDRESS: ffffffff80000000 So, now that I''ve stumbled on this, I''m confused why the PAGE_OFFSET+ VAs, ie, gpfns 0 - 16M, are not mapped to MFNs below 16M? Would this not be needed for ISA DMA? thanks a lot Jan, Mukesh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jun-29 06:57 UTC
Re: [Xen-devel] dom0 boot failure: dma_reserve in reserve_bootmem_generic()
>>> On 29.06.10 at 05:19, Mukesh Rathor <mukesh.rathor@oracle.com> wrote: > On Mon, 28 Jun 2010 10:21:09 +0100 > "Jan Beulich" <JBeulich@novell.com> wrote: >> For your issue, I rather wonder why dma_reserve reaches this high >> a value only with the particular dom0_mem= you''re stating. Did >> you check where those reservations come from, and how they >> differfrom when using smaller or larger dom0_mem= values? > > Yeah, I checked two values which boot fine: > dom0_mem = 500M > reserve_bootmem_generic(phys = 0, len = e91000) > if (phys+len <= MAX_DMA_PFN*PAGE_SIZE) > dma_reserve += len / PAGE_SIZE; > > dom0_mem = 930M > reserve_bootmem_generic(phys = 0, len = 1040000) > > with dom0_mem = 830M, failing to boot: > reserve_bootmem_generic(phys = 0, len = fdb000)So it''s the kernel space reservation that''s hitting you (and it may well be that this also triggered us to #ifdef out that code - it''s just been too long ago to recall). Clearly the size of the initrd shouldn''t get accounted to dma_reserve, nor should the p2m table''s initial space. Accounting the kernel image (or at least the permanent portions thereof) would seem correct otoh.> So, now that I''ve stumbled on this, I''m confused why the PAGE_OFFSET+ > VAs, ie, gpfns 0 - 16M, are not mapped to MFNs below 16M? Would this not > be needed for ISA DMA?There''s not support for ISA DMA in Xen (CONFIG_ISA depends on !X86_XEN and all DMA channels get absorbed close to the end of setup_arch()). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel