The latest Fedora based 2.6.37 kernels have stopped booting for me under xen. They stopped working around -rc7 but I think the trigger is that various debug options were turned off. My hardware won''t let me get serial output, so I have tried booting it within kvm, and got the attached output - the behaviour was similar to bare metal, though I don''t see enough to know if it is exactly the same crash. The kernel used has no additional xen patches, though I am seeing similar behaviour for kernels with patches from xen-next-2.6.37. The crash looks like it is something to do with irq. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-05  15:43 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Tue, Jan 04, 2011 at 10:01:56PM +0000, M A Young wrote:> The latest Fedora based 2.6.37 kernels have stopped booting for me > under xen. They stopped working around -rc7 but I think the trigger > is that various debug options were turned off. My hardware won''t let > me get serial output, so I have tried booting it within kvm, and got > the attached output - the behaviour was similar to bare metal, > though I don''t see enough to know if it is exactly the same crash. > The kernel used has no additional xen patches, though I am seeing > similar behaviour for kernels with patches from xen-next-2.6.37. The > crash looks like it is something to do with irq.Ahh, I hit this. Can you try ''stable/bug-fixes'' branch of mine? It has "xen/irq: Don''t fall over when nr_irqs_gsi > nr_irqs." patch which will fix the below problem you are seeing. But I am not sure if it fixes the problem you are having with hardware? (git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git) ..> [ 0.008220] ------------[ cut here ]------------ > [ 0.008999] WARNING: at drivers/xen/events.c:432 find_unbound_irq+0x88/0x9f() > [ 0.008999] Hardware name: Bochs > [ 0.008999] Modules linked in: > [ 0.008999] Pid: 1, comm: swapper Not tainted 2.6.37-0.rc8.git3.1.fc15.x86_64 #1 > [ 0.008999] Call Trace: > [ 0.008999] [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d > [ 0.008999] [<ffffffff81050609>] warn_slowpath_null+0x1a/0x1c > [ 0.008999] [<ffffffff812abfea>] find_unbound_irq+0x88/0x9f > [ 0.008999] [<ffffffff812ac90e>] bind_ipi_to_irqhandler+0x64/0x153 > [ 0.008999] [<ffffffff81007979>] ? xen_reschedule_interrupt+0x0/0x18 > [ 0.008999] [<ffffffff81234511>] ? kasprintf+0x38/0x3b > [ 0.008999] [<ffffffff81007b92>] xen_smp_intr_init+0x46/0x1f3 > [ 0.008999] [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107 > [ 0.008999] [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6 > [ 0.008999] [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10 > [ 0.008999] [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b > [ 0.008999] [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6 > [ 0.008999] [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10 > [ 0.008999] ---[ end trace a7919e7f17c0a725 ]--- > [ 0.008999] ------------[ cut here ]------------ > [ 0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab() > [ 0.008999] Hardware name: Bochs > [ 0.008999] Trying to free already-free IRQ 0 > [ 0.008999] Modules linked in: > [ 0.008999] Pid: 1, comm: swapper Tainted: G W 2.6.37-0.rc8.git3.1.fc15.x86_64 #1 > [ 0.008999] Call Trace: > [ 0.008999] [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d > [ 0.008999] [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48 > [ 0.008999] [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e > [ 0.008999] [<ffffffff810ac901>] __free_irq+0xa3/0x1ab > [ 0.008999] [<ffffffff810aca41>] free_irq+0x38/0x50 > [ 0.008999] [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20 > [ 0.008999] [<ffffffff81007cce>] xen_smp_intr_init+0x182/0x1f3 > [ 0.008999] [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107 > [ 0.008999] [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6 > [ 0.008999] [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10 > [ 0.008999] [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b > [ 0.008999] [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6 > [ 0.008999] [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10 > [ 0.008999] ---[ end trace a7919e7f17c0a726 ]--- > [ 0.008999] ------------[ cut here ]------------ > [ 0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab() > [ 0.008999] Hardware name: Bochs > [ 0.008999] Trying to free already-free IRQ 0 > [ 0.008999] Modules linked in: > [ 0.008999] Pid: 1, comm: swapper Tainted: G W 2.6.37-0.rc8.git3.1.fc15.x86_64 #1 > [ 0.008999] Call Trace: > [ 0.008999] [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d > [ 0.008999] [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48 > [ 0.008999] [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e > [ 0.008999] [<ffffffff810ac901>] __free_irq+0xa3/0x1ab > [ 0.008999] [<ffffffff810aca41>] free_irq+0x38/0x50 > [ 0.008999] [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20 > [ 0.008999] [<ffffffff81007cf0>] xen_smp_intr_init+0x1a4/0x1f3 > [ 0.008999] [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107 > [ 0.008999] [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6 > [ 0.008999] [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10 > [ 0.008999] [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b > [ 0.008999] [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6 > [ 0.008999] [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10 > [ 0.008999] ---[ end trace a7919e7f17c0a727 ]--- > [ 0.008999] ------------[ cut here ]------------ > [ 0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab() > [ 0.008999] Hardware name: Bochs > [ 0.008999] Trying to free already-free IRQ 0 > [ 0.008999] Modules linked in: > [ 0.008999] Pid: 1, comm: swapper Tainted: G W 2.6.37-0.rc8.git3.1.fc15.x86_64 #1 > [ 0.008999] Call Trace: > [ 0.008999] [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d > [ 0.008999] [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48 > [ 0.008999] [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e > [ 0.008999] [<ffffffff810ac901>] __free_irq+0xa3/0x1ab > [ 0.008999] [<ffffffff810aca41>] free_irq+0x38/0x50 > [ 0.008999] [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20 > [ 0.008999] [<ffffffff81007d34>] xen_smp_intr_init+0x1e8/0x1f3 > [ 0.008999] [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107 > [ 0.008999] [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6 > [ 0.008999] [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10 > [ 0.008999] [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b > [ 0.008999] [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6 > [ 0.008999] [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10 > [ 0.008999] ---[ end trace a7919e7f17c0a728 ]--- > [ 0.009018] ------------[ cut here ]------------ > [ 0.009999] kernel BUG at arch/x86/xen/smp.c:217! > [ 0.009999] invalid opcode: 0000 [#1] SMP > [ 0.009999] last sysfs file: > [ 0.009999] CPU 0 > [ 0.009999] Modules linked in: > [ 0.009999] > [ 0.009999] Pid: 1, comm: swapper Tainted: G W 2.6.37-0.rc8.git3.1.fc15.x86_64 #1 /Bochs > [ 0.009999] RIP: e030:[<ffffffff81b5839e>] [<ffffffff81b5839e>] xen_smp_prepare_cpus+0x41/0x107 > [ 0.009999] RSP: e02b:ffff880033841eb0 EFLAGS: 00010286 > [ 0.009999] RAX: 00000000ffffffff RBX: ffffffff81c1c7b0 RCX: 0000000000000100 > [ 0.009999] RDX: ffff88003a410000 RSI: 0000000000000000 RDI: ffffffff81d64d50 > [ 0.009999] RBP: ffff880033841ed0 R08: 0000000000000002 R09: 00000000fffffffe > [ 0.009999] R10: ffff880033841e50 R11: 0000000000000000 R12: 0000000000000100 > [ 0.009999] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 0.009999] FS: 0000000000000000(0000) GS:ffff88003b063000(0000) knlGS:0000000000000000 > [ 0.009999] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 0.009999] CR2: 0000000000000000 CR3: 0000000001a03000 CR4: 0000000000000660 > [ 0.009999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 0.009999] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 0.009999] Process swapper (pid: 1, threadinfo ffff880033840000, task ffff880033838000) > [ 0.009999] Stack: > [ 0.009999] ffff880033838000 ffffffff81c1c7b0 0000000000000000 0000000000000000 > [ 0.009999] ffff880033841f40 ffffffff81b53cf3 0000000000000001 0000000000000000 > [ 0.009999] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 0.009999] Call Trace: > [ 0.009999] [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6 > [ 0.009999] [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10 > [ 0.009999] [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b > [ 0.009999] [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6 > [ 0.009999] [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10 > [ 0.009999] Code: ff 48 8b 15 25 b9 fd ff 31 ff 48 c7 c0 00 36 01 00 66 c7 84 10 c0 00 00 00 01 00 e8 3c 76 91 ff 31 ff e8 b2 f7 4a ff 85 c0 74 02 <0f> 0b 31 ff e8 a9 f5 4a ff 48 c7 c2 00 20 c3 81 b9 08 00 00 00 > [ 0.009999] RIP [<ffffffff81b5839e>] xen_smp_prepare_cpus+0x41/0x107 > [ 0.009999] RSP <ffff880033841eb0> > [ 0.009999] ---[ end trace a7919e7f17c0a729 ]--- > [ 0.010021] Kernel panic - not syncing: Attempted to kill init! > [ 0.010999] Pid: 1, comm: swapper Tainted: G D W 2.6.37-0.rc8.git3.1.fc15.x86_64 #1 > [ 0.010999] Call Trace: > [ 0.010999] [<ffffffff814759d5>] panic+0x91/0x1a4 > [ 0.010999] [<ffffffff810d6093>] ? perf_event_exit_task+0xb8/0x1c7 > [ 0.010999] [<ffffffff81053b89>] do_exit+0x7c/0x75d > [ 0.010999] [<ffffffff8107d21f>] ? arch_local_irq_restore+0xb/0xd > [ 0.010999] [<ffffffff8147795f>] ? _raw_spin_unlock_irqrestore+0x17/0x19 > [ 0.010999] [<ffffffff8100022a>] ? _stext+0x9a/0xe70 > [ 0.010999] [<ffffffff81478c8b>] oops_end+0xbf/0xc7 > [ 0.010999] [<ffffffff8100022a>] ? _stext+0x9a/0xe70 > [ 0.010999] [<ffffffff8100022a>] ? _stext+0x9a/0xe70 > [ 0.010999] [<ffffffff8100e6ec>] die+0x5a/0x66 > [ 0.010999] [<ffffffff81478518>] do_trap+0x121/0x130 > [ 0.010999] [<ffffffff8100c06d>] do_invalid_op+0x98/0xa1 > [ 0.010999] [<ffffffff81b5839e>] ? xen_smp_prepare_cpus+0x41/0x107 > [ 0.010999] [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e > [ 0.010999] [<ffffffff8107d21f>] ? arch_local_irq_restore+0xb/0xd > [ 0.010999] [<ffffffff8147795f>] ? _raw_spin_unlock_irqrestore+0x17/0x19 > [ 0.010999] [<ffffffff810ac90d>] ? __free_irq+0xaf/0x1ab > [ 0.010999] [<ffffffff8100b95b>] invalid_op+0x1b/0x20 > [ 0.010999] [<ffffffff81b5839e>] ? xen_smp_prepare_cpus+0x41/0x107 > [ 0.010999] [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6 > [ 0.010999] [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10 > [ 0.010999] [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b > [ 0.010999] [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6 > [ 0.010999] [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10 > (XEN) Domain 0 crashed: rebooting machine in 5 seconds.> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 5 Jan 2011, Konrad Rzeszutek Wilk wrote:> Ahh, I hit this. Can you try ''stable/bug-fixes'' branch of mine? > It has "xen/irq: Don''t fall over when nr_irqs_gsi > nr_irqs." patch > which will fix the below problem you are seeing. > > But I am not sure if it fixes the problem you are having with hardware?That fixes the kvm boot, but unfortunately booting directly on the hardware doesn''t. Incidentally it is definitely turning debug options off that trigger the crash, as I realized I was building a kernel-debug package as well as a kernel package from the same source RPM, and it boots with the debug kernel but not the ordinary kernel. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-06  14:56 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Wed, Jan 05, 2011 at 11:11:03PM +0000, M A Young wrote:> On Wed, 5 Jan 2011, Konrad Rzeszutek Wilk wrote: > > >Ahh, I hit this. Can you try ''stable/bug-fixes'' branch of mine? > >It has "xen/irq: Don''t fall over when nr_irqs_gsi > nr_irqs." patch > >which will fix the below problem you are seeing. > > > >But I am not sure if it fixes the problem you are having with hardware? > > That fixes the kvm boot, but unfortunately booting directly on the > hardware doesn''t. Incidentally it is definitely turning debug > options off that trigger the crash, as I realized I was building a > kernel-debug package as well as a kernel package from the sameOk, I think we need a serial output. I don''t remember if you said that your docking station has a serial port or not. If the docking station does not, this card ought to do the trick: http://www.newegg.com/Product/Product.aspx?Item=N82E16839328018&Tpk=SDEXP15005 You can use under Xen as a normal PCI type serial card. For details: http://wiki.xensource.com/xenwiki/XenSerialConsole> source RPM, and it boots with the debug kernel but not the ordinary > kernel.> > Michael Young_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 6 Jan 2011, Konrad Rzeszutek Wilk wrote:> Ok, I think we need a serial output. I don''t remember if you said that > your docking station has a serial port or not.I don''t have any good way of getting a serial port on this computer. I have however managed to get output on the screen and have a poor quality photo. The relevant lines looks like BUG unable to handle kernel NULL pointer dereference at IP: [<ffffffff81b69b92>] setup_node_bootmem+0x16b/0x199 Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-07  19:18 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Fri, Jan 07, 2011 at 12:37:36AM +0000, M A Young wrote:> On Thu, 6 Jan 2011, Konrad Rzeszutek Wilk wrote: > > >Ok, I think we need a serial output. I don''t remember if you said that > >your docking station has a serial port or not. > > I don''t have any good way of getting a serial port on this computer. > I have however managed to get output on the screen and have a poor > quality photo. The relevant lines looks like > BUG unable to handle kernel NULL pointer dereference at > IP: [<ffffffff81b69b92>] setup_node_bootmem+0x16b/0x199Hmmm, I did see something similar to this in 2.6.37-rc1, but we fixed that quickly. It was triggered by having 4GB of memory or so and the work-around was to use dom0_mem=max:2GB. Can you send the photo? Maybe the calleer stack will shed some light. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
[This email is either empty or too large to be displayed at this time]
Konrad Rzeszutek Wilk
2011-Jan-07  21:23 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Fri, Jan 07, 2011 at 08:34:43PM +0000, M A Young wrote:> On Fri, 7 Jan 2011, Konrad Rzeszutek Wilk wrote: > >>BUG unable to handle kernel NULL pointer dereference at > >>IP: [<ffffffff81b69b92>] setup_node_bootmem+0x16b/0x199 > > >Hmmm, I did see something similar to this in 2.6.37-rc1, but we fixed > >that quickly. It was triggered by having 4GB of memory or so and > >the work-around was to use dom0_mem=max:2GB. > > > >Can you send the photo? Maybe the calleer stack will shed some light. > > Here are two photos of the output at different times. The context is > > 0xffffffff81b69b6d <setup_node_bootmem+326>: > callq 0xffffffff81475ec9 <printk> > 0xffffffff81b69b72 <setup_node_bootmem+331>: movslq %ebx,%rdx > 0xffffffff81b69b75 <setup_node_bootmem+334>: xor %eax,%eax > 0xffffffff81b69b77 <setup_node_bootmem+336>: mov $0x4fc0,%ecx > 0xffffffff81b69b7c <setup_node_bootmem+341>: > mov -0x7e4cb750(,%rdx,8),%rsi > 0xffffffff81b69b84 <setup_node_bootmem+349>: shr $0xc,%r13 > 0xffffffff81b69b88 <setup_node_bootmem+353>: shr $0xc,%r12 > 0xffffffff81b69b8c <setup_node_bootmem+357>: sub %r13,%r12 > 0xffffffff81b69b8f <setup_node_bootmem+360>: mov %rsi,%rdi > 0xffffffff81b69b92 <setup_node_bootmem+363>: rep stos %eax,%es:(%rdi)That looks like: memset(NODE_DATA(nodeid), 0, sizeof(pg_data_t));>From the photo, %eax is zero, and this is perfect code for copying values in.> 0xffffffff81b69b94 <setup_node_bootmem+365>: mov %ebx,%edi > 0xffffffff81b69b96 <setup_node_bootmem+367>: > mov -0x7e4cb750(,%rdx,8),%rax > > which is somewhere around line 224 in arch/x86/mm/numa_64.c > > if (nid != nodeid) > printk(KERN_INFO " NODE_DATA(%d) on node %d\n", > nodeid, nid);Can you make sure that 419db274bed4269f475a8e78cbe9c917192cfe8b is in? That is the patch that fixed this issue last time. However .. the more I look at the code the less it seems to be that and that is the last fix in that file. Do you see any messages about ''Cannot find 20 bytes in node X'' (where X I think is 0)? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 7 Jan 2011, Konrad Rzeszutek Wilk wrote:> Can you make sure that 419db274bed4269f475a8e78cbe9c917192cfe8b is in? That > is the patch that fixed this issue last time.Yes it is.> Do you see any messages about ''Cannot find 20 bytes in node X'' (where X > I think is 0)?I haven''t spotted any such message. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-10  18:42 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
> >Do you see any messages about ''Cannot find 20 bytes in node X'' (where X > >I think is 0)? > > I haven''t spotted any such message.Try fiddling with the dom0_mem.. to see at what point it starts failing. Is this happening only on this machine or do you see it on other boxes too? Your E820 looks as so: BIOS-e820: 0000000000000000 - 000000000009f000 (usable) BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved) BIOS-e820: 0000000000100000 - 00000000df66d800 (usable) BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved) BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved) BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved) BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000120000000 (usable) Which looks completly normal.. I am really at loss here. You could also sprinkle printk''s around that code (or xen_raw_printk and inhibit the Linux kernel console output - that way you would only see the Xen and output from xen_raw_printk). Let me bootup 2.6.37 on a 4GB machine just to see if I am seeing this. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:> Try fiddling with the dom0_mem.. to see at what point it starts failing. Is > this happening only on this machine or do you see it on other boxes too?dom0_mem=max:3574MB boots, dom0_mem=max:3575MB doesn''t. I haven''t tried it on other boxes yet.> Which looks completly normal.. I am really at loss here. You could > also sprinkle printk''s around that code (or xen_raw_printk and inhibit > the Linux kernel console output - that way you would only see the Xen > and output from xen_raw_printk).I will think about where the printk''s should go, but probably not tonight. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:> Your E820 looks as so: > BIOS-e820: 0000000000000000 - 000000000009f000 (usable) > BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved) > BIOS-e820: 0000000000100000 - 00000000df66d800 (usable) > BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved) > BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved) > BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) > BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved) > BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved) > BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved) > BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) > BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved) > BIOS-e820: 0000000100000000 - 0000000120000000 (usable) > > Which looks completly normal.. I am really at loss here.I have looked at this again and I am worried by the last section, which is a chunk from 4GB to 4.5GB. The problem is that I only have 4GB. My tests show that dom0_mem=max:3574MB boots, dom0_mem=max:3575MB doesn''t. The first two "usable" chunks add up to a few KB over 3574MB so the problems come when it tries to use the final "usable" chunk which I interpret as being beyond the memory I have. 3574MB is a bit less than 3.5GB so I would guess that the final chunk is trying to make up the memory to 4GB. There are also gaps in these memory pieces which add up to about 445MB. Hence I think there are some issues with the memory allocation mechanism. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 16/01/2011 20:48, "M A Young" <m.a.young@durham.ac.uk> wrote:>> BIOS-e820: 0000000000000000 - 000000000009f000 (usable) >> BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved) >> BIOS-e820: 0000000000100000 - 00000000df66d800 (usable) >> BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved) >> BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved) >> BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) >> BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved) >> BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved) >> BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved) >> BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) >> BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved) >> BIOS-e820: 0000000100000000 - 0000000120000000 (usable) >> >> Which looks completly normal.. I am really at loss here. > > I have looked at this again and I am worried by the last section, which is > a chunk from 4GB to 4.5GB. The problem is that I only have 4GB. My tests > show that dom0_mem=max:3574MB boots, dom0_mem=max:3575MB doesn''t. The > first two "usable" chunks add up to a few KB over 3574MB so the problems > come when it tries to use the final "usable" chunk which I interpret as > being beyond the memory I have. > > 3574MB is a bit less than 3.5GB so I would guess that the final chunk is > trying to make up the memory to 4GB. There are also gaps in these memory > pieces which add up to about 445MB. Hence I think there are some issues > with the memory allocation mechanism.Device memory gets mapped just below 4GB, so the last piece of your RAM gets re-mapped above 4GB by your BIOS, so that it can still be accessed. If you add up the size of all the usable regions in the list above, it will sum to a bit less than 4GB. The bug will be something in the kernel code that can''t handle physical addresses wider than 32 bits (i.e., physical addresses 4GB and above). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:>>> Do you see any messages about ''Cannot find 20 bytes in node X'' (where X >>> I think is 0)? >> >> I haven''t spotted any such message. > > Try fiddling with the dom0_mem.. to see at what point it starts failing. Is > this happening only on this machine or do you see it on other boxes too? > > Your E820 looks as so: > BIOS-e820: 0000000000000000 - 000000000009f000 (usable) > BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved) > BIOS-e820: 0000000000100000 - 00000000df66d800 (usable) > BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved) > BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved) > BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) > BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved) > BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved) > BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved) > BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) > BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved) > BIOS-e820: 0000000100000000 - 0000000120000000 (usable) > > Which looks completly normal.. I am really at loss here. You could > also sprinkle printk''s around that code (or xen_raw_printk and inhibit > the Linux kernel console output - that way you would only see the Xen > and output from xen_raw_printk). > > Let me bootup 2.6.37 on a 4GB machine just to see if I am seeing this.My next theory is that the issue is that the system is an alignment issue. The NODE DATA is put in the range 00000000df659800 to 00000000df66d7ff (the top end of the second "usable" chunk) and the problem come when it tries to write to the final 2K piece (00000000df66d000 to 00000000df66d800 - 00000000df66d000 occurs on the stack) which hasn''t been initialized properly because it isn''t a 4K piece. Does this sound plausible? Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 18 Jan 2011, M A Young wrote:> My next theory is that the issue is that the system is an alignment issue. > The NODE DATA is put in the range 00000000df659800 to 00000000df66d7ff (the > top end of the second "usable" chunk) and the problem come when it tries to > write to the final 2K piece (00000000df66d000 to 00000000df66d800 - > 00000000df66d000 occurs on the stack) which hasn''t been initialized properly > because it isn''t a 4K piece. > Does this sound plausible?Further experiments confirm that it is this 2K piece causing the problem - if I reserve the 2K chunk in the same was that NODE DATA is reserved (though without zeroing it) the system boots, if I reduce this to reserving only 1K then it doesn''t. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-20  19:24 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Wed, Jan 19, 2011 at 10:54:00PM +0000, M A Young wrote:> On Tue, 18 Jan 2011, M A Young wrote: > > >My next theory is that the issue is that the system is an > >alignment issue. The NODE DATA is put in the range > >00000000df659800 to 00000000df66d7ff (the top end of the second > >"usable" chunk) and the problem come when it tries to write to the > >final 2K piece (00000000df66d000 to 00000000df66d800 - > >00000000df66d000 occurs on the stack) which hasn''t been > >initialized properly because it isn''t a 4K piece. > >Does this sound plausible? > > Further experiments confirm that it is this 2K piece causing the > problem - if I reserve the 2K chunk in the same was that NODE DATA > is reserved (though without zeroing it) the system boots, if I > reduce this to reserving only 1K then it doesn''t.I think my math is off here. The reserve call is made on the df659800 -> df66d7ff, that would be 20 pages of data. The last PFN df66d is where it dies b/c there is no PTE entry set for it? What happens if you fudge the code so it allocates those pages to be page aligned. So df65a000->df66e000 ? We skip this way the region df659800->df659fff and start on a new PFN (and pte). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 20 Jan 2011, Konrad Rzeszutek Wilk wrote:> I think my math is off here. The reserve call is made on the > df659800 -> df66d7ff, that would be 20 pages of data. The last > PFN df66d is where it dies b/c there is no PTE entry set for it? > > What happens if you fudge the code so it allocates those pages to be > page aligned. So df65a000->df66e000 ? We skip this way the region > df659800->df659fff and start on a new PFN (and pte).I get (though the photo isn''t clear in places) df659000->df66cfff and it crashes at find_range_array+0x4d/0x56 which traces back to the call of memblock_find_dma_reserve from setup_arch in arch/x86/kernel/setup.c . So it still crashes, but at a slightly later stage. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-21  15:27 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Thu, Jan 20, 2011 at 10:39:17PM +0000, M A Young wrote:> On Thu, 20 Jan 2011, Konrad Rzeszutek Wilk wrote: > > >I think my math is off here. The reserve call is made on the > >df659800 -> df66d7ff, that would be 20 pages of data. The last > >PFN df66d is where it dies b/c there is no PTE entry set for it? > > > >What happens if you fudge the code so it allocates those pages to be > >page aligned. So df65a000->df66e000 ? We skip this way the region > >df659800->df659fff and start on a new PFN (and pte). > > I get (though the photo isn''t clear in places) df659000->df66cfff > and it crashes at find_range_array+0x4d/0x56 which traces back to > the call of memblock_find_dma_reserve from setup_arch in > arch/x86/kernel/setup.c . So it still crashes, but at a slightly > later stage.Ok, so we just pass the back so to say to the next user of that PFN. We should find out why that PTE is not being setup.... And I think this might be a missing entry in the MFN (thanks to Stefan Bader finding a bug there). Looking at your E820: [ 0.000000] Xen: 0000000000100000 - 000000003b0e2000 (usable) Your memory ends a 3b0e, which is not on a nice page boundary. Can you try this patch (you will need to re-gigger as in 2.6.38-rc1 the p2m code moved out of xen/mmu.c to xen/p2m.c): https://patchwork.kernel.org/patch/492011/ BTW, You are doing a great detective work here. Thanks for being willing to dig in this. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 21 Jan 2011, Konrad Rzeszutek Wilk wrote:> We should find out why that PTE is not being setup.... And I think > this might be a missing entry in the MFN (thanks to Stefan Bader > finding a bug there). Looking at your E820: > > [ 0.000000] Xen: 0000000000100000 - 000000003b0e2000 (usable)Mine is [ 0.000000] Xen: 0000000000100000 - 00000000df66d800 (usable)> Your memory ends a 3b0e, which is not on a nice page boundary.Mine isn''t on a page boundary at all!> Can you try this patch (you will need to re-gigger as in 2.6.38-rc1 > the p2m code moved out of xen/mmu.c to xen/p2m.c):It doesn''t help, and crashes at the same place as the unaltered kernel. My problem may not be happening in the xen code at all. From the boot logs of one of my hack attempts that actually booted I have [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] Xen: 0000000000000000 - 000000000009f000 (usable) [ 0.000000] Xen: 000000000009f000 - 0000000000100000 (reserved) [ 0.000000] Xen: 0000000000100000 - 00000000df66d800 (usable) [ 0.000000] Xen: 00000000df66d800 - 00000000e0000000 (reserved) [ 0.000000] Xen: 00000000f8000000 - 00000000fc000000 (reserved) [ 0.000000] Xen: 00000000fec00000 - 00000000fec10000 (reserved) [ 0.000000] Xen: 00000000fed18000 - 00000000fed1c000 (reserved) [ 0.000000] Xen: 00000000fed20000 - 00000000fed90000 (reserved) [ 0.000000] Xen: 00000000feda0000 - 00000000feda6000 (reserved) [ 0.000000] Xen: 00000000fee00000 - 00000000fee10000 (reserved) [ 0.000000] Xen: 00000000ffe00000 - 0000000100000000 (reserved) [ 0.000000] Xen: 0000000100000000 - 00000001342cb000 (usable) [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI 2.4 present. [ 0.000000] No AGP bridge found [ 0.000000] last_pfn = 0x1342cb max_arch_pfn = 0x400000000 [ 0.000000] last_pfn = 0xdf66d max_arch_pfn = 0x400000000 [ 0.000000] init_memory_mapping: 0000000000000000-00000000df66d000 [ 0.000000] init_memory_mapping: 0000000100000000-00000001342cb000 The last_pfn figure above is actually one more than the last pfn that is initialized and is obtained by right-shifting the start memory address plus the length of the memory piece. That is fine if the memory ends on a page boundary, but not if it doesn''t because the partial page doesn''t get a pfn. Thus it is available for early allocations such as the NODE DATA chunk. Xen goes for the memory chunk just below the 4GB mark and hits this region, bare metal (2.6.35) starts the NODE DATA at the 4GB mark and doesn''t. I am not sure if bare metal is clever enough not to try to use this partial page, or whether it could but misses it because of how it places the NODE_DATA (at the bottom end of a memory region rather than the top end). Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-24  14:14 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Fri, Jan 21, 2011 at 09:43:34PM +0000, M A Young wrote:> On Fri, 21 Jan 2011, Konrad Rzeszutek Wilk wrote: > > >We should find out why that PTE is not being setup.... And I think > >this might be a missing entry in the MFN (thanks to Stefan Bader > >finding a bug there). Looking at your E820: > > > >[ 0.000000] Xen: 0000000000100000 - 000000003b0e2000 (usable) > > Mine is > [ 0.000000] Xen: 0000000000100000 - 00000000df66d800 (usable) > > >Your memory ends a 3b0e, which is not on a nice page boundary. > > Mine isn''t on a page boundary at all!Whoa.> > >Can you try this patch (you will need to re-gigger as in 2.6.38-rc1 > >the p2m code moved out of xen/mmu.c to xen/p2m.c): > > It doesn''t help, and crashes at the same place as the unaltered > kernel. My problem may not be happening in the xen code at all. From > the boot logs of one of my hack attempts that actually booted I have > > [ 0.000000] BIOS-provided physical RAM map: > [ 0.000000] Xen: 0000000000000000 - 000000000009f000 (usable) > [ 0.000000] Xen: 000000000009f000 - 0000000000100000 (reserved) > [ 0.000000] Xen: 0000000000100000 - 00000000df66d800 (usable) > [ 0.000000] Xen: 00000000df66d800 - 00000000e0000000 (reserved) > [ 0.000000] Xen: 00000000f8000000 - 00000000fc000000 (reserved) > [ 0.000000] Xen: 00000000fec00000 - 00000000fec10000 (reserved) > [ 0.000000] Xen: 00000000fed18000 - 00000000fed1c000 (reserved) > [ 0.000000] Xen: 00000000fed20000 - 00000000fed90000 (reserved) > [ 0.000000] Xen: 00000000feda0000 - 00000000feda6000 (reserved) > [ 0.000000] Xen: 00000000fee00000 - 00000000fee10000 (reserved) > [ 0.000000] Xen: 00000000ffe00000 - 0000000100000000 (reserved) > [ 0.000000] Xen: 0000000100000000 - 00000001342cb000 (usable) > [ 0.000000] NX (Execute Disable) protection: active > [ 0.000000] DMI 2.4 present. > [ 0.000000] No AGP bridge found > [ 0.000000] last_pfn = 0x1342cb max_arch_pfn = 0x400000000 > [ 0.000000] last_pfn = 0xdf66d max_arch_pfn = 0x400000000 > [ 0.000000] init_memory_mapping: 0000000000000000-00000000df66d000 > [ 0.000000] init_memory_mapping: 0000000100000000-00000001342cb000 > > The last_pfn figure above is actually one more than the last pfn > that is initialized and is obtained by right-shifting the start > memory address plus the length of the memory piece. That is fine if > the memory ends on a page boundary, but not if it doesn''t because > the partial page doesn''t get a pfn. Thus it is available for earlyWe can fix how the E820 is done. Look in arch/x86/xen/setup.c for ''xen_memory_setup'' function. Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1) that should trim off the last 2048 bytes.> allocations such as the NODE DATA chunk. Xen goes for the memory > chunk just below the 4GB mark and hits this region, bare metal > (2.6.35) starts the NODE DATA at the 4GB mark and doesn''t.That should be generic and hit both cases - but I think this got fixed in 2.6.36-ish were going for the region right underneath 4GB is not done (don''t remember the details, sadly).> > I am not sure if bare metal is clever enough not to try to use this > partial page, or whether it could but misses it because of how it > places the NODE_DATA (at the bottom end of a memory region rather > than the top end).If you leave the instrumentation you placed in and add ''memblock=debug'' that should give you a good idea of how it does it?> > Michael Young > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-24  19:04 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
I have a work-in-progress patch that fixes a booting issue on one of my
testboxes. Could you please give it a try, passing dom0_mem=700M to the
Xen command line?
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 947f42a..ebc0221 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -291,10 +291,23 @@ unsigned long __init_refok init_memory_mapping(unsigned
long start,
 		 * located on different 2M pages. cleanup_highmap(), however,
 		 * can only consider _end when it runs, so destroy any
 		 * mappings beyond _brk_end here.
+		 * Be careful not to go over _end.
 		 */
 		pud = pud_offset(pgd_offset_k(_brk_end), _brk_end);
 		pmd = pmd_offset(pud, _brk_end - 1);
-		while (++pmd <= pmd_offset(pud, (unsigned long)_end - 1))
+		while (++pmd < pmd_offset(pud, (unsigned long)_end - 1))
+			pmd_clear(pmd);
+		if (((unsigned long)_end) & ~PMD_MASK) {
+			pte_t *pte;
+			unsigned long addr;
+			for (addr = ((unsigned long)_end) & PMD_MASK;
+					addr < ((unsigned long)_end);
+					addr += PAGE_SIZE) {
+				pte = pte_offset_map(pmd, addr);
+				pte_clear(&init_mm, addr, pte);
+				pte_unmap(pte);
+			}
+		} else
 			pmd_clear(pmd);
 	}
 #endif
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
On Mon, 24 Jan 2011, Konrad Rzeszutek Wilk wrote:> We can fix how the E820 is done. > Look in arch/x86/xen/setup.c for ''xen_memory_setup'' function. > Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1) > that should trim off the last 2048 bytes.The attached patch works for me, though it does assume the memory region starts on a page boundary. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 24 Jan 2011, Stefano Stabellini wrote:> I have a work-in-progress patch that fixes a booting issue on one of my > testboxes. Could you please give it a try, passing dom0_mem=700M to the > Xen command line?It wouldn''t prove anything in my case as booting with dom0_mem=700M works. Michael Young _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-25  12:03 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Mon, 24 Jan 2011, M A Young wrote:> On Mon, 24 Jan 2011, Konrad Rzeszutek Wilk wrote: > > > We can fix how the E820 is done. > > Look in arch/x86/xen/setup.c for ''xen_memory_setup'' function. > > Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1) > > that should trim off the last 2048 bytes. > > The attached patch works for me, though it does assume the memory region > starts on a page boundary.It turns out that it is me having the same issue you have and not the other way around :) Your patch (in addition to my previous patch) makes my testbox boot, no matter what dom0_mem parameter I choose. Appended is a version of the patch that doesn''t assume that the memory region starts on a page boundary. --- diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index b5a7f92..a3d28a1 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void) e820.nr_map = 0; xen_extra_mem_start = mem_end; for (i = 0; i < memmap.nr_entries; i++) { - unsigned long long end = map[i].addr + map[i].size; + unsigned long long end; + if (map[i].type == E820_RAM) + map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; + end = map[i].addr + map[i].size; if (map[i].type == E820_RAM && end > mem_end) { /* RAM off the end - may be partially included */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 2011-01-25 at 12:03 +0000, Stefano Stabellini wrote:> On Mon, 24 Jan 2011, M A Young wrote: > > On Mon, 24 Jan 2011, Konrad Rzeszutek Wilk wrote: > > > > > We can fix how the E820 is done. > > > Look in arch/x86/xen/setup.c for ''xen_memory_setup'' function. > > > Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1) > > > that should trim off the last 2048 bytes. > > > > The attached patch works for me, though it does assume the memory region > > starts on a page boundary. > > It turns out that it is me having the same issue you have and not the > other way around :) > > Your patch (in addition to my previous patch) makes my testbox boot, no > matter what dom0_mem parameter I choose. > > Appended is a version of the patch that doesn''t assume that the memory > region starts on a page boundary. > > --- > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index b5a7f92..a3d28a1 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void) > e820.nr_map = 0; > xen_extra_mem_start = mem_end; > for (i = 0; i < memmap.nr_entries; i++) { > - unsigned long long end = map[i].addr + map[i].size; > + unsigned long long end; > + if (map[i].type == E820_RAM) > + map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;The more normal idiom to round down to a page boundary in the kernel is: map[i].size &= ~(PAGE_SIZE-1); Do you also need to page align map[i].addr upwards for maximum safety? Ian.> + end = map[i].addr + map[i].size;> > if (map[i].type == E820_RAM && end > mem_end) { > /* RAM off the end - may be partially included */ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-25  13:31 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Tue, 25 Jan 2011, Ian Campbell wrote:> > It turns out that it is me having the same issue you have and not the > > other way around :) > > > > Your patch (in addition to my previous patch) makes my testbox boot, no > > matter what dom0_mem parameter I choose. > > > > Appended is a version of the patch that doesn''t assume that the memory > > region starts on a page boundary. > > > > --- > > > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > > index b5a7f92..a3d28a1 100644 > > --- a/arch/x86/xen/setup.c > > +++ b/arch/x86/xen/setup.c > > @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void) > > e820.nr_map = 0; > > xen_extra_mem_start = mem_end; > > for (i = 0; i < memmap.nr_entries; i++) { > > - unsigned long long end = map[i].addr + map[i].size; > > + unsigned long long end; > > + if (map[i].type == E820_RAM) > > + map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; > > The more normal idiom to round down to a page boundary in the kernel is: > map[i].size &= ~(PAGE_SIZE-1); > > Do you also need to page align map[i].addr upwards for maximum safety? >unless I am very confused map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE is not the same as: as map[i].size &= ~(PAGE_SIZE-1): because it also takes into account the possibility that map[i].addr is not page aligned. It doesn''t move map[i].addr upward but still makes sure that the region ends at a page boundary anyway. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 2011-01-25 at 13:31 +0000, Stefano Stabellini wrote:> On Tue, 25 Jan 2011, Ian Campbell wrote: > > > It turns out that it is me having the same issue you have and not the > > > other way around :) > > > > > > Your patch (in addition to my previous patch) makes my testbox boot, no > > > matter what dom0_mem parameter I choose. > > > > > > Appended is a version of the patch that doesn''t assume that the memory > > > region starts on a page boundary. > > > > > > --- > > > > > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > > > index b5a7f92..a3d28a1 100644 > > > --- a/arch/x86/xen/setup.c > > > +++ b/arch/x86/xen/setup.c > > > @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void) > > > e820.nr_map = 0; > > > xen_extra_mem_start = mem_end; > > > for (i = 0; i < memmap.nr_entries; i++) { > > > - unsigned long long end = map[i].addr + map[i].size; > > > + unsigned long long end; > > > + if (map[i].type == E820_RAM) > > > + map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; > > > > The more normal idiom to round down to a page boundary in the kernel is: > > map[i].size &= ~(PAGE_SIZE-1); > > > > Do you also need to page align map[i].addr upwards for maximum safety? > > > > unless I am very confused > > map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE > > is not the same as: > > as map[i].size &= ~(PAGE_SIZE-1): > > because it also takes into account the possibility that map[i].addr is > not page aligned.Oh yes, I didn''t notice that aspect of it.> It doesn''t move map[i].addr upward but still makes sure that > the region ends at a page boundary anyway.Which returns to my second question ;-) Why do we not need to align addr too? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-25  15:19 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Tue, 25 Jan 2011, Ian Campbell wrote:> > unless I am very confused > > > > map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE > > > > is not the same as: > > > > as map[i].size &= ~(PAGE_SIZE-1): > > > > because it also takes into account the possibility that map[i].addr is > > not page aligned. > > Oh yes, I didn''t notice that aspect of it. > > > It doesn''t move map[i].addr upward but still makes sure that > > the region ends at a page boundary anyway. > > Which returns to my second question ;-) Why do we not need to align addr > too?My machine can boot fine with a map[i].addr not page aligned. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Jan-25  15:52 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Tue, Jan 25, 2011 at 03:19:22PM +0000, Stefano Stabellini wrote:> On Tue, 25 Jan 2011, Ian Campbell wrote: > > > unless I am very confused > > > > > > map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE > > > > > > is not the same as: > > > > > > as map[i].size &= ~(PAGE_SIZE-1): > > > > > > because it also takes into account the possibility that map[i].addr is > > > not page aligned. > > > > Oh yes, I didn''t notice that aspect of it. > > > > > It doesn''t move map[i].addr upward but still makes sure that > > > the region ends at a page boundary anyway. > > > > Which returns to my second question ;-) Why do we not need to align addr > > too? > > My machine can boot fine with a map[i].addr not page aligned.OK, so then the patch that M A Young came up with ought to do it? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jan-25  15:56 UTC
Re: [Xen-devel] Crash on boot with 2.6.37-rc8-git3
On Tue, 25 Jan 2011, Konrad Rzeszutek Wilk wrote:> On Tue, Jan 25, 2011 at 03:19:22PM +0000, Stefano Stabellini wrote: > > On Tue, 25 Jan 2011, Ian Campbell wrote: > > > > unless I am very confused > > > > > > > > map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE > > > > > > > > is not the same as: > > > > > > > > as map[i].size &= ~(PAGE_SIZE-1): > > > > > > > > because it also takes into account the possibility that map[i].addr is > > > > not page aligned. > > > > > > Oh yes, I didn''t notice that aspect of it. > > > > > > > It doesn''t move map[i].addr upward but still makes sure that > > > > the region ends at a page boundary anyway. > > > > > > Which returns to my second question ;-) Why do we not need to align addr > > > too? > > > > My machine can boot fine with a map[i].addr not page aligned. > > OK, so then the patch that M A Young came up with ought to do it? >I think you need the slightly improved version I posted before that can handle map[i].addr not page aligned (I silently added a s-o-b Young, I hope he''s OK with this). --- commit b84683ad1e704c2a296d08ff0cbe29db936f94a7 Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Date: Tue Jan 25 12:03:42 2011 +0000 xen: make sure the e820 memory regions end at page boundary Signed-off-by: M A Young <m.a.young@durham.ac.uk> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index b5a7f92..a3d28a1 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void) e820.nr_map = 0; xen_extra_mem_start = mem_end; for (i = 0; i < memmap.nr_entries; i++) { - unsigned long long end = map[i].addr + map[i].size; + unsigned long long end; + if (map[i].type == E820_RAM) + map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; + end = map[i].addr + map[i].size; if (map[i].type == E820_RAM && end > mem_end) { /* RAM off the end - may be partially included */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 25 Jan 2011, Stefano Stabellini wrote:> I think you need the slightly improved version I posted before that can > handle map[i].addr not page aligned (I silently added a s-o-b Young, I > hope he''s OK with this).Yes and yes. My version doesn''t work if map[i].addr is not page aligned. The aim is to make sure the end address is page aligned, and avoid ending with a partial page which won''t have a PFN and might also require different treatment if there is reserved content in the rest of the page (which is true in my case). Michael Young> commit b84683ad1e704c2a296d08ff0cbe29db936f94a7 > Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Date: Tue Jan 25 12:03:42 2011 +0000 > > xen: make sure the e820 memory regions end at page boundary > > Signed-off-by: M A Young <m.a.young@durham.ac.uk> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c > index b5a7f92..a3d28a1 100644 > --- a/arch/x86/xen/setup.c > +++ b/arch/x86/xen/setup.c > @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void) > e820.nr_map = 0; > xen_extra_mem_start = mem_end; > for (i = 0; i < memmap.nr_entries; i++) { > - unsigned long long end = map[i].addr + map[i].size; > + unsigned long long end; > + if (map[i].type == E820_RAM) > + map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE; > + end = map[i].addr + map[i].size; > > if (map[i].type == E820_RAM && end > mem_end) { > /* RAM off the end - may be partially included */ >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel