Hi, Aron, Juan, all We could not boot dom0 with kernel-xen-2.6.17-1.2488.fc6 on Tiger4. So I and Kouya debuged it, and found kernel text area is cleared. Isuku make a patch which resolve the issue. And I tested booting dom0 with Isuku''s patch. So My test results is below. I used the below as dom0 kernel http://hg.et.redhat.com/kernel/linux-2.6-xen-fedora +kernel-2.6.17-ia64-xen.config+rsvd_region.patch And I used the latest xen-ia64-unstable.hg and xen-unstable. With the dom0 and the xen, I can boot dom0. Then I research the recently cset, I found the following mismatch. http://hg.et.redhat.com/kernel/linux-2.6-xen-fedora have vDSO linux-2.6- saparse changes, but Hypervisor don''t have the changes. changeset: 10703:8d501f39286c user: awilliam@xenbuild.aw date: Mon Jul 24 13:43:35 2006 -0600 summary: [IA64] vDSO paravirtualization: paravirtualize vDSO changeset: 10702:614deef19299 user: awilliam@xenbuild.aw date: Mon Jul 24 13:04:40 2006 -0600 summary: [IA64] vDSO paravirtualization: import linux files I think we need to update Hypervisor. Is this possible? Best Regards, Akio Takebe
Akio Takebe wrote: [Thu Aug 03 2006, 03:41:40PM EDT]> We could not boot dom0 with kernel-xen-2.6.17-1.2488.fc6 on Tiger4. > So I and Kouya debuged it, and found kernel text area is cleared. > Isuku make a patch which resolve the issue....> With the dom0 and the xen, I can boot dom0.Woohoo! Thanks for debugging this!> Then I research the recently cset, I found the following mismatch. > http://hg.et.redhat.com/kernel/linux-2.6-xen-fedora have vDSO linux-2.6- > saparse changes, but Hypervisor don''t have the changes. > > changeset: 10703:8d501f39286c > user: awilliam@xenbuild.aw > date: Mon Jul 24 13:43:35 2006 -0600 > summary: [IA64] vDSO paravirtualization: paravirtualize vDSO > > changeset: 10702:614deef19299 > user: awilliam@xenbuild.aw > date: Mon Jul 24 13:04:40 2006 -0600 > summary: [IA64] vDSO paravirtualization: import linux files > > I think we need to update Hypervisor. > Is this possible?Yes, Juan is working on this now. Thanks again for figuring out the problem. :-) Aron
Akio Takebe
2006-Aug-03 20:34 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Fedora-xen-ia64 test status
Hi, Xen/IA64 people When I boot the dom0, I get some __might_sleep()''s messages. BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 Call Trace: [<a00000010001c7a0>] show_stack+0x40/0xa0 sp=e00000002f7dfbb0 bsp=e00000002f7d91a8 [<a00000010001c830>] dump_stack+0x30/0x60 sp=e00000002f7dfd80 bsp=e00000002f7d9190 [<a00000010006c920>] __might_sleep+0x2a0/0x2c0 sp=e00000002f7dfd80 bsp=e00000002f7d9168 [<a0000001000bce80>] down_read+0x20/0x60 sp=e00000002f7dfd80 bsp=e00000002f7d9148 [<a0000001005b14f0>] ia64_do_page_fault+0x110/0x9e0 sp=e00000002f7dfd80 bsp=e00000002f7d90f8 [<a000000100064be0>] xen_leave_kernel+0x0/0x3b0 sp=e00000002f7dfe30 bsp=e00000002f7d90f8 Do we need CONFIG_DEBUG_SPINLOCK_SLEEP? If we need CONFIG_DEBUG_SPINLOCK_SLEEP, we must modify __might_sleep(). 6073 void __might_sleep(char *file, int line) 6074 { 6075 #if defined(in_atomic) 6076 static unsigned long prev_jiffy; /* ratelimiting */ 6077 6078 if ((in_atomic() || irqs_disabled()) && 6079 system_state == SYSTEM_RUNNING && !oops_in_progress) { 6080 if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) <<<<this 6081 return; 6082 prev_jiffy = jiffies; 6083 printk(KERN_ERR "Debug: sleeping function called from invalid" 6084 " context at %s:%d\n", file, line); 6085 printk("in_atomic():%d, irqs_disabled():%d\n", 6086 in_atomic(), irqs_disabled()); 6087 dump_stack(); 6088 } 6089 #endif 6090 } Best Regards, Akio Takebe
Dave Jones
2006-Aug-03 20:46 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Fedora-xen-ia64 test status
On Fri, Aug 04, 2006 at 05:34:50AM +0900, Akio Takebe wrote: Content-Description: Mail message body > Hi, Xen/IA64 people > > When I boot the dom0, I get some __might_sleep()''s messages. > > BUG: sleeping function called from invalid context at kernel/rwsem.c:20 > in_atomic():0, irqs_disabled():1 > > Call Trace: > [<a00000010001c7a0>] show_stack+0x40/0xa0 > sp=e00000002f7dfbb0 bsp=e00000002f7d91a8 > [<a00000010001c830>] dump_stack+0x30/0x60 > sp=e00000002f7dfd80 bsp=e00000002f7d9190 > [<a00000010006c920>] __might_sleep+0x2a0/0x2c0 > sp=e00000002f7dfd80 bsp=e00000002f7d9168 > [<a0000001000bce80>] down_read+0x20/0x60 > sp=e00000002f7dfd80 bsp=e00000002f7d9148 > [<a0000001005b14f0>] ia64_do_page_fault+0x110/0x9e0 > sp=e00000002f7dfd80 bsp=e00000002f7d90f8 > [<a000000100064be0>] xen_leave_kernel+0x0/0x3b0 > sp=e00000002f7dfe30 bsp=e00000002f7d90f8 > > > Do we need CONFIG_DEBUG_SPINLOCK_SLEEP? Yes, because it highlights bugs like the above. > If we need CONFIG_DEBUG_SPINLOCK_SLEEP, we must modify __might_sleep(). No, you must fix ia64 so that it doesn''t call down_read with interrupts disabled. Dave -- http://www.codemonkey.org.uk
On Fri, 2006-08-04 at 04:41 +0900, Akio Takebe wrote: Hi Akio> I think we need to update Hypervisor. > Is this possible?Updating Hypervisor, thanks for finding it. Later, Juan.
Akio Takebe
2006-Aug-03 21:09 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Fedora-xen-ia64 test status
Hi, Dave and Xen people> > Do we need CONFIG_DEBUG_SPINLOCK_SLEEP? > >Yes, because it highlights bugs like the above. > > > If we need CONFIG_DEBUG_SPINLOCK_SLEEP, we must modify __might_sleep(). > >No, you must fix ia64 so that it doesn''t call down_read with >interrupts disabled.OK, but I think this is not bug. I thik this messages are caused by domain scheduler of Xen. Time slice of domain schedule is longer than HZ. (probably) Am I right, xen people? Please comments. Best Regards, Akio Takebe
Hi Juan, Here is the patch cleaned up to apply to your current tree. I omitted the #ifdef notyet parts because they''re already fixed by the patch I sent yesterday. https://www.redhat.com/archives/fedora-xen/2006-August/msg00019.html Regards, Aron # HG changeset patch # User agriffis@cheo.zko.hp.com # Node ID 8040a45ec900fda375346157f4028da57bb62e20 # Parent 33a2d3fb09120f3c9342ddeba2c20200571de75b Compare both start and end when sorting rsvd_region list, then collapse overlapping regions. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Aron Griffis <aron@hp.com> diff -r 33a2d3fb0912 -r 8040a45ec900 arch/ia64/kernel/setup.c --- a/arch/ia64/kernel/setup.c Wed Aug 02 22:25:15 2006 -0400 +++ b/arch/ia64/kernel/setup.c Thu Aug 03 17:17:14 2006 -0400 @@ -196,21 +196,75 @@ filter_rsvd_memory (unsigned long start, return 0; } +static int __init +rsvd_region_cmp(struct rsvd_region *lhs, struct rsvd_region *rhs) +{ + if (lhs->start > rhs->start) + return 1; + if (lhs->start < rhs->start) + return -1; + + if (lhs->end > rhs->end) + return 1; + if (lhs->end < rhs->end) + return -1; + + return 0; +} + static void __init sort_regions (struct rsvd_region *rsvd_region, int max) { + int num = max; int j; /* simple bubble sorting */ while (max--) { for (j = 0; j < max; ++j) { - if (rsvd_region[j].start > rsvd_region[j+1].start) { + if (rsvd_region_cmp(&rsvd_region[j], + &rsvd_region[j + 1]) > 0) { struct rsvd_region tmp; tmp = rsvd_region[j]; rsvd_region[j] = rsvd_region[j + 1]; rsvd_region[j + 1] = tmp; } } + } + + for (j = 0; j < num; j++) { + printk("rsvd_region[%d]: [0x%016lx, 0x%06lx)\n", + j, rsvd_region[j].start, rsvd_region[j].end); + } + + for (j = 0; j < num - 1; j++) { + int k; + unsigned long start = rsvd_region[j].start; + unsigned long end = rsvd_region[j].end; + int collapsed; + + for (k = j + 1; k < num; k++) { + BUG_ON(start > rsvd_region[k].start); + if (end < rsvd_region[k].start) { + k--; + break; + } + end = max(end, rsvd_region[k].end); + } + if (k == num) { + k--; + } + rsvd_region[j].end = end; + collapsed = k - j; + for (k = j + 1; k < j + 1 + collapsed; k++) { + rsvd_region[k] = rsvd_region[k + collapsed]; + } + num -= collapsed; + } + + num_rsvd_regions = num; + for (j = 0; j < num; j++) { + printk("rsvd_region[%d]: [0x%016lx, 0x%06lx)\n", + j, rsvd_region[j].start, rsvd_region[j].end); } }
On Thu, 2006-08-03 at 17:22 -0400, Aron Griffis wrote:> Hi Juan, > > Here is the patch cleaned up to apply to your current tree. > I omitted the #ifdef notyet parts because they''re already fixed by the > patch I sent yesterday. > https://www.redhat.com/archives/fedora-xen/2006-August/msg00019.htmlAdded this patch. Actually updated HV to latest & same for kernel bits. One of the machines here boots: [root@frosty ~]# uname -a Linux frosty.rhts.boston.redhat.com 2.6.17-1.2523xen #1 SMP Thu Aug 3 20:28:24 EDT 2006 ia64 ia64 ia64 GNU/Linux [root@frosty ~]# But still loads of: BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 Call Trace: [<a00000010001c8a0>] show_stack+0x40/0xa0 sp=e0000000165b7bb0 bsp=e0000000165b11b8 [<a00000010001c930>] dump_stack+0x30/0x60 sp=e0000000165b7d80 bsp=e0000000165b11a0 [<a000000100069280>] __might_sleep+0x2a0/0x2c0 sp=e0000000165b7d80 bsp=e0000000165b1178 [<a0000001000b7080>] down_read+0x20/0x60 sp=e0000000165b7d80 bsp=e0000000165b1158 [<a0000001005a8f30>] ia64_do_page_fault+0x110/0x9e0 sp=e0000000165b7d80 bsp=e0000000165b1108 [<a000000100061520>] xen_leave_kernel+0x0/0x3b0 sp=e0000000165b7e30 bsp=e0000000165b1108 On the other machine (superdome), it still don''t boot, fails with: (XEN) Dom0 max_vcpus=1 (XEN) Dom0: 0xf00000000425c080 (XEN) Domain0 EFI passthrough:enable lsapic entry: 0xf000070fffbc01cc (XEN) DISABLE lsapic entry: 0xf000070fffbc01d8 (XEN) DISABLE lsapic entry: 0xf000070fffbc01e4 (XEN) DISABLE lsapic entry: 0xf000070fffbc01f0 (XEN) DISABLE lsapic entry: 0xf000070fffbc01fc (XEN) DISABLE lsapic entry: 0xf000070fffbc0208 (XEN) DISABLE lsapic entry: 0xf000070fffbc0214 (XEN) DISABLE lsapic entry: 0xf000070fffbc0220 (XEN) DISABLE lsapic entry: 0xf000070fffbc022c (XEN) ACPI 2.0=0x70fffbc0000 SMBIOS=0x1fffe000 (XEN) assign_new_domain_page: Can''t alloc!!!! Aaaargh! (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) assign_new_domain0_page: can''t allocate page for dom0**************************************** (XEN) (XEN) Reboot in five seconds... This is after I add (from Alex advice) max_addr=32G, otherwise it stops quite earlier with a complaint that it is unable to get an ored 8 page. Later, Juan.
Isaku Yamahata
2006-Aug-04 02:30 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Re: Fedora-xen-ia64 test status
Hi Aron. Thank you for cleaning up. However a bug was there. Here is the updated one. - removed debug print. - bug fix. Extents at the end can be discarded. - coding style clean up changeset: 34929:7ca328ac2929e1dfc19b298d7656e653a05d709b tag: tip user: yamahata@valinux.co.jp date: Fri Aug 04 11:17:17 2006 +0900 files: arch/ia64/kernel/setup.c description: Compare both start and end when sorting rsvd_region list, then collapse overlapping regions. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Aron Griffis <aron@hp.com> diff -r e677133a59a9c5313e6985e5ba63ff50aecec42c -r 7ca328ac2929e1dfc19b298d7656e653a05d709b arch/ia64/kernel/setup.c --- a/arch/ia64/kernel/setup.c Thu Aug 03 00:43:34 2006 +0200 +++ b/arch/ia64/kernel/setup.c Fri Aug 04 11:17:17 2006 +0900 @@ -194,15 +194,33 @@ filter_rsvd_memory (unsigned long start, return 0; } +static int __init +rsvd_region_cmp(struct rsvd_region *lhs, struct rsvd_region *rhs) +{ + if (lhs->start > rhs->start) + return 1; + if (lhs->start < rhs->start) + return -1; + + if (lhs->end > rhs->end) + return 1; + if (lhs->end < rhs->end) + return -1; + + return 0; +} + static void __init sort_regions (struct rsvd_region *rsvd_region, int max) { + int num = max; int j; /* simple bubble sorting */ while (max--) { for (j = 0; j < max; ++j) { - if (rsvd_region[j].start > rsvd_region[j+1].start) { + if (rsvd_region_cmp(&rsvd_region[j], + &rsvd_region[j + 1]) > 0) { struct rsvd_region tmp; tmp = rsvd_region[j]; rsvd_region[j] = rsvd_region[j + 1]; @@ -210,6 +228,31 @@ sort_regions (struct rsvd_region *rsvd_r } } } + + for (j = 0; j < num - 1; j++) { + int k; + unsigned long start = rsvd_region[j].start; + unsigned long end = rsvd_region[j].end; + int collapsed; + + for (k = j + 1; k < num; k++) { + BUG_ON(start > rsvd_region[k].start); + if (end < rsvd_region[k].start) { + k--; + break; + } + end = max(end, rsvd_region[k].end); + } + if (k == num) + k--; + rsvd_region[j].end = end; + collapsed = k - j; + num -= collapsed; + for (k = j + 1; k < num; k++) + rsvd_region[k] = rsvd_region[k + collapsed]; + } + + num_rsvd_regions = num; } /* -- yamahata
Rik van Riel
2006-Aug-04 03:48 UTC
Re: [Fedora-xen] Re: [Xen-ia64-devel] Fedora-xen-ia64 test status
Akio Takebe wrote:> Hi, Dave and Xen people > >>> Do we need CONFIG_DEBUG_SPINLOCK_SLEEP? >> Yes, because it highlights bugs like the above. >> >>> If we need CONFIG_DEBUG_SPINLOCK_SLEEP, we must modify __might_sleep(). >> No, you must fix ia64 so that it doesn''t call down_read with >> interrupts disabled. > OK, but I think this is not bug. > I thik this messages are caused by domain scheduler of Xen. > Time slice of domain schedule is longer than HZ. (probably) > Am I right, xen people?Nope, __might_sleep() is simply called from code paths that might sleep. The only reason there''s timer code in __might_sleep() is for printk rate limiting. You can not call down_read() with interrupts disabled, because the mutex code might need to sleep. The code needs to be fixed. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan
Akio Takebe
2006-Aug-07 00:46 UTC
[Patch] fix vDSO paravirtualization for fedora-xen-ia64 (is Re: [Fedora-xen] Re: [Xen-ia64-devel] Fedora-xen-ia64 test status)
Hi, The following error messages are occurred by a bug of vDSO paravirtualization. (Thanks, Isaku.) This fix of Isaku has been commited in xen-ia64-unstable, but has not been commited in xen-unstable yet. So this patch need to is commit to fedora-kernel util fedora-kernel catch up with current xen-ia64-unstable. ===============================BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 Call Trace: [<a00000010001c7a0>] show_stack+0x40/0xa0 sp=e00000002f7dfbb0 bsp=e00000002f7d91a8 [<a00000010001c830>] dump_stack+0x30/0x60 sp=e00000002f7dfd80 bsp=e00000002f7d9190 [<a00000010006c920>] __might_sleep+0x2a0/0x2c0 sp=e00000002f7dfd80 bsp=e00000002f7d9168 [<a0000001000bce80>] down_read+0x20/0x60 sp=e00000002f7dfd80 bsp=e00000002f7d9148 [<a0000001005b14f0>] ia64_do_page_fault+0x110/0x9e0 sp=e00000002f7dfd80 bsp=e00000002f7d90f8 [<a000000100064be0>] xen_leave_kernel+0x0/0x3b0 sp=e00000002f7dfe30 bsp=e00000002f7d90f8 =============================== The below is F.Y.I. http://lists.xensource.com/archives/html/xen-ia64-devel/2006-08/msg00042.html Best Regards, Akio Takebe
Juan Quintela
2006-Aug-11 15:11 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Re: Fedora-xen-ia64 test status
Hi Isaku as I had already commited old patch, could you confirm that this patch on top of old one fixes the problem? thanks, Juan.
Juan Quintela
2006-Aug-11 15:14 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Re: Fedora-xen-ia64 test status
On Fri, 2006-08-04 at 11:30 +0900, Isaku Yamahata wrote: Hi Isaku I had already commited old Aron version of the patch> However a bug was there. Here is the updated one. > > - removed debug print. > - bug fix. Extents at the end can be discarded.I removed only one of the prink''s sets :) Could you confirm that my patch on top of the old one does what you intend? The only functional change that I found was that the change of possition of the num -= colllapsed line, and I agree with the change :) Later, Juan.
Isaku Yamahata
2006-Aug-12 00:25 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Re: Fedora-xen-ia64 test status
On Fri, Aug 11, 2006 at 05:14:37PM +0200, Juan Quintela wrote:> Could you confirm that my patch on top of the old one does what you > intend? The only functional change that I found was that the change of > possition of the num -= colllapsed line, and I agree with the change :)Hi Juan. The line which follows num -= collapsed is also necessary. Without that, the rsvd_region[] at the end may be discarded. - for (k = j + 1; k < j + 1 + collapsed; k++) { + for (k = j + 1; k < num; k++) Thanks. -- yamahata
Juan Quintela
2006-Aug-12 00:35 UTC
[Fedora-xen] Re: [Xen-ia64-devel] Re: Fedora-xen-ia64 test status
On Sat, 2006-08-12 at 09:25 +0900, Isaku Yamahata wrote:> On Fri, Aug 11, 2006 at 05:14:37PM +0200, Juan Quintela wrote: > > > Could you confirm that my patch on top of the old one does what you > > intend? The only functional change that I found was that the change of > > possition of the num -= colllapsed line, and I agree with the change :) > > Hi Juan. > > The line which follows num -= collapsed is also necessary. > Without that, the rsvd_region[] at the end may be discarded. > > - for (k = j + 1; k < j + 1 + collapsed; k++) { > + for (k = j + 1; k < num; k++)Thanks very much, fixed. Later, JUan.