I am testing credit2 on a dual Xeon L5640 machine. I have an HVM Debian Squeeze domU that reliably leads to a panic when it''s run with the credit2 scheduler, but not with credit. The reproduction steps on this machine are simple: 1. Fully boot up the machine. 2. Enter commands that cause dom0 to use 100% CPU. For instance: screen -AmdS burn1 perl -e ''while(1) {}'' screen -AmdS burn2 perl -e ''while(1) {}'' screen -AmdS burn3 perl -e ''while(1) {}'' screen -AmdS burn4 perl -e ''while(1) {}'' 3. Start up the prepared Squeeze domU (which is a stock install), with "xm" ("xl" doesn''t work with debug=y because of a spurious assert, but has the same problem with debug=n): cd /servers/customers xm create testvds4.cfg The serial console then shows this: (XEN) irq.c:324: Dom1 callback via changed to Direct Vector 0xe9 (XEN) Xen BUG at sched_credit2.c:1606 (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- (XEN) CPU: 12 (XEN) RIP: e008:[<ffff82c48011a383>] csched_schedule+0xdb/0xab1 (XEN) RFLAGS: 0000000000010082 CONTEXT: hypervisor (XEN) rax: ffff830c2246c000 rbx: ffff830c2246bd10 rcx: 0000000000000000 (XEN) rdx: 0000000000000001 rsi: ffff82c480241680 rdi: ffff8300bf74c000 (XEN) rbp: ffff83043b28fe38 rsp: ffff83043b28fd58 r8: 0000000000000002 (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 (XEN) r12: 0000000000000000 r13: ffff83043ffe02d0 r14: 000000000000000c (XEN) r15: ffff83043ffe0010 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 0000000c22436000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff83043b28fd58: (XEN) ffff82c4801bc822 ffff83043b298040 0000000000000282 ffff83043b28fd88 (XEN) ffff82c48012248f ffff8300bf74c000 ffff83043b28fdb8 ffff82c4801b59bd (XEN) 00000014c77c5137 ffff83043b28fe68 ffff82c480241680 0000000000000001 (XEN) 00007cfbc4d70217 ffff82c48014b2c0 ffff83043b298060 ffff83043b28fde8 (XEN) ffff82c480124345 ffff83043b298060 ffff83043b28fe38 0000000000000082 (XEN) 00000000000186a0 0000000000000082 000000000000000c ffff8300bf74c000 (XEN) 0000000000000000 ffff82c480241680 ffff83043b298040 ffff83043b298060 (XEN) ffff83043b28feb8 ffff82c48012061c ffff83043b28feb8 00000014c77c5137 (XEN) 0000000000000293 ffff8300bf74d868 ffff82c48012248f ffff8300bf74c000 (XEN) ffff83043b28fe98 ffff82c4801b19d0 ffff8300bf74c000 ffff82c4802a8e80 (XEN) 00000000ffffffff ffff82c4802a8880 ffff83043b28ff18 ffffffffffffffff (XEN) ffff83043b28fef8 ffff82c480121caf 0440080000000001 ffff8300bf74c000 (XEN) 0000000000000046 ffff8800018501a0 ffffffff81311470 0000000000000092 (XEN) ffff83043b28ff08 ffff82c480121d0c ffff88000184c600 ffff82c4801bb3f1 (XEN) 0000000000000092 ffffffff81311470 ffff8800018501a0 0000000000000046 (XEN) ffff88000184c600 0000000000000001 00000000000186a0 0000000000000008 (XEN) 0000000000000200 0000000000000008 0000000000000000 0000000000000002 (XEN) 0000000000000000 0000000000000002 0000000000000007 0000beef0000beef (XEN) ffffffff81009308 0000beef0000beef 0000000000000046 ffff880031e63e58 (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef (XEN) Xen call trace: (XEN) [<ffff82c48011a383>] csched_schedule+0xdb/0xab1 (XEN) [<ffff82c48012061c>] schedule+0x122/0x60c (XEN) [<ffff82c480121caf>] __do_softirq+0x8d/0x9e (XEN) [<ffff82c480121d0c>] do_softirq+0x4c/0x4e (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 12: (XEN) Xen BUG at sched_credit2.c:1606 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... Where do we go from here? -John For reference, xl info output: dallas-dodec226-5 ~ # xl info host : dallas-dodec226-5 release : 2.6.32.37-gbe57219 version : #1 SMP Tue Apr 19 00:14:46 CDT 2011 machine : x86_64 nr_cpus : 24 nr_nodes : 2 cores_per_socket : 6 threads_per_core : 2 cpu_mhz : 2266 hw_caps : bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 virt_caps : hvm hvm_directio total_memory : 49143 free_memory : 47106 free_cpus : 0 xen_major : 4 xen_minor : 1 xen_extra : .1-rc1-pre xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit2 xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc xen_commandline : dom0_mem=1500M dom0_max_vcpus=4 iommu=dom0-passthrough sched=credit2 loglvl=all guest_loglvl=all com2=115200,8n1 console=com2 cc_compiler : gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) cc_compile_by : root cc_compile_domain : nuclearfallout.net cc_compile_date : Tue Apr 19 14:26:02 CDT 2011 xend_config_format : 4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Can you give the changset number / hash of your tip? In the current unstable tip, line 1606 is in the middle of a printk... -George On Tue, Apr 19, 2011 at 9:58 PM, John Weekes <lists.xen@nuclearfallout.net> wrote:> I am testing credit2 on a dual Xeon L5640 machine. I have an HVM Debian > Squeeze domU that reliably leads to a panic when it''s run with the credit2 > scheduler, but not with credit. > > The reproduction steps on this machine are simple: > > 1. Fully boot up the machine. > 2. Enter commands that cause dom0 to use 100% CPU. For instance: > > screen -AmdS burn1 perl -e ''while(1) {}'' > screen -AmdS burn2 perl -e ''while(1) {}'' > screen -AmdS burn3 perl -e ''while(1) {}'' > screen -AmdS burn4 perl -e ''while(1) {}'' > > 3. Start up the prepared Squeeze domU (which is a stock install), with "xm" > ("xl" doesn''t work with debug=y because of a spurious assert, but has the > same problem with debug=n): > > cd /servers/customers > xm create testvds4.cfg > > The serial console then shows this: > > (XEN) irq.c:324: Dom1 callback via changed to Direct Vector 0xe9 > (XEN) Xen BUG at sched_credit2.c:1606 > (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- > (XEN) CPU: 12 > (XEN) RIP: e008:[<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > (XEN) RFLAGS: 0000000000010082 CONTEXT: hypervisor > (XEN) rax: ffff830c2246c000 rbx: ffff830c2246bd10 rcx: 0000000000000000 > (XEN) rdx: 0000000000000001 rsi: ffff82c480241680 rdi: ffff8300bf74c000 > (XEN) rbp: ffff83043b28fe38 rsp: ffff83043b28fd58 r8: 0000000000000002 > (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 > (XEN) r12: 0000000000000000 r13: ffff83043ffe02d0 r14: 000000000000000c > (XEN) r15: ffff83043ffe0010 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 0000000c22436000 cr2: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff83043b28fd58: > (XEN) ffff82c4801bc822 ffff83043b298040 0000000000000282 ffff83043b28fd88 > (XEN) ffff82c48012248f ffff8300bf74c000 ffff83043b28fdb8 ffff82c4801b59bd > (XEN) 00000014c77c5137 ffff83043b28fe68 ffff82c480241680 0000000000000001 > (XEN) 00007cfbc4d70217 ffff82c48014b2c0 ffff83043b298060 ffff83043b28fde8 > (XEN) ffff82c480124345 ffff83043b298060 ffff83043b28fe38 0000000000000082 > (XEN) 00000000000186a0 0000000000000082 000000000000000c ffff8300bf74c000 > (XEN) 0000000000000000 ffff82c480241680 ffff83043b298040 ffff83043b298060 > (XEN) ffff83043b28feb8 ffff82c48012061c ffff83043b28feb8 00000014c77c5137 > (XEN) 0000000000000293 ffff8300bf74d868 ffff82c48012248f ffff8300bf74c000 > (XEN) ffff83043b28fe98 ffff82c4801b19d0 ffff8300bf74c000 ffff82c4802a8e80 > (XEN) 00000000ffffffff ffff82c4802a8880 ffff83043b28ff18 ffffffffffffffff > (XEN) ffff83043b28fef8 ffff82c480121caf 0440080000000001 ffff8300bf74c000 > (XEN) 0000000000000046 ffff8800018501a0 ffffffff81311470 0000000000000092 > (XEN) ffff83043b28ff08 ffff82c480121d0c ffff88000184c600 ffff82c4801bb3f1 > (XEN) 0000000000000092 ffffffff81311470 ffff8800018501a0 0000000000000046 > (XEN) ffff88000184c600 0000000000000001 00000000000186a0 0000000000000008 > (XEN) 0000000000000200 0000000000000008 0000000000000000 0000000000000002 > (XEN) 0000000000000000 0000000000000002 0000000000000007 0000beef0000beef > (XEN) ffffffff81009308 0000beef0000beef 0000000000000046 ffff880031e63e58 > (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef > (XEN) Xen call trace: > (XEN) [<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > (XEN) [<ffff82c48012061c>] schedule+0x122/0x60c > (XEN) [<ffff82c480121caf>] __do_softirq+0x8d/0x9e > (XEN) [<ffff82c480121d0c>] do_softirq+0x4c/0x4e > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 12: > (XEN) Xen BUG at sched_credit2.c:1606 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Where do we go from here? > > -John > > For reference, xl info output: > > dallas-dodec226-5 ~ # xl info > host : dallas-dodec226-5 > release : 2.6.32.37-gbe57219 > version : #1 SMP Tue Apr 19 00:14:46 CDT 2011 > machine : x86_64 > nr_cpus : 24 > nr_nodes : 2 > cores_per_socket : 6 > threads_per_core : 2 > cpu_mhz : 2266 > hw_caps : > bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 > virt_caps : hvm hvm_directio > total_memory : 49143 > free_memory : 47106 > free_cpus : 0 > xen_major : 4 > xen_minor : 1 > xen_extra : .1-rc1-pre > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit2 > xen_pagesize : 4096 > platform_params : virt_start=0xffff800000000000 > xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc > xen_commandline : dom0_mem=1500M dom0_max_vcpus=4 > iommu=dom0-passthrough sched=credit2 loglvl=all guest_loglvl=all > com2=115200,8n1 console=com2 > cc_compiler : gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) > cc_compile_by : root > cc_compile_domain : nuclearfallout.net > cc_compile_date : Tue Apr 19 14:26:02 CDT 2011 > xend_config_format : 4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/20/2011 2:36 AM, George Dunlap wrote:> Can you give the changset number / hash of your tip? In the current > unstable tip, line 1606 is in the middle of a printk...This is 4.1-testing. The version information was given in the "xl info" output -- it''s just the latest: xen_major : 4 xen_minor : 1 xen_extra : .1-rc1-pre xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc Here''s what the line corresponds to: BUG_ON(!is_idle_vcpu(scurr->vcpu) && scurr->rqd != rqd); If you''d like me to repeat the test with xen-unstable, I can do that. -John> -George > > On Tue, Apr 19, 2011 at 9:58 PM, John Weekes > <lists.xen@nuclearfallout.net> wrote: >> I am testing credit2 on a dual Xeon L5640 machine. I have an HVM Debian >> Squeeze domU that reliably leads to a panic when it''s run with the credit2 >> scheduler, but not with credit. >> >> The reproduction steps on this machine are simple: >> >> 1. Fully boot up the machine. >> 2. Enter commands that cause dom0 to use 100% CPU. For instance: >> >> screen -AmdS burn1 perl -e ''while(1) {}'' >> screen -AmdS burn2 perl -e ''while(1) {}'' >> screen -AmdS burn3 perl -e ''while(1) {}'' >> screen -AmdS burn4 perl -e ''while(1) {}'' >> >> 3. Start up the prepared Squeeze domU (which is a stock install), with "xm" >> ("xl" doesn''t work with debug=y because of a spurious assert, but has the >> same problem with debug=n): >> >> cd /servers/customers >> xm create testvds4.cfg >> >> The serial console then shows this: >> >> (XEN) irq.c:324: Dom1 callback via changed to Direct Vector 0xe9 >> (XEN) Xen BUG at sched_credit2.c:1606 >> (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 12 >> (XEN) RIP: e008:[<ffff82c48011a383>] csched_schedule+0xdb/0xab1 >> (XEN) RFLAGS: 0000000000010082 CONTEXT: hypervisor >> (XEN) rax: ffff830c2246c000 rbx: ffff830c2246bd10 rcx: 0000000000000000 >> (XEN) rdx: 0000000000000001 rsi: ffff82c480241680 rdi: ffff8300bf74c000 >> (XEN) rbp: ffff83043b28fe38 rsp: ffff83043b28fd58 r8: 0000000000000002 >> (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 >> (XEN) r12: 0000000000000000 r13: ffff83043ffe02d0 r14: 000000000000000c >> (XEN) r15: ffff83043ffe0010 cr0: 000000008005003b cr4: 00000000000026f0 >> (XEN) cr3: 0000000c22436000 cr2: 0000000000000000 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >> (XEN) Xen stack trace from rsp=ffff83043b28fd58: >> (XEN) ffff82c4801bc822 ffff83043b298040 0000000000000282 ffff83043b28fd88 >> (XEN) ffff82c48012248f ffff8300bf74c000 ffff83043b28fdb8 ffff82c4801b59bd >> (XEN) 00000014c77c5137 ffff83043b28fe68 ffff82c480241680 0000000000000001 >> (XEN) 00007cfbc4d70217 ffff82c48014b2c0 ffff83043b298060 ffff83043b28fde8 >> (XEN) ffff82c480124345 ffff83043b298060 ffff83043b28fe38 0000000000000082 >> (XEN) 00000000000186a0 0000000000000082 000000000000000c ffff8300bf74c000 >> (XEN) 0000000000000000 ffff82c480241680 ffff83043b298040 ffff83043b298060 >> (XEN) ffff83043b28feb8 ffff82c48012061c ffff83043b28feb8 00000014c77c5137 >> (XEN) 0000000000000293 ffff8300bf74d868 ffff82c48012248f ffff8300bf74c000 >> (XEN) ffff83043b28fe98 ffff82c4801b19d0 ffff8300bf74c000 ffff82c4802a8e80 >> (XEN) 00000000ffffffff ffff82c4802a8880 ffff83043b28ff18 ffffffffffffffff >> (XEN) ffff83043b28fef8 ffff82c480121caf 0440080000000001 ffff8300bf74c000 >> (XEN) 0000000000000046 ffff8800018501a0 ffffffff81311470 0000000000000092 >> (XEN) ffff83043b28ff08 ffff82c480121d0c ffff88000184c600 ffff82c4801bb3f1 >> (XEN) 0000000000000092 ffffffff81311470 ffff8800018501a0 0000000000000046 >> (XEN) ffff88000184c600 0000000000000001 00000000000186a0 0000000000000008 >> (XEN) 0000000000000200 0000000000000008 0000000000000000 0000000000000002 >> (XEN) 0000000000000000 0000000000000002 0000000000000007 0000beef0000beef >> (XEN) ffffffff81009308 0000beef0000beef 0000000000000046 ffff880031e63e58 >> (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef >> (XEN) Xen call trace: >> (XEN) [<ffff82c48011a383>] csched_schedule+0xdb/0xab1 >> (XEN) [<ffff82c48012061c>] schedule+0x122/0x60c >> (XEN) [<ffff82c480121caf>] __do_softirq+0x8d/0x9e >> (XEN) [<ffff82c480121d0c>] do_softirq+0x4c/0x4e >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 12: >> (XEN) Xen BUG at sched_credit2.c:1606 >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> Where do we go from here? >> >> -John >> >> For reference, xl info output: >> >> dallas-dodec226-5 ~ # xl info >> host : dallas-dodec226-5 >> release : 2.6.32.37-gbe57219 >> version : #1 SMP Tue Apr 19 00:14:46 CDT 2011 >> machine : x86_64 >> nr_cpus : 24 >> nr_nodes : 2 >> cores_per_socket : 6 >> threads_per_core : 2 >> cpu_mhz : 2266 >> hw_caps : >> bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 >> virt_caps : hvm hvm_directio >> total_memory : 49143 >> free_memory : 47106 >> free_cpus : 0 >> xen_major : 4 >> xen_minor : 1 >> xen_extra : .1-rc1-pre >> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 >> hvm-3.0-x86_32p hvm-3.0-x86_64 >> xen_scheduler : credit2 >> xen_pagesize : 4096 >> platform_params : virt_start=0xffff800000000000 >> xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc >> xen_commandline : dom0_mem=1500M dom0_max_vcpus=4 >> iommu=dom0-passthrough sched=credit2 loglvl=all guest_loglvl=all >> com2=115200,8n1 console=com2 >> cc_compiler : gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) >> cc_compile_by : root >> cc_compile_domain : nuclearfallout.net >> cc_compile_date : Tue Apr 19 14:26:02 CDT 2011 >> xend_config_format : 4 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2011-04-20 at 17:12 +0100, John Weekes wrote:> On 4/20/2011 2:36 AM, George Dunlap wrote: > > Can you give the changset number / hash of your tip? In the current > > unstable tip, line 1606 is in the middle of a printk... > > This is 4.1-testing. The version information was given in the "xl info" > output -- it''s just the latest: > > xen_major : 4 > xen_minor : 1 > xen_extra : .1-rc1-pre > xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc > > Here''s what the line corresponds to: > > BUG_ON(!is_idle_vcpu(scurr->vcpu) && scurr->rqd != rqd); > > If you''d like me to repeat the test with xen-unstable, I can do that.Yes, please, if you have the time. xen-unstable has some debugging output I put there after this failure was seen once in the xen.org automated testing. I''ve been over that code a number of times, and have no idea how that bug could be triggering. :-) Alternately, if you find unstable too unstable (or if you don''t want to find out), you can revert xen-4.1-testing c/s 22977:6af8e01d3e4a, which turns off the debugging code I''d put in for the 4.1 release. Peace, -George> > -John > > > -George > > > > On Tue, Apr 19, 2011 at 9:58 PM, John Weekes > > <lists.xen@nuclearfallout.net> wrote: > >> I am testing credit2 on a dual Xeon L5640 machine. I have an HVM Debian > >> Squeeze domU that reliably leads to a panic when it''s run with the credit2 > >> scheduler, but not with credit. > >> > >> The reproduction steps on this machine are simple: > >> > >> 1. Fully boot up the machine. > >> 2. Enter commands that cause dom0 to use 100% CPU. For instance: > >> > >> screen -AmdS burn1 perl -e ''while(1) {}'' > >> screen -AmdS burn2 perl -e ''while(1) {}'' > >> screen -AmdS burn3 perl -e ''while(1) {}'' > >> screen -AmdS burn4 perl -e ''while(1) {}'' > >> > >> 3. Start up the prepared Squeeze domU (which is a stock install), with "xm" > >> ("xl" doesn''t work with debug=y because of a spurious assert, but has the > >> same problem with debug=n): > >> > >> cd /servers/customers > >> xm create testvds4.cfg > >> > >> The serial console then shows this: > >> > >> (XEN) irq.c:324: Dom1 callback via changed to Direct Vector 0xe9 > >> (XEN) Xen BUG at sched_credit2.c:1606 > >> (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 12 > >> (XEN) RIP: e008:[<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > >> (XEN) RFLAGS: 0000000000010082 CONTEXT: hypervisor > >> (XEN) rax: ffff830c2246c000 rbx: ffff830c2246bd10 rcx: 0000000000000000 > >> (XEN) rdx: 0000000000000001 rsi: ffff82c480241680 rdi: ffff8300bf74c000 > >> (XEN) rbp: ffff83043b28fe38 rsp: ffff83043b28fd58 r8: 0000000000000002 > >> (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 > >> (XEN) r12: 0000000000000000 r13: ffff83043ffe02d0 r14: 000000000000000c > >> (XEN) r15: ffff83043ffe0010 cr0: 000000008005003b cr4: 00000000000026f0 > >> (XEN) cr3: 0000000c22436000 cr2: 0000000000000000 > >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > >> (XEN) Xen stack trace from rsp=ffff83043b28fd58: > >> (XEN) ffff82c4801bc822 ffff83043b298040 0000000000000282 ffff83043b28fd88 > >> (XEN) ffff82c48012248f ffff8300bf74c000 ffff83043b28fdb8 ffff82c4801b59bd > >> (XEN) 00000014c77c5137 ffff83043b28fe68 ffff82c480241680 0000000000000001 > >> (XEN) 00007cfbc4d70217 ffff82c48014b2c0 ffff83043b298060 ffff83043b28fde8 > >> (XEN) ffff82c480124345 ffff83043b298060 ffff83043b28fe38 0000000000000082 > >> (XEN) 00000000000186a0 0000000000000082 000000000000000c ffff8300bf74c000 > >> (XEN) 0000000000000000 ffff82c480241680 ffff83043b298040 ffff83043b298060 > >> (XEN) ffff83043b28feb8 ffff82c48012061c ffff83043b28feb8 00000014c77c5137 > >> (XEN) 0000000000000293 ffff8300bf74d868 ffff82c48012248f ffff8300bf74c000 > >> (XEN) ffff83043b28fe98 ffff82c4801b19d0 ffff8300bf74c000 ffff82c4802a8e80 > >> (XEN) 00000000ffffffff ffff82c4802a8880 ffff83043b28ff18 ffffffffffffffff > >> (XEN) ffff83043b28fef8 ffff82c480121caf 0440080000000001 ffff8300bf74c000 > >> (XEN) 0000000000000046 ffff8800018501a0 ffffffff81311470 0000000000000092 > >> (XEN) ffff83043b28ff08 ffff82c480121d0c ffff88000184c600 ffff82c4801bb3f1 > >> (XEN) 0000000000000092 ffffffff81311470 ffff8800018501a0 0000000000000046 > >> (XEN) ffff88000184c600 0000000000000001 00000000000186a0 0000000000000008 > >> (XEN) 0000000000000200 0000000000000008 0000000000000000 0000000000000002 > >> (XEN) 0000000000000000 0000000000000002 0000000000000007 0000beef0000beef > >> (XEN) ffffffff81009308 0000beef0000beef 0000000000000046 ffff880031e63e58 > >> (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef > >> (XEN) Xen call trace: > >> (XEN) [<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > >> (XEN) [<ffff82c48012061c>] schedule+0x122/0x60c > >> (XEN) [<ffff82c480121caf>] __do_softirq+0x8d/0x9e > >> (XEN) [<ffff82c480121d0c>] do_softirq+0x4c/0x4e > >> (XEN) > >> (XEN) > >> (XEN) **************************************** > >> (XEN) Panic on CPU 12: > >> (XEN) Xen BUG at sched_credit2.c:1606 > >> (XEN) **************************************** > >> (XEN) > >> (XEN) Reboot in five seconds... > >> > >> Where do we go from here? > >> > >> -John > >> > >> For reference, xl info output: > >> > >> dallas-dodec226-5 ~ # xl info > >> host : dallas-dodec226-5 > >> release : 2.6.32.37-gbe57219 > >> version : #1 SMP Tue Apr 19 00:14:46 CDT 2011 > >> machine : x86_64 > >> nr_cpus : 24 > >> nr_nodes : 2 > >> cores_per_socket : 6 > >> threads_per_core : 2 > >> cpu_mhz : 2266 > >> hw_caps : > >> bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 > >> virt_caps : hvm hvm_directio > >> total_memory : 49143 > >> free_memory : 47106 > >> free_cpus : 0 > >> xen_major : 4 > >> xen_minor : 1 > >> xen_extra : .1-rc1-pre > >> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > >> hvm-3.0-x86_32p hvm-3.0-x86_64 > >> xen_scheduler : credit2 > >> xen_pagesize : 4096 > >> platform_params : virt_start=0xffff800000000000 > >> xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc > >> xen_commandline : dom0_mem=1500M dom0_max_vcpus=4 > >> iommu=dom0-passthrough sched=credit2 loglvl=all guest_loglvl=all > >> com2=115200,8n1 console=com2 > >> cc_compiler : gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) > >> cc_compile_by : root > >> cc_compile_domain : nuclearfallout.net > >> cc_compile_date : Tue Apr 19 14:26:02 CDT 2011 > >> xend_config_format : 4 > >> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > >> > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/20/2011 9:51 AM, George Dunlap wrote:> Yes, please, if you have the time. xen-unstable has some debugging > output I put there after this failure was seen once in the xen.org > automated testing. I''ve been over that code a number of times, and have > no idea how that bug could be triggering. :-) > > Alternately, if you find unstable too unstable (or if you don''t want to > find out), you can revert xen-4.1-testing c/s 22977:6af8e01d3e4a, which > turns off the debugging code I''d put in for the 4.1 release.Any version works for me, since this is just for testing purposes (this isn''t a production machine yet). On xen-unstable, I''m seeing that the domU just hangs after it starts. The stubdom chews CPU very slowly, but there is no activity on the main domain: dallas-dodec226-5 customers # xl list Name ID Mem VCPUs State Time(s) Domain-0 0 1500 4 r----- 4061.2 testvds4 3 803 1 -b---- 0.0 testvds4-dm 4 32 1 -b---- 0.6 It''s hanging after this: (XEN) irq.c:258: Dom3 PCI link 1 changed 0 -> 10 (XEN) HVM3: PCI-ISA link 1 routed to IRQ10 (XEN) irq.c:258: Dom3 PCI link 2 changed 0 -> 11 (XEN) HVM3: PCI-ISA link 2 routed to IRQ11 (XEN) irq.c:258: Dom3 PCI link 3 changed 0 -> 5 (XEN) HVM3: PCI-ISA link 3 routed to IRQ5 (XEN) HVM3: pci dev 01:2 INTD->IRQ5 (XEN) HVM3: pci dev 01:3 INTA->IRQ10 (XEN) HVM3: pci dev 03:0 INTA->IRQ5 (XEN) HVM3: pci dev 04:0 INTA->IRQ5 (XEN) HVM3: pci dev 02:0 bar 10 size 02000000: f0000008 (XEN) HVM3: pci dev 03:0 bar 14 size 01000000: f2000008 (XEN) HVM3: pci dev 04:0 bar 10 size 00020000: f3000000 (XEN) HVM3: pci dev 02:0 bar 14 size 00001000: f3020000 (XEN) HVM3: pci dev 03:0 bar 10 size 00000100: 0000c001 (XEN) HVM3: pci dev 04:0 bar 14 size 00000040: 0000c101 (XEN) HVM3: pci dev 01:2 bar 20 size 00000020: 0000c141 (XEN) HVM3: pci dev 01:1 bar 20 size 00000010: 0000c161 (XEN) HVM3: Multiprocessor initialisation: (XEN) HVM3: - CPU0 ... 40-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done. (XEN) HVM3: - CPU1 ... 40-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done. (XEN) HVM3: - CPU2 ... 40-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done. (XEN) HVM3: - CPU3 ... 40-bit phys ... fixed MTRRs ... var MTRRs [2/8] ... done. (XEN) HVM3: Testing HVM environment: (XEN) HVM3: - REP INSB across page boundaries ... passed (XEN) HVM3: - GS base MSRs and SWAPGS ... passed (XEN) HVM3: Passed 2 of 2 tests (XEN) HVM3: Writing SMBIOS tables ... Killing the domU works properly. I''ll try 4.1-testing again with the patch reversion. -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/20/2011 10:33 AM, John Weekes wrote:> I''ll try 4.1-testing again with the patch reversion.4.1-testing is now doing the same thing as unstable for me -- the domU freezes after "(XEN) HVM5: Writing SMBIOS tables ..." -- and only when credit2 is being used. I did get it to "work" (panic) once, but then saw that it said debug=n. After that, I rebooted, did a "make clean" + recompile, and repeated my test, and it still said debug=n (this must be a separate build issue), so I completely wiped, checked out, reverted, recompile,d and tried again, leading to the current situation. Here''s what it said when it *did* crash. This is partial output because the later reboot overwrote part of the screen in ipmitool: (XEN) irq.c:324: Dom3 callback via changed to Direct Vector 0xe9 (XEN) Xen BUG at sched_credit2.c:811 (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=n Not tainted ]---- (XEN) CPU: 12 That means that it died at one of your special debug lines: BUG_ON(test_bit(__CSFLAG_scheduled, &svc->flags)); I''m trying to find a way to reproduce that output. -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/20/2011 12:10 PM, John Weekes wrote:> I''m trying to find a way to reproduce that output.Trying repeatedly, I did manage to repeat the output. Here''s the higher-detail version: (XEN) irq.c:324: Dom7 callback via changed to Direct Vector 0xe9 (XEN) Xen BUG at sched_credit2.c:811 (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- (XEN) CPU: 2 (XEN) RIP: e008:[<ffff82c4801186b8>] __runq_deassign+0x25/0x6f (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor (XEN) rax: 0000000000000002 rbx: ffff830c2246be30 rcx: 0000000000000230 (XEN) rdx: ffff83043ffe0248 rsi: 0000000000000000 rdi: ffff830c2246be30 (XEN) rbp: ffff83043ff57ad8 rsp: ffff83043ff57ad0 r8: 0000000000000002 (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 (XEN) r12: ffff83043ffe0010 r13: ffff830c2246be30 r14: 0000000000000080 (XEN) r15: ffff82c480241780 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 0000000c222fd000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff83043ff57ad0: (XEN) 000000000000000c ffff83043ff57ae8 ffff82c480118746 ffff83043ff57b58 (XEN) ffff82c480119e3b 000000013f0cf950 ffff83043ffe0010 ffff83043ffe02d4 (XEN) ffff8300bf74c000 000000000004cdfc ffff83043ffe0238 ffff83043ff57f18 (XEN) ffff83043ffe024c ffff83043ffe024c ffff82c4802ba7c0 0000000000000002 (XEN) ffff82c4802cbec0 ffff83043ff57be8 ffff82c48012122d ffff83043ff57b98 (XEN) ffff8300bf74c168 ffff82c4802cbec0 ffff82c4802cbec0 ffff82c4802cbec0 (XEN) 00ff82c4802ba7c0 ffff8300bf74c000 0000000000000282 0000000000000286 (XEN) ffff83043ffe024c ffff82c48012259f ffff8300bf74c000 ffff82c4802cbec0 (XEN) ffff82c4802ba7c0 ffff83043ffe024c ffff8300bf74c000 ffff83043ff57c18 (XEN) ffff82c480121739 ffff83043ff57f18 0000000000000002 0000000000000000 (XEN) ffff830c22276000 ffff83043ff57c98 ffff82c480103e8f ffff830c22276190 (XEN) 00000000ffffffff 0c93001800000000 00000000ffffffff 0c93001800000000 (XEN) 00000000ffffffff 0000000200000000 ffff82c400000009 ffff83043ff57ce8 (XEN) ffff83043ff57f18 0000000000000008 ffff8300bf74c000 0000000000000018 (XEN) ffff82c4801a4d59 ffff83043ff57ca8 ffff82c4801a4c9d ffff83043ff57d28 (XEN) ffff82c4801a4b2b ffff83043ff57cf8 ffff82c48011fbca 0000000000000000 (XEN) ffff880001855600 0000000000000000 00000000000000fd ffffffff0c930018 (XEN) 0000000000000000 0000000000000000 ffff8300bf74c000 ffff83043ff57f18 (XEN) 0000000000000012 0000000000000000 0000000000000000 ffff83043ff57f08 (XEN) ffff82c4801c1fa5 0000000000020007 ffff83043ff5e040 ffff83043ff57e38 (XEN) Xen call trace: (XEN) [<ffff82c4801186b8>] __runq_deassign+0x25/0x6f (XEN) [<ffff82c480118746>] runq_deassign+0x44/0x46 (XEN) [<ffff82c480119e3b>] csched_cpu_pick+0x273/0x284 (XEN) [<ffff82c48012122d>] vcpu_migrate+0x164/0x29d (XEN) [<ffff82c480121739>] vcpu_force_reschedule+0xad/0xc2 (XEN) [<ffff82c480103e8f>] do_vcpu_op+0x2b7/0x421 (XEN) [<ffff82c4801a4c9d>] hvm_vcpu_op+0x18/0x1a (XEN) [<ffff82c4801a4b2b>] hvm_do_hypercall+0x1a1/0x2d1 (XEN) [<ffff82c4801c1fa5>] vmx_vmexit_handler+0xc74/0x1c3a (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 2: (XEN) Xen BUG at sched_credit2.c:811 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... Also, and possibly related, the domU takes a longish time to start, even at its best. Often, the delay is a minute or so. Not so with "credit". -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John, Thanks for your help looking into this. Just to clarify, you''re running squeeze in HVM mode (not PV mode)? -George On Tue, 2011-04-19 at 21:58 +0100, John Weekes wrote:> I am testing credit2 on a dual Xeon L5640 machine. I have an HVM Debian > Squeeze domU that reliably leads to a panic when it''s run with the > credit2 scheduler, but not with credit. > > The reproduction steps on this machine are simple: > > 1. Fully boot up the machine. > 2. Enter commands that cause dom0 to use 100% CPU. For instance: > > screen -AmdS burn1 perl -e ''while(1) {}'' > screen -AmdS burn2 perl -e ''while(1) {}'' > screen -AmdS burn3 perl -e ''while(1) {}'' > screen -AmdS burn4 perl -e ''while(1) {}'' > > 3. Start up the prepared Squeeze domU (which is a stock install), with > "xm" ("xl" doesn''t work with debug=y because of a spurious assert, but > has the same problem with debug=n): > > cd /servers/customers > xm create testvds4.cfg > > The serial console then shows this: > > (XEN) irq.c:324: Dom1 callback via changed to Direct Vector 0xe9 > (XEN) Xen BUG at sched_credit2.c:1606 > (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- > (XEN) CPU: 12 > (XEN) RIP: e008:[<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > (XEN) RFLAGS: 0000000000010082 CONTEXT: hypervisor > (XEN) rax: ffff830c2246c000 rbx: ffff830c2246bd10 rcx: 0000000000000000 > (XEN) rdx: 0000000000000001 rsi: ffff82c480241680 rdi: ffff8300bf74c000 > (XEN) rbp: ffff83043b28fe38 rsp: ffff83043b28fd58 r8: 0000000000000002 > (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 > (XEN) r12: 0000000000000000 r13: ffff83043ffe02d0 r14: 000000000000000c > (XEN) r15: ffff83043ffe0010 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 0000000c22436000 cr2: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Xen stack trace from rsp=ffff83043b28fd58: > (XEN) ffff82c4801bc822 ffff83043b298040 0000000000000282 ffff83043b28fd88 > (XEN) ffff82c48012248f ffff8300bf74c000 ffff83043b28fdb8 ffff82c4801b59bd > (XEN) 00000014c77c5137 ffff83043b28fe68 ffff82c480241680 0000000000000001 > (XEN) 00007cfbc4d70217 ffff82c48014b2c0 ffff83043b298060 ffff83043b28fde8 > (XEN) ffff82c480124345 ffff83043b298060 ffff83043b28fe38 0000000000000082 > (XEN) 00000000000186a0 0000000000000082 000000000000000c ffff8300bf74c000 > (XEN) 0000000000000000 ffff82c480241680 ffff83043b298040 ffff83043b298060 > (XEN) ffff83043b28feb8 ffff82c48012061c ffff83043b28feb8 00000014c77c5137 > (XEN) 0000000000000293 ffff8300bf74d868 ffff82c48012248f ffff8300bf74c000 > (XEN) ffff83043b28fe98 ffff82c4801b19d0 ffff8300bf74c000 ffff82c4802a8e80 > (XEN) 00000000ffffffff ffff82c4802a8880 ffff83043b28ff18 ffffffffffffffff > (XEN) ffff83043b28fef8 ffff82c480121caf 0440080000000001 ffff8300bf74c000 > (XEN) 0000000000000046 ffff8800018501a0 ffffffff81311470 0000000000000092 > (XEN) ffff83043b28ff08 ffff82c480121d0c ffff88000184c600 ffff82c4801bb3f1 > (XEN) 0000000000000092 ffffffff81311470 ffff8800018501a0 0000000000000046 > (XEN) ffff88000184c600 0000000000000001 00000000000186a0 0000000000000008 > (XEN) 0000000000000200 0000000000000008 0000000000000000 0000000000000002 > (XEN) 0000000000000000 0000000000000002 0000000000000007 0000beef0000beef > (XEN) ffffffff81009308 0000beef0000beef 0000000000000046 ffff880031e63e58 > (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef > (XEN) Xen call trace: > (XEN) [<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > (XEN) [<ffff82c48012061c>] schedule+0x122/0x60c > (XEN) [<ffff82c480121caf>] __do_softirq+0x8d/0x9e > (XEN) [<ffff82c480121d0c>] do_softirq+0x4c/0x4e > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 12: > (XEN) Xen BUG at sched_credit2.c:1606 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > Where do we go from here? > > -John > > For reference, xl info output: > > dallas-dodec226-5 ~ # xl info > host : dallas-dodec226-5 > release : 2.6.32.37-gbe57219 > version : #1 SMP Tue Apr 19 00:14:46 CDT 2011 > machine : x86_64 > nr_cpus : 24 > nr_nodes : 2 > cores_per_socket : 6 > threads_per_core : 2 > cpu_mhz : 2266 > hw_caps : > bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 > virt_caps : hvm hvm_directio > total_memory : 49143 > free_memory : 47106 > free_cpus : 0 > xen_major : 4 > xen_minor : 1 > xen_extra : .1-rc1-pre > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit2 > xen_pagesize : 4096 > platform_params : virt_start=0xffff800000000000 > xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc > xen_commandline : dom0_mem=1500M dom0_max_vcpus=4 > iommu=dom0-passthrough sched=credit2 loglvl=all guest_loglvl=all > com2=115200,8n1 console=com2 > cc_compiler : gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) > cc_compile_by : root > cc_compile_domain : nuclearfallout.net > cc_compile_date : Tue Apr 19 14:26:02 CDT 2011 > xend_config_format : 4 >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/26/2011 4:01 AM, George Dunlap wrote:> John, > > Thanks for your help looking into this. Just to clarify, you''re running > squeeze in HVM mode (not PV mode)?Yes, that''s correct. -John> -George > > On Tue, 2011-04-19 at 21:58 +0100, John Weekes wrote: >> I am testing credit2 on a dual Xeon L5640 machine. I have an HVM Debian >> Squeeze domU that reliably leads to a panic when it''s run with the >> credit2 scheduler, but not with credit. >> >> The reproduction steps on this machine are simple: >> >> 1. Fully boot up the machine. >> 2. Enter commands that cause dom0 to use 100% CPU. For instance: >> >> screen -AmdS burn1 perl -e ''while(1) {}'' >> screen -AmdS burn2 perl -e ''while(1) {}'' >> screen -AmdS burn3 perl -e ''while(1) {}'' >> screen -AmdS burn4 perl -e ''while(1) {}'' >> >> 3. Start up the prepared Squeeze domU (which is a stock install), with >> "xm" ("xl" doesn''t work with debug=y because of a spurious assert, but >> has the same problem with debug=n): >> >> cd /servers/customers >> xm create testvds4.cfg >> >> The serial console then shows this: >> >> (XEN) irq.c:324: Dom1 callback via changed to Direct Vector 0xe9 >> (XEN) Xen BUG at sched_credit2.c:1606 >> (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 12 >> (XEN) RIP: e008:[<ffff82c48011a383>] csched_schedule+0xdb/0xab1 >> (XEN) RFLAGS: 0000000000010082 CONTEXT: hypervisor >> (XEN) rax: ffff830c2246c000 rbx: ffff830c2246bd10 rcx: 0000000000000000 >> (XEN) rdx: 0000000000000001 rsi: ffff82c480241680 rdi: ffff8300bf74c000 >> (XEN) rbp: ffff83043b28fe38 rsp: ffff83043b28fd58 r8: 0000000000000002 >> (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 >> (XEN) r12: 0000000000000000 r13: ffff83043ffe02d0 r14: 000000000000000c >> (XEN) r15: ffff83043ffe0010 cr0: 000000008005003b cr4: 00000000000026f0 >> (XEN) cr3: 0000000c22436000 cr2: 0000000000000000 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >> (XEN) Xen stack trace from rsp=ffff83043b28fd58: >> (XEN) ffff82c4801bc822 ffff83043b298040 0000000000000282 ffff83043b28fd88 >> (XEN) ffff82c48012248f ffff8300bf74c000 ffff83043b28fdb8 ffff82c4801b59bd >> (XEN) 00000014c77c5137 ffff83043b28fe68 ffff82c480241680 0000000000000001 >> (XEN) 00007cfbc4d70217 ffff82c48014b2c0 ffff83043b298060 ffff83043b28fde8 >> (XEN) ffff82c480124345 ffff83043b298060 ffff83043b28fe38 0000000000000082 >> (XEN) 00000000000186a0 0000000000000082 000000000000000c ffff8300bf74c000 >> (XEN) 0000000000000000 ffff82c480241680 ffff83043b298040 ffff83043b298060 >> (XEN) ffff83043b28feb8 ffff82c48012061c ffff83043b28feb8 00000014c77c5137 >> (XEN) 0000000000000293 ffff8300bf74d868 ffff82c48012248f ffff8300bf74c000 >> (XEN) ffff83043b28fe98 ffff82c4801b19d0 ffff8300bf74c000 ffff82c4802a8e80 >> (XEN) 00000000ffffffff ffff82c4802a8880 ffff83043b28ff18 ffffffffffffffff >> (XEN) ffff83043b28fef8 ffff82c480121caf 0440080000000001 ffff8300bf74c000 >> (XEN) 0000000000000046 ffff8800018501a0 ffffffff81311470 0000000000000092 >> (XEN) ffff83043b28ff08 ffff82c480121d0c ffff88000184c600 ffff82c4801bb3f1 >> (XEN) 0000000000000092 ffffffff81311470 ffff8800018501a0 0000000000000046 >> (XEN) ffff88000184c600 0000000000000001 00000000000186a0 0000000000000008 >> (XEN) 0000000000000200 0000000000000008 0000000000000000 0000000000000002 >> (XEN) 0000000000000000 0000000000000002 0000000000000007 0000beef0000beef >> (XEN) ffffffff81009308 0000beef0000beef 0000000000000046 ffff880031e63e58 >> (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef >> (XEN) Xen call trace: >> (XEN) [<ffff82c48011a383>] csched_schedule+0xdb/0xab1 >> (XEN) [<ffff82c48012061c>] schedule+0x122/0x60c >> (XEN) [<ffff82c480121caf>] __do_softirq+0x8d/0x9e >> (XEN) [<ffff82c480121d0c>] do_softirq+0x4c/0x4e >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 12: >> (XEN) Xen BUG at sched_credit2.c:1606 >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> Where do we go from here? >> >> -John >> >> For reference, xl info output: >> >> dallas-dodec226-5 ~ # xl info >> host : dallas-dodec226-5 >> release : 2.6.32.37-gbe57219 >> version : #1 SMP Tue Apr 19 00:14:46 CDT 2011 >> machine : x86_64 >> nr_cpus : 24 >> nr_nodes : 2 >> cores_per_socket : 6 >> threads_per_core : 2 >> cpu_mhz : 2266 >> hw_caps : >> bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 >> virt_caps : hvm hvm_directio >> total_memory : 49143 >> free_memory : 47106 >> free_cpus : 0 >> xen_major : 4 >> xen_minor : 1 >> xen_extra : .1-rc1-pre >> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 >> hvm-3.0-x86_32p hvm-3.0-x86_64 >> xen_scheduler : credit2 >> xen_pagesize : 4096 >> platform_params : virt_start=0xffff800000000000 >> xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc >> xen_commandline : dom0_mem=1500M dom0_max_vcpus=4 >> iommu=dom0-passthrough sched=credit2 loglvl=all guest_loglvl=all >> com2=115200,8n1 console=com2 >> cc_compiler : gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) >> cc_compile_by : root >> cc_compile_domain : nuclearfallout.net >> cc_compile_date : Tue Apr 19 14:26:02 CDT 2011 >> xend_config_format : 4 >> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
OK -- it turns out on my box, I hit the bug as the machine boots, so repro was very easy. :-) (Can you tell I haven''t been working on credit2 in a little while?) Can you test the attached patch and see if it works for you? -George On Tue, 2011-04-26 at 17:45 +0100, John Weekes wrote:> On 4/26/2011 4:01 AM, George Dunlap wrote: > > John, > > > > Thanks for your help looking into this. Just to clarify, you''re running > > squeeze in HVM mode (not PV mode)? > > Yes, that''s correct. > > -John > > > -George > > > > On Tue, 2011-04-19 at 21:58 +0100, John Weekes wrote: > >> I am testing credit2 on a dual Xeon L5640 machine. I have an HVM Debian > >> Squeeze domU that reliably leads to a panic when it''s run with the > >> credit2 scheduler, but not with credit. > >> > >> The reproduction steps on this machine are simple: > >> > >> 1. Fully boot up the machine. > >> 2. Enter commands that cause dom0 to use 100% CPU. For instance: > >> > >> screen -AmdS burn1 perl -e ''while(1) {}'' > >> screen -AmdS burn2 perl -e ''while(1) {}'' > >> screen -AmdS burn3 perl -e ''while(1) {}'' > >> screen -AmdS burn4 perl -e ''while(1) {}'' > >> > >> 3. Start up the prepared Squeeze domU (which is a stock install), with > >> "xm" ("xl" doesn''t work with debug=y because of a spurious assert, but > >> has the same problem with debug=n): > >> > >> cd /servers/customers > >> xm create testvds4.cfg > >> > >> The serial console then shows this: > >> > >> (XEN) irq.c:324: Dom1 callback via changed to Direct Vector 0xe9 > >> (XEN) Xen BUG at sched_credit2.c:1606 > >> (XEN) ----[ Xen-4.1.1-rc1-pre x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 12 > >> (XEN) RIP: e008:[<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > >> (XEN) RFLAGS: 0000000000010082 CONTEXT: hypervisor > >> (XEN) rax: ffff830c2246c000 rbx: ffff830c2246bd10 rcx: 0000000000000000 > >> (XEN) rdx: 0000000000000001 rsi: ffff82c480241680 rdi: ffff8300bf74c000 > >> (XEN) rbp: ffff83043b28fe38 rsp: ffff83043b28fd58 r8: 0000000000000002 > >> (XEN) r9: 000000000000003e r10: 0000000000000018 r11: 00000000000186a0 > >> (XEN) r12: 0000000000000000 r13: ffff83043ffe02d0 r14: 000000000000000c > >> (XEN) r15: ffff83043ffe0010 cr0: 000000008005003b cr4: 00000000000026f0 > >> (XEN) cr3: 0000000c22436000 cr2: 0000000000000000 > >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > >> (XEN) Xen stack trace from rsp=ffff83043b28fd58: > >> (XEN) ffff82c4801bc822 ffff83043b298040 0000000000000282 ffff83043b28fd88 > >> (XEN) ffff82c48012248f ffff8300bf74c000 ffff83043b28fdb8 ffff82c4801b59bd > >> (XEN) 00000014c77c5137 ffff83043b28fe68 ffff82c480241680 0000000000000001 > >> (XEN) 00007cfbc4d70217 ffff82c48014b2c0 ffff83043b298060 ffff83043b28fde8 > >> (XEN) ffff82c480124345 ffff83043b298060 ffff83043b28fe38 0000000000000082 > >> (XEN) 00000000000186a0 0000000000000082 000000000000000c ffff8300bf74c000 > >> (XEN) 0000000000000000 ffff82c480241680 ffff83043b298040 ffff83043b298060 > >> (XEN) ffff83043b28feb8 ffff82c48012061c ffff83043b28feb8 00000014c77c5137 > >> (XEN) 0000000000000293 ffff8300bf74d868 ffff82c48012248f ffff8300bf74c000 > >> (XEN) ffff83043b28fe98 ffff82c4801b19d0 ffff8300bf74c000 ffff82c4802a8e80 > >> (XEN) 00000000ffffffff ffff82c4802a8880 ffff83043b28ff18 ffffffffffffffff > >> (XEN) ffff83043b28fef8 ffff82c480121caf 0440080000000001 ffff8300bf74c000 > >> (XEN) 0000000000000046 ffff8800018501a0 ffffffff81311470 0000000000000092 > >> (XEN) ffff83043b28ff08 ffff82c480121d0c ffff88000184c600 ffff82c4801bb3f1 > >> (XEN) 0000000000000092 ffffffff81311470 ffff8800018501a0 0000000000000046 > >> (XEN) ffff88000184c600 0000000000000001 00000000000186a0 0000000000000008 > >> (XEN) 0000000000000200 0000000000000008 0000000000000000 0000000000000002 > >> (XEN) 0000000000000000 0000000000000002 0000000000000007 0000beef0000beef > >> (XEN) ffffffff81009308 0000beef0000beef 0000000000000046 ffff880031e63e58 > >> (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef > >> (XEN) Xen call trace: > >> (XEN) [<ffff82c48011a383>] csched_schedule+0xdb/0xab1 > >> (XEN) [<ffff82c48012061c>] schedule+0x122/0x60c > >> (XEN) [<ffff82c480121caf>] __do_softirq+0x8d/0x9e > >> (XEN) [<ffff82c480121d0c>] do_softirq+0x4c/0x4e > >> (XEN) > >> (XEN) > >> (XEN) **************************************** > >> (XEN) Panic on CPU 12: > >> (XEN) Xen BUG at sched_credit2.c:1606 > >> (XEN) **************************************** > >> (XEN) > >> (XEN) Reboot in five seconds... > >> > >> Where do we go from here? > >> > >> -John > >> > >> For reference, xl info output: > >> > >> dallas-dodec226-5 ~ # xl info > >> host : dallas-dodec226-5 > >> release : 2.6.32.37-gbe57219 > >> version : #1 SMP Tue Apr 19 00:14:46 CDT 2011 > >> machine : x86_64 > >> nr_cpus : 24 > >> nr_nodes : 2 > >> cores_per_socket : 6 > >> threads_per_core : 2 > >> cpu_mhz : 2266 > >> hw_caps : > >> bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 > >> virt_caps : hvm hvm_directio > >> total_memory : 49143 > >> free_memory : 47106 > >> free_cpus : 0 > >> xen_major : 4 > >> xen_minor : 1 > >> xen_extra : .1-rc1-pre > >> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > >> hvm-3.0-x86_32p hvm-3.0-x86_64 > >> xen_scheduler : credit2 > >> xen_pagesize : 4096 > >> platform_params : virt_start=0xffff800000000000 > >> xen_changeset : Thu Apr 07 15:26:58 2011 +0100 23025:dbf2ddf652dc > >> xen_commandline : dom0_mem=1500M dom0_max_vcpus=4 > >> iommu=dom0-passthrough sched=credit2 loglvl=all guest_loglvl=all > >> com2=115200,8n1 console=com2 > >> cc_compiler : gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) > >> cc_compile_by : root > >> cc_compile_domain : nuclearfallout.net > >> cc_compile_date : Tue Apr 19 14:26:02 CDT 2011 > >> xend_config_format : 4 > >> > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/26/2011 10:02 AM, George Dunlap wrote:> OK -- it turns out on my box, I hit the bug as the machine boots, so > repro was very easy. :-) (Can you tell I haven''t been working on > credit2 in a little while?) > > Can you test the attached patch and see if it works for you?cc1: warnings being treated as errorsOn 4.1, it doesn''t seem to compile: sched_credit2.c: In function ''csched_vcpu_migrate'': sched_credit2.c:1363: error: implicit declaration of function ''migrate'' make[4]: *** [sched_credit2.o] Error 1 make[4]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen/common'' make[3]: *** [/usr/src/xen-4.1-testing.hg/xen/common/built_in.o] Error 2 make[3]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen/arch/x86'' make[2]: *** [/usr/src/xen-4.1-testing.hg/xen/xen] Error 2 make[2]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen'' make[1]: *** [install] Error 2 make[1]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen'' make: *** [install-xen] Error 2 I switched back over to -unstable and ran 10 tests: Test 1: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." Test 2: Seemed to start normally, although VNC didn''t work (probably a separate bug in -unstable) Test 3: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." Test 4: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." Test 5: Seemed to start normally, although VNC didn''t work (probably a separate bug in -unstable) Test 6: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." Test 7: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." Test 8: Seemed to start normally, although VNC didn''t work (probably a separate bug in -unstable) Test 9: Seemed to start normally, although VNC didn''t work (probably a separate bug in -unstable) Test 10: Seemed to start normally, although VNC didn''t work (probably a separate bug in -unstable) This would seem to confirm that the BUG_ON is being avoided now, but the frequent SMBIOS issue means that I won''t be able to start testing credit2 with a subset of customers yet. Note that when it started normally, there were also tons of "memory.c:196:d18 Bad page free for domain x" and "mm.c:2137:d18 Error pfn 0: rd=ffff8304420d7000, od=ffff830xxxxxxxxxx, caf=8000000000000001, taf=7400000000000001" warnings, which I imagine is a separate concern. -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Apr 26, 2011 at 9:15 PM, John Weekes <lists.xen@nuclearfallout.net> wrote:> On 4/26/2011 10:02 AM, George Dunlap wrote: >> >> OK -- it turns out on my box, I hit the bug as the machine boots, so >> repro was very easy. :-) (Can you tell I haven''t been working on >> credit2 in a little while?) >> >> Can you test the attached patch and see if it works for you?cc1: warnings >> being treated as errors > > On 4.1, it doesn''t seem to compile: > > sched_credit2.c: In function ''csched_vcpu_migrate'': > sched_credit2.c:1363: error: implicit declaration of function ''migrate'' > make[4]: *** [sched_credit2.o] Error 1 > make[4]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen/common'' > make[3]: *** [/usr/src/xen-4.1-testing.hg/xen/common/built_in.o] Error 2 > make[3]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen/arch/x86'' > make[2]: *** [/usr/src/xen-4.1-testing.hg/xen/xen] Error 2 > make[2]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen'' > make[1]: *** [install] Error 2 > make[1]: Leaving directory `/usr/src/xen-4.1-testing.hg/xen'' > make: *** [install-xen] Error 2Ah -- yes, you''ll also need c/s 22982:591c459ee00a from xen-unstable if you want to backport it to 4.1.> > I switched back over to -unstable and ran 10 tests: > > Test 1: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." > Test 2: Seemed to start normally, although VNC didn''t work (probably a > separate bug in -unstable) > Test 3: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." > Test 4: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." > Test 5: Seemed to start normally, although VNC didn''t work (probably a > separate bug in -unstable) > Test 6: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." > Test 7: domU freeze after "(XEN) HVMx: Writing SMBIOS tables ..." > Test 8: Seemed to start normally, although VNC didn''t work (probably a > separate bug in -unstable) > Test 9: Seemed to start normally, although VNC didn''t work (probably a > separate bug in -unstable) > Test 10: Seemed to start normally, although VNC didn''t work (probably a > separate bug in -unstable) > > This would seem to confirm that the BUG_ON is being avoided now, but the > frequent SMBIOS issue means that I won''t be able to start testing credit2 > with a subset of customers yet.If you still have time, would you mind: * Starting a new e-mail thread (since it''s a different bug), and * Taking a 30-second trace of the VM booting and hanging with the following command: # xentrace -D -e all -T 30 [filename] & xl create [configuration] * Sending me the resulting trace (it will probably be 100-300MiB in size).> Note that when it started normally, there were also tons of > "memory.c:196:d18 Bad page free for domain x" and "mm.c:2137:d18 Error pfn > 0: rd=ffff8304420d7000, od=ffff830xxxxxxxxxx, caf=8000000000000001, > taf=7400000000000001" warnings, which I imagine is a separate concern.Hmm -- that one is a bit strange, particularly for an HVM guest. I''m not familiar with those messages or what could be causing them. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Weekes
2011-Apr-27 17:24 UTC
[Xen-devel] credit2 domU freeze at "Writing SMBIOS tables ..."
On 4/27/2011 2:10 AM, George Dunlap wrote:>> This would seem to confirm that the BUG_ON is being avoided now, but the >> > frequent SMBIOS issue means that I won''t be able to start testing credit2 >> > with a subset of customers yet. > If you still have time, would you mind: > * Starting a new e-mail thread (since it''s a different bug), andDone. To recap here, my Debian Squeeze stubdom-based HVM domU is frequently not completely starting, freezing after the output "Writing SMBIOS tables ...".> * Taking a 30-second trace of the VM booting and hanging with the > following command: > # xentrace -D -e all -T 30 [filename]& xl create [configuration] > * Sending me the resulting trace (it will probably be 100-300MiB in size).I am emailing you offlist with a link to the trace now. I had to use "xm" instead of "xl" because "xl" still doesn''t seem to work right with stubdoms (a separate bug); it probably needs to wait a little longer for the main domU to be created before creating the stubdom. # xl create testvds4-xl.cfg Parsing config file testvds4-xl.cfg xc: info: VIRTUAL MEMORY ARRANGEMENT: Loader: 0000000000100000->00000000001793f0 TOTAL: 0000000000000000->0000000031800000 ENTRY ADDRESS: 0000000000100000 xc: info: PHYSICAL MEMORY ALLOCATION: 4KB PAGES: 0x0000000000000200 2MB PAGES: 0x000000000000018b 1GB PAGES: 0x0000000000000000 xl: libxl_create.c:292: libxl__domain_make: Assertion `!libxl_domid_valid_guest(*domid)'' failed. Aborted -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Apr-28 13:10 UTC
Re: [Xen-devel] credit2 domU freeze at "Writing SMBIOS tables ..."
Is this just using credit2, or does it do the same thing under credit1? On my box, stubdoms with a 4-vcpu guest on credit1 fail too, with this output on the console: xs_read_watch() -> /local/domain/3/log-throttling /local/domain/3/log-throttling xs_read(/local/domain/3/log-throttling): ENOENT (XEN) HVM3: HVM Loader xs_read_watch() -> /local/domain/0/backend/console/3 be:0x13fb66:3:0x154240 (XEN) HVM3: Detected Xen v4.2-unstable Page fault at linear address 0x10, eip 0x49d9, regs 0x3cfb3c, sp 0x0, our_sp 0x3cfb18, code 0 (XEN) HVM3: Xenbus rings @0xfeffc000, event channel 5 Thread: main (XEN) HVM3: System requested ROMBIOS EIP: 49d9, EFLAGS 10202. (XEN) HVM3: CPU speed is 1995 MHz EBX: 00000000 ECX: 001589b0 EDX: 00000000 ESI: 0000000c EDI: 0000000c EBP: 003cfbb8 EAX: 00000000 DS: e021 ES: e021 orig_eax: ffffffff, eip: 000049d9 CS: e019 EFLAGS: 00010202 esp: 00000000 ss: 3cfba4 base is 0x3cfbb8 caller is 0x24bd3 base is 0x3cfbf8 caller is 0x9814 base is 0x3cfe88 caller is 0xe8e91 base is 0x3cfff0 caller is 0x31ad 3cfba0: 00 00 00 00 28 15 00 00 00 00 00 00 0a 00 00 00 3cfbb0: 67 96 13 00 e8 fb 3c 00 f8 fb 3c 00 d3 4b 02 00 3cfbc0: 00 00 00 00 51 e5 01 00 00 00 00 00 00 00 00 00 3cfbd0: 00 00 00 00 cb 96 13 00 78 94 00 82 03 00 00 00 49c0: c8 c1 e8 05 83 e1 1f 8b 44 85 ec d3 f8 a8 01 74 49d0: 0e 8b 43 14 89 04 24 ff d2 83 7b 10 00 75 23 8b 49e0: 53 0c 85 d2 74 1c 8b 0b 89 c8 c1 e8 05 83 e1 1f 49f0: 8b 44 85 e4 d3 f8 a8 01 74 08 8b 43 14 89 04 24 Pagetable walk from virt 10, base 371000: L3 = 00000000b074f027 (0x372000) [offset = 0] L2 = 00000000b074d067 (0x374000) [offset = 0] L1 = 0000000000000000 [offset = 0] -George On Wed, Apr 27, 2011 at 6:24 PM, John Weekes <lists.xen@nuclearfallout.net> wrote:> On 4/27/2011 2:10 AM, George Dunlap wrote: >>> >>> This would seem to confirm that the BUG_ON is being avoided now, but the >>> > frequent SMBIOS issue means that I won''t be able to start testing >>> > credit2 >>> > with a subset of customers yet. >> >> If you still have time, would you mind: >> * Starting a new e-mail thread (since it''s a different bug), and > > Done. > > To recap here, my Debian Squeeze stubdom-based HVM domU is frequently not > completely starting, freezing after the output "Writing SMBIOS tables ...". > >> * Taking a 30-second trace of the VM booting and hanging with the >> following command: >> # xentrace -D -e all -T 30 [filename]& xl create [configuration] >> * Sending me the resulting trace (it will probably be 100-300MiB in size). > > I am emailing you offlist with a link to the trace now. I had to use "xm" > instead of "xl" because "xl" still doesn''t seem to work right with stubdoms > (a separate bug); it probably needs to wait a little longer for the main > domU to be created before creating the stubdom. > > # xl create testvds4-xl.cfg > Parsing config file testvds4-xl.cfg > xc: info: VIRTUAL MEMORY ARRANGEMENT: > Loader: 0000000000100000->00000000001793f0 > TOTAL: 0000000000000000->0000000031800000 > ENTRY ADDRESS: 0000000000100000 > xc: info: PHYSICAL MEMORY ALLOCATION: > 4KB PAGES: 0x0000000000000200 > 2MB PAGES: 0x000000000000018b > 1GB PAGES: 0x0000000000000000 > xl: libxl_create.c:292: libxl__domain_make: Assertion > `!libxl_domid_valid_guest(*domid)'' failed. > Aborted > > -John > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Weekes
2011-Apr-28 16:25 UTC
Re: [Xen-devel] credit2 domU freeze at "Writing SMBIOS tables ..."
On 4/28/2011 6:10 AM, George Dunlap wrote:> Is this just using credit2, or does it do the same thing under credit1?I just re-tested with credit1, and it seems to do the same thing with debug=y (on xen-unstable), you''re right. On 4.1 with debug=n and credit1, which is my normal configuration, it doesn''t appear to happen, or at least didn''t happen over the 10 attempts that I just made (it could be a rare problem on that setup). I may have to do some additional testing with 4.1 credit2, 4.1 debug=y, and unstable debug=n, but it''s good that you''re seeing something reproducible on your end, at least (and with debug=y).> On my box, stubdoms with a 4-vcpu guest on credit1 fail too, with this > output on the console: > > xs_read_watch() -> /local/domain/3/log-throttling /local/domain/3/log-throttling > xs_read(/local/domain/3/log-throttling): ENOENT > (XEN) HVM3: HVM Loader > xs_read_watch() -> /local/domain/0/backend/console/3 be:0x13fb66:3:0x154240 > (XEN) HVM3: Detected Xen v4.2-unstable > Page fault at linear address 0x10, eip 0x49d9, regs 0x3cfb3c, sp 0x0, > our_sp 0x3cfb18, code 0 > (XEN) HVM3: Xenbus rings @0xfeffc000, event channel 5 > Thread: main > (XEN) HVM3: System requested ROMBIOS > EIP: 49d9, EFLAGS 10202. > (XEN) HVM3: CPU speed is 1995 MHz > EBX: 00000000 ECX: 001589b0 EDX: 00000000 > ESI: 0000000c EDI: 0000000c EBP: 003cfbb8 EAX: 00000000 > DS: e021 ES: e021 orig_eax: ffffffff, eip: 000049d9 > CS: e019 EFLAGS: 00010202 esp: 00000000 ss: 3cfba4 > base is 0x3cfbb8 caller is 0x24bd3 > base is 0x3cfbf8 caller is 0x9814 > base is 0x3cfe88 caller is 0xe8e91 > base is 0x3cfff0 caller is 0x31ad > > 3cfba0: 00 00 00 00 28 15 00 00 00 00 00 00 0a 00 00 00 > 3cfbb0: 67 96 13 00 e8 fb 3c 00 f8 fb 3c 00 d3 4b 02 00 > 3cfbc0: 00 00 00 00 51 e5 01 00 00 00 00 00 00 00 00 00 > 3cfbd0: 00 00 00 00 cb 96 13 00 78 94 00 82 03 00 00 00 > > 49c0: c8 c1 e8 05 83 e1 1f 8b 44 85 ec d3 f8 a8 01 74 > 49d0: 0e 8b 43 14 89 04 24 ff d2 83 7b 10 00 75 23 8b > 49e0: 53 0c 85 d2 74 1c 8b 0b 89 c8 c1 e8 05 83 e1 1f > 49f0: 8b 44 85 e4 d3 f8 a8 01 74 08 8b 43 14 89 04 24 > Pagetable walk from virt 10, base 371000: > L3 = 00000000b074f027 (0x372000) [offset = 0] > L2 = 00000000b074d067 (0x374000) [offset = 0] > L1 = 0000000000000000 [offset = 0]I''m not seeing that one yet. -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Apr-28 16:35 UTC
Re: [Xen-devel] credit2 domU freeze at "Writing SMBIOS tables ..."
Do you have "loglvl=all guest_loglvl=all" on your xen boot command-line? -George On Thu, 2011-04-28 at 17:25 +0100, John Weekes wrote:> On 4/28/2011 6:10 AM, George Dunlap wrote: > > Is this just using credit2, or does it do the same thing under credit1? > > I just re-tested with credit1, and it seems to do the same thing with > debug=y (on xen-unstable), you''re right. > > On 4.1 with debug=n and credit1, which is my normal configuration, it > doesn''t appear to happen, or at least didn''t happen over the 10 attempts > that I just made (it could be a rare problem on that setup). I may have > to do some additional testing with 4.1 credit2, 4.1 debug=y, and > unstable debug=n, but it''s good that you''re seeing something > reproducible on your end, at least (and with debug=y). > > > On my box, stubdoms with a 4-vcpu guest on credit1 fail too, with this > > output on the console: > > > > xs_read_watch() -> /local/domain/3/log-throttling /local/domain/3/log-throttling > > xs_read(/local/domain/3/log-throttling): ENOENT > > (XEN) HVM3: HVM Loader > > xs_read_watch() -> /local/domain/0/backend/console/3 be:0x13fb66:3:0x154240 > > (XEN) HVM3: Detected Xen v4.2-unstable > > Page fault at linear address 0x10, eip 0x49d9, regs 0x3cfb3c, sp 0x0, > > our_sp 0x3cfb18, code 0 > > (XEN) HVM3: Xenbus rings @0xfeffc000, event channel 5 > > Thread: main > > (XEN) HVM3: System requested ROMBIOS > > EIP: 49d9, EFLAGS 10202. > > (XEN) HVM3: CPU speed is 1995 MHz > > EBX: 00000000 ECX: 001589b0 EDX: 00000000 > > ESI: 0000000c EDI: 0000000c EBP: 003cfbb8 EAX: 00000000 > > DS: e021 ES: e021 orig_eax: ffffffff, eip: 000049d9 > > CS: e019 EFLAGS: 00010202 esp: 00000000 ss: 3cfba4 > > base is 0x3cfbb8 caller is 0x24bd3 > > base is 0x3cfbf8 caller is 0x9814 > > base is 0x3cfe88 caller is 0xe8e91 > > base is 0x3cfff0 caller is 0x31ad > > > > 3cfba0: 00 00 00 00 28 15 00 00 00 00 00 00 0a 00 00 00 > > 3cfbb0: 67 96 13 00 e8 fb 3c 00 f8 fb 3c 00 d3 4b 02 00 > > 3cfbc0: 00 00 00 00 51 e5 01 00 00 00 00 00 00 00 00 00 > > 3cfbd0: 00 00 00 00 cb 96 13 00 78 94 00 82 03 00 00 00 > > > > 49c0: c8 c1 e8 05 83 e1 1f 8b 44 85 ec d3 f8 a8 01 74 > > 49d0: 0e 8b 43 14 89 04 24 ff d2 83 7b 10 00 75 23 8b > > 49e0: 53 0c 85 d2 74 1c 8b 0b 89 c8 c1 e8 05 83 e1 1f > > 49f0: 8b 44 85 e4 d3 f8 a8 01 74 08 8b 43 14 89 04 24 > > Pagetable walk from virt 10, base 371000: > > L3 = 00000000b074f027 (0x372000) [offset = 0] > > L2 = 00000000b074d067 (0x374000) [offset = 0] > > L1 = 0000000000000000 [offset = 0] > > I''m not seeing that one yet. > > -John_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Weekes
2011-Apr-28 19:56 UTC
Re: [Xen-devel] credit2 domU freeze at "Writing SMBIOS tables ..."
On 4/28/2011 9:35 AM, George Dunlap wrote:> Do you have "loglvl=all guest_loglvl=all" on your xen boot command-line?Yes. My line is: kernel /boot/xen-4.2-unstable.gz dom0_mem=1500M dom0_max_vcpus=4 iommu=dom0-passthrough sched=credit loglvl=all guest_loglvl=all com2=115200,8n1 console=com2 (I''ve also seen an error very similar to what that you''re seeing with the extra logging turned off, but under different circumstances, so that one doesn''t seem to require it.) -John> -George > > On Thu, 2011-04-28 at 17:25 +0100, John Weekes wrote: >> On 4/28/2011 6:10 AM, George Dunlap wrote: >>> Is this just using credit2, or does it do the same thing under credit1? >> I just re-tested with credit1, and it seems to do the same thing with >> debug=y (on xen-unstable), you''re right. >> >> On 4.1 with debug=n and credit1, which is my normal configuration, it >> doesn''t appear to happen, or at least didn''t happen over the 10 attempts >> that I just made (it could be a rare problem on that setup). I may have >> to do some additional testing with 4.1 credit2, 4.1 debug=y, and >> unstable debug=n, but it''s good that you''re seeing something >> reproducible on your end, at least (and with debug=y). >> >>> On my box, stubdoms with a 4-vcpu guest on credit1 fail too, with this >>> output on the console: >>> >>> xs_read_watch() -> /local/domain/3/log-throttling /local/domain/3/log-throttling >>> xs_read(/local/domain/3/log-throttling): ENOENT >>> (XEN) HVM3: HVM Loader >>> xs_read_watch() -> /local/domain/0/backend/console/3 be:0x13fb66:3:0x154240 >>> (XEN) HVM3: Detected Xen v4.2-unstable >>> Page fault at linear address 0x10, eip 0x49d9, regs 0x3cfb3c, sp 0x0, >>> our_sp 0x3cfb18, code 0 >>> (XEN) HVM3: Xenbus rings @0xfeffc000, event channel 5 >>> Thread: main >>> (XEN) HVM3: System requested ROMBIOS >>> EIP: 49d9, EFLAGS 10202. >>> (XEN) HVM3: CPU speed is 1995 MHz >>> EBX: 00000000 ECX: 001589b0 EDX: 00000000 >>> ESI: 0000000c EDI: 0000000c EBP: 003cfbb8 EAX: 00000000 >>> DS: e021 ES: e021 orig_eax: ffffffff, eip: 000049d9 >>> CS: e019 EFLAGS: 00010202 esp: 00000000 ss: 3cfba4 >>> base is 0x3cfbb8 caller is 0x24bd3 >>> base is 0x3cfbf8 caller is 0x9814 >>> base is 0x3cfe88 caller is 0xe8e91 >>> base is 0x3cfff0 caller is 0x31ad >>> >>> 3cfba0: 00 00 00 00 28 15 00 00 00 00 00 00 0a 00 00 00 >>> 3cfbb0: 67 96 13 00 e8 fb 3c 00 f8 fb 3c 00 d3 4b 02 00 >>> 3cfbc0: 00 00 00 00 51 e5 01 00 00 00 00 00 00 00 00 00 >>> 3cfbd0: 00 00 00 00 cb 96 13 00 78 94 00 82 03 00 00 00 >>> >>> 49c0: c8 c1 e8 05 83 e1 1f 8b 44 85 ec d3 f8 a8 01 74 >>> 49d0: 0e 8b 43 14 89 04 24 ff d2 83 7b 10 00 75 23 8b >>> 49e0: 53 0c 85 d2 74 1c 8b 0b 89 c8 c1 e8 05 83 e1 1f >>> 49f0: 8b 44 85 e4 d3 f8 a8 01 74 08 8b 43 14 89 04 24 >>> Pagetable walk from virt 10, base 371000: >>> L3 = 00000000b074f027 (0x372000) [offset = 0] >>> L2 = 00000000b074d067 (0x374000) [offset = 0] >>> L1 = 0000000000000000 [offset = 0] >> I''m not seeing that one yet. >> >> -John >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel