Hello. I''ve a dom0 working perfectly under xen 3.3.x, with a bout 15 HVM domU. When migrating to xen 3.4.1, with the same dom0 kernel (2.6.27.37), everything seems to be fine, I can launch the various hosts, but 5 to 10 minutes later, the host violently reboot... I can''t find any trace in the logs. I do have a second host with the same configuration and setup, and the result is similar. It seems to be linked with domU activity, because without any domU, or without any domU with actual activity, I don''t have any reboot. I had to rollback to xen 3.3.0. I already attempted such upgrade to xen 3.4.0 this summer, with exactly the same result. It seems like an hardware issue (but it doesn''t appears with 3.3.0), or a crash in the hypervisor, than syslog is unable to catch when it appears. How can I try to get a trace ? -- BOFH excuse #248: Too much radiation coming from the soil. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Nov 19, 2009 at 07:06:56PM +0100, Guillaume Rousse wrote:> Hello. > > I''ve a dom0 working perfectly under xen 3.3.x, with a bout 15 HVM domU. > When migrating to xen 3.4.1, with the same dom0 kernel (2.6.27.37), > everything seems to be fine, I can launch the various hosts, but 5 to 10 > minutes later, the host violently reboot... I can''t find any trace in > the logs. I do have a second host with the same configuration and setup, > and the result is similar. It seems to be linked with domU activity, > because without any domU, or without any domU with actual activity, I > don''t have any reboot. I had to rollback to xen 3.3.0. >Did you try the new Xen 3.4.2 ?> I already attempted such upgrade to xen 3.4.0 this summer, with exactly > the same result. >Ok..> It seems like an hardware issue (but it doesn''t appears with 3.3.0), or > a crash in the hypervisor, than syslog is unable to catch when it > appears. How can I try to get a trace ? >You should setup a serial console, so you can capture and log the full console (xen + dom0 kernel) output to other computer.. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen a écrit :> On Thu, Nov 19, 2009 at 07:06:56PM +0100, Guillaume Rousse wrote: >> Hello. >> >> I''ve a dom0 working perfectly under xen 3.3.x, with a bout 15 HVM domU. >> When migrating to xen 3.4.1, with the same dom0 kernel (2.6.27.37), >> everything seems to be fine, I can launch the various hosts, but 5 to 10 >> minutes later, the host violently reboot... I can''t find any trace in >> the logs. I do have a second host with the same configuration and setup, >> and the result is similar. It seems to be linked with domU activity, >> because without any domU, or without any domU with actual activity, I >> don''t have any reboot. I had to rollback to xen 3.3.0. >> > > Did you try the new Xen 3.4.2 ?I just did this morning. Without any changelog, it''s a bit ''upgrade and pray''...>> It seems like an hardware issue (but it doesn''t appears with 3.3.0), or >> a crash in the hypervisor, than syslog is unable to catch when it >> appears. How can I try to get a trace ? >> > > You should setup a serial console, so you can capture and > log the full console (xen + dom0 kernel) output to other computer..Indeed. Here is the output. At first domU crash, because of memory ballooning issue, is not fatal. The second crash, however is. I don''t know if it''s because of uncorrect state after initial crash, or because of additional domUs launched in the interim. (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) Domain 1 (vcpu#0) crashed on cpu#3: (XEN) ----[ Xen-3.4.1 x86_64 debug=n Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: 0010:[<ffffffff811ed7ed>] (XEN) RFLAGS: 0000000000010246 CONTEXT: hvm guest (XEN) rax: 00000000007028b8 rbx: 0000000000001000 rcx: 0000000000000200 (XEN) rdx: 0000000000000000 rsi: 00000000007028b8 rdi: ffff8800123a0000 (XEN) rbp: ffff88001a119b68 rsp: ffff88001a119b50 r8: ffffea00003fcb00 (XEN) r9: 000000000001050f r10: 0000000000000000 r11: 0000000000000001 (XEN) r12: 0000000000001000 r13: 0000000000000000 r14: ffff88001796aea8 (XEN) r15: 0000000000001000 cr0: 000000008005003b cr4: 00000000000006f0 (XEN) cr3: 000000001a079000 cr2: 00007fc176c772e8 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0018 cs: 0010 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) Domain 2 reported crashed by domain 0 on cpu#0: (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! (XEN) domain_crash called from p2m.c:1091 (XEN) ----[ Xen-3.4.1 x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff828c801aab29>] hash_foreach+0x59/0xe0 (XEN) RFLAGS: 0000000000010296 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff8284000c1780 rcx: 00000000000060bc (XEN) rdx: ffff83041f98c000 rsi: 0000000000000336 rdi: ffff8300be7c0000 (XEN) rbp: 0000000000000336 rsp: ffff828c80257848 r8: 0000000000200c00 (XEN) r9: 0000000000000001 r10: ffff83041f98c000 r11: ffff828c801b10e0 (XEN) r12: 0000000000000001 r13: 0000000000000000 r14: 00000000000060bc (XEN) r15: ffff828c80205f80 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 0000000021759000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff828c80257848: (XEN) 0000000000000000 ffff8300be7c0000 ffff83041f98c000 ffff8284000c1780 (XEN) ffff8300be7c0000 00000000000060bc 0000000000000000 00000000000144bc (XEN) ffff8300be7c0000 ffff828c801aae4d ffff828c80257960 00000000000060bc (XEN) ffff828c80257960 ffff83041f98c000 ffff83041f98c000 ffff828c801b13bf (XEN) 00000000000144bc 0000000000200c00 ffff83041f4ed5e0 ffff83041f98d130 (XEN) ffff828c80284d24 ffff83041f4ed5e0 ffff828c80257960 ffff828c80257968 (XEN) ffff83041f98c000 00000000000144bc 0000000000000000 ffff828c801a96d4 (XEN) 0000000000000200 2000000000000000 ffff828c80257a80 000000061f98c000 (XEN) 0000000000000200 007fffffffffffff 0000000000000000 ffff83041f4ed000 (XEN) 000000000041f4ed 0000000000000001 0000000000000001 0000000000000200 (XEN) 00000000000144bc ffff83041f98c000 0000000000000006 ffff828c801a5991 (XEN) ffff828c80257abc 0000000000000001 ffff828c80257ba8 007fffffffffffff (XEN) ffff828c802579f0 ffff83041f98c000 ffff828c80257a80 ffff828c801a6efb (XEN) 0000000400000000 0000000000000000 ffff8300060bc000 ffff8300060bb000 (XEN) ffff8300060ba000 ffff8300060b9000 ffff8300060b8000 ffff8300060b7000 (XEN) ffff8300060b6000 ffff8300060b5000 ffff8300060b4000 ffff8300060b3000 (XEN) ffff8300060b2000 ffff8300060b1000 ffff8300060b0000 ffff8300060af000 (XEN) ffff8300060ae000 ffff828c801f16dc 0000000000000082 0000000100000001 (XEN) 0000000100000001 0000000100000001 0000000100000001 0000000100000001 (XEN) 0000000100000001 0000000100000001 0000000100000001 0000000000000286 (XEN) Xen call trace: (XEN) [<ffff828c801aab29>] hash_foreach+0x59/0xe0 (XEN) [<ffff828c801aae4d>] sh_remove_all_mappings+0x8d/0x200 (XEN) [<ffff828c801b13bf>] shadow_write_p2m_entry+0x2df/0x330 (XEN) [<ffff828c801a96d4>] p2m_set_entry+0x344/0x430 (XEN) [<ffff828c801a5991>] set_p2m_entry+0x71/0xa0 (XEN) [<ffff828c801a6efb>] p2m_pod_zero_check+0x1db/0x310 (XEN) [<ffff828c801a8a20>] p2m_pod_demand_populate+0x830/0xa40 (XEN) [<ffff828c801a90b4>] p2m_gfn_to_mfn+0x224/0x260 (XEN) [<ffff828c80151fd5>] mod_l1_entry+0x6e5/0x7b0 (XEN) [<ffff828c80153067>] do_mmu_update+0x937/0x16e0 (XEN) [<ffff828c8014df0b>] get_page_type+0xb/0x20 (XEN) [<ffff828c801112b4>] do_multicall+0x164/0x370 (XEN) [<ffff828c801c8169>] syscall_enter+0xa9/0xae (XEN) (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 000000001cb48067 00000000003d6ca9 (XEN) L3[0x000] = 000000000c58b067 00000000003e72ec (XEN) L2[0x000] = 0000000000000000 ffffffffffffffff (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL PAGE FAULT (XEN) [error_code=0000] (XEN) Faulting linear address: 0000000000000000 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... My domUs all have this configuration: memory = 256 maxmem = 512 Or different values, but always with the same ratio between memory and max memory. Which seems to be quite useless for hvm domUs, as memory ballooning is not supported AFAIK, unless using pv-drivers (which I can''t manage to build). With identical values, the issue does''nt appear. With Xen 3.4.2, the domUs still crash, but at least dom0 does not reboot. So it''s just less worst :) -- BOFH excuse #426: internet is needed to catch the etherbunny _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Nov 20, 2009 at 11:42:23AM +0100, Guillaume Rousse wrote:> Pasi Kärkkäinen a écrit : > >On Thu, Nov 19, 2009 at 07:06:56PM +0100, Guillaume Rousse wrote: > >>Hello. > >> > >>I''ve a dom0 working perfectly under xen 3.3.x, with a bout 15 HVM domU. > >>When migrating to xen 3.4.1, with the same dom0 kernel (2.6.27.37), > >>everything seems to be fine, I can launch the various hosts, but 5 to 10 > >>minutes later, the host violently reboot... I can''t find any trace in > >>the logs. I do have a second host with the same configuration and setup, > >>and the result is similar. It seems to be linked with domU activity, > >>because without any domU, or without any domU with actual activity, I > >>don''t have any reboot. I had to rollback to xen 3.3.0. > >> > > > >Did you try the new Xen 3.4.2 ? > I just did this morning. Without any changelog, it''s a bit ''upgrade and > pray''... >Changelog is here: http://xenbits.xen.org/xen-3.4-testing.hg> >>It seems like an hardware issue (but it doesn''t appears with 3.3.0), or > >>a crash in the hypervisor, than syslog is unable to catch when it > >>appears. How can I try to get a trace ? > >> > > > >You should setup a serial console, so you can capture and > >log the full console (xen + dom0 kernel) output to other computer.. > Indeed. > > Here is the output. At first domU crash, because of memory ballooning > issue, is not fatal. The second crash, however is. I don''t know if it''s > because of uncorrect state after initial crash, or because of additional > domUs launched in the interim. ><snip>> > > My domUs all have this configuration: > memory = 256 > maxmem = 512 > > Or different values, but always with the same ratio between memory and > max memory. Which seems to be quite useless for hvm domUs, as memory > ballooning is not supported AFAIK, unless using pv-drivers (which I > can''t manage to build). > > With identical values, the issue does''nt appear. >Hmm.. so it''s definitely related to ballooning.> With Xen 3.4.2, the domUs still crash, but at least dom0 does not > reboot. So it''s just less worst :) >So 3.4.2 fixes the hypervisor crash. That''s good. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, 2009-11-20 at 20:29 -0500, Ata E Husain wrote:> Dear All,Hi.> I am trying to get details with respect to the disk accesses (read/writes) > done by domUs by putting some monitoring code in dom0. One way to achieve > this is to create all domUs in separate LVM and use some disk monitoring > utility program like iostat to get the details, but is it possible to get > the details without running any utility program such as putting some > monitoring code in hypervisor to do the required ?The hypervisor is quite definitely the wrong place as it doesn''t do peripheral I/O virtualization.> I am using a tap:aio configuration to mount by domU image, and am going > through blktap device driver files to get some details but till now no luck.You could hook into the dom0 kernel. There are ways to do so with or without xen, but since I/O virtualization can forward block I/O requests to userspace, blktap is probably exactly what you want. Look into tools/blktap2 and the block-log driver. IIRC it''s only logging writes right now (?). Anyway, it should get you started. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dear All, I am trying to get details with respect to the disk accesses (read/writes) done by domUs by putting some monitoring code in dom0. One way to achieve this is to create all domUs in separate LVM and use some disk monitoring utility program like iostat to get the details, but is it possible to get the details without running any utility program such as putting some monitoring code in hypervisor to do the required ? I am using a tap:aio configuration to mount by domU image, and am going through blktap device driver files to get some details but till now no luck. I have written mails earlier to this community but have never got a single reply. I am not if I am not able to put the question in a proper manner or this does not fall under anybody of your interest. If in case I am not so clear with respect to my questions, please do reply so I get a chance to rephrase by question. It would be a great help if someone can provide some useful pointers to perform above mentioned tasks. Thanks! Ata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen a écrit :> On Fri, Nov 20, 2009 at 11:42:23AM +0100, Guillaume Rousse wrote: >> Pasi Kärkkäinen a écrit : >>> On Thu, Nov 19, 2009 at 07:06:56PM +0100, Guillaume Rousse wrote: >>>> Hello. >>>> >>>> I''ve a dom0 working perfectly under xen 3.3.x, with a bout 15 HVM domU. >>>> When migrating to xen 3.4.1, with the same dom0 kernel (2.6.27.37), >>>> everything seems to be fine, I can launch the various hosts, but 5 to 10 >>>> minutes later, the host violently reboot... I can''t find any trace in >>>> the logs. I do have a second host with the same configuration and setup, >>>> and the result is similar. It seems to be linked with domU activity, >>>> because without any domU, or without any domU with actual activity, I >>>> don''t have any reboot. I had to rollback to xen 3.3.0. >>>> >>> Did you try the new Xen 3.4.2 ? >> I just did this morning. Without any changelog, it''s a bit ''upgrade and >> pray''... >> > > Changelog is here: http://xenbits.xen.org/xen-3.4-testing.hgThe exhaustive list of all files modifications is totally useless for an admin to decide if the risk of updating a working system is worth the attempt. What is missing in all Xen releases is a comprehensive, user-targeted list of bug fixed and behavior changes. [..]>> With Xen 3.4.2, the domUs still crash, but at least dom0 does not >> reboot. So it''s just less worst :) >> > > So 3.4.2 fixes the hypervisor crash. That''s good.Yes, that''s lesser evil :) What is really strange, tough, is the difference behaviour exhibited by this new feature. On other systems, I just can''t launch any HVM with different values for ''memory'' and ''maxmem''. -- BOFH excuse #179: multicasts on broken packets _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel