On 29.11.2011 16:48, Roderick Colenbrander wrote:> Your tests are heavily stressing DomU. During any of your tests, have > you seen DomU crashing in such a nasty way that Dom0 went down? What > about in production? Our servers use similar software to you > (initially Xen 4.0.1, but now Xen 4.1.1 and a Linux 2.6.32-pvops > kernel), but a few percent of our servers go down in a very nasty way > every day. Dom0 becomes unresponsive, it feels it ''hung'' and we have > to force reboot the boxes. Do issues like this sound familiar?Not in this year of my stability tests. In this year I am always experiencing crashes of domU only. dom0 was always stable. But last year, I hunted a very serious problem which causes nasty hangs/crashes in dom0 (which crashes domU as a consequence). See this mailing list post: http://lists.xen.org/archives/html/xen-devel/2010-09/msg00556.html In my tests it clearly shows that if you have a CPU without ARAT and you don''t have the patch from my post, your Xen 4.0.1 or 4.1.1 will crash under load and/or after a while. What is your CPU? Regards Andreas
On Tue, Nov 29, 2011 at 9:12 AM, Andreas Kinzler <ml-xen-devel@hfp.de> wrote:> On 29.11.2011 16:48, Roderick Colenbrander wrote: >> >> Your tests are heavily stressing DomU. During any of your tests, have >> you seen DomU crashing in such a nasty way that Dom0 went down? What >> about in production? Our servers use similar software to you >> (initially Xen 4.0.1, but now Xen 4.1.1 and a Linux 2.6.32-pvops >> kernel), but a few percent of our servers go down in a very nasty way >> every day. Dom0 becomes unresponsive, it feels it ''hung'' and we have >> to force reboot the boxes. Do issues like this sound familiar? > > > Not in this year of my stability tests. In this year I am always > experiencing crashes of domU only. dom0 was always stable. > > But last year, I hunted a very serious problem which causes nasty > hangs/crashes in dom0 (which crashes domU as a consequence). See this > mailing list post: > http://lists.xen.org/archives/html/xen-devel/2010-09/msg00556.html > > In my tests it clearly shows that if you have a CPU without ARAT and you > don''t have the patch from my post, your Xen 4.0.1 or 4.1.1 will crash under > load and/or after a while. What is your CPU? > > Regards AndreasMost of our machines use i7 950 CPUs. They don''t seem to have ARAT. Some other machines use Xeon CPUs with ARAT support. We never had issues on the Xeon systems, so we may actually be suffering from the ARAT issue. Are you still using the patch you linked to in a production environment? I wonder why a cleaned up patch like that never made it into core. I''m going to do some testing (may take a while). Thanks, Roderick
>> Not in this year of my stability tests. In this year I am always >> experiencing crashes of domU only. dom0 was always stable. >> But last year, I hunted a very serious problem which causes nasty >> hangs/crashes in dom0 (which crashes domU as a consequence). See this >> mailing list post: >> http://lists.xen.org/archives/html/xen-devel/2010-09/msg00556.html >> In my tests it clearly shows that if you have a CPU without ARAT and you >> don''t have the patch from my post, your Xen 4.0.1 or 4.1.1 will crash under >> load and/or after a while. What is your CPU? > Most of our machines use i7 950 CPUs. They don''t seem to have ARAT.Yes, i7 950 does not have ARAT as it is the first Nehalem generation.> Some other machines use Xeon CPUs with ARAT support. We never had > issues on the Xeon systems, so we may actually be suffering from the > ARAT issue. Are you still using the patch you linked to in a > production environment?Absolutely. As I mentioned I just re-performed tests recently and found that even Xen 4.1.1 (earlier tests were for 4.0.1) is unstable without my patch on non-ARAT-CPUs. > I wonder why a cleaned up patch like that never made it into core. The patch actually only disables HPET broadcast which has some downsides because it effectively disables C3 states. Without C3 states Nehalem and later CPUs cannot enter turbo-mode. So there is some loss of performance.> I''m going to do some testing (may take a while).Please let me know. There are already some people confirming that HPET broadcast is buggy. With more evidence I will contact Keir again and suggest to have it (HPET broadcast) fixed or removed. Regards Andreas
On Tue, Nov 29, 2011 at 10:15 AM, Andreas Kinzler <ml-xen-devel@hfp.de> wrote:>>> Not in this year of my stability tests. In this year I am always >>> experiencing crashes of domU only. dom0 was always stable. >>> But last year, I hunted a very serious problem which causes nasty >>> hangs/crashes in dom0 (which crashes domU as a consequence). See this >>> mailing list post: >>> http://lists.xen.org/archives/html/xen-devel/2010-09/msg00556.html >>> In my tests it clearly shows that if you have a CPU without ARAT and you >>> don''t have the patch from my post, your Xen 4.0.1 or 4.1.1 will crash >>> under >>> load and/or after a while. What is your CPU? >> >> Most of our machines use i7 950 CPUs. They don''t seem to have ARAT. > > > Yes, i7 950 does not have ARAT as it is the first Nehalem generation. > > >> Some other machines use Xeon CPUs with ARAT support. We never had >> issues on the Xeon systems, so we may actually be suffering from the >> ARAT issue. Are you still using the patch you linked to in a >> production environment? > > > Absolutely. As I mentioned I just re-performed tests recently and found that > even Xen 4.1.1 (earlier tests were for 4.0.1) is unstable without my patch > on non-ARAT-CPUs. > > >> I wonder why a cleaned up patch like that never made it into core. > > The patch actually only disables HPET broadcast which has some downsides > because it effectively disables C3 states. Without C3 states Nehalem and > later CPUs cannot enter turbo-mode. So there is some loss of performance.Would disabling any low CPU power states and turbo clocks in the BIOS, help as well? Just curious. I have seen other ''weird'' performance issues between machines using the same hardware. Some CPU intensive algorithm could be twice as slow running on Dom0 compared to the same kernel without Xen. On other identical systems I didn''t see that issue. I didn''t have time to investigate, but I felt there may have been BIOS setting differences.> > >> I''m going to do some testing (may take a while). > > > Please let me know. There are already some people confirming that HPET > broadcast is buggy. With more evidence I will contact Keir again and suggest > to have it (HPET broadcast) fixed or removed. > > Regards AndreasThanks, Roderick
On Tue, Nov 29, 2011 at 07:15:57PM +0100, Andreas Kinzler wrote:> >>Not in this year of my stability tests. In this year I am always > >>experiencing crashes of domU only. dom0 was always stable. > >>But last year, I hunted a very serious problem which causes nasty > >>hangs/crashes in dom0 (which crashes domU as a consequence). See this > >>mailing list post: > >>http://lists.xen.org/archives/html/xen-devel/2010-09/msg00556.html > >>In my tests it clearly shows that if you have a CPU without ARAT and you > >>don''t have the patch from my post, your Xen 4.0.1 or 4.1.1 will crash > >>under > >>load and/or after a while. What is your CPU? > >Most of our machines use i7 950 CPUs. They don''t seem to have ARAT. > > Yes, i7 950 does not have ARAT as it is the first Nehalem generation. > > >Some other machines use Xeon CPUs with ARAT support. We never had > >issues on the Xeon systems, so we may actually be suffering from the > >ARAT issue. Are you still using the patch you linked to in a > >production environment? > > Absolutely. As I mentioned I just re-performed tests recently and found > that even Xen 4.1.1 (earlier tests were for 4.0.1) is unstable without > my patch on non-ARAT-CPUs.Did you try 4.1.2? This looks quite similar to one particular bug where the vectors were not migrated properly.
On Tue, Nov 29, 2011 at 3:02 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote:> On Tue, Nov 29, 2011 at 07:15:57PM +0100, Andreas Kinzler wrote: >> >>Not in this year of my stability tests. In this year I am always >> >>experiencing crashes of domU only. dom0 was always stable. >> >>But last year, I hunted a very serious problem which causes nasty >> >>hangs/crashes in dom0 (which crashes domU as a consequence). See this >> >>mailing list post: >> >>http://lists.xen.org/archives/html/xen-devel/2010-09/msg00556.html >> >>In my tests it clearly shows that if you have a CPU without ARAT and you >> >>don''t have the patch from my post, your Xen 4.0.1 or 4.1.1 will crash >> >>under >> >>load and/or after a while. What is your CPU? >> >Most of our machines use i7 950 CPUs. They don''t seem to have ARAT. >> >> Yes, i7 950 does not have ARAT as it is the first Nehalem generation. >> >> >Some other machines use Xeon CPUs with ARAT support. We never had >> >issues on the Xeon systems, so we may actually be suffering from the >> >ARAT issue. Are you still using the patch you linked to in a >> >production environment? >> >> Absolutely. As I mentioned I just re-performed tests recently and found >> that even Xen 4.1.1 (earlier tests were for 4.0.1) is unstable without >> my patch on non-ARAT-CPUs. > > Did you try 4.1.2? This looks quite similar to one particular bug where > the vectors were not migrated properly.I haven''t tried Xen 4.1.2 yet. We likely had the issue on Xen 4.0.1 though our data is not conclusive. Was the Xen 4.1.2 bug you refer to also around in Xen 4.0.1? I''m still preparing our tests. Is there any special logging option which would be useful to log anything? All systems are now setup with serial consoles and we log Xen and Dom0 to there. Roderick
>> The patch actually only disables HPET broadcast which has some downsides >> because it effectively disables C3 states. Without C3 states Nehalem and >> later CPUs cannot enter turbo-mode. So there is some loss of performance. > Would disabling any low CPU power states and turbo clocks in the BIOS, > help as well? Just curious. I have seen other ''weird'' performanceI would leave the BIOS to its default and I''d guess that playing with BIOS settings only makes it worse.> issues between machines using the same hardware. Some CPU intensive > algorithm could be twice as slow running on Dom0 compared to the same > kernel without Xen. On other identical systems I didn''t see that > issue. I didn''t have time to investigate, but I felt there may have > been BIOS setting differences.Xen does not use the performance governor by default which has a notable performance impact in my tests. My Xen boot config is: kernel /xen.gz dom0_mem=1024m dom0_max_vcpus=2 cpufreq=xen:performance module /linux-2.6.32.36-pvops0-ak2 root=/dev/sda5 Regards Andreas
On Wed, Nov 30, 2011 at 12:01 AM, Roderick Colenbrander <thunderbird2k@gmail.com> wrote:> On Tue, Nov 29, 2011 at 3:02 PM, Konrad Rzeszutek Wilk > <konrad@darnok.org> wrote: >> On Tue, Nov 29, 2011 at 07:15:57PM +0100, Andreas Kinzler wrote: >>> >>Not in this year of my stability tests. In this year I am always >>> >>experiencing crashes of domU only. dom0 was always stable. >>> >>But last year, I hunted a very serious problem which causes nasty >>> >>hangs/crashes in dom0 (which crashes domU as a consequence). See this >>> >>mailing list post: >>> >>http://lists.xen.org/archives/html/xen-devel/2010-09/msg00556.html >>> >>In my tests it clearly shows that if you have a CPU without ARAT and you >>> >>don''t have the patch from my post, your Xen 4.0.1 or 4.1.1 will crash >>> >>under >>> >>load and/or after a while. What is your CPU? >>> >Most of our machines use i7 950 CPUs. They don''t seem to have ARAT. >>> >>> Yes, i7 950 does not have ARAT as it is the first Nehalem generation. >>> >>> >Some other machines use Xeon CPUs with ARAT support. We never had >>> >issues on the Xeon systems, so we may actually be suffering from the >>> >ARAT issue. Are you still using the patch you linked to in a >>> >production environment? >>> >>> Absolutely. As I mentioned I just re-performed tests recently and found >>> that even Xen 4.1.1 (earlier tests were for 4.0.1) is unstable without >>> my patch on non-ARAT-CPUs. >> >> Did you try 4.1.2? This looks quite similar to one particular bug where >> the vectors were not migrated properly. > > I haven''t tried Xen 4.1.2 yet. We likely had the issue on Xen 4.0.1 > though our data is not conclusive. Was the Xen 4.1.2 bug you refer to > also around in Xen 4.0.1? > > I''m still preparing our tests. Is there any special logging option > which would be useful to log anything? All systems are now setup with > serial consoles and we log Xen and Dom0 to there. > > RoderickIt took about a week, but the systems went down again. Linux is down, but the hypervisor is still reachable on the serial console. Is there anything interesting to dump from there? Thanks, Roderick
> It took about a week, but the systems went down again. Linux is down, > but the hypervisor is still reachable on the serial console. Is there > anything interesting to dump from there? >Just the interrupts information. I think that is Ctrl-A, Ctrl-A, Ctrl-A, and then ''*'' to capture everything (I don''t remember the right one for just interrupts).
On Wed, Dec 7, 2011 at 8:38 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote:>> It took about a week, but the systems went down again. Linux is down, >> but the hypervisor is still reachable on the serial console. Is there >> anything interesting to dump from there? >> > Just the interrupts information. I think that is Ctrl-A, Ctrl-A, > Ctrl-A, and then ''*'' to capture everything (I don''t remember the right > one for just interrupts).Here''s the interrupt information: (XEN) [2011-12-06 19:13:37] [i: dump interrupt bindings] (XEN) [2011-12-06 19:13:37] Guest interrupt information: (XEN) [2011-12-06 19:13:37] IRQ: 0 affinity:00000000,00000000,00000000,00000001 vec:f0 type=IO-APIC-edge status=00000000 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 1 affinity:00000000,00000000,00000000,00000001 vec:30 type=IO-APIC-edge status=00000014 in-flight=0 domain-list=0: 1(-S--), (XEN) [2011-12-06 19:13:37] IRQ: 2 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:e2 type=XT-PIC status=00000000 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 3 affinity:00000000,00000000,00000000,00000001 vec:38 type=IO-APIC-edge status=00000006 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 4 affinity:00000000,00000000,00000000,00000001 vec:40 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 5 affinity:00000000,00000000,00000000,00000001 vec:f1 type=IO-APIC-edge status=00000000 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 6 affinity:00000000,00000000,00000000,00000001 vec:48 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 7 affinity:00000000,00000000,00000000,00000001 vec:50 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 8 affinity:00000000,00000000,00000000,00000001 vec:58 type=IO-APIC-edge status=00000010 in-flight=0 domain-list=0: 8(-S--), (XEN) [2011-12-06 19:13:37] IRQ: 9 affinity:00000000,00000000,00000000,00000001 vec:60 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 9(-S--), (XEN) [2011-12-06 19:13:37] IRQ: 10 affinity:00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 11 affinity:00000000,00000000,00000000,00000001 vec:70 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 12 affinity:00000000,00000000,00000000,00000001 vec:78 type=IO-APIC-edge status=00000010 in-flight=0 domain-list=0: 12(-S--), (XEN) [2011-12-06 19:13:37] IRQ: 13 affinity:00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 14 affinity:00000000,00000000,00000000,00000001 vec:90 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 15 affinity:00000000,00000000,00000000,00000001 vec:98 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:37] IRQ: 16 affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), (XEN) [2011-12-06 19:13:37] IRQ: 18 affinity:00000000,00000000,00000000,00000001 vec:51 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 18(-S--), (XEN) [2011-12-06 19:13:37] IRQ: 19 affinity:00000000,00000000,00000000,00000001 vec:d0 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 19(-S--), (XEN) [2011-12-06 19:13:37] IRQ: 21 affinity:00000000,00000000,00000000,00000001 vec:61 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 21(-S--), (XEN) [2011-12-06 19:13:37] IRQ: 23 affinity:00000000,00000000,00000000,00000001 vec:59 type=IO-APIC-level status=00000010 in-flight=1 domain-list=0: 23(PS-M), (XEN) [2011-12-06 19:13:37] IRQ: 24 affinity:00000000,00000000,00000000,00000001 vec:28 type=DMA_MSI status=00000000 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 25 affinity:00000000,00000000,00000000,00000001 vec:a0 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 26 affinity:00000000,00000000,00000000,00000001 vec:a8 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 27 affinity:00000000,00000000,00000000,00000001 vec:b0 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 28 affinity:00000000,00000000,00000000,00000001 vec:b8 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 29 affinity:00000000,00000000,00000000,00000001 vec:c0 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 30 affinity:00000000,00000000,00000000,00000001 vec:d8 type=PCI-MSI status=00000014 in-flight=0 domain-list=0:274(PS--), (XEN) [2011-12-06 19:13:38] IRQ: 31 affinity:00000000,00000000,00000000,00000001 vec:21 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:273(PS--), (XEN) [2011-12-06 19:13:38] IRQ: 32 affinity:00000000,00000000,00000000,00000001 vec:29 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:272(PS--), (XEN) [2011-12-06 19:13:38] IRQ: 33 affinity:00000000,00000000,00000000,00000001 vec:31 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:271(PS--), (XEN) [2011-12-06 19:13:38] IRQ: 34 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:39 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 35 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:41 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 36 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:49 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2011-12-06 19:13:38] IRQ: 37 affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI status=00000010 in-flight=0 domain-list=1201: 55(--M-), (XEN) [2011-12-06 19:13:38] IO-APIC interrupt information: (XEN) [2011-12-06 19:13:38] IRQ 0 Vec240: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 2: vector=240, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 1 Vec 48: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 1: vector=48, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 3 Vec 56: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 3: vector=56, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 4 Vec 64: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 4: vector=64, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 5 Vec241: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 5: vector=241, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 6 Vec 72: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 6: vector=72, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 7 Vec 80: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 7: vector=80, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 8 Vec 88: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 8: vector=88, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 9 Vec 96: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 9: vector=96, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=level, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 10 Vec104: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 10: vector=104, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 11 Vec112: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 11: vector=112, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 12 Vec120: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 12: vector=120, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 13 Vec136: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 13: vector=136, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 14 Vec144: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 14: vector=144, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 15 Vec152: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 15: vector=152, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, irr=0, trigger=edge, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 16 Vec200: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 16: vector=200, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 18 Vec 81: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 18: vector=81, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 19 Vec208: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 19: vector=208, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 21 Vec 97: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 21: vector=97, delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, irr=0, trigger=level, mask=0, dest_id:1 (XEN) [2011-12-06 19:13:38] IRQ 23 Vec 89: (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 23: vector=89, delivery_mode=1, dest_mode=logical, delivery_status=1, polarity=1, irr=1, trigger=level, mask=0, dest_id:1 I noticed some potential interesting things. The system in question is using PCI passthrough. On the serial console I remember seeing that the PCI device got unmapped from DomU and got mapped again in Dom0. The serial console log still had a lot of information about this DomU which I guess was going down. I guess it wasn''t fully down yet. Thanks, Roderick
On Wed, Dec 07, 2011 at 08:44:19PM +0000, Roderick Colenbrander wrote:> On Wed, Dec 7, 2011 at 8:38 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote: > >> It took about a week, but the systems went down again. Linux is down, > >> but the hypervisor is still reachable on the serial console. Is there > >> anything interesting to dump from there? > >> > > Just the interrupts information. I think that is Ctrl-A, Ctrl-A, > > Ctrl-A, and then ''*'' to capture everything (I don''t remember the right > > one for just interrupts). > > Here''s the interrupt information: > (XEN) [2011-12-06 19:13:37] [i: dump interrupt bindings] > (XEN) [2011-12-06 19:13:37] Guest interrupt information: > (XEN) [2011-12-06 19:13:37] IRQ: 0 > affinity:00000000,00000000,00000000,00000001 vec:f0 type=IO-APIC-edge > status=00000000 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 1 > affinity:00000000,00000000,00000000,00000001 vec:30 type=IO-APIC-edge > status=00000014 in-flight=0 domain-list=0: 1(-S--), > (XEN) [2011-12-06 19:13:37] IRQ: 2 > affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:e2 type=XT-PIC > status=00000000 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 3 > affinity:00000000,00000000,00000000,00000001 vec:38 type=IO-APIC-edge > status=00000006 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 4 > affinity:00000000,00000000,00000000,00000001 vec:40 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 5 > affinity:00000000,00000000,00000000,00000001 vec:f1 type=IO-APIC-edge > status=00000000 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 6 > affinity:00000000,00000000,00000000,00000001 vec:48 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 7 > affinity:00000000,00000000,00000000,00000001 vec:50 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 8 > affinity:00000000,00000000,00000000,00000001 vec:58 type=IO-APIC-edge > status=00000010 in-flight=0 domain-list=0: 8(-S--), > (XEN) [2011-12-06 19:13:37] IRQ: 9 > affinity:00000000,00000000,00000000,00000001 vec:60 type=IO-APIC-level > status=00000010 in-flight=0 domain-list=0: 9(-S--), > (XEN) [2011-12-06 19:13:37] IRQ: 10 > affinity:00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 11 > affinity:00000000,00000000,00000000,00000001 vec:70 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 12 > affinity:00000000,00000000,00000000,00000001 vec:78 type=IO-APIC-edge > status=00000010 in-flight=0 domain-list=0: 12(-S--), > (XEN) [2011-12-06 19:13:37] IRQ: 13 > affinity:00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 14 > affinity:00000000,00000000,00000000,00000001 vec:90 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 15 > affinity:00000000,00000000,00000000,00000001 vec:98 type=IO-APIC-edge > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:37] IRQ: 16 > affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level > status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), > (XEN) [2011-12-06 19:13:37] IRQ: 18 > affinity:00000000,00000000,00000000,00000001 vec:51 type=IO-APIC-level > status=00000010 in-flight=0 domain-list=0: 18(-S--), > (XEN) [2011-12-06 19:13:37] IRQ: 19 > affinity:00000000,00000000,00000000,00000001 vec:d0 type=IO-APIC-level > status=00000010 in-flight=0 domain-list=0: 19(-S--), > (XEN) [2011-12-06 19:13:37] IRQ: 21 > affinity:00000000,00000000,00000000,00000001 vec:61 type=IO-APIC-level > status=00000010 in-flight=0 domain-list=0: 21(-S--), > (XEN) [2011-12-06 19:13:37] IRQ: 23 > affinity:00000000,00000000,00000000,00000001 vec:59 type=IO-APIC-level > status=00000010 in-flight=1 domain-list=0: 23(PS-M), > (XEN) [2011-12-06 19:13:37] IRQ: 24 > affinity:00000000,00000000,00000000,00000001 vec:28 type=DMA_MSI > status=00000000 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 25 > affinity:00000000,00000000,00000000,00000001 vec:a0 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 26 > affinity:00000000,00000000,00000000,00000001 vec:a8 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 27 > affinity:00000000,00000000,00000000,00000001 vec:b0 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 28 > affinity:00000000,00000000,00000000,00000001 vec:b8 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 29 > affinity:00000000,00000000,00000000,00000001 vec:c0 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 30 > affinity:00000000,00000000,00000000,00000001 vec:d8 type=PCI-MSI > status=00000014 in-flight=0 domain-list=0:274(PS--), > (XEN) [2011-12-06 19:13:38] IRQ: 31 > affinity:00000000,00000000,00000000,00000001 vec:21 type=PCI-MSI > status=00000010 in-flight=0 domain-list=0:273(PS--), > (XEN) [2011-12-06 19:13:38] IRQ: 32 > affinity:00000000,00000000,00000000,00000001 vec:29 type=PCI-MSI > status=00000010 in-flight=0 domain-list=0:272(PS--), > (XEN) [2011-12-06 19:13:38] IRQ: 33 > affinity:00000000,00000000,00000000,00000001 vec:31 type=PCI-MSI > status=00000010 in-flight=0 domain-list=0:271(PS--), > (XEN) [2011-12-06 19:13:38] IRQ: 34 > affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:39 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 35 > affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:41 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 36 > affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:49 type=PCI-MSI > status=00000002 mapped, unbound > (XEN) [2011-12-06 19:13:38] IRQ: 37 > affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI > status=00000010 in-flight=0 domain-list=1201: 55(--M-), > (XEN) [2011-12-06 19:13:38] IO-APIC interrupt information: > (XEN) [2011-12-06 19:13:38] IRQ 0 Vec240: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 2: vector=240, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 1 Vec 48: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 1: vector=48, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 3 Vec 56: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 3: vector=56, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 4 Vec 64: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 4: vector=64, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 5 Vec241: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 5: vector=241, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 6 Vec 72: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 6: vector=72, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 7 Vec 80: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 7: vector=80, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 8 Vec 88: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 8: vector=88, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 9 Vec 96: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 9: vector=96, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=level, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 10 Vec104: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 10: vector=104, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 11 Vec112: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 11: vector=112, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 12 Vec120: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 12: vector=120, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 13 Vec136: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 13: vector=136, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 14 Vec144: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 14: vector=144, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 15 Vec152: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 15: vector=152, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > irr=0, trigger=edge, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 16 Vec200: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 16: vector=200, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > irr=0, trigger=level, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 18 Vec 81: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 18: vector=81, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > irr=0, trigger=level, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 19 Vec208: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 19: vector=208, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > irr=0, trigger=level, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 21 Vec 97: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 21: vector=97, > delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > irr=0, trigger=level, mask=0, dest_id:1 > (XEN) [2011-12-06 19:13:38] IRQ 23 Vec 89: > (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 23: vector=89, > delivery_mode=1, dest_mode=logical, delivery_status=1, polarity=1, > irr=1, trigger=level, mask=0, dest_id:1 > > I noticed some potential interesting things. The system in question is > using PCI passthrough. On the serial console I remember seeing that > the PCI device got unmapped from DomU and got mapped again in Dom0. > The serial console log still had a lot of information about this DomU > which I guess was going down. I guess it wasn''t fully down yet.One of the debug informations that gets printed with ''*'' is the guest PCI passthrough data. Such as which IRQ, BAR, etc are mapped in. Did any of those exist? My original thought of what is going on is that the interrupts either stopped completly working (does not look like - it looks like the interrupts are firring and the event channels that are bound to it have the bits set saying - hey I am pending). Oddly there are bunch of MSI ones that are masked which usually means they are disabled. Then there is the affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI status=00000010 in-flight=0 domain-list=1201: 55(--M-), The guest has masked both interrupts - so it is off//dying, but somehow the unmapping has not happend. In other words, what you just analyzed and found out. Sadly, there is no smoking gun.. So a couple of things that I would do is to ensure that the Xen hypervisor boots with ''sync_console console_to_ring'' and when this crash happens see if I can get a stack trace from both the Xen hypervisor (there are some debug parameters to get that - I think it is ''r''?), and also from the Linux kernel. But also see if the problem disappears with the latest 4.1.x hypervisor. And it might also be worth upgrading the dom0 to a 3.0. Hmm, it would be very interesting to see where the dom0 _is_ stuck at. The hypervisor is running fine so it all points to dom0 crashing somewhere.. Can you make sure that dom0 runs with ''debug loglevel=8'' as well. That should give some wealth of information when/if a crash happens. Oh, and try to pass in Ctrl-Alt-SysRQ-t (on serial concentrators I think you just need to type ''break'' on the telnet line)and then t. But I am not entirely sure about it - might want to double check Google and see how to enable Alt-SysRQ.
On Thu, Dec 8, 2011 at 11:46 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote:> On Wed, Dec 07, 2011 at 08:44:19PM +0000, Roderick Colenbrander wrote: >> On Wed, Dec 7, 2011 at 8:38 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote: >> >> It took about a week, but the systems went down again. Linux is down, >> >> but the hypervisor is still reachable on the serial console. Is there >> >> anything interesting to dump from there? >> >> >> > Just the interrupts information. I think that is Ctrl-A, Ctrl-A, >> > Ctrl-A, and then ''*'' to capture everything (I don''t remember the right >> > one for just interrupts). >> >> Here''s the interrupt information: >> (XEN) [2011-12-06 19:13:37] [i: dump interrupt bindings] >> (XEN) [2011-12-06 19:13:37] Guest interrupt information: >> (XEN) [2011-12-06 19:13:37] IRQ: 0 >> affinity:00000000,00000000,00000000,00000001 vec:f0 type=IO-APIC-edge >> status=00000000 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 1 >> affinity:00000000,00000000,00000000,00000001 vec:30 type=IO-APIC-edge >> status=00000014 in-flight=0 domain-list=0: 1(-S--), >> (XEN) [2011-12-06 19:13:37] IRQ: 2 >> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:e2 type=XT-PIC >> status=00000000 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 3 >> affinity:00000000,00000000,00000000,00000001 vec:38 type=IO-APIC-edge >> status=00000006 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 4 >> affinity:00000000,00000000,00000000,00000001 vec:40 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 5 >> affinity:00000000,00000000,00000000,00000001 vec:f1 type=IO-APIC-edge >> status=00000000 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 6 >> affinity:00000000,00000000,00000000,00000001 vec:48 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 7 >> affinity:00000000,00000000,00000000,00000001 vec:50 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 8 >> affinity:00000000,00000000,00000000,00000001 vec:58 type=IO-APIC-edge >> status=00000010 in-flight=0 domain-list=0: 8(-S--), >> (XEN) [2011-12-06 19:13:37] IRQ: 9 >> affinity:00000000,00000000,00000000,00000001 vec:60 type=IO-APIC-level >> status=00000010 in-flight=0 domain-list=0: 9(-S--), >> (XEN) [2011-12-06 19:13:37] IRQ: 10 >> affinity:00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 11 >> affinity:00000000,00000000,00000000,00000001 vec:70 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 12 >> affinity:00000000,00000000,00000000,00000001 vec:78 type=IO-APIC-edge >> status=00000010 in-flight=0 domain-list=0: 12(-S--), >> (XEN) [2011-12-06 19:13:37] IRQ: 13 >> affinity:00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 14 >> affinity:00000000,00000000,00000000,00000001 vec:90 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 15 >> affinity:00000000,00000000,00000000,00000001 vec:98 type=IO-APIC-edge >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:37] IRQ: 16 >> affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level >> status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), >> (XEN) [2011-12-06 19:13:37] IRQ: 18 >> affinity:00000000,00000000,00000000,00000001 vec:51 type=IO-APIC-level >> status=00000010 in-flight=0 domain-list=0: 18(-S--), >> (XEN) [2011-12-06 19:13:37] IRQ: 19 >> affinity:00000000,00000000,00000000,00000001 vec:d0 type=IO-APIC-level >> status=00000010 in-flight=0 domain-list=0: 19(-S--), >> (XEN) [2011-12-06 19:13:37] IRQ: 21 >> affinity:00000000,00000000,00000000,00000001 vec:61 type=IO-APIC-level >> status=00000010 in-flight=0 domain-list=0: 21(-S--), >> (XEN) [2011-12-06 19:13:37] IRQ: 23 >> affinity:00000000,00000000,00000000,00000001 vec:59 type=IO-APIC-level >> status=00000010 in-flight=1 domain-list=0: 23(PS-M), >> (XEN) [2011-12-06 19:13:37] IRQ: 24 >> affinity:00000000,00000000,00000000,00000001 vec:28 type=DMA_MSI >> status=00000000 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 25 >> affinity:00000000,00000000,00000000,00000001 vec:a0 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 26 >> affinity:00000000,00000000,00000000,00000001 vec:a8 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 27 >> affinity:00000000,00000000,00000000,00000001 vec:b0 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 28 >> affinity:00000000,00000000,00000000,00000001 vec:b8 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 29 >> affinity:00000000,00000000,00000000,00000001 vec:c0 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 30 >> affinity:00000000,00000000,00000000,00000001 vec:d8 type=PCI-MSI >> status=00000014 in-flight=0 domain-list=0:274(PS--), >> (XEN) [2011-12-06 19:13:38] IRQ: 31 >> affinity:00000000,00000000,00000000,00000001 vec:21 type=PCI-MSI >> status=00000010 in-flight=0 domain-list=0:273(PS--), >> (XEN) [2011-12-06 19:13:38] IRQ: 32 >> affinity:00000000,00000000,00000000,00000001 vec:29 type=PCI-MSI >> status=00000010 in-flight=0 domain-list=0:272(PS--), >> (XEN) [2011-12-06 19:13:38] IRQ: 33 >> affinity:00000000,00000000,00000000,00000001 vec:31 type=PCI-MSI >> status=00000010 in-flight=0 domain-list=0:271(PS--), >> (XEN) [2011-12-06 19:13:38] IRQ: 34 >> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:39 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 35 >> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:41 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 36 >> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:49 type=PCI-MSI >> status=00000002 mapped, unbound >> (XEN) [2011-12-06 19:13:38] IRQ: 37 >> affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI >> status=00000010 in-flight=0 domain-list=1201: 55(--M-), >> (XEN) [2011-12-06 19:13:38] IO-APIC interrupt information: >> (XEN) [2011-12-06 19:13:38] IRQ 0 Vec240: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 2: vector=240, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 1 Vec 48: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 1: vector=48, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 3 Vec 56: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 3: vector=56, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 4 Vec 64: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 4: vector=64, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 5 Vec241: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 5: vector=241, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 6 Vec 72: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 6: vector=72, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 7 Vec 80: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 7: vector=80, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 8 Vec 88: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 8: vector=88, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 9 Vec 96: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 9: vector=96, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=level, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 10 Vec104: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 10: vector=104, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 11 Vec112: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 11: vector=112, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 12 Vec120: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 12: vector=120, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 13 Vec136: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 13: vector=136, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 14 Vec144: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 14: vector=144, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 15 Vec152: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 15: vector=152, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >> irr=0, trigger=edge, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 16 Vec200: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 16: vector=200, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >> irr=0, trigger=level, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 18 Vec 81: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 18: vector=81, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >> irr=0, trigger=level, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 19 Vec208: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 19: vector=208, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >> irr=0, trigger=level, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 21 Vec 97: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 21: vector=97, >> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >> irr=0, trigger=level, mask=0, dest_id:1 >> (XEN) [2011-12-06 19:13:38] IRQ 23 Vec 89: >> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 23: vector=89, >> delivery_mode=1, dest_mode=logical, delivery_status=1, polarity=1, >> irr=1, trigger=level, mask=0, dest_id:1 >> >> I noticed some potential interesting things. The system in question is >> using PCI passthrough. On the serial console I remember seeing that >> the PCI device got unmapped from DomU and got mapped again in Dom0. >> The serial console log still had a lot of information about this DomU >> which I guess was going down. I guess it wasn''t fully down yet. > > One of the debug informations that gets printed with ''*'' is the guest > PCI passthrough data. Such as which IRQ, BAR, etc are mapped in. Did any > of those exist?There is not much PCI related information. Just interrupt stuff, no PCI BARs or other interesting PCI stuff: (XEN) [2011-12-06 19:13:20] 03:00.0 - dom 1201 - MSIs < 37 > (XEN) [2011-12-06 19:13:20] MSI 37 vec=65 lowest edge assert log lowest dest=00000010 mask=0/0/-1> My original thought of what is going on is that the interrupts either > stopped completly working (does not look like - it looks like the > interrupts are firring and the event channels that are bound to it have > the bits set saying - hey I am pending). Oddly there are bunch of MSI > ones that are masked which usually means they are disabled. > > Then there is the > affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level > status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), > affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI > status=00000010 in-flight=0 domain-list=1201: 55(--M-), > > The guest has masked both interrupts - so it is off//dying, but somehow > the unmapping has not happend. In other words, what you just analyzed > and found out. > > Sadly, there is no smoking gun.. > > So a couple of things that I would do is to ensure that the Xen > hypervisor boots with ''sync_console console_to_ring'' and when this > crash happens see if I can get a stack trace from both the Xen > hypervisor (there are some debug parameters to get that - I think it is > ''r''?), and also from the Linux kernel.I set up some new tests, so that will take some days to get a crash. For what it is worth, I do have a ''*'' log left from this system and it has some stack trace. If you think it is useful, I could send it gzipped, but I don''t want to spam this list if it may not be useful. One thing I did notice was the following: (XEN) [2011-12-06 19:13:39] active vcpus: (XEN) [2011-12-06 19:13:39] 1: [1201.2] pri=-2 flags=0 cpu=6 credit=-2411 [w=256] (XEN) [2011-12-06 19:13:39] 2: [1201.1] pri=-2 flags=0 cpu=5 credit=-2460 [w=256] (XEN) [2011-12-06 19:13:39] 3: [0.2] pri=0 flags=0 cpu=2 credit=142 [w=256] (XEN) [2011-12-06 19:13:39] 4: [0.1] pri=-2 flags=0 cpu=1 credit=-2612 [w=256] (XEN) [2011-12-06 19:13:39] CPU[00] sort=12511062, sibling=00000000,00000000,00000000,00000003, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [32767.0] pri=0 flags=0 cpu=0 (XEN) [2011-12-06 19:13:39] 1: [0.0] pri=-1 flags=0 cpu=0 credit=333 [w=256] (XEN) [2011-12-06 19:13:39] CPU[01] sort=12511063, sibling=00000000,00000000,00000000,00000003, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [0.1] pri=-2 flags=0 cpu=1 credit=-2914 [w=256] (XEN) [2011-12-06 19:13:39] 1: [32767.1] pri=-64 flags=0 cpu=1 (XEN) [2011-12-06 19:13:39] CPU[02] sort=12511063, sibling=00000000,00000000,00000000,0000000c, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [32767.2] pri=-64 flags=0 cpu=2 (XEN) [2011-12-06 19:13:39] CPU[03] sort=12511063, sibling=00000000,00000000,00000000,0000000c, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [32767.3] pri=-64 flags=0 cpu=3 (XEN) [2011-12-06 19:13:39] CPU[04] sort=12511063, sibling=00000000,00000000,00000000,00000030, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [32767.4] pri=-64 flags=0 cpu=4 (XEN) [2011-12-06 19:13:39] CPU[05] sort=12511063, sibling=00000000,00000000,00000000,00000030, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [1201.1] pri=-2 flags=0 cpu=5 credit=-3617 [w=256] (XEN) [2011-12-06 19:13:39] 1: [32767.5] pri=-64 flags=0 cpu=5 (XEN) [2011-12-06 19:13:39] CPU[06] sort=12511063, sibling=00000000,00000000,00000000,000000c0, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [1201.2] pri=-2 flags=0 cpu=6 credit=-3918 [w=256] (XEN) [2011-12-06 19:13:39] 1: [32767.6] pri=-64 flags=0 cpu=6 (XEN) [2011-12-06 19:13:39] CPU[07] sort=12511063, sibling=00000000,00000000,00000000,000000c0, core=00000000,00000000,00000000,000000ff (XEN) [2011-12-06 19:13:39] run: [32767.7] pri=-64 flags=0 cpu=7 The system in question has a 4-core i7 CPU where Dom0 is pinned to core 0-3 and DomU is pinned 4-7. Aren''t the negative credit numbers quite big?> > But also see if the problem disappears with the latest 4.1.x hypervisor. > And it might also be worth upgrading the dom0 to a 3.0. Hmm, it would be > very interesting to see where the dom0 _is_ stuck at. The hypervisor is > running fine so it all points to dom0 crashing somewhere.. > > Can you make sure that dom0 runs with ''debug loglevel=8'' as well. That > should give some wealth of information when/if a crash happens. > Oh, and try to pass in Ctrl-Alt-SysRQ-t (on serial concentrators I think > you just need to type ''break'' on the telnet line)and then t. But I am > not entirely sure about it - might want to double check Google and see > how to enable Alt-SysRQ.I tried to get SysRq working. It worked fine on some of my other machines (sysrq= ''shift-control-o''), but not on this broken box. Apparently the kernel is stuck in some place where SysRq doesn''t work. I''m going to retest with the relevant logging turned on. I also upgraded to the latest 2.6.32 kernel (.48 I think). I will also upgrade to the latest 4.1.x build from the mercurial repository. It will take a few days before I get crash. Thanks so far! Roderick
One interesting observation. This morning I had another of my stress machines with the problem. I never had it on this problem before and it didn''t have any of the new logging / software updates yet. The system was in the same state as the other machine which I reported about before. I tried SysRq stuff and other things. While I was about to reboot the system, a login prompt appeared again on the VGA. I don''t know whether any of the stuff I did triggered it or not. Anyway it means Linux is not really dead. I tried logging, but I don''t even see characters appearing. The system feels to be very, very, very slow. The other tests are still running. Roderick On Thu, Dec 8, 2011 at 5:33 PM, Roderick Colenbrander <thunderbird2k@gmail.com> wrote:> On Thu, Dec 8, 2011 at 11:46 PM, Konrad Rzeszutek Wilk > <konrad@darnok.org> wrote: >> On Wed, Dec 07, 2011 at 08:44:19PM +0000, Roderick Colenbrander wrote: >>> On Wed, Dec 7, 2011 at 8:38 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote: >>> >> It took about a week, but the systems went down again. Linux is down, >>> >> but the hypervisor is still reachable on the serial console. Is there >>> >> anything interesting to dump from there? >>> >> >>> > Just the interrupts information. I think that is Ctrl-A, Ctrl-A, >>> > Ctrl-A, and then ''*'' to capture everything (I don''t remember the right >>> > one for just interrupts). >>> >>> Here''s the interrupt information: >>> (XEN) [2011-12-06 19:13:37] [i: dump interrupt bindings] >>> (XEN) [2011-12-06 19:13:37] Guest interrupt information: >>> (XEN) [2011-12-06 19:13:37] IRQ: 0 >>> affinity:00000000,00000000,00000000,00000001 vec:f0 type=IO-APIC-edge >>> status=00000000 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 1 >>> affinity:00000000,00000000,00000000,00000001 vec:30 type=IO-APIC-edge >>> status=00000014 in-flight=0 domain-list=0: 1(-S--), >>> (XEN) [2011-12-06 19:13:37] IRQ: 2 >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:e2 type=XT-PIC >>> status=00000000 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 3 >>> affinity:00000000,00000000,00000000,00000001 vec:38 type=IO-APIC-edge >>> status=00000006 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 4 >>> affinity:00000000,00000000,00000000,00000001 vec:40 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 5 >>> affinity:00000000,00000000,00000000,00000001 vec:f1 type=IO-APIC-edge >>> status=00000000 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 6 >>> affinity:00000000,00000000,00000000,00000001 vec:48 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 7 >>> affinity:00000000,00000000,00000000,00000001 vec:50 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 8 >>> affinity:00000000,00000000,00000000,00000001 vec:58 type=IO-APIC-edge >>> status=00000010 in-flight=0 domain-list=0: 8(-S--), >>> (XEN) [2011-12-06 19:13:37] IRQ: 9 >>> affinity:00000000,00000000,00000000,00000001 vec:60 type=IO-APIC-level >>> status=00000010 in-flight=0 domain-list=0: 9(-S--), >>> (XEN) [2011-12-06 19:13:37] IRQ: 10 >>> affinity:00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 11 >>> affinity:00000000,00000000,00000000,00000001 vec:70 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 12 >>> affinity:00000000,00000000,00000000,00000001 vec:78 type=IO-APIC-edge >>> status=00000010 in-flight=0 domain-list=0: 12(-S--), >>> (XEN) [2011-12-06 19:13:37] IRQ: 13 >>> affinity:00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 14 >>> affinity:00000000,00000000,00000000,00000001 vec:90 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 15 >>> affinity:00000000,00000000,00000000,00000001 vec:98 type=IO-APIC-edge >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:37] IRQ: 16 >>> affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level >>> status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), >>> (XEN) [2011-12-06 19:13:37] IRQ: 18 >>> affinity:00000000,00000000,00000000,00000001 vec:51 type=IO-APIC-level >>> status=00000010 in-flight=0 domain-list=0: 18(-S--), >>> (XEN) [2011-12-06 19:13:37] IRQ: 19 >>> affinity:00000000,00000000,00000000,00000001 vec:d0 type=IO-APIC-level >>> status=00000010 in-flight=0 domain-list=0: 19(-S--), >>> (XEN) [2011-12-06 19:13:37] IRQ: 21 >>> affinity:00000000,00000000,00000000,00000001 vec:61 type=IO-APIC-level >>> status=00000010 in-flight=0 domain-list=0: 21(-S--), >>> (XEN) [2011-12-06 19:13:37] IRQ: 23 >>> affinity:00000000,00000000,00000000,00000001 vec:59 type=IO-APIC-level >>> status=00000010 in-flight=1 domain-list=0: 23(PS-M), >>> (XEN) [2011-12-06 19:13:37] IRQ: 24 >>> affinity:00000000,00000000,00000000,00000001 vec:28 type=DMA_MSI >>> status=00000000 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 25 >>> affinity:00000000,00000000,00000000,00000001 vec:a0 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 26 >>> affinity:00000000,00000000,00000000,00000001 vec:a8 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 27 >>> affinity:00000000,00000000,00000000,00000001 vec:b0 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 28 >>> affinity:00000000,00000000,00000000,00000001 vec:b8 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 29 >>> affinity:00000000,00000000,00000000,00000001 vec:c0 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 30 >>> affinity:00000000,00000000,00000000,00000001 vec:d8 type=PCI-MSI >>> status=00000014 in-flight=0 domain-list=0:274(PS--), >>> (XEN) [2011-12-06 19:13:38] IRQ: 31 >>> affinity:00000000,00000000,00000000,00000001 vec:21 type=PCI-MSI >>> status=00000010 in-flight=0 domain-list=0:273(PS--), >>> (XEN) [2011-12-06 19:13:38] IRQ: 32 >>> affinity:00000000,00000000,00000000,00000001 vec:29 type=PCI-MSI >>> status=00000010 in-flight=0 domain-list=0:272(PS--), >>> (XEN) [2011-12-06 19:13:38] IRQ: 33 >>> affinity:00000000,00000000,00000000,00000001 vec:31 type=PCI-MSI >>> status=00000010 in-flight=0 domain-list=0:271(PS--), >>> (XEN) [2011-12-06 19:13:38] IRQ: 34 >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:39 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 35 >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:41 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 36 >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:49 type=PCI-MSI >>> status=00000002 mapped, unbound >>> (XEN) [2011-12-06 19:13:38] IRQ: 37 >>> affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI >>> status=00000010 in-flight=0 domain-list=1201: 55(--M-), >>> (XEN) [2011-12-06 19:13:38] IO-APIC interrupt information: >>> (XEN) [2011-12-06 19:13:38] IRQ 0 Vec240: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 2: vector=240, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 1 Vec 48: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 1: vector=48, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 3 Vec 56: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 3: vector=56, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 4 Vec 64: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 4: vector=64, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 5 Vec241: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 5: vector=241, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 6 Vec 72: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 6: vector=72, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 7 Vec 80: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 7: vector=80, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 8 Vec 88: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 8: vector=88, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 9 Vec 96: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 9: vector=96, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=level, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 10 Vec104: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 10: vector=104, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 11 Vec112: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 11: vector=112, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 12 Vec120: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 12: vector=120, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 13 Vec136: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 13: vector=136, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 14 Vec144: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 14: vector=144, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 15 Vec152: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 15: vector=152, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, >>> irr=0, trigger=edge, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 16 Vec200: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 16: vector=200, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >>> irr=0, trigger=level, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 18 Vec 81: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 18: vector=81, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >>> irr=0, trigger=level, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 19 Vec208: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 19: vector=208, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >>> irr=0, trigger=level, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 21 Vec 97: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 21: vector=97, >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, >>> irr=0, trigger=level, mask=0, dest_id:1 >>> (XEN) [2011-12-06 19:13:38] IRQ 23 Vec 89: >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 23: vector=89, >>> delivery_mode=1, dest_mode=logical, delivery_status=1, polarity=1, >>> irr=1, trigger=level, mask=0, dest_id:1 >>> >>> I noticed some potential interesting things. The system in question is >>> using PCI passthrough. On the serial console I remember seeing that >>> the PCI device got unmapped from DomU and got mapped again in Dom0. >>> The serial console log still had a lot of information about this DomU >>> which I guess was going down. I guess it wasn''t fully down yet. >> >> One of the debug informations that gets printed with ''*'' is the guest >> PCI passthrough data. Such as which IRQ, BAR, etc are mapped in. Did any >> of those exist? > > There is not much PCI related information. Just interrupt stuff, no > PCI BARs or other interesting PCI stuff: > (XEN) [2011-12-06 19:13:20] 03:00.0 - dom 1201 - MSIs < 37 > > (XEN) [2011-12-06 19:13:20] MSI 37 vec=65 lowest edge assert > log lowest dest=00000010 mask=0/0/-1 > > >> My original thought of what is going on is that the interrupts either >> stopped completly working (does not look like - it looks like the >> interrupts are firring and the event channels that are bound to it have >> the bits set saying - hey I am pending). Oddly there are bunch of MSI >> ones that are masked which usually means they are disabled. >> >> Then there is the >> affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level >> status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), >> affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI >> status=00000010 in-flight=0 domain-list=1201: 55(--M-), >> >> The guest has masked both interrupts - so it is off//dying, but somehow >> the unmapping has not happend. In other words, what you just analyzed >> and found out. >> >> Sadly, there is no smoking gun.. >> >> So a couple of things that I would do is to ensure that the Xen >> hypervisor boots with ''sync_console console_to_ring'' and when this >> crash happens see if I can get a stack trace from both the Xen >> hypervisor (there are some debug parameters to get that - I think it is >> ''r''?), and also from the Linux kernel. > > I set up some new tests, so that will take some days to get a crash. > For what it is worth, I do have a ''*'' log left from this system and it > has some stack trace. If you think it is useful, I could send it > gzipped, but I don''t want to spam this list if it may not be useful. > > One thing I did notice was the following: > (XEN) [2011-12-06 19:13:39] active vcpus: > (XEN) [2011-12-06 19:13:39] 1: [1201.2] pri=-2 flags=0 cpu=6 > credit=-2411 [w=256] > (XEN) [2011-12-06 19:13:39] 2: [1201.1] pri=-2 flags=0 cpu=5 > credit=-2460 [w=256] > (XEN) [2011-12-06 19:13:39] 3: [0.2] pri=0 flags=0 cpu=2 > credit=142 [w=256] > (XEN) [2011-12-06 19:13:39] 4: [0.1] pri=-2 flags=0 cpu=1 > credit=-2612 [w=256] > (XEN) [2011-12-06 19:13:39] CPU[00] sort=12511062, > sibling=00000000,00000000,00000000,00000003, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [32767.0] pri=0 flags=0 cpu=0 > (XEN) [2011-12-06 19:13:39] 1: [0.0] pri=-1 flags=0 cpu=0 > credit=333 [w=256] > (XEN) [2011-12-06 19:13:39] CPU[01] sort=12511063, > sibling=00000000,00000000,00000000,00000003, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [0.1] pri=-2 flags=0 cpu=1 > credit=-2914 [w=256] > (XEN) [2011-12-06 19:13:39] 1: [32767.1] pri=-64 flags=0 cpu=1 > (XEN) [2011-12-06 19:13:39] CPU[02] sort=12511063, > sibling=00000000,00000000,00000000,0000000c, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [32767.2] pri=-64 flags=0 cpu=2 > (XEN) [2011-12-06 19:13:39] CPU[03] sort=12511063, > sibling=00000000,00000000,00000000,0000000c, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [32767.3] pri=-64 flags=0 cpu=3 > (XEN) [2011-12-06 19:13:39] CPU[04] sort=12511063, > sibling=00000000,00000000,00000000,00000030, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [32767.4] pri=-64 flags=0 cpu=4 > (XEN) [2011-12-06 19:13:39] CPU[05] sort=12511063, > sibling=00000000,00000000,00000000,00000030, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [1201.1] pri=-2 flags=0 cpu=5 > credit=-3617 [w=256] > (XEN) [2011-12-06 19:13:39] 1: [32767.5] pri=-64 flags=0 cpu=5 > (XEN) [2011-12-06 19:13:39] CPU[06] sort=12511063, > sibling=00000000,00000000,00000000,000000c0, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [1201.2] pri=-2 flags=0 cpu=6 > credit=-3918 [w=256] > (XEN) [2011-12-06 19:13:39] 1: [32767.6] pri=-64 flags=0 cpu=6 > (XEN) [2011-12-06 19:13:39] CPU[07] sort=12511063, > sibling=00000000,00000000,00000000,000000c0, > core=00000000,00000000,00000000,000000ff > (XEN) [2011-12-06 19:13:39] run: [32767.7] pri=-64 flags=0 cpu=7 > > The system in question has a 4-core i7 CPU where Dom0 is pinned to > core 0-3 and DomU is pinned 4-7. Aren''t the negative credit numbers > quite big? > >> >> But also see if the problem disappears with the latest 4.1.x hypervisor. >> And it might also be worth upgrading the dom0 to a 3.0. Hmm, it would be >> very interesting to see where the dom0 _is_ stuck at. The hypervisor is >> running fine so it all points to dom0 crashing somewhere.. >> >> Can you make sure that dom0 runs with ''debug loglevel=8'' as well. That >> should give some wealth of information when/if a crash happens. >> Oh, and try to pass in Ctrl-Alt-SysRQ-t (on serial concentrators I think >> you just need to type ''break'' on the telnet line)and then t. But I am >> not entirely sure about it - might want to double check Google and see >> how to enable Alt-SysRQ. > > I tried to get SysRq working. It worked fine on some of my other > machines (sysrq= ''shift-control-o''), but not on this broken box. > Apparently the kernel is stuck in some place where SysRq doesn''t work. > > I''m going to retest with the relevant logging turned on. I also > upgraded to the latest 2.6.32 kernel (.48 I think). I will also > upgrade to the latest 4.1.x build from the mercurial repository. It > will take a few days before I get crash. > > Thanks so far! > Roderick
On Fri, Dec 09, 2011 at 02:02:31PM -0800, Roderick Colenbrander wrote:> One interesting observation. This morning I had another of my stress > machines with the problem. I never had it on this problem before and > it didn''t have any of the new logging / software updates yet. > > The system was in the same state as the other machine which I reported > about before. I tried SysRq stuff and other things. While I was about > to reboot the system, a login prompt appeared again on the VGA. I > don''t know whether any of the stuff I did triggered it or not. Anyway > it means Linux is not really dead. I tried logging, but I don''t even > see characters appearing. The system feels to be very, very, very > slow. >Hmm.. I wonder if this is the same issue I''m sometimes seeing on my laptop.. suddenly it starts becoming slower, and after a while it''s *very* slow.. not dead, but unusably slow.. I haven''t had time to (try to) debug it.. -- Pasi> The other tests are still running. > > Roderick > > On Thu, Dec 8, 2011 at 5:33 PM, Roderick Colenbrander > <thunderbird2k@gmail.com> wrote: > > On Thu, Dec 8, 2011 at 11:46 PM, Konrad Rzeszutek Wilk > > <konrad@darnok.org> wrote: > >> On Wed, Dec 07, 2011 at 08:44:19PM +0000, Roderick Colenbrander wrote: > >>> On Wed, Dec 7, 2011 at 8:38 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote: > >>> >> It took about a week, but the systems went down again. Linux is down, > >>> >> but the hypervisor is still reachable on the serial console. Is there > >>> >> anything interesting to dump from there? > >>> >> > >>> > Just the interrupts information. I think that is Ctrl-A, Ctrl-A, > >>> > Ctrl-A, and then ''*'' to capture everything (I don''t remember the right > >>> > one for just interrupts). > >>> > >>> Here''s the interrupt information: > >>> (XEN) [2011-12-06 19:13:37] [i: dump interrupt bindings] > >>> (XEN) [2011-12-06 19:13:37] Guest interrupt information: > >>> (XEN) [2011-12-06 19:13:37] IRQ: 0 > >>> affinity:00000000,00000000,00000000,00000001 vec:f0 type=IO-APIC-edge > >>> status=00000000 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 1 > >>> affinity:00000000,00000000,00000000,00000001 vec:30 type=IO-APIC-edge > >>> status=00000014 in-flight=0 domain-list=0: 1(-S--), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 2 > >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:e2 type=XT-PIC > >>> status=00000000 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 3 > >>> affinity:00000000,00000000,00000000,00000001 vec:38 type=IO-APIC-edge > >>> status=00000006 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 4 > >>> affinity:00000000,00000000,00000000,00000001 vec:40 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 5 > >>> affinity:00000000,00000000,00000000,00000001 vec:f1 type=IO-APIC-edge > >>> status=00000000 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 6 > >>> affinity:00000000,00000000,00000000,00000001 vec:48 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 7 > >>> affinity:00000000,00000000,00000000,00000001 vec:50 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 8 > >>> affinity:00000000,00000000,00000000,00000001 vec:58 type=IO-APIC-edge > >>> status=00000010 in-flight=0 domain-list=0: 8(-S--), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 9 > >>> affinity:00000000,00000000,00000000,00000001 vec:60 type=IO-APIC-level > >>> status=00000010 in-flight=0 domain-list=0: 9(-S--), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 10 > >>> affinity:00000000,00000000,00000000,00000001 vec:68 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 11 > >>> affinity:00000000,00000000,00000000,00000001 vec:70 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 12 > >>> affinity:00000000,00000000,00000000,00000001 vec:78 type=IO-APIC-edge > >>> status=00000010 in-flight=0 domain-list=0: 12(-S--), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 13 > >>> affinity:00000000,00000000,00000000,00000001 vec:88 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 14 > >>> affinity:00000000,00000000,00000000,00000001 vec:90 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 15 > >>> affinity:00000000,00000000,00000000,00000001 vec:98 type=IO-APIC-edge > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:37] IRQ: 16 > >>> affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level > >>> status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 18 > >>> affinity:00000000,00000000,00000000,00000001 vec:51 type=IO-APIC-level > >>> status=00000010 in-flight=0 domain-list=0: 18(-S--), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 19 > >>> affinity:00000000,00000000,00000000,00000001 vec:d0 type=IO-APIC-level > >>> status=00000010 in-flight=0 domain-list=0: 19(-S--), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 21 > >>> affinity:00000000,00000000,00000000,00000001 vec:61 type=IO-APIC-level > >>> status=00000010 in-flight=0 domain-list=0: 21(-S--), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 23 > >>> affinity:00000000,00000000,00000000,00000001 vec:59 type=IO-APIC-level > >>> status=00000010 in-flight=1 domain-list=0: 23(PS-M), > >>> (XEN) [2011-12-06 19:13:37] IRQ: 24 > >>> affinity:00000000,00000000,00000000,00000001 vec:28 type=DMA_MSI > >>> status=00000000 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 25 > >>> affinity:00000000,00000000,00000000,00000001 vec:a0 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 26 > >>> affinity:00000000,00000000,00000000,00000001 vec:a8 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 27 > >>> affinity:00000000,00000000,00000000,00000001 vec:b0 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 28 > >>> affinity:00000000,00000000,00000000,00000001 vec:b8 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 29 > >>> affinity:00000000,00000000,00000000,00000001 vec:c0 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 30 > >>> affinity:00000000,00000000,00000000,00000001 vec:d8 type=PCI-MSI > >>> status=00000014 in-flight=0 domain-list=0:274(PS--), > >>> (XEN) [2011-12-06 19:13:38] IRQ: 31 > >>> affinity:00000000,00000000,00000000,00000001 vec:21 type=PCI-MSI > >>> status=00000010 in-flight=0 domain-list=0:273(PS--), > >>> (XEN) [2011-12-06 19:13:38] IRQ: 32 > >>> affinity:00000000,00000000,00000000,00000001 vec:29 type=PCI-MSI > >>> status=00000010 in-flight=0 domain-list=0:272(PS--), > >>> (XEN) [2011-12-06 19:13:38] IRQ: 33 > >>> affinity:00000000,00000000,00000000,00000001 vec:31 type=PCI-MSI > >>> status=00000010 in-flight=0 domain-list=0:271(PS--), > >>> (XEN) [2011-12-06 19:13:38] IRQ: 34 > >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:39 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 35 > >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:41 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 36 > >>> affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:49 type=PCI-MSI > >>> status=00000002 mapped, unbound > >>> (XEN) [2011-12-06 19:13:38] IRQ: 37 > >>> affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI > >>> status=00000010 in-flight=0 domain-list=1201: 55(--M-), > >>> (XEN) [2011-12-06 19:13:38] IO-APIC interrupt information: > >>> (XEN) [2011-12-06 19:13:38] IRQ 0 Vec240: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 2: vector=240, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 1 Vec 48: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 1: vector=48, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 3 Vec 56: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 3: vector=56, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 4 Vec 64: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 4: vector=64, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 5 Vec241: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 5: vector=241, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 6 Vec 72: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 6: vector=72, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 7 Vec 80: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 7: vector=80, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 8 Vec 88: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 8: vector=88, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 9 Vec 96: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 9: vector=96, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=level, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 10 Vec104: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 10: vector=104, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 11 Vec112: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 11: vector=112, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 12 Vec120: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 12: vector=120, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 13 Vec136: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 13: vector=136, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 14 Vec144: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 14: vector=144, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 15 Vec152: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 15: vector=152, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=0, > >>> irr=0, trigger=edge, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 16 Vec200: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 16: vector=200, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > >>> irr=0, trigger=level, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 18 Vec 81: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 18: vector=81, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > >>> irr=0, trigger=level, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 19 Vec208: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 19: vector=208, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > >>> irr=0, trigger=level, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 21 Vec 97: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 21: vector=97, > >>> delivery_mode=1, dest_mode=logical, delivery_status=0, polarity=1, > >>> irr=0, trigger=level, mask=0, dest_id:1 > >>> (XEN) [2011-12-06 19:13:38] IRQ 23 Vec 89: > >>> (XEN) [2011-12-06 19:13:38] Apic 0x00, Pin 23: vector=89, > >>> delivery_mode=1, dest_mode=logical, delivery_status=1, polarity=1, > >>> irr=1, trigger=level, mask=0, dest_id:1 > >>> > >>> I noticed some potential interesting things. The system in question is > >>> using PCI passthrough. On the serial console I remember seeing that > >>> the PCI device got unmapped from DomU and got mapped again in Dom0. > >>> The serial console log still had a lot of information about this DomU > >>> which I guess was going down. I guess it wasn''t fully down yet. > >> > >> One of the debug informations that gets printed with ''*'' is the guest > >> PCI passthrough data. Such as which IRQ, BAR, etc are mapped in. Did any > >> of those exist? > > > > There is not much PCI related information. Just interrupt stuff, no > > PCI BARs or other interesting PCI stuff: > > (XEN) [2011-12-06 19:13:20] 03:00.0 - dom 1201 - MSIs < 37 > > > (XEN) [2011-12-06 19:13:20] MSI 37 vec=65 lowest edge assert > > log lowest dest=00000010 mask=0/0/-1 > > > > > >> My original thought of what is going on is that the interrupts either > >> stopped completly working (does not look like - it looks like the > >> interrupts are firring and the event channels that are bound to it have > >> the bits set saying - hey I am pending). Oddly there are bunch of MSI > >> ones that are masked which usually means they are disabled. > >> > >> Then there is the > >> affinity:00000000,00000000,00000000,00000001 vec:c8 type=IO-APIC-level > >> status=00000010 in-flight=0 domain-list=0: 16(-S--),1201: 16(--M-), > >> affinity:00000000,00000000,00000000,00000010 vec:65 type=PCI-MSI > >> status=00000010 in-flight=0 domain-list=1201: 55(--M-), > >> > >> The guest has masked both interrupts - so it is off//dying, but somehow > >> the unmapping has not happend. In other words, what you just analyzed > >> and found out. > >> > >> Sadly, there is no smoking gun.. > >> > >> So a couple of things that I would do is to ensure that the Xen > >> hypervisor boots with ''sync_console console_to_ring'' and when this > >> crash happens see if I can get a stack trace from both the Xen > >> hypervisor (there are some debug parameters to get that - I think it is > >> ''r''?), and also from the Linux kernel. > > > > I set up some new tests, so that will take some days to get a crash. > > For what it is worth, I do have a ''*'' log left from this system and it > > has some stack trace. If you think it is useful, I could send it > > gzipped, but I don''t want to spam this list if it may not be useful. > > > > One thing I did notice was the following: > > (XEN) [2011-12-06 19:13:39] active vcpus: > > (XEN) [2011-12-06 19:13:39] 1: [1201.2] pri=-2 flags=0 cpu=6 > > credit=-2411 [w=256] > > (XEN) [2011-12-06 19:13:39] 2: [1201.1] pri=-2 flags=0 cpu=5 > > credit=-2460 [w=256] > > (XEN) [2011-12-06 19:13:39] 3: [0.2] pri=0 flags=0 cpu=2 > > credit=142 [w=256] > > (XEN) [2011-12-06 19:13:39] 4: [0.1] pri=-2 flags=0 cpu=1 > > credit=-2612 [w=256] > > (XEN) [2011-12-06 19:13:39] CPU[00] sort=12511062, > > sibling=00000000,00000000,00000000,00000003, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [32767.0] pri=0 flags=0 cpu=0 > > (XEN) [2011-12-06 19:13:39] 1: [0.0] pri=-1 flags=0 cpu=0 > > credit=333 [w=256] > > (XEN) [2011-12-06 19:13:39] CPU[01] sort=12511063, > > sibling=00000000,00000000,00000000,00000003, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [0.1] pri=-2 flags=0 cpu=1 > > credit=-2914 [w=256] > > (XEN) [2011-12-06 19:13:39] 1: [32767.1] pri=-64 flags=0 cpu=1 > > (XEN) [2011-12-06 19:13:39] CPU[02] sort=12511063, > > sibling=00000000,00000000,00000000,0000000c, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [32767.2] pri=-64 flags=0 cpu=2 > > (XEN) [2011-12-06 19:13:39] CPU[03] sort=12511063, > > sibling=00000000,00000000,00000000,0000000c, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [32767.3] pri=-64 flags=0 cpu=3 > > (XEN) [2011-12-06 19:13:39] CPU[04] sort=12511063, > > sibling=00000000,00000000,00000000,00000030, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [32767.4] pri=-64 flags=0 cpu=4 > > (XEN) [2011-12-06 19:13:39] CPU[05] sort=12511063, > > sibling=00000000,00000000,00000000,00000030, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [1201.1] pri=-2 flags=0 cpu=5 > > credit=-3617 [w=256] > > (XEN) [2011-12-06 19:13:39] 1: [32767.5] pri=-64 flags=0 cpu=5 > > (XEN) [2011-12-06 19:13:39] CPU[06] sort=12511063, > > sibling=00000000,00000000,00000000,000000c0, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [1201.2] pri=-2 flags=0 cpu=6 > > credit=-3918 [w=256] > > (XEN) [2011-12-06 19:13:39] 1: [32767.6] pri=-64 flags=0 cpu=6 > > (XEN) [2011-12-06 19:13:39] CPU[07] sort=12511063, > > sibling=00000000,00000000,00000000,000000c0, > > core=00000000,00000000,00000000,000000ff > > (XEN) [2011-12-06 19:13:39] run: [32767.7] pri=-64 flags=0 cpu=7 > > > > The system in question has a 4-core i7 CPU where Dom0 is pinned to > > core 0-3 and DomU is pinned 4-7. Aren''t the negative credit numbers > > quite big? > > > >> > >> But also see if the problem disappears with the latest 4.1.x hypervisor. > >> And it might also be worth upgrading the dom0 to a 3.0. Hmm, it would be > >> very interesting to see where the dom0 _is_ stuck at. The hypervisor is > >> running fine so it all points to dom0 crashing somewhere.. > >> > >> Can you make sure that dom0 runs with ''debug loglevel=8'' as well. That > >> should give some wealth of information when/if a crash happens. > >> Oh, and try to pass in Ctrl-Alt-SysRQ-t (on serial concentrators I think > >> you just need to type ''break'' on the telnet line)and then t. But I am > >> not entirely sure about it - might want to double check Google and see > >> how to enable Alt-SysRQ. > > > > I tried to get SysRq working. It worked fine on some of my other > > machines (sysrq= ''shift-control-o''), but not on this broken box. > > Apparently the kernel is stuck in some place where SysRq doesn''t work. > > > > I''m going to retest with the relevant logging turned on. I also > > upgraded to the latest 2.6.32 kernel (.48 I think). I will also > > upgrade to the latest 4.1.x build from the mercurial repository. It > > will take a few days before I get crash. > > > > Thanks so far! > > Roderick > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel
On Sun, Dec 11, 2011 at 4:52 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Fri, Dec 09, 2011 at 02:02:31PM -0800, Roderick Colenbrander wrote: >> One interesting observation. This morning I had another of my stress >> machines with the problem. I never had it on this problem before and >> it didn''t have any of the new logging / software updates yet. >> >> The system was in the same state as the other machine which I reported >> about before. I tried SysRq stuff and other things. While I was about >> to reboot the system, a login prompt appeared again on the VGA. I >> don''t know whether any of the stuff I did triggered it or not. Anyway >> it means Linux is not really dead. I tried logging, but I don''t even >> see characters appearing. The system feels to be very, very, very >> slow. >> > > Hmm.. I wonder if this is the same issue I''m sometimes seeing on my laptop.. > suddenly it starts becoming slower, and after a while it''s *very* slow.. > not dead, but unusably slow.. > > I haven''t had time to (try to) debug it.. > > -- Pasi >I''m seeing slowness issues on our systems as well. As in some code starts running really, really slowly. Local TCP ''heartbeat'' like mechanism from Dom0 to a DomU timing out. Code which should execute quickly becoming orders of magnitude slower for no obvious reason. Typically we see evidence of this in logging. I felt there was a connection between this slowness and the ''unresponsive Dom0'' but I haven''t been able to confirm this. Normally we see weird things in our logs, but on the unresponsive systems I didn''t see anything strange in the logs yet. Most likely the logs weren''t synced to disk yet. After more investigation it seems that in my case all issues, seem to happen after the DomU is up and we somehow ''start'' using it. During startup of our own software, both DomU and Dom0 would at least see a spike in cpu/disk activity before things settle down a bit. My feeling is (based on some logs) that this is when Dom0 sometimes becomes unresponsive. I''m still running tests on a number of machines, but it takes ages to get results. Roderick
On Sun, Dec 11, 2011 at 6:58 PM, Roderick Colenbrander <thunderbird2k@gmail.com> wrote:> On Sun, Dec 11, 2011 at 4:52 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote: >> On Fri, Dec 09, 2011 at 02:02:31PM -0800, Roderick Colenbrander wrote: >>> One interesting observation. This morning I had another of my stress >>> machines with the problem. I never had it on this problem before and >>> it didn''t have any of the new logging / software updates yet. >>> >>> The system was in the same state as the other machine which I reported >>> about before. I tried SysRq stuff and other things. While I was about >>> to reboot the system, a login prompt appeared again on the VGA. I >>> don''t know whether any of the stuff I did triggered it or not. Anyway >>> it means Linux is not really dead. I tried logging, but I don''t even >>> see characters appearing. The system feels to be very, very, very >>> slow. >>> >> >> Hmm.. I wonder if this is the same issue I''m sometimes seeing on my laptop.. >> suddenly it starts becoming slower, and after a while it''s *very* slow.. >> not dead, but unusably slow.. >> >> I haven''t had time to (try to) debug it.. >> >> -- Pasi >> > > I''m seeing slowness issues on our systems as well. As in some code > starts running really, really slowly. Local TCP ''heartbeat'' like > mechanism from Dom0 to a DomU timing out. Code which should execute > quickly becoming orders of magnitude slower for no obvious reason. > Typically we see evidence of this in logging. > > I felt there was a connection between this slowness and the > ''unresponsive Dom0'' but I haven''t been able to confirm this. Normally > we see weird things in our logs, but on the unresponsive systems I > didn''t see anything strange in the logs yet. Most likely the logs > weren''t synced to disk yet. > > After more investigation it seems that in my case all issues, seem to > happen after the DomU is up and we somehow ''start'' using it. During > startup of our own software, both DomU and Dom0 would at least see a > spike in cpu/disk activity before things settle down a bit. My feeling > is (based on some logs) that this is when Dom0 sometimes becomes > unresponsive. > > I''m still running tests on a number of machines, but it takes ages to > get results. > > RoderickToday, yet another of my machines went down. To explain the setup, I''m running tests across a couple of similar machines. On some of the machines I updated to latest 2.6.32.x and Xen 4.1.x (and I upgrade every machine which goes down). The machine which just went down wasn''t using the latest versions yet, but some more logging was enabled in ''dmesg''. This box still had Xen 4.1.2 and 2.6.32.37. I''m not sure if the logging is too useful, yet but it may be of some use. The log output is mostly from the serial console. I looked at some log files on disks. Some files give clues, but some important log files didn''t get synced to disk before bad things happened. The last timestamp I was able to find was at ''16:13:55'' which some of our own software wrote some line to its logs. The few log files which got synced to disk tell that the VM started around 16:13:02 which matches PCIe device ownership to the VM: (XEN) [2011-12-10 16:13:03] [VT-D]iommu.c:1501: d0:PCIe: unmap bdf = 3:0.0 (XEN) [2011-12-10 16:13:03] [VT-D]iommu.c:1374: d1109:PCIe: map bdf = 3:0.0 (XEN) [2011-12-10 16:13:03] [VT-D]io.c:304: d1109: bind: m_gsi=55 g_gsi=32 device=4 intx=0 (XEN) [2011-12-10 16:13:23] [VT-D]io.c:328: d1109: unbind: m_gsi=55 g_gsi=32 device=4 intx=0 (XEN) [2011-12-10 16:13:23] [VT-D]io.c:386: d1109 unmap: m_irq=55 device=4 intx=0 (XEN) [2011-12-10 16:13:23] [VT-D]io.c:304: d1109: bind: m_gsi=16 g_gsi=32 device=4 intx=0 We are using the tapdisk driver for disk ''xvdc'' which is referred to as ''tdc'' I guess. It is a copy-on-write VHD disk which uses a raw disk image as its backing store. Apparently there are all sorts of read errors. Not sure if the read errors are too bad. [905276.510737] blkback: ring-ref 15658, event-channel 13, protocol 1 (unspecified, assuming native) [905276.512128] blkback: ring-ref 15658, event-channel 13, protocol 1 (unspecified, assuming native) [905277.113215] block tdc: sector-size: 512 capacity: 118785 [905277.215872] blkback: ring-ref 15775, event-channel 14, protocol 1 (unspecified, assuming native) [905680.196850] end_request: I/O error, dev tdc, sector 45689 [905680.197293] end_request: I/O error, dev tdc, sector 45689 [905680.197624] end_request: I/O error, dev tdc, sector 45688 [905680.197993] end_request: I/O error, dev tdc, sector 45696 More blktap issues hours later, right? The real problem must have happened much earlier then. [912393.552881] blkback: ring-ref 15810, event-channel 13, protocol 1 (unspecified, assuming native) [912394.101667] block tdc: sector-size: 512 capacity: 118785 [912394.384715] blkback: ring-ref 15660, event-channel 14, protocol 1 (unspecified, assuming native) [912402.433492] BUG: unable to handle kernel NULL pointer dereference at 00000000000002f0 [912402.434077] IP: [<ffffffff81249ac2>] blktap_device_end_request+0x4e/0x68 [912402.434384] PGD 384bb067 PUD 3a55f067 PMD 0 [912402.434752] Oops: 0000 [#1] SMP [912402.435096] last sysfs file: /sys/devices/virtual/block/tdc/removable [912402.435363] CPU 2 [912402.435653] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf [912402.436092] Pid: 16994, comm: tapdisk2 Not tainted 2.6.32.37 #1 X8STi [912402.436595] RIP: e030:[<ffffffff81249ac2>] [<ffffffff81249ac2>] blktap_device_end_request+0x4e/0x68 [912402.437117] RSP: e02b:ffff8800616f9d08 EFLAGS: 00010046 [912402.437380] RAX: 0000000000000000 RBX: ffff880014ef29b0 RCX: 0000000000000018 [912402.437884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88007fcebd58 [912402.438368] RBP: 0000000000000000 R08: ffffffff816a9488 R09: ffff8800123af040 [912402.438855] R10: 00000001365d23c7 R11: ffffffff810861e1 R12: ffff88007bea4e70 [912402.439358] R13: ffff88007b8aa800 R14: 0000000000000000 R15: ffff880014ef29b0 [912402.439848] FS: 00007fde3bbe2730(0000) GS:ffff880028070000(0000) knlGS:0000000000000000 [912402.440341] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [912402.440621] CR2: 00000000000002f0 CR3: 000000001f721000 CR4: 0000000000002660 [912402.441102] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [912402.441583] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [912402.442083] Process tapdisk2 (pid: 16994, threadinfo ffff8800616f8000, task ffff88007bab4ec0) [912402.442569] Stack: [912402.442825] ffffea000185d038 0000000000000001 ffff88007bab4ec0 0000000000000000 [912402.443199] <0> ffff88007b8aa800 ffffffff81248c3b 00000000000001ff 0000000000000000 [912402.443871] <0> 00033e583f482159 000000000000e030 ffff88006f6da480 0000039f000003a0 [912402.444793] Call Trace: [912402.445054] [<ffffffff81248c3b>] ? blktap_ring_ioctl+0x137/0x228 [912402.445326] [<ffffffff810bea63>] ? core_sys_select+0x1ee/0x21e [912402.445599] [<ffffffff810dc64f>] ? really_put_req+0x62/0x8f [912402.445871] [<ffffffff8100e865>] ? xen_force_evtchn_callback+0x9/0xa [912402.446159] [<ffffffff8100eef2>] ? check_events+0x12/0x20 [912402.446434] [<ffffffff810dc4e4>] ? aio_read_evt+0x26/0xe4 [912402.446704] [<ffffffff810dd71a>] ? sys_io_getevents+0x10e/0x356 [912402.446984] [<ffffffff810bcc47>] ? vfs_ioctl+0x56/0x6c [912402.447255] [<ffffffff810bd133>] ? do_vfs_ioctl+0x460/0x49e [912402.447543] [<ffffffff81059699>] ? getnstimeofday+0x55/0xaf [912402.447814] [<ffffffff810bd1c2>] ? sys_ioctl+0x51/0x70 [912402.448083] [<ffffffff81011a02>] ? system_call_fastpath+0x16/0x1b [912402.448351] Code: e8 a5 f5 ff ff 49 8b 44 24 40 48 8b b8 f0 02 00 00 e8 a5 37 27 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 85 13 f8 ff 49 8b 44 24 40 <48> 8b 80 f0 02 00 00 fe 00 ff 14 25 28 0d 6a 81 5b 5b 41 5c 41 [912402.451398] RIP [<ffffffff81249ac2>] blktap_device_end_request+0x4e/0x68 [912402.451733] RSP <ffff8800616f9d08> [912402.451997] CR2: 00000000000002f0 [912402.452265] ---[ end trace f20c5f7e976fe78b ]--- Half a minute later I get tons of these: [912432.458625] ata6.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen [912432.458893] ata6.00: failed command: READ FPDMA QUEUED [912432.459172] ata6.00: cmd 60/08:00:f2:13:e3/00:00:0d:00:00/40 tag 0 ncq 4096 in And tons of these: [912478.690536] sd 5:0:0:0: [sda] Unhandled error code [912478.690798] sd 5:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [912478.691306] sd 5:0:0:0: [sda] CDB: Read(10): 28 00 00 e6 95 05 00 00 68 00 I wonder if we are having real SATA issues or whether this is somehow caused by the ''real'' problem which is tapdisk? Finally I also get timeouts in the network code. I think these are just caused by the bad things happening. [912737.470673] WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0xd2/0x16e() [912737.471170] Hardware name: X8STi [912737.471426] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out [912737.471686] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf [912737.472106] Pid: 0, comm: swapper Tainted: G D 2.6.32.37 #1 [912737.472586] Call Trace: [912737.472838] <IRQ> [<ffffffff813c8e12>] ? dev_watchdog+0xd2/0x16e [912737.473130] [<ffffffff813c8e12>] ? dev_watchdog+0xd2/0x16e [912737.473391] [<ffffffff81040fb1>] ? warn_slowpath_common+0x77/0xa3 [912737.473668] [<ffffffff813c8d40>] ? dev_watchdog+0x0/0x16e [912737.473928] [<ffffffff81041039 > ] ? warn_slowpk _dev_program_evt+ 0xe3/0x18c [90/0x60 [912737.474988] [<ffffffff8100e865>] ? xen_force_evtchn_callback+0x9/0xa [912737.480706] [<ffffffff8100eef2>] ? check_events+0x12/0x20 [912737.480965] [<ffffffff813c8d15>] ? netif_tx_lock+0x3d/0x68 [912737.481222] [<ffffffff813c8d40>] ? dev_watchdog+0x0/0x16e [912737.481481] [<ffffffff813b83a0>] ? netdev_drivername+0x3b/0x40 [912737.481741] [<ffffffff813c8e12>] ? dev_watchdog+0xd2/0x16e [912737.482015] [<ffffffff81049614>] ? run_timer_softirq+0x16c/0x1d9 [912737.482274] [<ffffffff81078fb8>] ? handle_percpu_irq+0x4e/0x6a [912737.482534] [<ffffffff81045bc8>] ? __do_softirq+0x91/0x116 [912737.482794] [<ffffffff81012b6c>] ? call_softirq+0x1c/0x30 [912737.483052] [<ffffffff810140e7>] ? do_softirq+0x3f/0x7c [912737.483324] [<ffffffff81045a64>] ? irq_exit+0x36/0x76 [912737.483584] [<ffffffff8123b6ad>] ? xen_evtchn_do_upcall+0x33/0x42 [912737.483843] [<ffffffff81012bbe>] ? xen_do_hypervisor_callback+0x1e/0x30 [912737.484101] <EOI> [<ffffffff8100eedf>] ? xen_restore_fl_direct_end+0x0/0x1 [912737.484394] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 [912737.484670] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 [912737.484930] [<ffffffff8100e8e3>] ? xen_safe_halt+0xc/0x15 [912737.485188] [<ffffffff81017f07>] ? default_idle+0x21/0x3d [912737.485447] [<ffffffff81010df1>] ? cpu_idle+0x59/0x91 [912737.485704] ---[ end trace f20c5f7e976fe78c ]--- Does this feel like blktap issues? Or is it more likely that something else happened which causes blktap and the other things to fail? Roderick
> > Does this feel like blktap issues? Or is it more likely that something else > happened which causes blktap and the other things to fail? >How easy is it to use another backend type without changing too much else? James
On Mon, Dec 12, 2011 at 5:50 PM, James Harper <james.harper@bendigoit.com.au> wrote:>> >> Does this feel like blktap issues? Or is it more likely that something else >> happened which causes blktap and the other things to fail? >> > > How easy is it to use another backend type without changing too much else? > > JamesI may be able to change the tapdisk based disk to a different format (from vhd to qcow) to force qdisk. Looking at that now. Roderick
> Today, yet another of my machines went down. To explain the setup, I''m > running tests across a couple of similar machines. On some of the > machines I updated to latest 2.6.32.x and Xen 4.1.x (and I upgrade > every machine which goes down). The machine which just went down > wasn''t using the latest versions yet, but some more logging was > enabled in ''dmesg''. > > This box still had Xen 4.1.2 and 2.6.32.37. I''m not sure if the > logging is too useful, yet but it may be of some use. The log output > is mostly from the serial console. I looked at some log files on > disks. Some files give clues, but some important log files didn''t get > synced to disk before bad things happened. The last timestamp I was > able to find was at ''16:13:55'' which some of our own software wrote > some line to its logs. > > The few log files which got synced to disk tell that the VM started > around 16:13:02 which matches PCIe device ownership to the VM: > > (XEN) [2011-12-10 16:13:03] [VT-D]iommu.c:1501: d0:PCIe: unmap bdf = 3:0.0 > (XEN) [2011-12-10 16:13:03] [VT-D]iommu.c:1374: d1109:PCIe: map bdf = 3:0.0 > (XEN) [2011-12-10 16:13:03] [VT-D]io.c:304: d1109: bind: m_gsi=55 > g_gsi=32 device=4 intx=0 > (XEN) [2011-12-10 16:13:23] [VT-D]io.c:328: d1109: unbind: m_gsi=55 > g_gsi=32 device=4 intx=0 > (XEN) [2011-12-10 16:13:23] [VT-D]io.c:386: d1109 unmap: m_irq=55 > device=4 intx=0 > (XEN) [2011-12-10 16:13:23] [VT-D]io.c:304: d1109: bind: m_gsi=16 > g_gsi=32 device=4 intx=0 > > > We are using the tapdisk driver for disk ''xvdc'' which is referred to > as ''tdc'' I guess. It is a copy-on-writetdc would be what the disk is in dom 0.> VHD disk which uses a raw disk image as its backing store. Apparently > there are all sorts of read errors. > Not sure if the read errors are too bad. > > [905276.510737] blkback: ring-ref 15658, event-channel 13, protocol 1 > (unspecified, assuming native) > [905276.512128] blkback: ring-ref 15658, event-channel 13, protocol 1 > (unspecified, assuming native) > [905277.113215] block tdc: sector-size: 512 capacity: 118785 > [905277.215872] blkback: ring-ref 15775, event-channel 14, protocol 1 > (unspecified, assuming native) > [905680.196850] end_request: I/O error, dev tdc, sector 45689 > [905680.197293] end_request: I/O error, dev tdc, sector 45689 > [905680.197624] end_request: I/O error, dev tdc, sector 45688 > [905680.197993] end_request: I/O error, dev tdc, sector 45696 > > More blktap issues hours later, right? The real problem must have > happened much earlier then.Any chance you can eliminate blktap from the picture? Can you use blkback?> > [912393.552881] blkback: ring-ref 15810, event-channel 13, protocol 1 > (unspecified, assuming native) > [912394.101667] block tdc: sector-size: 512 capacity: 118785 > [912394.384715] blkback: ring-ref 15660, event-channel 14, protocol 1 > (unspecified, assuming native) > [912402.433492] BUG: unable to handle kernel NULL pointer dereference > at 00000000000002f0 > [912402.434077] IP: [<ffffffff81249ac2>] blktap_device_end_request+0x4e/0x68 > [912402.434384] PGD 384bb067 PUD 3a55f067 PMD 0 > [912402.434752] Oops: 0000 [#1] SMP > [912402.435096] last sysfs file: /sys/devices/virtual/block/tdc/removable > [912402.435363] CPU 2 > [912402.435653] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf > [912402.436092] Pid: 16994, comm: tapdisk2 Not tainted 2.6.32.37 #1 X8STi > [912402.436595] RIP: e030:[<ffffffff81249ac2>] [<ffffffff81249ac2>] > blktap_device_end_request+0x4e/0x68 > [912402.437117] RSP: e02b:ffff8800616f9d08 EFLAGS: 00010046 > [912402.437380] RAX: 0000000000000000 RBX: ffff880014ef29b0 RCX: > 0000000000000018 > [912402.437884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > ffff88007fcebd58 > [912402.438368] RBP: 0000000000000000 R08: ffffffff816a9488 R09: > ffff8800123af040 > [912402.438855] R10: 00000001365d23c7 R11: ffffffff810861e1 R12: > ffff88007bea4e70 > [912402.439358] R13: ffff88007b8aa800 R14: 0000000000000000 R15: > ffff880014ef29b0 > [912402.439848] FS: 00007fde3bbe2730(0000) GS:ffff880028070000(0000) > knlGS:0000000000000000 > [912402.440341] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [912402.440621] CR2: 00000000000002f0 CR3: 000000001f721000 CR4: > 0000000000002660 > [912402.441102] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [912402.441583] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [912402.442083] Process tapdisk2 (pid: 16994, threadinfo > ffff8800616f8000, task ffff88007bab4ec0) > [912402.442569] Stack: > [912402.442825] ffffea000185d038 0000000000000001 ffff88007bab4ec0 > 0000000000000000 > [912402.443199] <0> ffff88007b8aa800 ffffffff81248c3b 00000000000001ff > 0000000000000000 > [912402.443871] <0> 00033e583f482159 000000000000e030 ffff88006f6da480 > 0000039f000003a0 > [912402.444793] Call Trace: > [912402.445054] [<ffffffff81248c3b>] ? blktap_ring_ioctl+0x137/0x228 > [912402.445326] [<ffffffff810bea63>] ? core_sys_select+0x1ee/0x21e > [912402.445599] [<ffffffff810dc64f>] ? really_put_req+0x62/0x8f > [912402.445871] [<ffffffff8100e865>] ? xen_force_evtchn_callback+0x9/0xa > [912402.446159] [<ffffffff8100eef2>] ? check_events+0x12/0x20 > [912402.446434] [<ffffffff810dc4e4>] ? aio_read_evt+0x26/0xe4 > [912402.446704] [<ffffffff810dd71a>] ? sys_io_getevents+0x10e/0x356 > [912402.446984] [<ffffffff810bcc47>] ? vfs_ioctl+0x56/0x6c > [912402.447255] [<ffffffff810bd133>] ? do_vfs_ioctl+0x460/0x49e > [912402.447543] [<ffffffff81059699>] ? getnstimeofday+0x55/0xaf > [912402.447814] [<ffffffff810bd1c2>] ? sys_ioctl+0x51/0x70 > [912402.448083] [<ffffffff81011a02>] ? system_call_fastpath+0x16/0x1b > [912402.448351] Code: e8 a5 f5 ff ff 49 8b 44 24 40 48 8b b8 f0 02 00 > 00 e8 a5 37 27 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 85 13 f8 ff 49 > 8b 44 24 40 <48> 8b 80 f0 02 00 00 fe 00 ff 14 25 28 0d 6a 81 5b 5b 41 > 5c 41 > [912402.451398] RIP [<ffffffff81249ac2>] blktap_device_end_request+0x4e/0x68 > [912402.451733] RSP <ffff8800616f9d08> > [912402.451997] CR2: 00000000000002f0 > [912402.452265] ---[ end trace f20c5f7e976fe78b ]--- > > Half a minute later I get tons of these: > [912432.458625] ata6.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen.. frozen..> [912432.458893] ata6.00: failed command: READ FPDMA QUEUED > [912432.459172] ata6.00: cmd 60/08:00:f2:13:e3/00:00:0d:00:00/40 tag 0 > ncq 4096 in > > And tons of these: > [912478.690536] sd 5:0:0:0: [sda] Unhandled error code > [912478.690798] sd 5:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET > driverbyte=DRIVER_OK > [912478.691306] sd 5:0:0:0: [sda] CDB: Read(10): 28 00 00 e6 95 05 00 00 68 00 > > I wonder if we are having real SATA issues or whether this is somehowThat really looks to cause blktap to die when the command failed.> caused by the ''real'' problem which is tapdisk? > > Finally I also get timeouts in the network code. I think these are > just caused by the bad things happening. > [912737.470673] WARNING: at net/sched/sch_generic.c:261 > dev_watchdog+0xd2/0x16e() > [912737.471170] Hardware name: X8STi > [912737.471426] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed outand then this time out..> [912737.471686] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf > [912737.472106] Pid: 0, comm: swapper Tainted: G D 2.6.32.37 #1 > [912737.472586] Call Trace: > [912737.472838] <IRQ> [<ffffffff813c8e12>] ? dev_watchdog+0xd2/0x16e > [912737.473130] [<ffffffff813c8e12>] ? dev_watchdog+0xd2/0x16e > [912737.473391] [<ffffffff81040fb1>] ? warn_slowpath_common+0x77/0xa3 > [912737.473668] [<ffffffff813c8d40>] ? dev_watchdog+0x0/0x16e > [912737.473928] [<ffffffff81041039 > ] ? warn_slowpk > _dev_program_evt+ 0xe3/0x18c > [90/0x60 > [912737.474988] [<ffffffff8100e865>] ? xen_force_evtchn_callback+0x9/0xa > [912737.480706] [<ffffffff8100eef2>] ? check_events+0x12/0x20 > [912737.480965] [<ffffffff813c8d15>] ? netif_tx_lock+0x3d/0x68 > [912737.481222] [<ffffffff813c8d40>] ? dev_watchdog+0x0/0x16e > [912737.481481] [<ffffffff813b83a0>] ? netdev_drivername+0x3b/0x40 > [912737.481741] [<ffffffff813c8e12>] ? dev_watchdog+0xd2/0x16e > [912737.482015] [<ffffffff81049614>] ? run_timer_softirq+0x16c/0x1d9 > [912737.482274] [<ffffffff81078fb8>] ? handle_percpu_irq+0x4e/0x6a > [912737.482534] [<ffffffff81045bc8>] ? __do_softirq+0x91/0x116 > [912737.482794] [<ffffffff81012b6c>] ? call_softirq+0x1c/0x30 > [912737.483052] [<ffffffff810140e7>] ? do_softirq+0x3f/0x7c > [912737.483324] [<ffffffff81045a64>] ? irq_exit+0x36/0x76 > [912737.483584] [<ffffffff8123b6ad>] ? xen_evtchn_do_upcall+0x33/0x42 > [912737.483843] [<ffffffff81012bbe>] ? xen_do_hypervisor_callback+0x1e/0x30 > [912737.484101] <EOI> [<ffffffff8100eedf>] ? xen_restore_fl_direct_end+0x0/0x1 > [912737.484394] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 > [912737.484670] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 > [912737.484930] [<ffffffff8100e8e3>] ? xen_safe_halt+0xc/0x15 > [912737.485188] [<ffffffff81017f07>] ? default_idle+0x21/0x3d > [912737.485447] [<ffffffff81010df1>] ? cpu_idle+0x59/0x91 > [912737.485704] ---[ end trace f20c5f7e976fe78c ]--- > > > Does this feel like blktap issues? Or is it more likely that something > else happened which causes blktap and the other things to fail?It looks like the interrupts stopped. Perhaps you can run the C code (attached) on the serial console and see _what_ (or perhaps _when_) the interrupts stops (and also verify that the timeouts and ''frozen'' happen due to no interrupts being received). There are a couple of bugs that were discovered in the interrupt code (that had been there since 2.6.27!) that went to the stable tree - so they should been backported to 2.6.32.x tree - but I honestly don''t recall the names. I can look them up tomorrow.
> > Does this feel like blktap issues? Or is it more likely that something > > else happened which causes blktap and the other things to fail? > > It looks like the interrupts stopped. Perhaps you can run the C code > (attached) on the serial console and see _what_ (or perhaps _when_) > the interrupts stops (and also verify that the timeouts and ''frozen'' > happen due to no interrupts being received).The C code is http://darnok.org/xen/read_interrupts.c> > There are a couple of bugs that were discovered in the interrupt code > (that had been there since 2.6.27!) that went to the stable tree - so > they should been backported to 2.6.32.x tree - but I honestly don''t > recall the names. I can look them up tomorrow.xen: x86_32: do not enable iterrupts when returning from exception in interrupt context (git commit d198d499148a0c64a41b3aba9e7dd43772832b91)
Thanks. On Tuesday I set up your monitoring tool on the already running tests. Some of the systems were on the verge of showing results, but we just had a power outage and I have to restart the tests. It will probably take until the Holidays before there any results :( Roderick On Tue, Dec 13, 2011 at 9:20 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote:>> > Does this feel like blktap issues? Or is it more likely that something >> > else happened which causes blktap and the other things to fail? >> >> It looks like the interrupts stopped. Perhaps you can run the C code >> (attached) on the serial console and see _what_ (or perhaps _when_) >> the interrupts stops (and also verify that the timeouts and ''frozen'' >> happen due to no interrupts being received). > > The C code is http://darnok.org/xen/read_interrupts.c > >> >> There are a couple of bugs that were discovered in the interrupt code >> (that had been there since 2.6.27!) that went to the stable tree - so >> they should been backported to 2.6.32.x tree - but I honestly don''t >> recall the names. I can look them up tomorrow. > > xen: x86_32: do not enable iterrupts when returning from exception in > interrupt context (git commit d198d499148a0c64a41b3aba9e7dd43772832b91) >
I have finally got some results (the tests are still running). Let me first summarize what tests were running. In total I had 4 machines. All systems had at least Xen 4.1.2 + Linux 2.6.32.48 (and one had a Xen 4.1.3 snapshot). I divided the systems into 2 groups and each group ran a different test. There are 2 differencess between the groups: 1) one was the use of blktap vs not using blktap. The main reason for having the blktap systems in even though it adds noise, is that it kept some of tests close to what they previously were. 2) the non-blktap systems used a different cpu pinning configuration. Previously Dom0 was using 0-3 and DomU 4-7, but due to hyperthreading both VMs used the same cores (0 and 4 are on the same physical core). Now to the initial results. - so far each machine has been up for 10 days. - one machine failed early on due to a PCI device assignment issue. I suspect that at some point due to a race condition, ownership of the PCI device wasn''t transferred correctly back from DomU to Dom0 on VM shutdown. Xm and Xl respectively fail with ''Error: failed to assign device 03:00.0: maybe it has already been assigned to other domain, or maybe it doesn''t exist.'' From a quick glance at the logs it looks like on shutdown of a previous VM, the ''unmap bdf'' never happened. Not sure why, but I guess due to a potential toolchain bug. - One of the non-blktap machines had a kernel Oops. It happened on VM shutdown and I wouldn''t be surprised if it was similar to the crashes we wanted to debug. The system is in a usable state though, but this may have been due to the use of Linux 2.6.32.48 or the different CPU pinning configuration. Some of the serial output: Thu Dec 22 23:17:38 2011 - 1232 0: 87 113 365 blkif-backend 525325sec Thu Dec 22 23:30:07 2011 - 1259 0: 12 6 222 xenbus 526065sec Thu Dec 22 23:30:07 2011 - 1250 0: 62 42 1461 ahci 526065sec Thu Dec 22 23:30:07 2011 - 1249 0: 37 28 75 eth0-rx-0 526065sec Thu Dec 22 23:30:07 2011 - 1248 0: 71 305 933 eth0-tx-0 526065sec Thu Dec 22 23:30:07 2011 - 1243 0: 6 3 134 evtchn:xenstored 526065sec Thu Dec 22 23:30:07 2011 - 1241 0: 6 3 87582 evtchn:xenstored 526065sec Thu Dec 22 23:30:07 2011 - 1240 0: 256 261 54162 evtchn:qemu-dm 526065sec Thu Dec 22 23:30:07 2011 - 1239 0: 244 251 7219 evtchn:qemu-dm 526065sec Thu Dec 22 23:30:07 2011 - 1238 0: 289 273 5931 evtchn:qemu-dm 526065sec Thu Dec 22 23:30:07 2011 - 1237 0: 248 245 4279 evtchn:qemu-dm 526065sec Thu Dec 22 23:30:07 2011 - 1236 0: 12 7 315 blkif-backend 526065sec Thu Dec 22 23:30:07 2011 - 1234 0: 7 4 43 veth1 526065sec Thu Dec 22 23:30:07 2011 - 1232 CPU0 going backwards (-3304)! Thu Dec 22 23:30:07 2011 - 19 CPU0 going backwards (-212)! Thu Dec 22 23:30:07 2011 - 18 0: 35 17 35 ehci_hcd:usb1, uhci_hcd:usb8 526065sec Thu Dec 22 23:30:07 2011 - 16 CPU0 going backwards (-176)! Thu Dec 22 23:30:07 2011 - 12 CPU0 going backwards (-4)! Thu Dec 22 23:30:07 2011 - 4 CPU0 going backwards (-1)! Thu Dec 22 23:30:12 2011 - 1250 0: 384 213 1461 ahci 526070sec Thu Dec 22 23:30:12 2011 - 1249 0: 14 21 75 eth0-rx-0 526070sec Thu Dec 22 23:30:12 2011 - 1240 0: 260 260 54162 evtchn:qemu-dm 526070sec Thu Dec 22 23:30:12 2011 - 1239 0: 279 265 7219 evtchn:qemu-dm 526070sec Thu Dec 22 23:30:13 2011 - 1238 0: 271 272 5931 evtchn:qemu-dm 526070sec Thu Dec 22 23:30:13 2011 - 1237 0: 261 253 4279 evtchn:qemu-dm 526070sec Thu Dec 22 23:30:13 2011 - 1236 0: 145 76 315 blkif-backend 526070sec Thu Dec 22 23:30:18 2011 - 1250 0: 372 293 1461 ahci 526075sec Thu Dec 22 23:30:18 2011 - 1249 0: 26 24 75 eth0-rx-0 526075sec Thu Dec 22 23:30:18 2011 - 1248 0: 16 86 933 eth0-tx-0 526075sec Thu Dec 22 23:30:18 2011 - 1240 0: 234 247 54162 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1239 0: 234 249 7219 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1238 0: 241 256 5931 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1237 0: 224 239 4279 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1236 0: 100 88 315 blkif-backend 526075sec Thu Dec 22 23:30:23 2011 - 1249 0: 16 20 75 eth0-rx-0 526080sec Thu Dec 22 23:30:23 2011 - 1248 0: 7 46 933 eth0-tx-0 526080sec Thu Dec 22 23:30:23 2011 - 1240 0: 8 128 54162 evtchn:qemu-dm 526080sec Thu Dec 22 23:30:23 2011 - 1239 0: 16 133 7219 evtchn:qemu-dm 526080sec Thu Dec 22 23:30:23 2011 - 1238 0: 28 142 5931 evtchn:qemu-dm 526080sec Thu Dec 22 23:30:23 2011 - 1233 CPU0 going backwards (-12648)! Thu Dec 22 23:30:23 2011 - 19 CPU0 going backwards (-212)! Thu Dec 22 23:30:23 2011 - 18 0: 35 17 42 ehci_hcd:usb1, uhci_hcd:usb8 526080sec Thu Dec 22 23:30:23 2011 - 16 CPU0 going backwards (-176)! Thu Dec 22 23:30:23 2011 - 12 CPU0 going backwards (-4)! Thu Dec 22 23:30:23 2011 - 4 CPU0 going backwards (-1)! [533804.785589] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [533804.793913] IP: [<ffffffff814bdba9>] _spin_lock+0x5/0x15 [533804.799556] PGD 0 [533804.801896] Oops: 0002 [#1] SMP [533804.805448] last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/class [533804.813736] CPU 0 [533804.816066] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf [533804.822914] Pid: 21632, comm: qemu-dm Not tainted 2.6.32.48 #1 X8STi [533804.829595] RIP: e030:[<ffffffff814bdba9>] [<ffffffff814bdba9>] _spin_lock+0x5/0x15 [533804.837873] RSP: e02b:ffff88005f187c50 EFLAGS: 00010202 [533804.843513] RAX: 0000000000000100 RBX: ffff88007d6923c0 RCX: 0000000000000003 [533804.851192] RDX: ffff88007deb3180 RSI: ffff88007d6923c0 RDI: 0000000000000008 [533804.858871] RBP: ffff88007e260128 R08: 0000000000000000 R09: 0000000000000000 [533804.866552] R10: ffff88007121eb40 R11: ffffffff811b862b R12: ffff88007fac1e40 [533804.874541] R13: ffff88007e3c3340 R14: ffff88007e3c3340 R15: ffff88007fdfbc00 [533804.882243] FS: 00007f5cdcefe6f0(0000) GS:ffff880028038000(0000) knlGS:0000000000000000 [533804.890874] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [533804.896948] CR2: 0000000000000008 CR3: 0000000001001000 CR4: 0000000000002660 [533804.904627] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [533804.912306] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [533804.919985] Process qemu-dm (pid: 21632, threadinfo ffff88005f186000, task ffff880074f4e270) [533804.928971] Stack: [533804.931312] ffffffff810a9ad0 ffff88007deb3180 ffff88007e260100 ffff88007e260100 [533804.938762] <0> ffffffff8124222d 0000000000000008 0000000000000008 ffff88007deb3180 [533804.946900] <0> ffffffff810b245e ffff88007fac1e40 ffff88007deb3180 0000000000000000 [533804.955465] Call Trace: [533804.958244] [<ffffffff810a9ad0>] ? mmu_notifier_unregister+0x27/0xa5 [533804.965019] [<ffffffff8124222d>] ? gntdev_release+0xc3/0xd1 [533804.971007] [<ffffffff810b245e>] ? __fput+0x100/0x1af [533804.976477] [<ffffffff810af998>] ? filp_close+0x5b/0x62 [533804.982119] [<ffffffff81042989>] ? put_files_struct+0x64/0xc1 [533804.988280] [<ffffffff810441f0>] ? do_exit+0x209/0x68d [533804.993836] [<ffffffff810441f8>] ? do_exit+0x211/0x68d [533804.999390] [<ffffffff810446e9>] ? do_group_exit+0x75/0x9c [533805.005294] [<ffffffff8104cf1d>] ? get_signal_to_deliver+0x30d/0x338 [533805.012063] [<ffffffff81010f7a>] ? do_notify_resume+0x8a/0x6b1 [533805.018310] [<ffffffff810bdb3a>] ? poll_select_copy_remaining+0xd0/0xf3 [533805.025339] [<ffffffff81011c8e>] ? int_signal+0x12/0x17 [533805.030980] Code: 00 00 00 01 74 05 e8 67 1c d2 ff 48 89 d0 5e c3 ff 14 25 20 2d 6a 81 f0 81 2f 00 00 00 01 74 05 e8 4d 1c d2 ff c3 b8 00 01 00 00 <f0> 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c3 f0 81 2f 00 00 [533805.050454] RIP [<ffffffff814bdba9>] _spin_lock+0x5/0x15 [533805.056182] RSP <ffff88005f187c50> [533805.059997] CR2: 0000000000000008 [533805.063638] ---[ end trace 85ee7cbec9ce72ac ]--- [533805.068584] Fixing recursive fault but reboot is needed! If I had to guess, some of the gnttab code in qemu triggered a bug in the gntdev code? I have some experience with gnttab/gntdev, but don''t know the inner parts of it very well. Roderick On Fri, Dec 16, 2011 at 2:25 AM, Roderick Colenbrander <thunderbird2k@gmail.com> wrote:> Thanks. On Tuesday I set up your monitoring tool on the already > running tests. Some of the systems were on the verge of showing > results, but we just had a power outage and I have to restart the > tests. It will probably take until the Holidays before there any > results :( > > Roderick > > On Tue, Dec 13, 2011 at 9:20 PM, Konrad Rzeszutek Wilk > <konrad@darnok.org> wrote: >>> > Does this feel like blktap issues? Or is it more likely that something >>> > else happened which causes blktap and the other things to fail? >>> >>> It looks like the interrupts stopped. Perhaps you can run the C code >>> (attached) on the serial console and see _what_ (or perhaps _when_) >>> the interrupts stops (and also verify that the timeouts and ''frozen'' >>> happen due to no interrupts being received). >> >> The C code is http://darnok.org/xen/read_interrupts.c >> >>> >>> There are a couple of bugs that were discovered in the interrupt code >>> (that had been there since 2.6.27!) that went to the stable tree - so >>> they should been backported to 2.6.32.x tree - but I honestly don''t >>> recall the names. I can look them up tomorrow. >> >> xen: x86_32: do not enable iterrupts when returning from exception in >> interrupt context (git commit d198d499148a0c64a41b3aba9e7dd43772832b91) >>
On Mon, Dec 26, 2011 at 06:45:32PM +0000, Roderick Colenbrander wrote:> I have finally got some results (the tests are still running). Let me > first summarize what tests were running.Thanks for being so dilligient in providing full details. It helps after reading this after the holidays.> > In total I had 4 machines. All systems had at least Xen 4.1.2 + Linux > 2.6.32.48 (and one had a Xen 4.1.3 snapshot). I divided the systems > into 2 groups and each group ran a different test. There are 2 > differencess between the groups: > 1) one was the use of blktap vs not using blktap. The main reason for > having the blktap systems in even though it adds noise, is that it > kept some of tests close to what they previously were. > 2) the non-blktap systems used a different cpu pinning configuration. > Previously Dom0 was using 0-3 and DomU 4-7, but due to hyperthreading > both VMs used the same cores (0 and 4 are on the same physical core). > > Now to the initial results. > - so far each machine has been up for 10 days. > - one machine failed early on due to a PCI device assignment issue. I > suspect that at some point due to a race condition, ownership of the > PCI device wasn''t transferred correctly back from DomU to Dom0 on VM > shutdown. Xm and Xl respectively fail with ''Error: failed to assign > device 03:00.0: maybe it has already been assigned to other domain, or > maybe it doesn''t exist.'' From a quick glance at the logs it looks likeLets ignore that one. It is harmelss and I should probably remove it.> on shutdown of a previous VM, the ''unmap bdf'' never happened. Not sure > why, but I guess due to a potential toolchain bug. > > - One of the non-blktap machines had a kernel Oops. It happened on VM > shutdown and I wouldn''t be surprised if it was similar to the crashes > we wanted to debug. The system is in a usable state though, but this > may have been due to the use of Linux 2.6.32.48 or the different CPU > pinning configuration. Some of the serial output: > > Thu Dec 22 23:17:38 2011 - 1232 0: 87 113 365 blkif-backend > 525325sec > > Thu Dec 22 23:30:07 2011 - 1259 0: 12 6 222 xenbus > 526065sec > Thu Dec 22 23:30:07 2011 - 1250 0: 62 42 1461 ahci > 526065sec > Thu Dec 22 23:30:07 2011 - 1249 0: 37 28 75 eth0-rx-0 > 526065sec > Thu Dec 22 23:30:07 2011 - 1248 0: 71 305 933 eth0-tx-0 > 526065sec > Thu Dec 22 23:30:07 2011 - 1243 0: 6 3 134 > evtchn:xenstored 526065sec > Thu Dec 22 23:30:07 2011 - 1241 0: 6 3 87582 > evtchn:xenstored 526065sec > Thu Dec 22 23:30:07 2011 - 1240 0: 256 261 54162 evtchn:qemu-dm > 526065sec > Thu Dec 22 23:30:07 2011 - 1239 0: 244 251 7219 evtchn:qemu-dm > 526065sec > Thu Dec 22 23:30:07 2011 - 1238 0: 289 273 5931 evtchn:qemu-dm > 526065sec > Thu Dec 22 23:30:07 2011 - 1237 0: 248 245 4279 evtchn:qemu-dm > 526065sec > Thu Dec 22 23:30:07 2011 - 1236 0: 12 7 315 blkif-backend > 526065sec > Thu Dec 22 23:30:07 2011 - 1234 0: 7 4 43 veth1 > 526065sec > Thu Dec 22 23:30:07 2011 - 1232 CPU0 going backwards (-3304)! > Thu Dec 22 23:30:07 2011 - 19 CPU0 going backwards (-212)! > Thu Dec 22 23:30:07 2011 - 18 0: 35 17 35 ehci_hcd:usb1, > uhci_hcd:usb8 526065sec > Thu Dec 22 23:30:07 2011 - 16 CPU0 going backwards (-176)! > Thu Dec 22 23:30:07 2011 - 12 CPU0 going backwards (-4)! > Thu Dec 22 23:30:07 2011 - 4 CPU0 going backwards (-1)! > Thu Dec 22 23:30:12 2011 - 1250 0: 384 213 1461 ahci > 526070sec > Thu Dec 22 23:30:12 2011 - 1249 0: 14 21 75 eth0-rx-0 > 526070sec > Thu Dec 22 23:30:12 2011 - 1240 0: 260 260 54162 evtchn:qemu-dm > 526070sec > Thu Dec 22 23:30:12 2011 - 1239 0: 279 265 7219 evtchn:qemu-dm > 526070sec > Thu Dec 22 23:30:13 2011 - 1238 0: 271 272 5931 evtchn:qemu-dm > 526070sec > Thu Dec 22 23:30:13 2011 - 1237 0: 261 253 4279 evtchn:qemu-dm > 526070sec > Thu Dec 22 23:30:13 2011 - 1236 0: 145 76 315 blkif-backend > 526070sec > Thu Dec 22 23:30:18 2011 - 1250 0: 372 293 1461 ahci > 526075sec > Thu Dec 22 23:30:18 2011 - 1249 0: 26 24 75 eth0-rx-0 > 526075sec > Thu Dec 22 23:30:18 2011 - 1248 0: 16 86 933 eth0-tx-0 > 526075sec > Thu Dec 22 23:30:18 2011 - 1240 0: 234 247 54162 evtchn:qemu-dm > 526075sec > Thu Dec 22 23:30:18 2011 - 1239 0: 234 249 7219 evtchn:qemu-dm > 526075sec > Thu Dec 22 23:30:18 2011 - 1238 0: 241 256 5931 evtchn:qemu-dm > 526075sec > Thu Dec 22 23:30:18 2011 - 1237 0: 224 239 4279 evtchn:qemu-dm > 526075sec > Thu Dec 22 23:30:18 2011 - 1236 0: 100 88 315 blkif-backend > 526075sec > Thu Dec 22 23:30:23 2011 - 1249 0: 16 20 75 eth0-rx-0 > 526080sec > Thu Dec 22 23:30:23 2011 - 1248 0: 7 46 933 eth0-tx-0 > 526080sec > Thu Dec 22 23:30:23 2011 - 1240 0: 8 128 54162 evtchn:qemu-dm > 526080sec > Thu Dec 22 23:30:23 2011 - 1239 0: 16 133 7219 evtchn:qemu-dm > 526080sec > Thu Dec 22 23:30:23 2011 - 1238 0: 28 142 5931 evtchn:qemu-dm > 526080sec > Thu Dec 22 23:30:23 2011 - 1233 CPU0 going backwards (-12648)! > Thu Dec 22 23:30:23 2011 - 19 CPU0 going backwards (-212)! > Thu Dec 22 23:30:23 2011 - 18 0: 35 17 42 ehci_hcd:usb1, > uhci_hcd:usb8 526080sec > Thu Dec 22 23:30:23 2011 - 16 CPU0 going backwards (-176)! > Thu Dec 22 23:30:23 2011 - 12 CPU0 going backwards (-4)! > Thu Dec 22 23:30:23 2011 - 4 CPU0 going backwards (-1)! > > [533804.785589] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000008 > [533804.793913] IP: [<ffffffff814bdba9>] _spin_lock+0x5/0x15 > [533804.799556] PGD 0 > [533804.801896] Oops: 0002 [#1] SMP > [533804.805448] last sysfs file: > /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/class > [533804.813736] CPU 0 > [533804.816066] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf > [533804.822914] Pid: 21632, comm: qemu-dm Not tainted 2.6.32.48 #1 X8STi > [533804.829595] RIP: e030:[<ffffffff814bdba9>] [<ffffffff814bdba9>] > _spin_lock+0x5/0x15 > [533804.837873] RSP: e02b:ffff88005f187c50 EFLAGS: 00010202 > [533804.843513] RAX: 0000000000000100 RBX: ffff88007d6923c0 RCX: > 0000000000000003 > [533804.851192] RDX: ffff88007deb3180 RSI: ffff88007d6923c0 RDI: > 0000000000000008 > [533804.858871] RBP: ffff88007e260128 R08: 0000000000000000 R09: > 0000000000000000 > [533804.866552] R10: ffff88007121eb40 R11: ffffffff811b862b R12: > ffff88007fac1e40 > [533804.874541] R13: ffff88007e3c3340 R14: ffff88007e3c3340 R15: > ffff88007fdfbc00 > [533804.882243] FS: 00007f5cdcefe6f0(0000) GS:ffff880028038000(0000) > knlGS:0000000000000000 > [533804.890874] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [533804.896948] CR2: 0000000000000008 CR3: 0000000001001000 CR4: > 0000000000002660 > [533804.904627] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [533804.912306] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [533804.919985] Process qemu-dm (pid: 21632, threadinfo > ffff88005f186000, task ffff880074f4e270) > [533804.928971] Stack: > [533804.931312] ffffffff810a9ad0 ffff88007deb3180 ffff88007e260100 > ffff88007e260100 > [533804.938762] <0> ffffffff8124222d 0000000000000008 0000000000000008 > ffff88007deb3180 > [533804.946900] <0> ffffffff810b245e ffff88007fac1e40 ffff88007deb3180 > 0000000000000000 > [533804.955465] Call Trace: > [533804.958244] [<ffffffff810a9ad0>] ? mmu_notifier_unregister+0x27/0xa5 > [533804.965019] [<ffffffff8124222d>] ? gntdev_release+0xc3/0xd1 > [533804.971007] [<ffffffff810b245e>] ? __fput+0x100/0x1af > [533804.976477] [<ffffffff810af998>] ? filp_close+0x5b/0x62 > [533804.982119] [<ffffffff81042989>] ? put_files_struct+0x64/0xc1 > [533804.988280] [<ffffffff810441f0>] ? do_exit+0x209/0x68d > [533804.993836] [<ffffffff810441f8>] ? do_exit+0x211/0x68d > [533804.999390] [<ffffffff810446e9>] ? do_group_exit+0x75/0x9c > [533805.005294] [<ffffffff8104cf1d>] ? get_signal_to_deliver+0x30d/0x338 > [533805.012063] [<ffffffff81010f7a>] ? do_notify_resume+0x8a/0x6b1 > [533805.018310] [<ffffffff810bdb3a>] ? poll_select_copy_remaining+0xd0/0xf3 > [533805.025339] [<ffffffff81011c8e>] ? int_signal+0x12/0x17 > [533805.030980] Code: 00 00 00 01 74 05 e8 67 1c d2 ff 48 89 d0 5e c3 > ff 14 25 20 2d 6a 81 f0 81 2f 00 00 00 01 74 05 e8 4d 1c d2 ff c3 b8 > 00 01 00 00 <f0> 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c3 f0 81 2f > 00 00 > [533805.050454] RIP [<ffffffff814bdba9>] _spin_lock+0x5/0x15 > [533805.056182] RSP <ffff88005f187c50> > [533805.059997] CR2: 0000000000000008 > [533805.063638] ---[ end trace 85ee7cbec9ce72ac ]--- > [533805.068584] Fixing recursive fault but reboot is needed! > > If I had to guess, some of the gnttab code in qemu triggered a bug in > the gntdev code? I have some experience with gnttab/gntdev, but don''t > know the inner parts of it very well.It certainly looks that way. Or rather that we hit a race - what mmu_notifier_unregister tried to call was already de-allocated. [edit: Perhaps this is related to a TLB flush fix: 7899891c7d161752f29abcc9bc0a9c6c3a3af26c xen mmu: fix a race window causing leave_mm BUG()] That would explain the hang you got. The qemu-dm is stuck waiting for gntdev to respond (you should be able to verify this by attaching gdb to qemu-dm and seeing the backtrace - it should be stuck at some ioctl call). And the kernel sees that this particular process is not doing anything. Also we have some gntdev in a bad state (but that should be OK to the other processes) - so I am not sure how that "hangs" your box. The interrupts being disabled does not seem to occur with this crash? Does read_interrupts code you are running is still spewing data right? Or does it stop as well? But lets fix one thing at a time. Looking at the code in 2.6.32 it is the version that went upstream as git commit ab31523c2fcac557226bac72cbdf5fafe01f9a26 and it has not had any of the features/bug-fixes that went with the upstream version: ab31523 xen/gntdev: allow usermode to map granted pages 8d3eaea xen/gntdev: add VM_PFNMAP to vma 9329e76 xen: gntdev: move use of GNTMAP_contains_pte next to the map_op ba5d101 xen/gntdev: stop using "token" argument f0a70c8 xen/gntdev: Fix circular locking dependency a12b4eb xen gntdev: use gnttab_map_refs and gnttab_unmap_refs ef91082 xen-gntdev: Change page limit to be global instead of per-open a879211 xen-gntdev: Use find_vma rather than iterating our vma list manually 68b025c xen-gntdev: Add reference counting to maps aab8f11 xen-gntdev: Support mapping in HVM domains bdc612d xen/gntalloc,gntdev: Add unmap notify ioctl 90b6f30 xen-gntdev: Fix memory leak when mmap fails 0ea22f0 xen-gntdev: Fix unmap notify on PV domains 84e4075 xen-gntdev: Use map->vma for checking map validity b57c186 xen-gntdev: Avoid unmapping ranges twice 12996fc xen-gntdev: Avoid double-mapping memory 9960be9 xen-gntdev: prevent using UNMAP_NOTIFY_CLEAR_BYTE on read-only mappings 77c35ac xen-gntdev: Fix incorrect use of zero handle f4ee4af xen-gntdev: Add cast to pointer 38eaeb0 xen: gntdev: fix build warning d79647a xen/gntdev,gntalloc: Remove unneeded VM flags ca47cea xen-gntdev: Use ballooned pages for grant mappings 12f0258 xen-gntdev: return -EFAULT on copy_to_user failure a93e20a xen-gntdev: unlock on error path in gntdev_mmap() The big question is whether the bug you are hitting is fixed by one of those git commits or that an unrelated fix (like the vmalloc_sync_all or the kmap_atomic one) which are: 461ae488ecb125b140d7ea29ceeedbcce9327003 mm: sync vmalloc address space page tables in alloc_vm_area( 2cd1c8d4dc7ecca9e9431e2dabe41ae9c7d89e51 x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode But the errors that one gets without those two git commits is much different from what you get. So I doubt it is one of those. The CPU time going backwards is disturbing. It does imply that the Xen hypervisor has stopped updating the timer information. Or maybe it has not, but the gntdev has crashed and left all the interrupts disabled. So three questions: 1). Is the read_intr printing any data after the crash? 2). If you attach gdb to qemu-dm can you see where it is stuck at? 3). Is the console activate at all? If not, can you SSH in (some network cards switch to polling so you can still login in a machine even if the interrupts are turned off - the atl1c something can do it and I think the r8169 as well, but not sure). 4). If you had configured your dom0 console to use the serial console instead of going through the Xen hypervisor console (so console=ttyS0 on Linux, and no com1=XXX and console=com1 on Xen hypervisor line), could you get a back-track of where the Linux kernel is at using Alt-sysRq? Thanks and sorry for taking so long. Just coming back from holidays.
Oops initially only replied to Konrad. This is a resend to xen-devel. On Tue, Jan 3, 2012 at 4:55 PM, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote:> On Mon, Dec 26, 2011 at 06:45:32PM +0000, Roderick Colenbrander wrote: >> I have finally got some results (the tests are still running). Let me >> first summarize what tests were running. > > Thanks for being so dilligient in providing full details. It helps after > reading this after the holidays. >> >> In total I had 4 machines. All systems had at least Xen 4.1.2 + Linux >> 2.6.32.48 (and one had a Xen 4.1.3 snapshot). I divided the systems >> into 2 groups and each group ran a different test. There are 2 >> differencess between the groups: >> 1) one was the use of blktap vs not using blktap. The main reason for >> having the blktap systems in even though it adds noise, is that it >> kept some of tests close to what they previously were. >> 2) the non-blktap systems used a different cpu pinning configuration. >> Previously Dom0 was using 0-3 and DomU 4-7, but due to hyperthreading >> both VMs used the same cores (0 and 4 are on the same physical core). >> >> Now to the initial results. >> - so far each machine has been up for 10 days. >> - one machine failed early on due to a PCI device assignment issue. I >> suspect that at some point due to a race condition, ownership of the >> PCI device wasn''t transferred correctly back from DomU to Dom0 on VM >> shutdown. Xm and Xl respectively fail with ''Error: failed to assign >> device 03:00.0: maybe it has already been assigned to other domain, or >> maybe it doesn''t exist.'' From a quick glance at the logs it looks like > > Lets ignore that one. It is harmelss and I should probably remove it.Okay good to know that it should probably removed.>> on shutdown of a previous VM, the ''unmap bdf'' never happened. Not sure >> why, but I guess due to a potential toolchain bug. >> >> - One of the non-blktap machines had a kernel Oops. It happened on VM >> shutdown and I wouldn''t be surprised if it was similar to the crashes >> we wanted to debug. The system is in a usable state though, but this >> may have been due to the use of Linux 2.6.32.48 or the different CPU >> pinning configuration. Some of the serial output: >> >> Thu Dec 22 23:17:38 2011 - 1232 =A0 0: 87 =A0 =A0113 =A0 365 =A0 blkif-backend >> =A0 =A0 =A0525325sec >> >> Thu Dec 22 23:30:07 2011 - 1259 =A0 0: 12 =A0 =A06 =A0 =A0 222 =A0 xenbus >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1250 =A0 0: 62 =A0 =A042 =A0 =A01461 =A0ahci >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1249 =A0 0: 37 =A0 =A028 =A0 =A075 =A0 =A0eth0-rx-0 >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1248 =A0 0: 71 =A0 =A0305 =A0 933 =A0 eth0-tx-0 >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1243 =A0 0: 6 =A0 =A0 3 =A0 =A0 134 >> evtchn:xenstored =A0 =A0 526065sec >> Thu Dec 22 23:30:07 2011 - 1241 =A0 0: 6 =A0 =A0 3 =A0 =A0 87582 >> evtchn:xenstored =A0 =A0 526065sec >> Thu Dec 22 23:30:07 2011 - 1240 =A0 0: 256 =A0 261 =A0 54162 evtchn:qemu-dm >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1239 =A0 0: 244 =A0 251 =A0 7219 =A0evtchn:qemu-dm >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1238 =A0 0: 289 =A0 273 =A0 5931 =A0evtchn:qemu-dm >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1237 =A0 0: 248 =A0 245 =A0 4279 =A0evtchn:qemu-dm >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1236 =A0 0: 12 =A0 =A07 =A0 =A0 315 =A0 blkif-backend >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1234 =A0 0: 7 =A0 =A0 4 =A0 =A0 43 =A0 =A0veth1 >> =A0 =A0 =A0526065sec >> Thu Dec 22 23:30:07 2011 - 1232 CPU0 going backwards (-3304)! >> Thu Dec 22 23:30:07 2011 - =A0 19 CPU0 going backwards (-212)! >> Thu Dec 22 23:30:07 2011 - =A0 18 =A0 0: 35 =A0 =A017 =A0 =A035 =A0 =A0ehci_hcd:usb1, >> uhci_hcd:usb8 526065sec >> Thu Dec 22 23:30:07 2011 - =A0 16 CPU0 going backwards (-176)! >> Thu Dec 22 23:30:07 2011 - =A0 12 CPU0 going backwards (-4)! >> Thu Dec 22 23:30:07 2011 - =A0 =A04 CPU0 going backwards (-1)! >> Thu Dec 22 23:30:12 2011 - 1250 =A0 0: 384 =A0 213 =A0 1461 =A0ahci >> =A0 =A0 =A0526070sec >> Thu Dec 22 23:30:12 2011 - 1249 =A0 0: 14 =A0 =A021 =A0 =A075 =A0 =A0eth0-rx-0 >> =A0 =A0 =A0526070sec >> Thu Dec 22 23:30:12 2011 - 1240 =A0 0: 260 =A0 260 =A0 54162 evtchn:qemu-dm >> =A0 =A0 =A0526070sec >> Thu Dec 22 23:30:12 2011 - 1239 =A0 0: 279 =A0 265 =A0 7219 =A0evtchn:qemu-dm >> =A0 =A0 =A0526070sec >> Thu Dec 22 23:30:13 2011 - 1238 =A0 0: 271 =A0 272 =A0 5931 =A0evtchn:qemu-dm >> =A0 =A0 =A0526070sec >> Thu Dec 22 23:30:13 2011 - 1237 =A0 0: 261 =A0 253 =A0 4279 =A0evtchn:qemu-dm >> =A0 =A0 =A0526070sec >> Thu Dec 22 23:30:13 2011 - 1236 =A0 0: 145 =A0 76 =A0 =A0315 =A0 blkif-backend >> =A0 =A0 =A0526070sec >> Thu Dec 22 23:30:18 2011 - 1250 =A0 0: 372 =A0 293 =A0 1461 =A0ahci >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:18 2011 - 1249 =A0 0: 26 =A0 =A024 =A0 =A075 =A0 =A0eth0-rx-0 >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:18 2011 - 1248 =A0 0: 16 =A0 =A086 =A0 =A0933 =A0 eth0-tx-0 >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:18 2011 - 1240 =A0 0: 234 =A0 247 =A0 54162 evtchn:qemu-dm >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:18 2011 - 1239 =A0 0: 234 =A0 249 =A0 7219 =A0evtchn:qemu-dm >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:18 2011 - 1238 =A0 0: 241 =A0 256 =A0 5931 =A0evtchn:qemu-dm >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:18 2011 - 1237 =A0 0: 224 =A0 239 =A0 4279 =A0evtchn:qemu-dm >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:18 2011 - 1236 =A0 0: 100 =A0 88 =A0 =A0315 =A0 blkif-backend >> =A0 =A0 =A0526075sec >> Thu Dec 22 23:30:23 2011 - 1249 =A0 0: 16 =A0 =A020 =A0 =A075 =A0 =A0eth0-rx-0 >> =A0 =A0 =A0526080sec >> Thu Dec 22 23:30:23 2011 - 1248 =A0 0: 7 =A0 =A0 46 =A0 =A0933 =A0 eth0-tx-0 >> =A0 =A0 =A0526080sec >> Thu Dec 22 23:30:23 2011 - 1240 =A0 0: 8 =A0 =A0 128 =A0 54162 evtchn:qemu-dm >> =A0 =A0 =A0526080sec >> Thu Dec 22 23:30:23 2011 - 1239 =A0 0: 16 =A0 =A0133 =A0 7219 =A0evtchn:qemu-dm >> =A0 =A0 =A0526080sec >> Thu Dec 22 23:30:23 2011 - 1238 =A0 0: 28 =A0 =A0142 =A0 5931 =A0evtchn:qemu-dm >> =A0 =A0 =A0526080sec >> Thu Dec 22 23:30:23 2011 - 1233 CPU0 going backwards (-12648)! >> Thu Dec 22 23:30:23 2011 - =A0 19 CPU0 going backwards (-212)! >> Thu Dec 22 23:30:23 2011 - =A0 18 =A0 0: 35 =A0 =A017 =A0 =A042 =A0 =A0ehci_hcd:usb1, >> uhci_hcd:usb8 526080sec >> Thu Dec 22 23:30:23 2011 - =A0 16 CPU0 going backwards (-176)! >> Thu Dec 22 23:30:23 2011 - =A0 12 CPU0 going backwards (-4)! >> Thu Dec 22 23:30:23 2011 - =A0 =A04 CPU0 going backwards (-1)! >> >> [533804.785589] BUG: unable to handle kernel NULL pointer dereference >> at 0000000000000008 >> [533804.793913] IP: [<ffffffff814bdba9>] _spin_lock+0x5/0x15 >> [533804.799556] PGD 0 >> [533804.801896] Oops: 0002 [#1] SMP >> [533804.805448] last sysfs file: >> /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/class >> [533804.813736] CPU 0 >> [533804.816066] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf >> [533804.822914] Pid: 21632, comm: qemu-dm Not tainted 2.6.32.48 #1 X8STi >> [533804.829595] RIP: e030:[<ffffffff814bdba9>] =A0[<ffffffff814bdba9>] >> _spin_lock+0x5/0x15 >> [533804.837873] RSP: e02b:ffff88005f187c50 =A0EFLAGS: 00010202 >> [533804.843513] RAX: 0000000000000100 RBX: ffff88007d6923c0 RCX: >> 0000000000000003 >> [533804.851192] RDX: ffff88007deb3180 RSI: ffff88007d6923c0 RDI: >> 0000000000000008 >> [533804.858871] RBP: ffff88007e260128 R08: 0000000000000000 R09: >> 0000000000000000 >> [533804.866552] R10: ffff88007121eb40 R11: ffffffff811b862b R12: >> ffff88007fac1e40 >> [533804.874541] R13: ffff88007e3c3340 R14: ffff88007e3c3340 R15: >> ffff88007fdfbc00 >> [533804.882243] FS: =A000007f5cdcefe6f0(0000) GS:ffff880028038000(0000) >> knlGS:0000000000000000 >> [533804.890874] CS: =A0e033 DS: 0000 ES: 0000 CR0: 000000008005003b >> [533804.896948] CR2: 0000000000000008 CR3: 0000000001001000 CR4: >> 0000000000002660 >> [533804.904627] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [533804.912306] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000400 >> [533804.919985] Process qemu-dm (pid: 21632, threadinfo >> ffff88005f186000, task ffff880074f4e270) >> [533804.928971] Stack: >> [533804.931312] =A0ffffffff810a9ad0 ffff88007deb3180 ffff88007e260100 >> ffff88007e260100 >> [533804.938762] <0> ffffffff8124222d 0000000000000008 0000000000000008 >> ffff88007deb3180 >> [533804.946900] <0> ffffffff810b245e ffff88007fac1e40 ffff88007deb3180 >> 0000000000000000 >> [533804.955465] Call Trace: >> [533804.958244] =A0[<ffffffff810a9ad0>] ? mmu_notifier_unregister+0x27/0xa5 >> [533804.965019] =A0[<ffffffff8124222d>] ? gntdev_release+0xc3/0xd1 >> [533804.971007] =A0[<ffffffff810b245e>] ? __fput+0x100/0x1af >> [533804.976477] =A0[<ffffffff810af998>] ? filp_close+0x5b/0x62 >> [533804.982119] =A0[<ffffffff81042989>] ? put_files_struct+0x64/0xc1 >> [533804.988280] =A0[<ffffffff810441f0>] ? do_exit+0x209/0x68d >> [533804.993836] =A0[<ffffffff810441f8>] ? do_exit+0x211/0x68d >> [533804.999390] =A0[<ffffffff810446e9>] ? do_group_exit+0x75/0x9c >> [533805.005294] =A0[<ffffffff8104cf1d>] ? get_signal_to_deliver+0x30d/0x338 >> [533805.012063] =A0[<ffffffff81010f7a>] ? do_notify_resume+0x8a/0x6b1 >> [533805.018310] =A0[<ffffffff810bdb3a>] ? poll_select_copy_remaining+0xd0/0xf3 >> [533805.025339] =A0[<ffffffff81011c8e>] ? int_signal+0x12/0x17 >> [533805.030980] Code: 00 00 00 01 74 05 e8 67 1c d2 ff 48 89 d0 5e c3 >> ff 14 25 20 2d 6a 81 f0 81 2f 00 00 00 01 74 05 e8 4d 1c d2 ff c3 b8 >> 00 01 00 00 <f0> 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c3 f0 81 2f >> 00 00 >> [533805.050454] RIP =A0[<ffffffff814bdba9>] _spin_lock+0x5/0x15 >> [533805.056182] =A0RSP <ffff88005f187c50> >> [533805.059997] CR2: 0000000000000008 >> [533805.063638] ---[ end trace 85ee7cbec9ce72ac ]--- >> [533805.068584] Fixing recursive fault but reboot is needed! >> >> If I had to guess, some of the gnttab code in qemu triggered a bug in >> the gntdev code? I have some experience with gnttab/gntdev, but don''t >> know the inner parts of it very well. > > It certainly looks that way. Or rather that we hit a race - what > mmu_notifier_unregister tried to call was already de-allocated. > [edit: Perhaps this is related to a TLB flush fix: > =A07899891c7d161752f29abcc9bc0a9c6c3a3af26c xen mmu: fix a race window > causing leave_mm BUG()] > > That would explain the hang you got. The qemu-dm is stuck waiting for > gntdev to respond (you should be able to verify this by attaching gdb to > qemu-dm and seeing the backtrace - it should be stuck at some ioctl > call). And the kernel sees that this particular process is not doing > anything. Also we have some gntdev in a bad state (but that should be OK > to the other processes) - so I am not sure how that "hangs" your box. > > The interrupts being disabled does not seem to occur with this crash? > Does read_interrupts code you are running is still spewing data right? Or > does it stop as well? > > But lets fix one thing at a time. > > Looking at the code in 2.6.32 it is the version that went upstream > as git commit ab31523c2fcac557226bac72cbdf5fafe01f9a26 and it has > not had any of the features/bug-fixes that went with the upstream > version: > > ab31523 xen/gntdev: allow usermode to map granted pages > 8d3eaea xen/gntdev: add VM_PFNMAP to vma > 9329e76 xen: gntdev: move use of GNTMAP_contains_pte next to the map_op > ba5d101 xen/gntdev: stop using "token" argument > f0a70c8 xen/gntdev: Fix circular locking dependency > a12b4eb xen gntdev: use gnttab_map_refs and gnttab_unmap_refs > ef91082 xen-gntdev: Change page limit to be global instead of per-open > a879211 xen-gntdev: Use find_vma rather than iterating our vma list > manually > 68b025c xen-gntdev: Add reference counting to maps > aab8f11 xen-gntdev: Support mapping in HVM domains > bdc612d xen/gntalloc,gntdev: Add unmap notify ioctl > 90b6f30 xen-gntdev: Fix memory leak when mmap fails > 0ea22f0 xen-gntdev: Fix unmap notify on PV domains > 84e4075 xen-gntdev: Use map->vma for checking map validity > b57c186 xen-gntdev: Avoid unmapping ranges twice > 12996fc xen-gntdev: Avoid double-mapping memory > 9960be9 xen-gntdev: prevent using UNMAP_NOTIFY_CLEAR_BYTE on read-only > mappings > 77c35ac xen-gntdev: Fix incorrect use of zero handle > f4ee4af xen-gntdev: Add cast to pointer > 38eaeb0 xen: gntdev: fix build warning > d79647a xen/gntdev,gntalloc: Remove unneeded VM flags > ca47cea xen-gntdev: Use ballooned pages for grant mappings > 12f0258 xen-gntdev: return -EFAULT on copy_to_user failure > a93e20a xen-gntdev: unlock on error path in gntdev_mmap() > > The big question is whether the bug you are hitting is fixed by one of those > git commits or that an unrelated fix (like the vmalloc_sync_all or the > kmap_atomic one) which are: > 461ae488ecb125b140d7ea29ceeedbcce9327003 mm: sync vmalloc address space page tables in alloc_vm_area( > 2cd1c8d4dc7ecca9e9431e2dabe41ae9c7d89e51 x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode > > But the errors that one gets without those two git commits is much > different from what you get. So I doubt it is one of those.I kept a close look at most of the gntdev related patches since they are important to us. As you said, most of the changes look related to this bug.> > The CPU time going backwards is disturbing. It does imply that the Xen > hypervisor has stopped updating the timer information. Or maybe it has > not, but the gntdev has crashed and left all the interrupts disabled. >Just for the record, aside from the ''Oops'' the machine is working fine. It''s responsive, networking works, you can login etcetera.> So three questions: > =A01). Is the read_intr printing any data after the crash?The tool was still running afterwards. Let me show a couple of lines before (was in previous email) and after the Oops. Thu Dec 22 23:30:18 2011 - 1250 0: 372 293 1461 ahci 526075sec Thu Dec 22 23:30:18 2011 - 1249 0: 26 24 75 eth0-rx-0 526075sec Thu Dec 22 23:30:18 2011 - 1248 0: 16 86 933 eth0-tx-0 526075sec Thu Dec 22 23:30:18 2011 - 1240 0: 234 247 54162 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1239 0: 234 249 7219 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1238 0: 241 256 5931 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1237 0: 224 239 4279 evtchn:qemu-dm 526075sec Thu Dec 22 23:30:18 2011 - 1236 0: 100 88 315 blkif-backend 526075sec Thu Dec 22 23:30:23 2011 - 1249 0: 16 20 75 eth0-rx-0 526080sec Thu Dec 22 23:30:23 2011 - 1248 0: 7 46 933 eth0-tx-0 526080sec Thu Dec 22 23:30:23 2011 - 1240 0: 8 128 54162 evtchn:qemu-dm 526080sec Thu Dec 22 23:30:23 2011 - 1239 0: 16 133 7219 evtchn:qemu-dm 526080sec Thu Dec 22 23:30:23 2011 - 1238 0: 28 142 5931 evtchn:qemu-dm 526080sec Thu Dec 22 23:30:23 2011 - 1233 CPU0 going backwards (-12648)! Thu Dec 22 23:30:23 2011 - 19 CPU0 going backwards (-212)! Thu Dec 22 23:30:23 2011 - 18 0: 35 17 42 ehci_hcd:usb1, uhci_hcd:usb8 526080sec Thu Dec 22 23:30:23 2011 - 16 CPU0 going backwards (-176)! Thu Dec 22 23:30:23 2011 - 12 CPU0 going backwards (-4)! Thu Dec 22 23:30:23 2011 - 4 CPU0 going backwards (-1)! [533804.785589] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [533804.793913] IP: [<ffffffff814bdba9>] _spin_lock+0x5/0x15 [533804.799556] PGD 0 [533804.801896] Oops: 0002 [#1] SMP ... ... [533805.068584] Fixing recursive fault but reboot is needed! Thu Dec 22 23:30:28 2011 - 1249 0: 15 18 75 eth0-rx-0 526085sec Thu Dec 22 23:30:28 2011 - 1241 0: 1361 682 87582 evtchn:xenstored 526085sec Thu Dec 22 23:30:28 2011 - 1240 CPU0 going backwards (-391001)! Thu Dec 22 23:30:28 2011 - 1239 CPU0 going backwards (-45746)! Thu Dec 22 23:30:28 2011 - 1238 CPU0 going backwards (-73084)! Thu Dec 22 23:30:28 2011 - 1237 CPU0 going backwards (-82968)! Thu Dec 22 23:30:28 2011 - 1236 CPU0 going backwards (-6887)! Thu Dec 22 23:30:28 2011 - 1235 CPU0 going backwards (-440)! Thu Dec 22 23:30:28 2011 - 1234 CPU0 going backwards (-243)! Thu Dec 22 23:30:28 2011 - 19 CPU0 going backwards (-215)! Thu Dec 22 23:30:28 2011 - 18 CPU0 going backwards (-2)! Thu Dec 22 23:30:28 2011 - 16 CPU0 going backwards (-179)! Thu Dec 22 23:30:28 2011 - 12 CPU0 going backwards (-2)! Thu Dec 22 23:30:33 2011 - 1249 0: 32 25 75 eth0-rx-0 526090sec Thu Dec 22 23:30:33 2011 - 1248 0: 18 21 933 eth0-tx-0 526090sec Thu Dec 22 23:30:38 2011 - 1250 0: 6 23 1461 ahci 526095sec Thu Dec 22 23:30:38 2011 - 1249 0: 18 22 75 eth0-rx-0 526095sec Thu Dec 22 23:30:43 2011 - 1249 0: 10 16 75 eth0-rx-0 526100sec Thu Dec 22 23:30:48 2011 - 1249 0: 14 15 75 eth0-rx-0 526105sec Thu Dec 22 23:30:53 2011 - 1249 0: 10 12 75 eth0-rx-0 526110sec Thu Dec 22 23:30:58 2011 - 1249 0: 12 12 75 eth0-rx-0 526115sec> =A02). If you attach gdb to qemu-dm can you see where it is stuck at?I don''t seem to be able to attach to the qemu-dm process. When I try to attach, gdb never completes attaching to the process. Not sure why it doesn''t work, it works fine on other processes.> =A03). Is the console activate at all? If not, can you SSH in (some > =A0 =A0 network cards switch to polling so you can still login in a machine > =A0 =A0 even if the interrupts are turned off - the atl1c something can do > =A0 =A0 it and I think the r8169 as well, but not sure).The system is still responsive. Both network and the VGA console work fine.> =A04). If you had configured your dom0 console to use the serial console > =A0 =A0 instead of going through the Xen hypervisor console (so > =A0 =A0 console=3DttyS0 on Linux, and no com1=3DXXX and console=3Dcom1 on Xen > =A0 =A0 hypervisor line), could you get a back-track of where the Linux > =A0 =A0 kernel is at using Alt-sysRq?I tried to get some Alt-sysRq output just now. After two tries, the kernel wasn''t too happy with me and died for good. I hope the limited information is of some use. [1654538.605575] SysRq : Show Blocked State [1654538.609723] task PC stack pid father [1654538.616093] qemu-dm D 0000000000000000 0 21632 21633 0x00000004 [1654538.623354] ffffffff8169e3b0 0000000000000246 0000000000000000 ffff88005f187a28 [1654538.631319] ffff88005f187968 0000000000000000 000000000000f4f8 ffff88005f187fd8 [1654538.639284] 0000000000013c80 0000000000013c80 ffff880074f4e270 ffff880074f4e518 [1654538.647249] Call Trace: [1654538.650114] [<ffffffff811fd43c>] ? vgacon_cursor+0x0/0x17e [1654538.656100] [<ffffffff810440b8>] ? do_exit+0xd1/0x68d [1654538.661655] [<ffffffff8100eeef>] ? xen_restore_fl_direct_end+0x0/0x1 [1654538.668511] [<ffffffff814bdbd4>] ? _spin_unlock_irqrestore+0xc/0xd [1654538.675192] [<ffffffff81041573>] ? release_console_sem+0x17d/0x1ae [1654538.681873] [<ffffffff814bea6c>] ? oops_end+0xae/0xb3 [1654538.687429] [<ffffffff8102d636>] ? no_context+0x1ff/0x20e [1654538.693332] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654538.700186] [<ffffffff8102d7eb>] ? __bad_area_nosemaphore+0x1a6/0x1ca [1654538.707129] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654538.713985] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654538.719887] [<ffffffff81035e28>] ? check_preempt_wakeup+0x0/0x15f [1654538.726482] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654538.733346] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654538.739247] [<ffffffff81035e28>] ? check_preempt_wakeup+0x0/0x15f [1654538.745842] [<ffffffff814bfeba>] ? do_page_fault+0x116/0x2fc [1654538.752004] [<ffffffff814bdf45>] ? page_fault+0x25/0x30 [1654538.757738] [<ffffffff811b862b>] ? cap_file_free_security+0x0/0x1 [1654538.764338] [<ffffffff814bdba9>] ? _spin_lock+0x5/0x15 [1654538.769980] [<ffffffff810a9ad0>] ? mmu_notifier_unregister+0x27/0xa5 [1654538.776836] [<ffffffff8124222d>] ? gntdev_release+0xc3/0xd1 [1654538.782911] [<ffffffff810b245e>] ? __fput+0x100/0x1af [1654538.788467] [<ffffffff810af998>] ? filp_close+0x5b/0x62 [1654538.794195] [<ffffffff81042989>] ? put_files_struct+0x64/0xc1 [1654538.800444] [<ffffffff810441f0>] ? do_exit+0x209/0x68d [1654538.806087] [<ffffffff810441f8>] ? do_exit+0x211/0x68d [1654538.811727] [<ffffffff810446e9>] ? do_group_exit+0x75/0x9c [1654538.817717] [<ffffffff8104cf1d>] ? get_signal_to_deliver+0x30d/0x338 [1654538.824572] [<ffffffff81010f7a>] ? do_notify_resume+0x8a/0x6b1 [1654538.830907] [<ffffffff810bdb3a>] ? poll_select_copy_remaining+0xd0/0xf3 [1654538.838022] [<ffffffff81011c8e>] ? int_signal+0x12/0x17 It looks like the gntdev file got closed by qemu-dm which triggers the gntdev cleanup code. What I was wondering about is ''gntdev_release''. static int gntdev_release(struct inode *inode, struct file *flip) { struct gntdev_priv *priv =3D flip->private_data; .. spin_lock(&priv->lock); <do cleanup> spin_unlock(&priv->lock); mmu_notifier_unregister(..); kfree(priv); return 0; } I don''t know enough about the file descriptor code in Linux. What happens if multiple applications (or threads) open gntdev, do they get the same "flip"? If they do then it looks like there can be a race if two threads call this code close to each other. I remember there being some ref counting mechanism on file descriptors, so I guess this is not the problem. Wed Jan 4 22:48:45 2012 - 1250 0: 6 4 1461 ahci 1644285sec Wed Jan 4 22:48:45 2012 - 1249 0: 19 19 122 eth0-rx-0 1644285sec Wed Jan 4 22:48:50 2012 - 1249 0: 19 19 122 eth0-rx-0 1644290sec Wed Jan 4 22:48:55 2012 - 1249 0: 14 16 122 eth0-rx-0 1644295sec Wed Jan 4 22:49:00 2012 - 1249 0: 25 21 122 eth0-rx-0 1644300sec Wed Jan 4 22:49:00 2012 - 1248 0: 13 7 933 eth0-tx-0 1644300sec Wed Jan 4 22:49:05 2012 - 1249 0: 21 21 122 eth0-rx-0 1644305sec Wed Jan 4 22:49:05 2012 - 1248 0: 7 7 933 eth0-tx-0 1644305sec Wed Jan 4 22:49:10 2012 - 1249 0: 21 21 122 eth0-rx-0 1644310sec Wed Jan 4 22:49:10 2012 - 1248 0: 15 11 933 eth0-tx-0 1644310sec Wed Jan 4 22:49:15 2012 - 1249 0: 15 18 122 eth0-rx-0 1644315sec Wed Jan 4 22:49:21 2012 - 1250 0: 9 5 1461 ahci 1644320sec Wed Jan 4 22:49:21 2012 - 1249 0: 28 23 122 eth0-rx-0 1644320sec Wed Jan 4 22:49:21 2012 - 1248 0: 20 13 933 eth0-tx-0 1644320sec Wed Jan 4 22:49:26 2012 - 1249 0: 14 18 122 eth0-rx-0 1644325sec Wed Jan 4 22:49:31 2012 - 1249 0: 23 21 122 eth0-rx-0 1644330sec Wed Jan 4 22:49:31 2012 - 1248 0: 21 14 933 eth0-tx-0 1644330sec Wed Jan 4 22:49:36 2012 - 1249 0: 18 19 122 eth0-rx-0 1644335sec Wed Jan 4 22:49:41 2012 - 1249 0: 14 17 122 eth0-rx-0 1644340sec Wed Jan 4 22:49:46 2012 - 1249 0: 18 17 122 eth0-rx-0 1644345sec Wed Jan 4 22:49:51 2012 - 1249 0: 21 19 122 eth0-rx-0 1644350sec Wed Jan 4 22:49:56 2012 - 1250 0: 6 4 1461 ahci 1644355sec Wed Jan 4 22:49:56 2012 - 1249 0: 11 15 122 eth0-rx-0 1644355sec Wed Jan 4 22:50:01 2012 - 1249 0: 31 23 122 eth0-rx-0 1644360sec Wed Jan 4 22:50:01 2012 - 1248 0: 15 8 933 eth0-tx-0 1644360sec Wed Jan 4 22:50:06 2012 - 1249 0: 20 22 122 eth0-rx-0 1644365sec Wed Jan 4 22:50:06 2012 - 1248 0: 9 8 933 eth0-tx-0 1644365sec Wed Jan 4 22:50:11 2012 - 1249 0: 28 25 122 eth0-rx-0 1644370sec Wed Jan 4 22:50:11 2012 - 1248 0: 14 11 933 eth0-tx-0 1644370sec [1654627.877585] SysRq : Show backtrace of all active CPUs [1654627.883041] sending NMI to all CPUs: [1654627.887060] BUG: unable to handle kernel paging request at ffffffffff5fc310 [1654627.894437] IP: [<ffffffff81026743>] _flat_send_IPI_mask+0x4b/0x78 [1654627.901049] PGD 1003067 PUD 1004067 PMD 17a5067 PTE 0 [1654627.906500] Thread overran stack, or stack corrupted [1654627.911883] Oops: 0002 [#2] SMP [1654627.915515] last sysfs file: /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1:1.0/uevent [1654627.924320] CPU 0 [1654627.926746] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf [1654627.933681] Pid: 0, comm: swapper Tainted: G D 2.6.32.48 #1 X8STi [1654627.940883] RIP: e030:[<ffffffff81026743>] [<ffffffff81026743>] _flat_send_IPI_mask+0x4b/0x78 [1654627.950121] RSP: e02b:ffff88002803b988 EFLAGS: 00010006 [1654627.955872] RAX: 000000000f000000 RBX: 0000000000000800 RCX: 00000000ffff0095 [1654627.963644] RDX: ffff880028038000 RSI: 0000000000000002 RDI: 000000000000000f [1654627.971410] RBP: 0000000000000002 R08: 0000000000153363 R09: ffffffff812608df [1654627.979176] R10: ffff8800000ba6c0 R11: ffffffff811fd43c R12: ffffffff817000b0 [1654627.986949] R13: 000000000000000f R14: 000000000000000f R15: 0000000000000026 [1654627.994715] FS: 00007f4c15299910(0000) GS:ffff880028038000(0000) knlGS:0000000000000000 [1654628.003434] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [1654628.009595] CR2: ffffffffff5fc310 CR3: 000000007b581000 CR4: 0000000000002660 [1654628.017362] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1654628.025127] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [1654628.032891] Process swapper (pid: 0, threadinfo ffffffff8169a000, task ffffffff8169e3b0) [1654628.041609] Stack: [1654628.044036] ffffffff8100e871 0000000000000000 000000000000006c 0000000000000001 [1654628.051576] <0> ffff88007e613800 ffffffff810240ef ffffffff816cce90 ffffffff81265fb5 [1654628.059807] <0> 00000000f28a6723 0000000000000000 0000000000000001 ffff88007e613800 [1654628.068469] Call Trace: [1654628.071330] <IRQ> [1654628.073835] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.080691] [<ffffffff810240ef>] ? arch_trigger_all_cpu_backtrace+0x3d/0x5d [1654628.088370] [<ffffffff81265fb5>] ? __handle_sysrq+0xaf/0x14b [1654628.094531] [<ffffffff8125f057>] ? kbd_event+0x35e/0x616 [1654628.100347] [<ffffffff814bdbd4>] ? _spin_unlock_irqrestore+0xc/0xd [1654628.107029] [<ffffffff8136ec5f>] ? input_pass_event+0x5a/0x7b [1654628.113276] [<ffffffff81370b8e>] ? input_event+0x5c/0x82 [1654628.119094] [<ffffffff8139fb34>] ? hidinput_hid_event+0x291/0x2c1 [1654628.125853] [<ffffffff8139c992>] ? hid_process_event+0xdd/0x121 [1654628.132274] [<ffffffff8139cd83>] ? hid_report_raw_event+0x3ad/0x42f [1654628.139042] [<ffffffff8139cdf1>] ? hid_report_raw_event+0x41b/0x42f [1654628.145812] [<ffffffff8139d040>] ? hid_input_report+0x23b/0x260 [1654628.152235] [<ffffffff813a5042>] ? hid_irq_in+0xad/0x212 [1654628.158050] [<ffffffff81344b34>] ? usb_hcd_giveback_urb+0x76/0xa9 [1654628.164646] [<ffffffff813632d6>] ? uhci_giveback_urb+0x10c/0x226 [1654628.171153] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.178010] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654628.183912] [<ffffffff81363b71>] ? uhci_scan_schedule+0x677/0x937 [1654628.190505] [<ffffffff81366284>] ? uhci_irq+0x8c0/0x8dd [1654628.196236] [<ffffffff810550bf>] ? hrtimer_interrupt+0xe3/0x18c [1654628.202658] [<ffffffff8100ee1e>] ? xen_vcpuop_set_next_event+0x0/0x60 [1654628.209598] [<ffffffff8100ee1e>] ? xen_vcpuop_set_next_event+0x0/0x60 [1654628.216541] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.223397] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654628.229299] [<ffffffff8100ee1e>] ? xen_vcpuop_set_next_event+0x0/0x60 [1654628.236240] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.243097] [<ffffffff81344605>] ? usb_hcd_irq+0x39/0x7e [1654628.248913] [<ffffffff81077993>] ? handle_IRQ_event+0x2b/0xbd [1654628.255160] [<ffffffff810791cf>] ? handle_fasteoi_irq+0x78/0xaf [1654628.261584] [<ffffffff8123b382>] ? __xen_evtchn_do_upcall+0x1d0/0x28d [1654628.268526] [<ffffffff8123bc9c>] ? xen_evtchn_do_upcall+0x2e/0x42 [1654628.275120] [<ffffffff81012b7e>] ? xen_do_hypervisor_callback+0x1e/0x30 [1654628.282235] <EOI> [1654628.284748] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 [1654628.291084] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 [1654628.297419] [<ffffffff8100e8ef>] ? xen_safe_halt+0xc/0x15 [1654628.303322] [<ffffffff81017ec7>] ? default_idle+0x21/0x3d [1654628.309223] [<ffffffff81010dd7>] ? cpu_idle+0x59/0x91 [1654628.314780] [<ffffffff81728c8c>] ? start_kernel+0x381/0x38d [1654628.320855] [<ffffffff8172bde1>] ? xen_start_kernel+0x5aa/0x5b0 [1654628.327274] Code: 48 8b 05 f1 00 6e 00 83 fe 02 8b 58 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 c3 5f ff f6 c4 10 75 f2 44 89 e8 c1 e0 18 <89> 04 25 10 c3 5f ff 89 e8 09 d8 80 cf 04 83 fd 02 0f 44 c3 89 [1654628.346854] RIP [<ffffffff81026743>] _flat_send_IPI_mask+0x4b/0x78 [1654628.353535] RSP <ffff88002803b988> [1654628.357436] CR2: ffffffffff5fc310 [1654628.361164] ---[ end trace 85ee7cbec9ce72ad ]--- Then it died: [1654628.366199] Kernel panic - not syncing: Fatal exception in interrupt [1654628.372968] Pid: 0, comm: swapper Tainted: G D 2.6.32.48 #1 [1654628.379649] Call Trace: [1654628.382509] <IRQ> [<ffffffff814bbd88>] ? panic+0x86/0x13e [1654628.388490] [<ffffffff812608df>] ? set_cursor+0x3d/0x63 [1654628.394219] [<ffffffff8100eeef>] ? xen_restore_fl_direct_end+0x0/0x1 [1654628.401074] [<ffffffff812608df>] ? set_cursor+0x3d/0x63 [1654628.406802] [<ffffffff812608df>] ? set_cursor+0x3d/0x63 [1654628.412533] [<ffffffff811fd43c>] ? vgacon_cursor+0x0/0x17e [1654628.418520] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.425376] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654628.431277] [<ffffffff811fd43c>] ? vgacon_cursor+0x0/0x17e [1654628.437266] [<ffffffff812608df>] ? set_cursor+0x3d/0x63 [1654628.442995] [<ffffffff814bea64>] ? oops_end+0xa6/0xb3 [1654628.448553] [<ffffffff8102d636>] ? no_context+0x1ff/0x20e [1654628.454452] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654628.460356] [<ffffffff811fd43c>] ? vgacon_cursor+0x0/0x17e [1654628.466343] [<ffffffff8102d7eb>] ? __bad_area_nosemaphore+0x1a6/0x1ca [1654628.473286] [<ffffffff8100eeef>] ? xen_restore_fl_direct_end+0x0/0x1 [1654628.480143] [<ffffffff81041ad5>] ? vprintk+0x310/0x33c [1654628.485783] [<ffffffff8100eeef>] ? xen_restore_fl_direct_end+0x0/0x1 [1654628.492639] [<ffffffff8100c269>] ? __raw_callee_save_xen_pte_val+0x11/0x1e [1654628.500013] [<ffffffff811fd43c>] ? vgacon_cursor+0x0/0x17e [1654628.506003] [<ffffffff8102cf52>] ? spurious_fault+0x147/0x1db [1654628.512251] [<ffffffff814bfe0d>] ? do_page_fault+0x69/0x2fc [1654628.518327] [<ffffffff814bdf45>] ? page_fault+0x25/0x30 [1654628.524056] [<ffffffff811fd43c>] ? vgacon_cursor+0x0/0x17e [1654628.535471] [<ffffffff812608df>] ? set_cursor+0x3d/0x63 [1654628.541200] [<ffffffff81026743>] ? _flat_send_IPI_mask+0x4b/0x78 [1654628.547715] [<ffffffff8102672d>] ? _flat_send_IPI_mask+0x35/0x78 [1654628.554225] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.561080] [<ffffffff810240ef>] ? arch_trigger_all_cpu_backtrace+0x3d/0x5d [1654628.568759] [<ffffffff81265fb5>] ? __handle_sysrq+0xaf/0x14b [1654628.574921] [<ffffffff8125f057>] ? kbd_event+0x35e/0x616 [1654628.580736] [<ffffffff814bdbd4>] ? _spin_unlock_irqrestore+0xc/0xd [1654628.587418] [<ffffffff8136ec5f>] ? input_pass_event+0x5a/0x7b [1654628.593666] [<ffffffff81370b8e>] ? input_event+0x5c/0x82 [1654628.599482] [<ffffffff8139fb34>] ? hidinput_hid_event+0x291/0x2c1 [1654628.606077] [<ffffffff8139c992>] ? hid_process_event+0xdd/0x121 [1654628.612500] [<ffffffff8139cd83>] ? hid_report_raw_event+0x3ad/0x42f [1654628.619268] [<ffffffff8139cdf1>] ? hid_report_raw_event+0x41b/0x42f [1654628.626045] [<ffffffff8139d040>] ? hid_input_report+0x23b/0x260 [1654628.632469] [<ffffffff813a5042>] ? hid_irq_in+0xad/0x212 [1654628.638284] [<ffffffff81344b34>] ? usb_hcd_giveback_urb+0x76/0xa9 [1654628.644879] [<ffffffff813632d6>] ? uhci_giveback_urb+0x10c/0x226 [1654628.651387] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.658243] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654628.664145] [<ffffffff81363b71>] ? uhci_scan_schedule+0x677/0x937 [1654628.670740] [<ffffffff81366284>] ? uhci_irq+0x8c0/0x8dd [1654628.676468] [<ffffffff810550bf>] ? hrtimer_interrupt+0xe3/0x18c [1654628.682891] [<ffffffff8100ee1e>] ? xen_vcpuop_set_next_event+0x0/0x60 [1654628.689834] [<ffffffff8100ee1e>] ? xen_vcpuop_set_next_event+0x0/0x60 [1654628.696774] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.703629] [<ffffffff8100ef02>] ? check_events+0x12/0x20 [1654628.709532] [<ffffffff8100ee1e>] ? xen_vcpuop_set_next_event+0x0/0x60 [1654628.716487] [<ffffffff8100e871>] ? xen_force_evtchn_callback+0x9/0xa [1654628.723348] [<ffffffff81344605>] ? usb_hcd_irq+0x39/0x7e [1654628.729163] [<ffffffff81077993>] ? handle_IRQ_event+0x2b/0xbd [1654628.735412] [<ffffffff810791cf>] ? handle_fasteoi_irq+0x78/0xaf [1654628.741834] [<ffffffff8123b382>] ? __xen_evtchn_do_upcall+0x1d0/0x28d [1654628.748776] [<ffffffff8123bc9c>] ? xen_evtchn_do_upcall+0x2e/0x42 [1654628.755371] [<ffffffff81012b7e>] ? xen_do_hypervisor_callback+0x1e/0x30 [1654628.762485] <EOI> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 [1654628.769428] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001 [1654628.775764] [<ffffffff8100e8ef>] ? xen_safe_halt+0xc/0x15 [1654628.781667] [<ffffffff81017ec7>] ? default_idle+0x21/0x3d [1654628.787568] [<ffffffff81010dd7>] ? cpu_idle+0x59/0x91 [1654628.793124] [<ffffffff81728c8c>] ? start_kernel+0x381/0x38d [1654628.799199] [<ffffffff8172bde1>] ? xen_start_kernel+0x5aa/0x5b0 (XEN) [2012-01-04 22:50:14] Domain 0 crashed: rebooting machine in 5 seconds.> > Thanks and sorry for taking so long. Just coming back from holidays. >One of the ''blktap'' machines is still running tests. It is running Linux 2.6.32.48 and Xen 4.1.3-rc1. By now it has probably ran a 5000 VMs and it has been up for about 20 days. Before it wouldn''t survive ~1000 VMs and 5 days. It would then get into that very, very slow state. My feeling is that Linux 2.6.32.48 had some fixes which the problems I was having. Before I ran Xen 4.1.3-rc1 in combination with Linux 2.6.32.37 on it and it had the slowness problem. It would be nice if we can get that other bug fixed as well. It could be around in 3.x kernels as well. I would love to move to Linux 3.2, but it requires a lot of testing which will take 1-2 months. At least 2.6.32.48 feels more stable and we may be able to fix a few more bugs in it. Thanks for your help! Roderick