This is against current x86_64 defconfig build: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <2b> TDT <31> next_to_use <31> next_to_clean <2b> buffer_info[next_to_clean] time_stamp <10004d5f2> next_to_watch <2d> jiffies <10004d7ce> next_to_watch.status <0> ... repeat until eventually ... NETDEV WATCHDOG: eth0: transmit timed out this is on simple scp to dom0 from external box. after a bit watchdog resets, and ping works, only to repeat itself when a try to scp again thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ah, finally people experiencing same bug as me! It is much much worse for me, as soon as I ping from domU network goes down, started with the subarch change. Adam Wendt IPCoast, Inc. On Wed, 8 Feb 2006 12:11 , Chris Wright <chrisw@sous-sol.org> sent:>* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote: >> Yep, this is the bug I warned y''all about at the summit, but you asked >> for the code to be checked in anyway... > >Hehe, get what you ask for... > >> A bug shared is a bug fixed quicker? :-) > >Let''s hope ;-) > >> For us, this only manifests on x86_64, and arrived with the subarch xen >> version of 2.6.12. Extensive inspection of the arch->subarch conversion >> suggests that nothing should have changed, so this is likely a latent >> bug being triggered by slight timing changes. >> >> It sounds like it''s rather easier for you to trigger than it was for us >> -- we had to run xm-test several times to get it to happen. Happy >> hunting, and good luck :-) > >It''s trivial for me to trigger. I''ll keep poking at it. > >thanks, >-chris > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> This is against current x86_64 defconfig build: > > e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > Tx Queue <0> > TDH <2b> > TDT <31> > next_to_use <31> > next_to_clean <2b> > buffer_info[next_to_clean] > time_stamp <10004d5f2> > next_to_watch <2d> > jiffies <10004d7ce> > next_to_watch.status <0> > > ... repeat until eventually ... > > NETDEV WATCHDOG: eth0: transmit timed out > > this is on simple scp to dom0 from external box. after a bit > watchdog resets, and ping works, only to repeat itself when a > try to scp againYep, this is the bug I warned y''all about at the summit, but you asked for the code to be checked in anyway... A bug shared is a bug fixed quicker? :-) For us, this only manifests on x86_64, and arrived with the subarch xen version of 2.6.12. Extensive inspection of the arch->subarch conversion suggests that nothing should have changed, so this is likely a latent bug being triggered by slight timing changes. It sounds like it''s rather easier for you to trigger than it was for us -- we had to run xm-test several times to get it to happen. Happy hunting, and good luck :-) Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote:> Yep, this is the bug I warned y''all about at the summit, but you asked > for the code to be checked in anyway...Hehe, get what you ask for...> A bug shared is a bug fixed quicker? :-)Let''s hope ;-)> For us, this only manifests on x86_64, and arrived with the subarch xen > version of 2.6.12. Extensive inspection of the arch->subarch conversion > suggests that nothing should have changed, so this is likely a latent > bug being triggered by slight timing changes. > > It sounds like it''s rather easier for you to trigger than it was for us > -- we had to run xm-test several times to get it to happen. Happy > hunting, and good luck :-)It''s trivial for me to trigger. I''ll keep poking at it. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Ah, finally people experiencing same bug as me! > > It is much much worse for me, as soon as I ping from domU > network goes down, started with the subarch change.For this bug it might actually be helpful to start collecting information about the hardware it''s observed on. For us, the bug is hard to repro, despite us having tried on several different machines (2 and 4 way SMP, Opteron and Xeon, tg3 and e1000 NICs). If the bug is easier to trigger for you, please post a summary of the hardware and anything unusual about your config (i.e. not default bridged). Thanks, Ian> Adam Wendt > IPCoast, Inc. > > On Wed, 8 Feb 2006 12:11 , Chris Wright <chrisw@sous-sol.org> sent: > > >* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote: > >> Yep, this is the bug I warned y''all about at the summit, but you > >> asked for the code to be checked in anyway... > > > >Hehe, get what you ask for... > > > >> A bug shared is a bug fixed quicker? :-) > > > >Let''s hope ;-) > > > >> For us, this only manifests on x86_64, and arrived with > the subarch > >> xen version of 2.6.12. Extensive inspection of the arch->subarch > >> conversion suggests that nothing should have changed, so this is > >> likely a latent bug being triggered by slight timing changes. > >> > >> It sounds like it''s rather easier for you to trigger than > it was for > >> us > >> -- we had to run xm-test several times to get it to happen. Happy > >> hunting, and good luck :-) > > > >It''s trivial for me to trigger. I''ll keep poking at it. > > > >thanks, > >-chris > > > >_______________________________________________ > >Xen-devel mailing list > >Xen-devel@lists.xensource.com > >http://lists.xensource.com/xen-devel > > > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote:> For us, the bug is hard to repro, despite us having tried on several > different machines (2 and 4 way SMP, Opteron and Xeon, tg3 and e1000 > NICs).xeon, 4 cpu (2-ht), e1000, 4G - works fine w/ 32-bit - dom0 is UP (SMP fails as well) - this is dom0 only, no xend, no domUs, no bridging - limiting to 2G works fine, sounds like something with swiotlb Also, while it was working, I blasted with packets, and eventually got: irq 19: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> <ffffffff80148508>{__report_bad_irq+56} <ffffffff80148721>{note_interrupt+449} <ffffffff80147dcc>{handle_IRQ_event+76} <ffffffff80147ec2>{__do_IRQ+162} <ffffffff8011077b>{do_IRQ+75} <ffffffff802f16b5>{evtchn_do_upcall+117} <ffffffff8010e5f1>{do_hypervisor_callback+37} <ffffffff8011ccc5>{ia32_syscall+13} <ffffffff8010a22a>{hypercall_page+554} <ffffffff8010a22a>{hypercall_page+554} <ffffffff802f14de>{force_evtchn_callback+14} <ffffffff80147db5>{handle_IRQ_event+53} <ffffffff80147ea8>{__do_IRQ+136} <ffffffff8011077b>{do_IRQ+75} <ffffffff802f16b5>{evtchn_do_upcall+117} <ffffffff8010e5f1>{do_hypervisor_callback+37} <EOI> <ffffffff8011ccc5>{ia32_syscall+13} handlers: [<ffffffff80377b80>] (ata_interrupt+0x0/0x1b0) [<ffffffff80396570>] (usb_hcd_irq+0x0/0x70) Disabling IRQ #19 thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> xeon, 4 cpu (2-ht), e1000, 4G > > - works fine w/ 32-bit > - dom0 is UP (SMP fails as well) > - this is dom0 only, no xend, no domUs, no bridging > - limiting to 2G works fine, sounds like something with swiotlbThat''s interesting, but I''d be surprised if it was an swiotlb thing -- it looks so much more like an interrupt problem. e1000 and tg3 shouldn''t be going anywhere near swiotlb anyhow. Please can you try a PAE kernel just to check you don''t have the problem.> Also, while it was working, I blasted with packets, and > eventually got: > > irq 19: nobody cared (try booting with the "irqpoll" option)What devices are on irq 19? It might be worth trying booting nousb on the kernel command line (or usb-handoff) Thanks, Ian> Call Trace: <IRQ> <ffffffff80148508>{__report_bad_irq+56} > <ffffffff80148721>{note_interrupt+449} > <ffffffff80147dcc>{handle_IRQ_event+76} > <ffffffff80147ec2>{__do_IRQ+162} <ffffffff8011077b>{do_IRQ+75} > <ffffffff802f16b5>{evtchn_do_upcall+117} > <ffffffff8010e5f1>{do_hypervisor_callback+37} > <ffffffff8011ccc5>{ia32_syscall+13} > <ffffffff8010a22a>{hypercall_page+554} > <ffffffff8010a22a>{hypercall_page+554} > <ffffffff802f14de>{force_evtchn_callback+14} > <ffffffff80147db5>{handle_IRQ_event+53} > <ffffffff80147ea8>{__do_IRQ+136} > <ffffffff8011077b>{do_IRQ+75} > <ffffffff802f16b5>{evtchn_do_upcall+117} > <ffffffff8010e5f1>{do_hypervisor_callback+37} <EOI> > <ffffffff8011ccc5>{ia32_syscall+13} > handlers: > [<ffffffff80377b80>] (ata_interrupt+0x0/0x1b0) > [<ffffffff80396570>] (usb_hcd_irq+0x0/0x70) Disabling IRQ #19 > > thanks, > -chris >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote:> > xeon, 4 cpu (2-ht), e1000, 4G > > > > - works fine w/ 32-bit > > - dom0 is UP (SMP fails as well) > > - this is dom0 only, no xend, no domUs, no bridging > > - limiting to 2G works fine, sounds like something with swiotlb > > That''s interesting, but I''d be surprised if it was an swiotlb thing -- > it looks so much more like an interrupt problem. e1000 and tg3 shouldn''t > be going anywhere near swiotlb anyhow. > > Please can you try a PAE kernel just to check you don''t have the > problem.It''s 64-bit. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote: whoops, missed this part.> What devices are on irq 19? > > It might be worth trying booting nousb on the kernel command line (or > usb-handoff)19: 5748 Phys-irq libata, uhci_hcd:usb3 with ata, that effectively killed the box. trying with nousb, but i wonder if it''s not evntchn problem? thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > That''s interesting, but I''d be surprised if it was an > swiotlb thing -- > > it looks so much more like an interrupt problem. e1000 and tg3 > > shouldn''t be going anywhere near swiotlb anyhow. > > > > Please can you try a PAE kernel just to check you don''t have the > > problem. > > It''s 64-bit.Yep, but I''m wandering whether it''s worth trying a PAE kernel as that might give a datapoint to indicate whether swiotlb might be involved. My money is still on an interrupt problem (virtual or otherwise), though. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote:> > > That''s interesting, but I''d be surprised if it was an > > swiotlb thing -- > > > it looks so much more like an interrupt problem. e1000 and tg3 > > > shouldn''t be going anywhere near swiotlb anyhow. > > > > > > Please can you try a PAE kernel just to check you don''t have the > > > problem. > > > > It''s 64-bit. > > Yep, but I''m wandering whether it''s worth trying a PAE kernel as that > might give a datapoint to indicate whether swiotlb might be involved.Yeah, sorry, I was confused at first. I''m building PAE atm. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > What devices are on irq 19? > > > > It might be worth trying booting nousb on the kernel > command line (or > > usb-handoff) > > 19: 5748 Phys-irq libata, uhci_hcd:usb3 > > with ata, that effectively killed the box. trying with > nousb, but i wonder if it''s not evntchn problem?Something else to try might be booting with maxcpus=1 on the xen command line, but if you''re running just a uniproc dom0 this really ought not make any difference. When the box is in a bad state, it might be worth using the serial debug keys to get some information about the ioapic and event channels. I''m glad you''ve got an easy way of repro''ing this. I''ve just tried again on a couple of our machines and it took me ages to trigger. Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote:> Yep, but I''m wandering whether it''s worth trying a PAE kernel as that > might give a datapoint to indicate whether swiotlb might be involved.OK, PAE works fine. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christian Leber
2006-Feb-09 15:02 UTC
Re: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang
On Wed, Feb 08, 2006 at 11:36:06PM -0000, Ian Pratt wrote:> For this bug it might actually be helpful to start collecting > information about the hardware it''s observed on. > > For us, the bug is hard to repro, despite us having tried on several > different machines (2 and 4 way SMP, Opteron and Xeon, tg3 and e1000 > NICs).It''s not on Xen, but i get something similar with scp: (and this Tx Unit Hang seems to be a seldom problem) (2.6.15) [4294726.019000] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang [4294726.019000] TDH <cb> [4294726.019000] TDT <cb> [4294726.019000] next_to_use <cb> [4294726.019000] next_to_clean <df> [4294726.019000] buffer_info[next_to_clean] [4294726.019000] dma <1aa25cce> [4294726.019000] time_stamp <fffc40e7> [4294726.019000] next_to_watch <df> [4294726.019000] jiffies <fffc5183> [4294726.019000] next_to_watch.status <0> That happens on AthlonXP+ViaKT600 but not on Intel PIII with Intel 815 chipset. https://launchpad.net/distros/ubuntu/+source/linux-source-2.6.15/+bug/30476 Christian Leber -- "Omnis enim res, quae dando non deficit, dum habetur et non datur, nondum habetur, quomodo habenda est." (Aurelius Augustinus) Translation: <http://gnuhh.org/work/fsf-europe/augustinus.html> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Christian Leber (christian@leber.de) wrote:> That happens on AthlonXP+ViaKT600 but not on Intel PIII with Intel 815 > chipset. > https://launchpad.net/distros/ubuntu/+source/linux-source-2.6.15/+bug/30476Does that have >=2.6.15.2 patchset? thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christian Leber
2006-Feb-09 23:29 UTC
Re: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang
On Thu, Feb 09, 2006 at 09:24:57AM -0800, Chris Wright wrote:> > That happens on AthlonXP+ViaKT600 but not on Intel PIII with Intel 815 > > chipset. > > https://launchpad.net/distros/ubuntu/+source/linux-source-2.6.15/+bug/30476 > > Does that have >=2.6.15.2 patchset?No, but it''s >=2.6.15.1 and the 2.6.15.2 changelog doesn''t seem to be related to ethernet drivers. I tried also 2.6.16-rc2 and it has the same problem. Christian Leber -- "Omnis enim res, quae dando non deficit, dum habetur et non datur, nondum habetur, quomodo habenda est." (Aurelius Augustinus) Translation: <http://gnuhh.org/work/fsf-europe/augustinus.html> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Feb-09 23:55 UTC
RE: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang
> - limiting to 2G works fine, sounds like something with swiotlbI noticed it too and exactly same. I also notice this in the dom0 dmesg. PCI-DMA: Disabling IOMMU. WARNING more than 4GB of memory but IOMMU not compiled in. WARNING 32bit PCI may malfunction. You might want to enable CONFIG_GART_IOMMU Memory: 5868412k/6071120k available (3553k kernel code, 202040k reserved, 1376k data, 300k init) Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel- >bounces@lists.xensource.com] On Behalf Of Chris Wright >Sent: Wednesday, February 08, 2006 5:28 PM >To: Ian Pratt >Cc: Chris Wright; xen-devel@lists.xensource.com; adam@ipcoast.com >Subject: Re: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang > >* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote: >> For us, the bug is hard to repro, despite us having tried on several >> different machines (2 and 4 way SMP, Opteron and Xeon, tg3 and e1000 >> NICs). > >xeon, 4 cpu (2-ht), e1000, 4G > >- works fine w/ 32-bit >- dom0 is UP (SMP fails as well) > - this is dom0 only, no xend, no domUs, no bridging > - limiting to 2G works fine, sounds like something with swiotlb > >Also, while it was working, I blasted with packets, and eventually got: > >irq 19: nobody cared (try booting with the "irqpoll" option) > >Call Trace: <IRQ> <ffffffff80148508>{__report_bad_irq+56} > <ffffffff80148721>{note_interrupt+449} ><ffffffff80147dcc>{handle_IRQ_event+76} > <ffffffff80147ec2>{__do_IRQ+162} <ffffffff8011077b>{do_IRQ+75} > <ffffffff802f16b5>{evtchn_do_upcall+117} ><ffffffff8010e5f1>{do_hypervisor_callback+37} > <ffffffff8011ccc5>{ia32_syscall+13} ><ffffffff8010a22a>{hypercall_page+554} > <ffffffff8010a22a>{hypercall_page+554} ><ffffffff802f14de>{force_evtchn_callback+14} > <ffffffff80147db5>{handle_IRQ_event+53} ><ffffffff80147ea8>{__do_IRQ+136} > <ffffffff8011077b>{do_IRQ+75} ><ffffffff802f16b5>{evtchn_do_upcall+117} > <ffffffff8010e5f1>{do_hypervisor_callback+37} <EOI> > <ffffffff8011ccc5>{ia32_syscall+13} >handlers: >[<ffffffff80377b80>] (ata_interrupt+0x0/0x1b0) >[<ffffffff80396570>] (usb_hcd_irq+0x0/0x70) >Disabling IRQ #19 > >thanks, >-chris > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 9 Feb 2006, at 23:55, Kamble, Nitin A wrote:>> - limiting to 2G works fine, sounds like something with swiotlb > > I noticed it too and exactly same. I also notice this in the dom0 > dmesg. > > PCI-DMA: Disabling IOMMU. > WARNING more than 4GB of memory but IOMMU not compiled in. > WARNING 32bit PCI may malfunction. > You might want to enable CONFIG_GART_IOMMU > Memory: 5868412k/6071120k available (3553k kernel code, 202040k > reserved, 1376k > data, 300k init)That is harmless. In fact our SWIOTLB probably is enabled (look at the lines just above the ones you posted). It''s because we don''t properly (yet) respect the new plug-n-play dma_ops structures in x86_64. I''ve checked in a temporary fix to remove the above misleading lines. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Muli Ben-Yehuda
2006-Feb-10 11:33 UTC
Re: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang
On Fri, Feb 10, 2006 at 11:20:50AM +0000, Keir Fraser wrote:> That is harmless. In fact our SWIOTLB probably is enabled (look at the > lines just above the ones you posted). It''s because we don''t properly > (yet) respect the new plug-n-play dma_ops structures in x86_64. > > I''ve checked in a temporary fix to remove the above misleading > lines.There was also a harmless bug in the initial dma_ops patch that caused the wrong printk in some cases. Jon Mason submitted a fix that is in mainline now. Not sure if this is the case here, but FYI. Cheers, Muli -- Muli Ben-Yehuda http://www.mulix.org | http://mulix.livejournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Kamble, Nitin A (nitin.a.kamble@intel.com) wrote:> > - limiting to 2G works fine, sounds like something with swiotlb > > I noticed it too and exactly same. I also notice this in the dom0 dmesg.After spending hours trying to find something -- anything -- wrong with irq delivery and e1000 hung tx unit, I went back to my original hunch, which was swiotlb related. When TSO is enabled, some debugging showed this: swiotlb_map_page: returns d586a000 dma_map_page: returns ffffffffd586a000 Indeed. a43: e8 00 00 00 00 callq a48 <dma_map_page+0xc8> a44: R_X86_64_PC32 swiotlb_map_page+0xfffffffffffffffc a48: 48 63 d8 movslq %eax,%rbx Whoops. Prototype mismatch. And had we been paying attention: /home/chrisw/hg/xen/xen-unstable/linux-2.6.16-rc2-xen0/arch/x86_64/kernel/../../i386/kernel/pci-dma-xen.c:107: warning: implicit declaration of function ‘swiotlb_map_page’ /home/chrisw/hg/xen/xen-unstable/linux-2.6.16-rc2-xen0/arch/x86_64/kernel/../../i386/kernel/pci-dma-xen.c: In function ‘dma_unmap_page’: /home/chrisw/hg/xen/xen-unstable/linux-2.6.16-rc2-xen0/arch/x86_64/kernel/../../i386/kernel/pci-dma-xen.c:125: warning: implicit declaration of function ‘swiotlb_unmap_page’ Here''s a quick patch that fixes the issue (not ready to apply to -unstable, since it''s a file that''s not in sparse tree). Nitin, this should fix your problem as well. I''ll work on a proper patch later this evening or tomorrow morning. thanks, -chris -- --- linux-2.6.16-rc2/include/asm-x86_64/swiotlb.h 2006-02-15 21:42:24.000000000 -0500 +++ linux-2.6.16-rc2-xen0/include/asm-x86_64/swiotlb.h 2006-02-15 21:19:15.000000000 -0500 @@ -38,6 +38,11 @@ extern void swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg, int nents, int direction); extern int swiotlb_dma_mapping_error(dma_addr_t dma_addr); +extern dma_addr_t swiotlb_map_page(struct device *hwdev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction direction); +extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address, + size_t size, enum dma_data_direction direction); extern void swiotlb_free_coherent (struct device *hwdev, size_t size, void *vaddr, dma_addr_t dma_handle); extern int swiotlb_dma_supported(struct device *hwdev, u64 mask); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 16 Feb 2006, at 03:07, Chris Wright wrote:> Here''s a quick patch that fixes the issue (not ready to apply to > -unstable, since it''s a file that''s not in sparse tree). Nitin, this > should fix your problem as well. I''ll work on a proper patch later > this > evening or tomorrow morning.Thanks for tracking this one down: it''s been our major outstanding bug for a while now. We checked in a suitable fix to -unstable (change pci-dma-xen.c to explicitly include the asm-i386/mach-xen version of swiotlb.h). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 16.02.06 12:36:29 >>> > >On 16 Feb 2006, at 03:07, Chris Wright wrote: > >> Here''s a quick patch that fixes the issue (not ready to apply to >> -unstable, since it''s a file that''s not in sparse tree). Nitin, this >> should fix your problem as well. I''ll work on a proper patch later >> this >> evening or tomorrow morning. > >Thanks for tracking this one down: it''s been our major outstanding bug >for a while now. We checked in a suitable fix to -unstable (change >pci-dma-xen.c to explicitly include the asm-i386/mach-xen version of >swiotlb.h).This doesn''t sound like a good thing to do, as that way all but this one file will include the x86-64 version of it, and you can easily get things out of sync (if e.g. the x86-64 version changes). I would much favor the change being done as originally posted; we have a similar same fix in our tree. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Guillaume Thouvenin
2006-Feb-16 13:10 UTC
Re: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang
On Wed, 15 Feb 2006 19:07:15 -0800 Chris Wright <chrisw@sous-sol.org> wrote:> > --- linux-2.6.16-rc2/include/asm-x86_64/swiotlb.h 2006-02-15 21:42:24.000000000 -0500 > +++ linux-2.6.16-rc2-xen0/include/asm-x86_64/swiotlb.h 2006-02-15 21:19:15.000000000 -0500 > @@ -38,6 +38,11 @@ > extern void swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sg, > int nents, int direction); > extern int swiotlb_dma_mapping_error(dma_addr_t dma_addr); > +extern dma_addr_t swiotlb_map_page(struct device *hwdev, struct page *page, > + unsigned long offset, size_t size, > + enum dma_data_direction direction); > +extern void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dma_address, > + size_t size, enum dma_data_direction direction); > extern void swiotlb_free_coherent (struct device *hwdev, size_t size, > void *vaddr, dma_addr_t dma_handle); > extern int swiotlb_dma_supported(struct device *hwdev, u64 mask);The patch fixes the problem of the tx hang and it also fixes another problem on my box. With the xen unstable (changeset 8833), I couldn''t open a ssh connection on the domain 0 until I ran the xend daemon (I don''t know why running the xend daemon allows the connection). With the patch, I can open a ssh connection as soon as the ssh daemon is running on domain0. Just a remark, if I enable PAE, it doesn''t solve the problem of the tx hang on my computer which is an Intel Xeon (1 CPU) with hyper-threading enabled. I''m using a debian distribution. thanks, Guillaume _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 16 Feb 2006, at 11:45, Jan Beulich wrote:>> Thanks for tracking this one down: it''s been our major outstanding bug >> for a while now. We checked in a suitable fix to -unstable (change >> pci-dma-xen.c to explicitly include the asm-i386/mach-xen version of >> swiotlb.h). > > This doesn''t sound like a good thing to do, as that way all but this > one file will include the x86-64 version of it, > and you can easily get things out of sync (if e.g. the x86-64 version > changes). I would much favor the change being done > as originally posted; we have a similar same fix in our tree.In our tree, pci-dma-xen.c is the only file that uses the core swiotlb functions. Since it''s an i386 file linked against our xen-i386 swiotlb, it seems to make sense for it to include explicitly the i386 swiotlb header file. The best fix of course is to merge the swiotlbs: maybe by incrementally modifying the xen-specific one to get it closer the generic swiotlb code. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 16 Feb 2006, at 13:10, Guillaume Thouvenin wrote:> Just a remark, if I enable PAE, it doesn''t solve the problem of the tx > hang on my computer which is an Intel Xeon (1 CPU) with hyper-threading > enabled. I''m using a debian distribution.Does this go away if your specify ''mem=2G'' as a Xen boot parameter? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Guillaume Thouvenin
2006-Feb-17 07:26 UTC
Re: [Xen-devel] x86_64 eth0 e1000_clean_tx_irq tx hang
On Thu, 16 Feb 2006 13:55:18 +0000 Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > On 16 Feb 2006, at 13:10, Guillaume Thouvenin wrote: > > > Just a remark, if I enable PAE, it doesn''t solve the problem of the tx > > hang on my computer which is an Intel Xeon (1 CPU) with hyper-threading > > enabled. I''m using a debian distribution. > > Does this go away if your specify ''mem=2G'' as a Xen boot parameter?Yes it goes away. Guillaume _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel