thr3ads.net - Xen devel - [Xen-devel] DMA trouble with current xen-sparse [Oct 2005]

If this information is useful, please help other people find it:
Share via:

Stephen C. Tweedie

2005-Oct-28 19:21 UTC

[Xen-devel] DMA trouble with current xen-sparse

Hi,

I''ve been trying to get current xen-sparse up and running on a 2-cpu
box
and have had a number of problems.  One has been that networking is
completely unstable: I get kernel panics under the slightest network
load.

The trouble is that this is a 1G box, so its memory is not large enough
to automatically enable the swiotlb.  (arch/xen/i386/kernel/swiotlb.c
enables swiotlb automatically for dom0 only if there''s at least 2G of
memory.)  And the first time we get a pci_dma_single() request for a
dom0-contiguous region which crosses a page boundary, we hit the BUG_ON
at arch/xen/i386/kernel/pci_dma.c:270 due to dma_map_single() checking:

		IOMMU_BUG_ON(range_straddles_page_boundary(ptr, size));

And this happens *instantly* on any loaded tcp connection on my e1000
NIC.  All I need to do to kill the box is to ssh in and type "find\n".
Instant dom0 death after the ssh client receives about a dozen lines of
output.  The stack trace is appended below.

The PCI mapping documentation certainly says that pci_map_single() needs
to be able to map a single region, not just a single page.  If it
can''t,
then I suspect we really need to enable swiotlb by default, because
we''ll just be unstable without it.

The kernel panics after this with "Fatal DMA error! Please use
''swiotlb=force''".  But of course the default for Xen is
to instantly
reboot at this point before the error is visible.  And even after
catching the message with serial console, I found that "swiotlb=force"
*also* dies on this box, with

(XEN) (file=memory.c, line=57) Could not allocate order=14 extent: id=0 flags=0
(0 of 1)
kernel BUG at arch/xen/i386/mm/hypervisor.c:354
(xen_create_contiguous_region)!
 [<c011a77d>] xen_create_contiguous_region+0x26d/0x2b0
 [<c0112596>] swiotlb_init_with_default_size+0x86/0x1c0
 [<c0112735>] swiotlb_init+0x65/0xa0

because we don''t have a large enough zone at boot time to create the
64MB swiotlb.  

Booting with "swiotlb=force swiotlb=8m" works around both of these
bugs
and allows me to boot; fortunately things are much more stable after I
get this far.

Cheers,
 Stephen

---

kernel BUG at arch/xen/i386/kernel/pci-dma.c:270 (dma_map_single)!
 [<c010ecd6>] dma_map_single+0xf6/0x160
 [<f49cd40b>] e1000_xmit_frame+0x40b/0xd30 [e1000]
 [<c0313510>] qdisc_restart+0x100/0x2f0
 [<c03241d0>] ip_finish_output2+0x0/0x250
 [<c030d594>] nf_hook_slow+0x64/0x110
 [<c03010ff>] dev_queue_xmit+0x9f/0x340
 [<c032404c>] ip_finish_output+0x15c/0x2e0
 [<c03241d0>] ip_finish_output2+0x0/0x250
 [<c0324947>] ip_queue_xmit+0x2b7/0x560
 [<c0323ec0>] dst_output+0x0/0x30
 [<c0155bf2>] poison_obj+0x32/0x60
 [<c0155408>] dbg_redzone1+0x18/0x60
 [<c0155e06>] check_poison_obj+0x26/0x1c0
 [<c0155bf2>] poison_obj+0x32/0x60
 [<c0155408>] dbg_redzone1+0x18/0x60
 [<c0157dbc>] cache_alloc_debugcheck_after+0x4c/0x1b0
 [<c0336e24>] tcp_transmit_skb+0x3d4/0x810
 [<c02fab10>] skb_clone+0x20/0x1d0
 [<c0337efd>] tcp_write_xmit+0x10d/0x330
 [<c0334943>] __tcp_data_snd_check+0xa3/0xe0
 [<c02fa961>] kfree_skbmem+0x21/0x30
 [<c0335069>] tcp_rcv_established+0x2a9/0x910
 [<f4b3f036>] ipt_hook+0x36/0x40 [iptable_filter]
 [<c033ef5a>] tcp_v4_do_rcv+0xfa/0x150
 [<c033f8d5>] tcp_v4_rcv+0x925/0x980
 [<c030d594>] nf_hook_slow+0x64/0x110
 [<c03208d0>] ip_local_deliver_finish+0x0/0x270
 [<c03206bc>] ip_local_deliver+0xdc/0x2f0
 [<c03208d0>] ip_local_deliver_finish+0x0/0x270
 [<c0320f0e>] ip_rcv+0x3ce/0x5b0
 [<c03210f0>] ip_rcv_finish+0x0/0x320
 [<c0301be0>] netif_receive_skb+0x250/0x310
 [<f49cf3ae>] e1000_clean_rx_irq+0x13e/0x5d0 [e1000]
 [<f49ce8a2>] e1000_clean+0x52/0x1c0 [e1000]
 [<c0301f2c>] net_rx_action+0xdc/0x220
 [<c0128f4a>] __do_softirq+0x8a/0x120
 [<c012905d>] do_softirq+0x7d/0x80
 [<c010ee22>] do_IRQ+0x22/0x30
 [<c01049be>] evtchn_do_upcall+0x9e/0xe0
 [<c010a2f0>] hypervisor_callback+0x2c/0x34
 [<c0107b30>] xen_idle+0x40/0x80
 [<c0107bd4>] cpu_idle+0x64/0xb0
 [<c0436a4f>] start_kernel+0x1af/0x210
 [<c0436380>] unknown_bootoption+0x0/0x220



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Oct-29 08:55 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On 28 Oct 2005, at 20:21, Stephen C. Tweedie wrote:
> The trouble is that this is a 1G box, so its memory is not large enough
> to automatically enable the swiotlb.  (arch/xen/i386/kernel/swiotlb.c
> enables swiotlb automatically for dom0 only if there''s at least 2G
of
> memory.)  And the first time we get a pci_dma_single() request for a
> dom0-contiguous region which crosses a page boundary, we hit the BUG_ON
> at arch/xen/i386/kernel/pci_dma.c:270 due to dma_map_single() checking:
>
> 		IOMMU_BUG_ON(range_straddles_page_boundary(ptr, size));
>
> And this happens *instantly* on any loaded tcp connection on my e1000
> NIC.  All I need to do to kill the box is to ssh in and type
"find\n".
> Instant dom0 death after the ssh client receives about a dozen lines of
> output.  The stack trace is appended below.
Is the network interface set up to use jumbo frames? Otherwise I 
wouldn''t expect alloc_skb() to allocate a data area that straddles a 
page boundary, since the allocation will come from one of the 
sub-page-sized power-of-two kmem caches.

If the problem is jumbo frames, we might need to add a hook to 
alloc_skb(). Using swiotlb will suck hugely.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Muli Ben-Yehuda

2005-Oct-30 09:52 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On Fri, Oct 28, 2005 at 03:21:20PM -0400, Stephen C. Tweedie
wrote:> Hi,
> 
> I''ve been trying to get current xen-sparse up and running on a
2-cpu box
> and have had a number of problems.  One has been that networking is
> completely unstable: I get kernel panics under the slightest network
> load.
FYI, I opened bugzilla #373 to track this issue.

Cheers,
Muli
-- 
Muli Ben-Yehuda
http://www.mulix.org | http://mulix.livejournal.com/


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2005-Nov-02 15:32 UTC

head link

RE: [Xen-devel] DMA trouble with current xen-sparse

> On 28 Oct 2005, at 20:21, Stephen C. Tweedie wrote:
> 
> > The trouble is that this is a 1G box, so its memory is not large 
> > enough to automatically enable the swiotlb.  
> > (arch/xen/i386/kernel/swiotlb.c enables swiotlb 
> automatically for dom0 
> > only if there''s at least 2G of
> > memory.)  And the first time we get a pci_dma_single() 
> request for a 
> > dom0-contiguous region which crosses a page boundary, we hit the 
> > BUG_ON at arch/xen/i386/kernel/pci_dma.c:270 due to 
> dma_map_single() checking:
Does your card support TSO? What revision e1000 is it?

Please can you try turning it off with: 
  ethtool -K eth0 tso off

If TSO is the problem we''ll come up with a better fix than using
swiotlb.


Thanks,
Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stephen Tweedie

2005-Nov-02 15:36 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

Hi,

On Wed, Nov 02, 2005 at 03:32:58PM -0000, Ian Pratt wrote:
 > Does your card support TSO? What revision e1000 is it?
Yes, and I''ll check on Friday once I''m back from travelling
(but it is
a very recent box.)
 > Please can you try turning it off with: 
>   ethtool -K eth0 tso off
I already tried that and it did not help.  I''ve also tried both gcc32
and gcc4 with no success.

Cheers,
 Stephen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Veillard

2005-Nov-02 15:59 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On Wed, Nov 02, 2005 at 10:36:17AM -0500, Stephen Tweedie
wrote:> Hi,
> 
> On Wed, Nov 02, 2005 at 03:32:58PM -0000, Ian Pratt wrote:
>  
> > Does your card support TSO? What revision e1000 is it?
> 
> Yes, and I''ll check on Friday once I''m back from
travelling (but it is
> a very recent box.)
  I am seeing the exact same problem with my Dell Latitude D800 laptop using
Ethernet controller: Broadcom Corporation NetXtreme BCM5705M Gigabit Ethernet
(rev 01)
This is a relatively common and not so recent configuration.
> > Please can you try turning it off with: 
> >   ethtool -K eth0 tso off
> 
> I already tried that and it did not help.  I''ve also tried both
gcc32
> and gcc4 with no success.
[root@localhost ~]# ethtool -K eth0 tso off
Cannot set device tcp segmentation offload settings: Operation not supported

  too bad ...
  with ''swiotlb=force swiotlb=8m'' kernel parameters the box is
stable,
without it very basic network access can crash it (say ''locate
lib'' over ssh)
and then the whole system reboots.

  100% reproductible for me, and without crazy hardware :-)

   Hope this helps,

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Nov-02 17:12 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On 2 Nov 2005, at 15:59, Daniel Veillard wrote:
> [root@localhost ~]# ethtool -K eth0 tso off
> Cannot set device tcp segmentation offload settings: Operation not 
> supported
>
>   too bad ...
>   with ''swiotlb=force swiotlb=8m'' kernel parameters the
box is stable,
> without it very basic network access can crash it (say ''locate
lib''
> over ssh)
> and then the whole system reboots.
>
>   100% reproductible for me, and without crazy hardware :-)
It''d be interesting to know what form of skbuffs get sent to the driver
when this happens. e.g., how big is the skbuff data area, is the skbuff 
fragmented, etc.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Veillard

2005-Nov-02 23:04 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On Wed, Nov 02, 2005 at 05:12:27PM +0000, Keir Fraser
wrote:> 
> On 2 Nov 2005, at 15:59, Daniel Veillard wrote:
> 
> >[root@localhost ~]# ethtool -K eth0 tso off
> >Cannot set device tcp segmentation offload settings: Operation not 
> >supported
> >
> >  too bad ...
> >  with ''swiotlb=force swiotlb=8m'' kernel parameters
the box is stable,
> >without it very basic network access can crash it (say ''locate
lib''
> >over ssh)
> >and then the whole system reboots.
> >
> >  100% reproductible for me, and without crazy hardware :-)
> 
> It''d be interesting to know what form of skbuffs get sent to the
driver
> when this happens. e.g., how big is the skbuff data area, is the skbuff 
> fragmented, etc.
  I''m not a kernel hacker, but if you give me a patch displaying those
informations at the IOMMU_BUG_ON pointed by Steven, I will gladly rebuild 
and try to reboot over it to give you the informations (I have no serial
so hint on avoiding the instant reboot of the dom0 would help). Oh yeah
it''s just dom0 on top of the hypervisor, no domU even started.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Vincent Hanquez

2005-Nov-03 02:45 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On Wed, Nov 02, 2005 at 06:04:25PM -0500, Daniel Veillard
wrote:>   I''m not a kernel hacker, but if you give me a patch displaying
those
> informations at the IOMMU_BUG_ON pointed by Steven, I will gladly rebuild 
> and try to reboot over it to give you the informations (I have no serial
> so hint on avoiding the instant reboot of the dom0 would help). Oh yeah
> it''s just dom0 on top of the hypervisor, no domU even started.
Hi Daniel,

could you try the following patch just to have a bit more information
about the pointer and the size ?

diff -r ca2e91ab4311 linux-2.6-xen-sparse/arch/xen/i386/kernel/pci-dma.c
--- a/linux-2.6-xen-sparse/arch/xen/i386/kernel/pci-dma.c	Thu Nov  3 01:45:07
2005
+++ b/linux-2.6-xen-sparse/arch/xen/i386/kernel/pci-dma.c	Wed Nov  2 21:32:34
2005
@@ -267,6 +267,8 @@
 		dma = swiotlb_map_single(dev, ptr, size, direction);
 	} else {
 		dma = virt_to_bus(ptr);
+		if (range_straddles_page_boundary(ptr, size))
+			printk("ptr: %p %zd\n", ptr, size);
 		IOMMU_BUG_ON(range_straddles_page_boundary(ptr, size));
 		IOMMU_BUG_ON(address_needs_mapping(dev, dma));
 	}


stick a while (1) ; after the printk would help you to avoid the reboot
something like:

if (range_straddles_page_boundary(ptr, size)) {
	printk("ptr: %p %zd\n", ptr, size);
	while (1);
}

Cheers,
-- 
Vincent Hanquez

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Veillard

2005-Nov-03 14:51 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On Thu, Nov 03, 2005 at 03:45:27AM +0100, Vincent Hanquez
wrote:> On Wed, Nov 02, 2005 at 06:04:25PM -0500, Daniel Veillard wrote:
> >   I''m not a kernel hacker, but if you give me a patch
displaying those
> > informations at the IOMMU_BUG_ON pointed by Steven, I will gladly
rebuild
> > and try to reboot over it to give you the informations (I have no
serial
> > so hint on avoiding the instant reboot of the dom0 would help). Oh
yeah
> > it''s just dom0 on top of the hypervisor, no domU even
started.
> 
> Hi Daniel,
  Hi, Salut :-)
> could you try the following patch just to have a bit more information
> about the pointer and the size ?
[...]> stick a while (1) ; after the printk would help you to avoid the reboot
> something like:
Sure, took a bit of time to recompile the kernel (I didn''t do this for
years)
and it crashed as expected, here are the info:

  ptr: f160ed8e 1514

the size looks a full ethernet frame, i.e. 1500 of payload, 2 ethernet
addresses and the 2bytes for the ethernet type, that looks kosher to me
but clearly it is not aligned.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2005-Nov-04 14:50 UTC

head link

RE: [Xen-devel] DMA trouble with current xen-sparse

> Sure, took a bit of time to recompile the kernel (I didn''t do 
> this for years) and it crashed as expected, here are the info:
> 
>   ptr: f160ed8e 1514
> 
> the size looks a full ethernet frame, i.e. 1500 of payload, 2 
> ethernet addresses and the 2bytes for the ethernet type, that 
> looks kosher to me but clearly it is not aligned.
Please can you try using either our -xen or -xen0 kernel config. I
strongly suspect there''s something in your config that is breaking this
for you, just not sure what.

(NB: make sure you ''rm dist/install/boot/config*'' to avoid
make woprld
from grabbing your old config)

Best,
Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2005-Nov-07 13:51 UTC

head link

RE: [Xen-devel] DMA trouble with current xen-sparse

> Sure, took a bit of time to recompile the kernel (I didn''t do 
> this for years) and it crashed as expected, here are the info:
> 
>   ptr: f160ed8e 1514
> 
> the size looks a full ethernet frame, i.e. 1500 of payload, 2 
> ethernet addresses and the 2bytes for the ethernet type, that 
> looks kosher to me but clearly it is not aligned.
This allocation isn''t aligned to the next power of 2 boundary ---
usually 1514 byte allocations are 2KB aligned. 

You''re not enabling some experimental option in your config that
changes
the alignment of slab allocations are you?

Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Veillard

2005-Nov-07 14:15 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On Mon, Nov 07, 2005 at 01:51:35PM -0000, Ian Pratt
wrote:> 
> > Sure, took a bit of time to recompile the kernel (I didn''t do
> > this for years) and it crashed as expected, here are the info:
> > 
> >   ptr: f160ed8e 1514
> > 
> > the size looks a full ethernet frame, i.e. 1500 of payload, 2 
> > ethernet addresses and the 2bytes for the ethernet type, that 
> > looks kosher to me but clearly it is not aligned.
> 
> This allocation isn''t aligned to the next power of 2 boundary ---
> usually 1514 byte allocations are 2KB aligned. 
> 
> You''re not enabling some experimental option in your config that
changes
> the alignment of slab allocations are you?
  Hi Ian,

sorry for not responding to your previous message. The point is that I
don''t
really know offhand myself those kernel internals aspects. Steven can certainly
provide a more informed answer. I checked our kernel config, and I see

  CONFIG_DEBUG_SLAB=y

to be set up in our kernel-2.6.12-i686-hypervisor.config. Browsing to check
all the other DEBUG option which might be potentially relevant I only found
CONFIG_DEBUG_KERNEL CONFIG_DEBUG_HIGHMEM and CONFIG_DEBUG_INFO enabled.
CONFIG_DEBUG_DRIVER is not set. The Xen options are:

CONFIG_XEN=y
CONFIG_ARCH_XEN=y
CONFIG_NO_IDLE_HZ=y
CONFIG_XEN_WRITABLE_PAGETABLES=y
# CONFIG_XEN_SHADOW_MODE is not set
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_FOREIGN_PAGES=y
CONFIG_HAVE_ARCH_DEV_ALLOC_SKB=y
CONFIG_XEN_BLKDEV_GRANT=y
# CONFIG_XEN_BLKDEV_TAP_BE is not set
# CONFIG_XEN_BLKDEV_TAP is not set
# CONFIG_XEN_NETDEV_GRANT_TX is not set
# CONFIG_XEN_NETDEV_GRANT_RX is not set
# CONFIG_SMP_ALTERNATIVES is not set
CONFIG_X86=y
# CONFIG_X86_64 is not set
CONFIG_XENARCH="i386"
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
# CONFIG_M686 is not set

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stephen C. Tweedie

2005-Nov-07 21:28 UTC

head link

RE: [Xen-devel] DMA trouble with current xen-sparse

Hi,

On Fri, 2005-11-04 at 14:50 +0000, Ian Pratt wrote:
>  > Sure, took a bit of time to recompile the kernel (I didn''t
do
> > this for years) and it crashed as expected, here are the info:
> > 
> >   ptr: f160ed8e 1514
> > 
> > the size looks a full ethernet frame, i.e. 1500 of payload, 2 
> > ethernet addresses and the 2bytes for the ethernet type, that 
> > looks kosher to me but clearly it is not aligned.
> 
> Please can you try using either our -xen or -xen0 kernel config. I
> strongly suspect there''s something in your config that is breaking
this
> for you, just not sure what.
I just tried to build it; it would not boot.  That was building the
2.6.12 xen-sparse w/ gcc4; retrying with gcc32 now.

But I suspect that the problem is CONFIG_SLAB_DEBUG.  That sets up slab
redzoning which checks for buffer overruns.  One consequence is that
cached objects grow very slightly --- enough that the 2k kmalloc cache
gets created with 3 objects per order-2 slab, ie. all MTU-sized frames
are going to be allocated from an 8k slab and one in three will straddle
the page boundary.

I may not have time to verify that today, but it sounds like a likely
explanation for what we''re seeing.

NB. even without redzoning, the slab allocator will try both order-1 and
order-2 slab sizes to see what minimises the wasted space in a slab, so
any subsystem that''s doing its own allocation of objects from a pool
outside kmalloc may hit a size that creates these page-straddling
caches.  

There''s a hacky quick-fix, which is to change

#define BREAK_GFP_ORDER_HI	 1

from 1 to 0 in mm/slab.c.  But that''s just going to waste more slab
cache space for many caches.   Without that change, the fact is that an
important debugging option is creating cross-page objects routinely, and
that the slab allocator can create such objects quite normally even
without that option; so it may end up being something that Xen just has
to deal with.

--Stephen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2005-Nov-07 23:03 UTC

head link

RE: [Xen-devel] DMA trouble with current xen-sparse

> from 1 to 0 in mm/slab.c.  But that''s just going to waste more
slab
> cache space for many caches.   Without that change, the fact 
> is that an
> important debugging option is creating cross-page objects 
> routinely, and that the slab allocator can create such 
> objects quite normally even without that option; so it may 
> end up being something that Xen just has to deal with.
The best xen fix for this is for us to hook alloc_skb (rather than just
dev_alloc_skb). This will enable us to solve the jumbo frames issue too.

Ian 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Daniel Veillard

2005-Nov-08 06:41 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On Mon, Nov 07, 2005 at 04:28:59PM -0500, Stephen C. Tweedie
wrote:> Hi,
> 
> On Fri, 2005-11-04 at 14:50 +0000, Ian Pratt wrote:
> 
> >  > Sure, took a bit of time to recompile the kernel (I
didn''t do
> > > this for years) and it crashed as expected, here are the info:
> > > 
> > >   ptr: f160ed8e 1514
> > > 
> > > the size looks a full ethernet frame, i.e. 1500 of payload, 2 
> > > ethernet addresses and the 2bytes for the ethernet type, that 
> > > looks kosher to me but clearly it is not aligned.
> > 
> > Please can you try using either our -xen or -xen0 kernel config. I
> > strongly suspect there''s something in your config that is
breaking this
> > for you, just not sure what.
> 
> I just tried to build it; it would not boot.  That was building the
> 2.6.12 xen-sparse w/ gcc4; retrying with gcc32 now.
> 
> But I suspect that the problem is CONFIG_SLAB_DEBUG.  That sets up slab
> redzoning which checks for buffer overruns.  One consequence is that
  Just to confirm that CONFIG_SLAB_DEBUG is the one exposing the issue.
I recompiled the exact same kernel with just that option turned off
and the tg3 driver does not seems to hang anymore.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stephen C. Tweedie

2005-Nov-08 15:25 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

Hi,

On Tue, 2005-11-08 at 15:55 +0000, Keir Fraser wrote:
> >   Just to confirm that CONFIG_SLAB_DEBUG is the one exposing the
issue.
> > I recompiled the exact same kernel with just that option turned off
> > and the tg3 driver does not seems to hang anymore.
> 
> This is now fixed in our tree (changeset 7700:98bcd8fbd5e3). Should get 
> pushed to the public repository in an hour or two...
Thanks; I''ll have a look at that when it shows up.  My main test box at
work just died, though, so it might be a while before I can test it out
properly.

Cheers,
 Stephen



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2005-Nov-08 15:55 UTC

head link

Re: [Xen-devel] DMA trouble with current xen-sparse

On 8 Nov 2005, at 06:41, Daniel Veillard wrote:
>> I just tried to build it; it would not boot.  That was building the
>> 2.6.12 xen-sparse w/ gcc4; retrying with gcc32 now.
>>
>> But I suspect that the problem is CONFIG_SLAB_DEBUG.  That sets up 
>> slab
>> redzoning which checks for buffer overruns.  One consequence is that
>
>   Just to confirm that CONFIG_SLAB_DEBUG is the one exposing the issue.
> I recompiled the exact same kernel with just that option turned off
> and the tg3 driver does not seems to hang anymore.
This is now fixed in our tree (changeset 7700:98bcd8fbd5e3). Should get 
pushed to the public repository in an hour or two...

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Oct 2005 - DMA trouble with current xen-sparse

[Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

RE: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

RE: [Xen-devel] DMA trouble with current xen-sparse

RE: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

RE: [Xen-devel] DMA trouble with current xen-sparse

RE: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse

Re: [Xen-devel] DMA trouble with current xen-sparse