Hi list,
I have a very strange problem with my network. I have 2 internet
connections: A - 1 Gbit, B - 100Mbps.
Network layout:
A, B
| |
[Brd1]
/ \
[L1] [L2]
\ /
[ GW1]
...................
Clients
.....................
Brd1 runs bgpd, and balances the traffic through L1 and L2.
L1 and L2 do traffic shaping.
GW1 does some packet filtering, and balances the traffic through L1 and L2.
Every interface is gigabit. (Realtek NICs)
I''m using IMQ on L1 and L2, to separate the traffic into 2 zones,
international and local, with HTB for shaping.
The system works fine for some time, but when the traffic hits 200Mbps, and
ocassionally bursts to 250-300Mbps,
L1 and L2 behave strangely (packet loss > 30%, increased latency +20ms),
sometimes they even hang, leaving me with the only solution: rebooting them.
I''ve checked the CPU usage, it stays around 80% during the highest
traffic.
I''ve examined the logs, and here is what i''ve found:
Feb 11 08:04:05 l1 kernel: cpu 0 cold: low 0, high 0, batch 1 used:0
Feb 11 08:04:05 l1 kernel: DMA32 per-cpu: empty
Feb 11 08:04:05 l1 kernel: Normal per-cpu:
Feb 11 08:04:05 l1 kernel: cpu 0 hot: low 0, high 186, batch 31 used:79
Feb 11 08:04:05 l1 kernel: cpu 0 cold: low 0, high 62, batch 15 used:52
Feb 11 08:04:05 l1 kernel: HighMem per-cpu: empty
Feb 11 08:04:05 l1 kernel: Free pages: 3032kB (0kB HighMem)
Feb 11 08:04:05 l1 kernel: Active:15050 inactive:8995 dirty:0 writeback:0
unstable:0 free:758 slab:102918 mapped:3203 pagetables:101
Feb 11 08:04:05 l1 kernel: DMA free:2016kB min:88kB low:108kB high:132kB
active:28kB inactive:1092kB present:16384kB pages_scanned:0 all_unrec
laimable? no
Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 495 495
Feb 11 08:04:05 l1 kernel: DMA32 free:0kB min:0kB low:0kB high:0kB
active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 495 495
Feb 11 08:04:05 l1 kernel: Normal free:1016kB min:2800kB low:3500kB
high:4200kB active:60172kB inactive:34888kB present:507584kB pages_scanned
:0 all_unreclaimable? no
Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 0 0
Feb 11 08:04:05 l1 kernel: HighMem free:0kB min:128kB low:128kB high:128kB
active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimab
le? no
Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 0 0
...........
Feb 11 08:04:05 l1 kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0
Feb 11 08:04:05 l1 kernel: Free swap = 987956kB
Feb 11 08:04:05 l1 kernel: Total swap = 987956kB
Feb 11 08:04:05 l1 kernel: Free swap: 987956kB
Feb 11 08:04:05 l1 kernel: 130992 pages of RAM
Feb 11 08:04:05 l1 kernel: 0 pages of HIGHMEM
Feb 11 08:04:05 l1 kernel: 2137 reserved pages
Feb 11 08:04:05 l1 kernel: 28840 pages shared
Feb 11 08:04:05 l1 kernel: 0 pages swap cached
Feb 11 08:04:05 l1 kernel: 0 pages dirty
Feb 11 08:04:05 l1 kernel: 0 pages writeback
Feb 11 08:04:05 l1 kernel: 3203 pages mapped
Feb 11 08:04:05 l1 kernel: 102918 pages slab
Feb 11 08:04:05 l1 kernel: 101 pages pagetables
Feb 11 08:04:05 l1 kernel: ksoftirqd/0: page allocation failure. order:0,
mode:0x20
Feb 11 08:04:05 l1 kernel: [<c0137fa6>] __alloc_pages+0x1e6/0x2b0
Feb 11 08:04:05 l1 kernel: [<c013ac50>] kmem_getpages+0x30/0x90
Feb 11 08:04:05 l1 kernel: [<c013b89c>] cache_grow+0x8c/0x120
Feb 11 08:04:05 l1 kernel: [<c013ba4f>] cache_alloc_refill+0x11f/0x1d0
Feb 11 08:04:05 l1 kernel: [<c013bd6f>] __kmalloc+0x4f/0x60
Feb 11 08:04:05 l1 kernel: [<c028f200>] __alloc_skb+0x40/0x130
Feb 11 08:04:05 l1 kernel: [<c023c4a0>] e1000_alloc_rx_buffers+0x60/0x360
Feb 11 08:04:05 l1 kernel: [<c023bc83>] e1000_clean_rx_irq+0x1d3/0x4a0
Feb 11 08:04:05 l1 kernel: [<c02649bb>] rtl8169_rx_fill+0x5b/0x70
Feb 11 08:04:05 l1 kernel: [<c023b4fa>] e1000_clean+0x9a/0x150
Feb 11 08:04:05 l1 kernel: [<c011d790>] ksoftirqd+0x0/0x80
Feb 11 08:04:05 l1 kernel: [<c0294ce1>] net_rx_action+0x61/0xe0
Feb 11 08:04:05 l1 kernel: [<c011d479>] __do_softirq+0x79/0x90
Feb 11 08:04:05 l1 kernel: [<c011d4b6>] do_softirq+0x26/0x30
Feb 11 08:04:05 l1 kernel: [<c011d7dd>] ksoftirqd+0x4d/0x80
Feb 11 08:04:05 l1 kernel: [<c012a3cc>] kthread+0x9c/0xb0
Feb 11 08:04:05 l1 kernel: [<c012a330>] kthread+0x0/0xb0
Feb 11 08:04:05 l1 kernel: [<c0100f65>] kernel_thread_helper+0x5/0x10
And it continues like this for a long, long time ....
Does anybody know whats wrong, or how can I fix this?
Thanks.
Andrei SANDU.
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc