Hi All, Running into temporary pauses in our VMs which correspond to these errors in dmesg on the dom0: [1721485.352560] SLUB: Unable to allocate memory on node -1 (gfp=0x20) [1721485.352563] cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 3, min order: 0 [1721485.352566] node 0: slabs: 81, objs: 1296, free: 0 [1721485.352576] swapper: page allocation failure: order:0, mode:0x4020 [1721485.352579] Pid: 0, comm: swapper Not tainted 3.0.3 #1 [1721485.352582] Call Trace: [1721485.352584] <IRQ> [<ffffffff810bea48>] warn_alloc_failed+0x12b/0x142 [1721485.352595] [<ffffffff810be8f2>] ? get_page_from_freelist+0x51c/0x547 [1721485.352601] [<ffffffff810068a5>] ? xen_force_evtchn_callback+0xd/0xf [1721485.352605] [<ffffffff810bf30d>] __alloc_pages_nodemask+0x606/0x67d [1721485.352610] [<ffffffff81006eef>] ? xen_restore_fl_direct_reloc+0x4/0x4 [1721485.352614] [<ffffffff810ed11e>] new_slab+0x7e/0x1f6 [1721485.352617] [<ffffffff810ed430>] __slab_alloc+0x19a/0x33c [1721485.352623] [<ffffffff815b1655>] ? __netdev_alloc_skb+0x1d/0x3c [1721485.352627] [<ffffffff810ed84e>] __kmalloc_track_caller+0x106/0x145 [1721485.352631] [<ffffffff815b1655>] ? __netdev_alloc_skb+0x1d/0x3c [1721485.352634] [<ffffffff815b066d>] __alloc_skb+0x69/0x129 [1721485.352638] [<ffffffff815b1655>] __netdev_alloc_skb+0x1d/0x3c [1721485.352643] [<ffffffff81469c03>] e1000_alloc_rx_buffers+0x7f/0x14c [1721485.352647] [<ffffffff81469f84>] e1000_clean_rx_irq+0x265/0x28c [1721485.352651] [<ffffffff810068a5>] ? xen_force_evtchn_callback+0xd/0xf [1721485.352655] [<ffffffff8146b44a>] e1000_clean+0x75/0x24e [1721485.352658] [<ffffffff81006eef>] ? xen_restore_fl_direct_reloc+0x4/0x4 [1721485.352663] [<ffffffff815b888e>] net_rx_action+0xdd/0x20f [1721485.352668] [<ffffffff810492d7>] __do_softirq+0xd3/0x1bb [1721485.352673] [<ffffffff81094be4>] ? handle_edge_irq+0x9d/0xbc [1721485.352678] [<ffffffff81731b1c>] call_softirq+0x1c/0x30 [1721485.352682] [<ffffffff8100bd89>] do_softirq+0x61/0xbf [1721485.352686] [<ffffffff81049082>] irq_exit+0x43/0xb2 [1721485.352691] [<ffffffff813712b1>] xen_evtchn_do_upcall+0x2f/0x3c [1721485.352695] [<ffffffff81731b6e>] xen_do_hypervisor_callback+0x1e/0x30 [1721485.352697] <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [1721485.352705] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [1721485.352709] [<ffffffff8100693c>] ? xen_safe_halt+0x10/0x1a [1721485.352714] [<ffffffff81010fc3>] ? default_idle+0x5e/0xa6 [1721485.352718] [<ffffffff81009f81>] ? cpu_idle+0x6d/0xa3 [1721485.352723] [<ffffffff81701664>] ? rest_init+0x68/0x6a [1721485.352729] [<ffffffff81cb2d18>] ? start_kernel+0x412/0x41d [1721485.352733] [<ffffffff81cb22cb>] ? x86_64_start_reservations+0xb6/0xba [1721485.352737] [<ffffffff81cb5f55>] ? xen_start_kernel+0x59b/0x5a2 [1721485.352739] Mem-Info: [1721485.352741] DMA per-cpu: [1721485.352744] CPU 0: hi: 0, btch: 1 usd: 0 [1721485.352746] DMA32 per-cpu: [1721485.352748] CPU 0: hi: 186, btch: 31 usd: 176 [1721485.352750] Normal per-cpu: [1721485.352752] CPU 0: hi: 186, btch: 31 usd: 0 [1721485.352757] active_anon:2403 inactive_anon:13164 isolated_anon:0 [1721485.352758] active_file:66256 inactive_file:75740 isolated_file:0 [1721485.352759] unevictable:507 dirty:3175 writeback:40742 unstable:0 [1721485.352760] free:13180 slab_reclaimable:5805 slab_unreclaimable:9005 [1721485.352761] mapped:1983 shmem:4 pagetables:1147 bounce:0 [1721485.352768] DMA free:15904kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15680kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1721485.352774] lowmem_reserve[]: 0 952 10042 10042 [1721485.352784] DMA32 free:36816kB min:1212kB low:1512kB high:1816kB active_anon:9612kB inactive_anon:52656kB active_file:265024kB inactive_file:302960kB unevictable:2028kB isolated(anon):0kB isolated(file):0kB present:975072kB mlocked:2028kB dirty:12700kB writeback:162968kB mapped:7932kB shmem:16kB slab_reclaimable:23220kB slab_unreclaimable:36020kB kernel_stack:2360kB pagetables:4588kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1721485.352790] lowmem_reserve[]: 0 0 9090 9090 [1721485.352799] Normal free:0kB min:11600kB low:14500kB high:17400kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:9308160kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1721485.352805] lowmem_reserve[]: 0 0 0 0 [1721485.352810] DMA: 0*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15904kB [1721485.352822] DMA32: 588*4kB 256*8kB 150*16kB 168*32kB 285*64kB 34*128kB 6*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 36816kB [1721485.352834] Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB [1721485.352845] 153277 total pagecache pages [1721485.352848] 10888 pages in swap cache [1721485.352850] Swap cache stats: add 3069760, delete 3058872, find 3791763/4300237 [1721485.352852] Free swap = 1816844kB [1721485.352854] Total swap = 1943856kB [1721485.377309] 2621424 pages RAM [1721485.377312] 2427446 pages reserved [1721485.377313] 108302 pages shared [1721485.377315] 126166 pages non-shared [1721485.377319] SLUB: Unable to allocate memory on node -1 (gfp=0x20) [1721485.377323] cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 3, min order: 0 [1721485.377326] node 0: slabs: 81, objs: 1296, free: 0 [1721485.377560] SLUB: Unable to allocate memory on node -1 (gfp=0x20) [1721485.377564] cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 3, min order: 0 [1721485.377567] node 0: slabs: 81, objs: 1296, free: 0 xen7 ~ # xm info host : xen7 release : 3.0.3 version : #1 SMP Mon Aug 22 14:25:38 PDT 2011 machine : x86_64 nr_cpus : 24 nr_nodes : 2 cores_per_socket : 6 threads_per_core : 2 cpu_mhz : 2266 hw_caps : bfebfbff:2c100800:00000000:00003f40:009ee3fd:00000000:00000001:00000000 virt_caps : hvm hvm_directio total_memory : 98294 free_memory : 36580 free_cpus : 0 xen_major : 4 xen_minor : 1 xen_extra : .1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable xen_commandline : console=com1,com2,vga com1=115200,8n1 com2=115200,8n1 dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true cc_compiler : gcc version 4.3.4 (Gentoo 4.3.4 p1.1, pie-10.1.5) cc_compile_by : root cc_compile_domain : nmsrv.com cc_compile_date : Mon Aug 22 11:28:50 PDT 2011 xend_config_format : 4 Seeing this on multiple dom0''s which are all running identical hardware (Supermicro X8DTT w/ Intel 82574L gige). Dom0''s are limited to 1gb (dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true) although they don''t go above 250mb used. Not sure if this is a xen bug, network driver issue or something else? - Nathan -- Nathan March<nathan@gt.net> Gossamer Threads Inc. http://www.gossamer-threads.com/ Tel: (604) 687-5804 Fax: (604) 687-5806 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-12 20:17 UTC
Re: [Xen-devel] SLUB allocation error on 3.0.3 / 4.1.1
> total_memory : 98294 > free_memory : 36580 > free_cpus : 0 > xen_major : 4 > xen_minor : 1 > xen_extra : .1 > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p > hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit > xen_pagesize : 4096 > platform_params : virt_start=0xffff800000000000 > xen_changeset : unavailable > xen_commandline : console=com1,com2,vga com1=115200,8n1 > com2=115200,8n1 dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true > cc_compiler : gcc version 4.3.4 (Gentoo 4.3.4 p1.1, pie-10.1.5) > cc_compile_by : root > cc_compile_domain : nmsrv.com > cc_compile_date : Mon Aug 22 11:28:50 PDT 2011 > xend_config_format : 4 > > Seeing this on multiple dom0''s which are all running identical > hardware (Supermicro X8DTT w/ Intel 82574L gige). Dom0''s are limited > to 1gb (dom0_mem=1024M dom0_max_vcpus=1 dom0_vcpus_pin=true) > although they don''t go above 250mb used. > > Not sure if this is a xen bug, network driver issue or something else?It is a Linux kernel bug. It does not respect the dom0_mem=max:X argument so you end up with 98GB of pagetables in Dom0 and you can''t allocate enough memory for your normal drivers (since most of the memory is used for your non-used pagetables). The workaround is to put in your Linux command-line: "mem=1GB" (and keep the dom0_mem=..) arguments. A patch in 3.0.4 (or 3.0.5) should soon surface which will fix this. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel