Eric Tessler
2007-Aug-05 06:59 UTC
[Xen-users] OOM killer observed during heavy I/O from VMs (XEN 3.0.4 and XEN 3.1)
Under both XEN 3.0.4 (2.6.16.33) and XEN 3.1 (2.6.18), I can make the OOM killer appear in dom0 of my server by doing heavy I/O from within a VM. If I start 5 VMs on the same server, each VM doing constant I/O over its boot disk (read/write a 2GB file), after about 30 minutes the OOM killer appears in dom0 and starts killing processes. This was observed using 256MB in dom0. If I bump the memory in dom0 up to 512MB, this seems to fix the problem (I have not seen the OOM killer in this configuration). This used to work in XEN 2.0.7 with 256MB in dom0. Below I have included the OOM killer output and memory/slab info for the server. I want to get it working using 256MB in dom0 - this smells like a bug in the kernel/XEN as it appears there is plenty of available swap and about 40MB of memory (using 512MB swap). Here is OOM killer info from the message log when it hits: Aug 4 18:36:08 DMA: 1078*4kB 102*8kB 34*16kB 15*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 7*4096kB = 36168kB Aug 4 18:36:08 DMA32: empty Aug 4 18:36:08 Normal: empty Aug 4 18:36:08 HighMem: empty Aug 4 18:36:08 Swap cache: add 29953, delete 29286, find 9785/14072, race 0+50 Aug 4 18:36:08 Free swap = 498904kB Aug 4 18:36:08 Total swap = 524280kB Aug 4 18:36:08 Free swap: 498904kB Aug 4 18:36:08 67584 pages of RAM Aug 4 18:36:08 0 pages of HIGHMEM Aug 4 18:36:08 18261 reserved pages Aug 4 18:36:13 31927 pages shared Aug 4 18:36:19 667 pages swap cached Aug 4 18:36:24 1 pages dirty Aug 4 18:36:30 27115 pages writeback Aug 4 18:36:32 1090 pages mapped Aug 4 18:36:36 6800 pages slab Aug 4 18:36:44 695 pages pagetables Aug 4 18:36:51 oom-killer: gfp_mask=0x200d2, order=0 Aug 4 18:36:51 [<c0105801>] show_trace+0x21/0x30 Aug 4 18:36:51 [<c010593e>] dump_stack+0x1e/0x20 Aug 4 18:36:51 [<c0140de0>] out_of_memory+0x90/0xc0 Aug 4 18:36:52 [<c014219d>] __alloc_pages+0x2ed/0x320 Aug 4 18:36:52 [<c014d0c0>] do_wp_page+0xa0/0x4c0 Aug 4 18:36:52 [<c014ddd8>] do_swap_page+0x2f8/0x480 Aug 4 18:36:52 [<c014e972>] __handle_mm_fault+0x302/0x430 Aug 4 18:36:53 [<c011581f>] do_page_fault+0x1df/0x906 Aug 4 18:36:53 [<c01054cb>] error_code+0x2b/0x30 Aug 4 18:36:53 [<c0104f91>] handle_signal+0x81/0x170 Aug 4 18:36:53 [<c0105136>] do_signal+0xb6/0x170 Aug 4 18:36:54 [<c010522a>] do_notify_resume+0x3a/0x3c Aug 4 18:36:54 [<c01053ef>] work_notifysig+0x13/0x18 Aug 4 18:36:54 Mem-info: Aug 4 18:36:54 DMA per-cpu: Aug 4 18:36:54 cpu 0 hot: high 90, batch 15 used:11 Aug 4 18:36:54 cpu 0 cold: high 30, batch 7 used:28 Aug 4 18:36:54 DMA32 per-cpu: empty Aug 4 18:36:54 Normal per-cpu: empty Aug 4 18:36:54 HighMem per-cpu: empty Aug 4 18:36:54 Free pages: 36492kB (0kB HighMem) Aug 4 18:36:54 Active:988 inactive:27911 dirty:1 writeback:26988 unstable:0 free:9123 slab:6800 mapped:1089 pagetables:695 Aug 4 18:36:54 DMA free:36492kB min:32768kB low:40960kB high:49152kB active:3952kB inactive:111644kB present:270336kB pages_scanned:29094 all_unreclaimable? no Aug 4 18:36:54 lowmem_reserve[]: 0 0 0 0 Aug 4 18:36:54 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Aug 4 18:36:54 lowmem_reserve[]: 0 0 0 0 Aug 4 18:36:54 Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Aug 4 18:36:54 lowmem_reserve[]: 0 0 0 0 Aug 4 18:36:54 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Aug 4 18:36:54 lowmem_reserve[]: 0 0 0 0 Memory/slab info just before OOM hits: MemTotal: 262344 kB MemFree: 43616 kB Buffers: 99044 kB Cached: 7940 kB SwapCached: 5420 kB Active: 9300 kB Inactive: 103380 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 262344 kB LowFree: 43616 kB SwapTotal: 524280 kB SwapFree: 501272 kB Dirty: 4836 kB Writeback: 92056 kB Mapped: 7936 kB Slab: 25428 kB CommitLimit: 655452 kB Committed_AS: 765172 kB PageTables: 2072 kB VmallocTotal: 593912 kB VmallocUsed: 2812 kB VmallocChunk: 590208 kB Cache Num Total Size Pages fib6_nodes 8 113 32 113 ip6_dst_cache 28 30 256 15 ndisc_cache 1 20 192 20 RAWv6 4 6 640 6 UDPv6 1 7 576 7 tw_sock_TCPv6 0 0 128 30 request_sock_TCPv6 0 0 128 30 TCPv6 0 0 1152 7 nbd-wi 52 78 48 78 bridge_fdb_cache 35 59 64 59 ip_fib_alias 15 113 32 113 ip_fib_hash 15 113 32 113 ext3_inode_cache 309 576 460 8 ext3_xattr 0 0 44 84 journal_handle 64 169 20 169 journal_head 196 504 52 72 revoke_table 6 254 12 254 revoke_record 0 0 16 203 dm_tio 11142 11165 16 203 dm_io 11105 11154 20 169 scsi_cmd_cache 10 10 384 10 Cache Num Total Size Pages sgpool-128 34 34 3072 2 sgpool-64 35 35 1536 5 sgpool-32 35 35 768 5 sgpool-16 36 40 384 10 sgpool-8 40 40 192 20 scsi_io_context 0 0 104 37 rpc_buffers 8 8 2048 2 rpc_tasks 8 20 192 20 rpc_inode_cache 0 0 448 9 UNIX 42 50 384 10 ip_mrt_cache 0 0 128 30 tcp_bind_bucket 49 203 16 203 inet_peer_cache 1 59 64 59 secpath_cache 0 0 128 30 xfrm_dst_cache 0 0 320 12 ip_dst_cache 15 30 256 15 arp_cache 7 30 128 30 RAW 2 9 448 9 UDP 5 9 448 9 tw_sock_TCP 3 30 128 30 request_sock_TCP 0 0 64 59 Cache Num Total Size Pages TCP 63 72 1024 4 flow_cache 0 0 128 30 uhci_urb_priv 0 0 40 92 blktapif_cache 0 0 108 36 blkif_cache 15 35 112 35 cfq_ioc_pool 0 0 48 78 cfq_pool 0 0 96 40 crq_pool 0 0 44 84 deadline_drq 0 0 48 78 as_arq 1197 1260 60 63 nfs_write_data 36 36 448 9 nfs_read_data 32 36 448 9 nfs_inode_cache 0 0 560 7 nfs_page 0 0 64 59 isofs_inode_cache 0 0 340 11 ext2_inode_cache 0 0 420 9 dnotify_cache 0 0 20 169 eventpoll_pwq 0 0 36 101 eventpoll_epi 0 0 128 30 inotify_event_cache 0 0 28 127 inotify_watch_cache 0 0 36 101 Cache Num Total Size Pages kioctx 0 0 192 20 kiocb 0 0 128 30 fasync_cache 0 0 16 203 shmem_inode_cache 1515 1521 404 9 posix_timers_cache 1 40 96 40 uid_cache 2 59 64 59 blkdev_ioc 178 254 28 127 blkdev_queue 548 548 900 4 blkdev_requests 1205 1265 168 23 biovec-(256) 260 260 3072 2 biovec-128 264 265 1536 5 biovec-64 290 290 768 5 biovec-16 440 560 192 20 biovec-4 284 295 64 59 biovec-1 32822 41006 16 203 bio 32855 33512 64 59 sock_inode_cache 122 130 384 10 skbuff_fclone_cache 46 100 384 10 skbuff_head_cache 620 920 192 20 xen-skb-65536 0 0 65536 1 xen-skb-32768 0 0 32768 1 Cache Num Total Size Pages xen-skb-16384 0 0 16384 1 xen-skb-8192 0 0 8192 1 xen-skb-4096 544 557 4096 1 xen-skb-2048 24 54 2048 2 xen-skb-512 48 48 512 8 file_lock_cache 4 44 88 44 acpi_operand 637 736 40 92 acpi_parse_ext 0 0 44 84 acpi_parse 0 0 28 127 acpi_state 0 0 48 78 proc_inode_cache 17 48 328 12 sigqueue 54 54 144 27 radix_tree_node 810 1456 276 14 bdev_cache 82 90 448 9 sysfs_dir_cache 13325 13340 40 92 mnt_cache 21 30 128 30 inode_cache 2416 2976 312 12 dentry_cache 4407 12989 124 31 filp 710 1060 192 20 names_cache 2 2 4096 1 idr_layer_cache 209 232 136 29 Cache Num Total Size Pages buffer_head 25038 28938 48 78 mm_struct 87 90 448 9 vm_area_struct 1782 2478 92 42 fs_cache 84 565 32 113 files_cache 85 135 448 9 signal_cache 193 200 384 10 sighand_cache 191 204 1344 3 task_struct 208 303 1280 3 anon_vma 751 2034 8 339 pgd 89 89 4096 1 pmd 351 351 4096 1 size-131072(DMA) 0 0 131072 1 size-131072 1 1 131072 1 size-65536(DMA) 0 0 65536 1 size-65536 12 12 65536 1 size-32768(DMA) 0 0 32768 1 size-32768 0 0 32768 1 size-16384(DMA) 0 0 16384 1 size-16384 3 3 16384 1 size-8192(DMA) 0 0 8192 1 size-8192 211 218 8192 1 Cache Num Total Size Pages size-4096(DMA) 0 0 4096 1 size-4096 81 86 4096 1 size-2048(DMA) 0 0 2048 2 size-2048 91 92 2048 2 size-1024(DMA) 0 0 1024 4 size-1024 242 248 1024 4 size-512(DMA) 0 0 512 8 size-512 492 496 512 8 size-256(DMA) 0 0 256 15 size-256 892 900 256 15 size-192(DMA) 0 0 192 20 size-192 689 700 192 20 size-128(DMA) 0 0 128 30 size-128 356 360 128 30 size-96(DMA) 0 0 128 30 size-96 3300 3300 128 30 size-64(DMA) 0 0 64 59 size-32(DMA) 0 0 32 113 size-64 5362 5782 64 59 size-32 6732 9492 32 113 kmem_cache 150 150 128 30 --------------------------------- Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Thomas Mueller
2007-Aug-09 17:46 UTC
Re: [Xen-users] OOM killer observed during heavy I/O from VMs (XEN 3.0.4 and XEN 3.1)
hi there > Under both XEN 3.0.4 (2.6.16.33) and XEN 3.1 (2.6.18), I can make the OOM killer > appear in dom0 of my server by doing heavy I/O from within a > VM. just wanted to say, that i''m facing probably the same problem with XEN 3.1 and 256MB RAM in dom0. I also had this with 512MB but not so often. Sometimes this happens if i "dd" a new file-image (10GB) in dom0. - Thomas _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Eric Tessler
2007-Aug-10 04:25 UTC
Re: [Xen-users] OOM killer observed during heavy I/O from VMs (XEN 3.0.4 and XEN 3.1)
Can you send me your OOM killer output from /var/log/messages - I want to see if it''s the same one that I am hitting. If we increase our memory to 384MB of higher, we don''t get the OOM killer - at least we have not seen it yet (we want to keep dom0 to 256MB). Eric Thomas Mueller <tmu@muellerit.ch> wrote: hi there> Under both XEN 3.0.4 (2.6.16.33) and XEN 3.1 (2.6.18), I can make theOOM killer> appear in dom0 of my server by doing heavy I/O from within a > VM.just wanted to say, that i''m facing probably the same problem with XEN 3.1 and 256MB RAM in dom0. I also had this with 512MB but not so often. Sometimes this happens if i "dd" a new file-image (10GB) in dom0. - Thomas _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users --------------------------------- Shape Yahoo! in your own image. Join our Network Research Panel today! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Artur Linhart - Linux communication
2007-Sep-26 13:27 UTC
RE: [Xen-users] OOM killer observed during heavy I/O from VMs (XEN3.0.4 and XEN 3.1)
Hello, I got this error also by 384 MB (having 7 linux VMs and 1 W2K HVM on Xen 3.1.0 under Debian Etch Dom0), the oom-killer till now always selected the W2K instance to kill... I think this is very dependent on the fact if the DOmU is on LVM and especially if there is some snapshot of it - then If You for example copy by "dd" the new image on such a partition, which has a snapshot, the LVM subsystem has to allocate a lot of memory on the disk for the new blocks in snapshot and it looks to lead to the given memory problem. If I dropped the snapshot partition and used smaller blocksize in dd, it has been no problem. Sure, to have the snapshot of the partition which is going completely to be changed is not goor, I just have forgotten to drop the snapshot I created for backup purposes before... I have read somwhere, there can be prevented some process from beeing killed by the oom-killer, but I do not know how to apply it to the given domain after it has been started... In http://linux-mm.org/OOM_Killer there is written "Any particular process leader may be immunized against the oom killer if the value of it''s /proc/<pid>/oomadj is set to the constant OOM_DISABLE (currently defined as -17).", but how can I figure out which process is the process serving my DomU? And what can happen with the LVM subsystem if the memory is not increased because no DomU can be killed by oom-killer? Does anybody have experiences with it? With regards, Archie _____ From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Eric Tessler Sent: Friday, August 10, 2007 6:26 AM To: Thomas Mueller; xen-users@lists.xensource.com Subject: Re: [Xen-users] OOM killer observed during heavy I/O from VMs (XEN3.0.4 and XEN 3.1) Can you send me your OOM killer output from /var/log/messages - I want to see if it''s the same one that I am hitting. If we increase our memory to 384MB of higher, we don''t get the OOM killer - at least we have not seen it yet (we want to keep dom0 to 256MB). Eric Thomas Mueller <tmu@muellerit.ch> wrote: hi there> Under both XEN 3.0.4 (2.6.16.33) and XEN 3.1 (2.6.18), I can make theOOM killer> appear in dom0 of my server by doing heavy I/O from within a > VM.just wanted to say, that i''m facing probably the same problem with XEN 3.1 and 256MB RAM in dom0. I also had this with 512MB but not so often. Sometimes this happens if i "dd" a new file-image (10GB) in dom0. - Thomas _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _____ Shape Yahoo! in your own image. Join <http://us.rd.yahoo.com/evt=48517/*http:/surveylink.yahoo.com/gmrs/yahoo_pan el_invite.asp?a=7> our Network Research Panel today! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users