This is a fairly simple test and it does work with Xen 4.2. # xl info host : tst035.dumpdata.com release : 3.10.0-rc5upstream-00438-g335262d-dirty version 3f00:17bae3ff:00000000:00000001:00000000 virt_caps : hvm hvm_directio total_memory : 8016 free_memory : 5852 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 3 xen_extra : -unstable xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : Mon Jun 10 14:42:51 2013 +0200 git:44434f3-dirty xen_commandline : com1=115200,8n1 tmem=1 dom0_mem=max:2G cpufreq=xen:performance,verbose noreboot console=com1,vga loglvl=all guest_loglvl=all cc_compiler : gcc (GCC) 4.4.4 20100503 (Red Hat 4.4.4-2) cc_compile_by : konrad cc_compile_domain : (none) cc_compile_date : Mon Jun 10 17:01:43 EDT 2013 xend_config_format : 4 # 133.475684] xen-blkback:(backend_changed:585) . Jun 11 13:39:58 tst035 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/1/51712 [ 133.477018] xen-blkback:(xen_vbd_create:421) Successful creation of handle=ca00 (dom=1) [ 133.477018] . [ 133.479632] xen-blkback:(frontend_changed:665) Initialising. mapping kernel into physical memory about to get started... Jun 11 13:39:59 tst035 logger: /etc/xen/scripts/vif-bridge: online type_if=vif XENBUS_PATH=backend/vif/1/0 [ 133.635819] device vif1.0 entered promiscuous mode [ 133.639363] IPv6: ADDRCONF(NETDEV_UP): vif1.0: link is not ready Jun 11 13:39:59 tst035 logger: /etc/xen/scripts/vif-bridge: Successful vif-bridge online for vif1.0, bridge switch. Jun 11 13:39:59 tst035 logger: /etc/xen/scripts/vif-bridge: Writing backend/vif/1/0/hotplug-status connected to xenstore. [ 135.864732] IPv6: ADDRCONF(NETDEV_CHANGE): vif1.0: link becomes ready [ 135.865760] switch: port 2(vif1.0) entered forward[ 135.965777] xen-blkback:(frontend_changed:665) Initialised. [ 135.966711] xen-blkback:(connect_ring:820) /local/domain/1/d persistent grants [ 135.968942] xen-blkback:(connect:734) /local/domain/1/device/vbd/51712. [ 135.981089] xen-blkback:(frontend_changed:665) Connected. ... snip.. [ 140.441073] xen-blkback: grant 38 added to the tree of persistent grants, using 28/1056 [ 140.441640] xen-blkback: grant 39 added to the tree of persistent grants, using 29/1056 [ 140.442284] xen-blkback: grant 40 added to the tree of persistent grants, using 30/1056 [ 140.442840] xen-blkback: grant 41 added to the tree of persistent grants, using 31/1056 [ 140.443389] xen-blkback: grant 42 added to the tree of persistent grants, using 32/1056 [ 140.443920] xen-blkback: grant 43 added to the tree of persistent grants, using 33/1056 [ 140.444449] xen-blkback: grant 44 added to the tree of persistent grants, using 34/1056 (XEN) tmem: initializing tmem capability for domid=1...<G><2>ok (XEN) tmem: allocating persistent-private tmem pool for domid=1...<G><2>pool_id=0 [ 150.879132] switch: port 2(vif1.0) entered forwarding state (XEN) Xen BUG at domain_page.c:143 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c4c0160461>] map_domain_page+0x450/0x514 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000020 rbx: ffff8300c68f9000 rcx: 0000000000000000 (XEN) rdx: 0000000000000020 rsi: 0000000000000020 rdi: 0000000000000000 (XEN) rbp: ffff82c4c02c7cc8 rsp: ffff82c4c02c7c88 r8: ffff820060001000 (XEN) r9: 00000000ffffffff r10: ffff820060006000 r11: 0000000000000000 (XEN) r12: ffff83022e1bb000 r13: 00000000001ebcdc r14: 0000000000000020 (XEN) r15: 0000000000000004 cr0: 0000000080050033 cr4: 00000000000426f0 (XEN) cr3: 0000000209541000 cr2: ffff88002b683fd0 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c4c02c7c88: (XEN) ffff83022e1bb2d8 0000000000000286 ffff82c4c012760a ffff83022e1bb000 (XEN) ffff82e003d79b80 ffff82c4c02c7d60 00000000001ebcdc 0000000000000000 (XEN) ffff82c4c02c7d38 ffff82c4c01373de ffff82c4c0127b6b ffffffffffffffff (XEN) 00000000c02c7d38 ffff82c4c02c7d58 ffff83022e1bb2d8 0000000000000286 (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 (XEN) 0000000000000000 00000000001ebcdc ffff82c4c02c7d98 ffff82c4c01377c4 (XEN) 0000000000000000 ffff820040014000 ffff82e003d79b80 00000000001ebcdc (XEN) ffff82c4c02c7d98 ffff830210ecf390 00000000fffffff4 ffff820040010010 (XEN) ffff82004001cf50 ffff83022e1bcc90 ffff82c4c02c7e18 ffff82c4c0135929 (XEN) ffff82c4c02c7db8 ffff82004001cf50 0000000000000000 00000000001ebcdc (XEN) 0000000000000000 0000000000000000 0000e8a200000000 ffff82c4c02c7e00 (XEN) ffff82c4c02c7e18 ffff83022e1bcc90 ffff830210ecf390 0000000000000000 (XEN) 0000000000000001 000000000000009a ffff82c4c02c7ef8 ffff82c4c0136510 (XEN) 0000002700001000 0000000000000000 ffff82c4c02c7e90 97c4284effffffc2 (XEN) ffff82c4c02c7e68 ffff82c4c015719d ffff82c4c0127b09 0000000000000000 (XEN) ffff82c4c02c7e88 ffff82c4c018c13c ffff82c4c0319100 ffff82c4c02c7f18 (XEN) 0000000000000004 0000000000000001 0000000000000000 0000000000000000 (XEN) 000000000000e8a2 0000000000000000 00000000001ebcdc 000000000000e030 (XEN) 0000000000000246 ffff8300c68f9000 0000000000000000 0000000000000000 (XEN) 0000000000000001 0000000000000000 00007d3b3fd380c7 ffff82c4c02236db (XEN) Xen call trace: (XEN) [<ffff82c4c0160461>] map_domain_page+0x450/0x514 (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 (XEN) [<ffff82c4c02236db>] syscall_enter+0xeb/0x145 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Xen BUG at domain_page.c:143 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified)
>>> On 11.06.13 at 15:45, konrad wilk <konrad.wilk@oracle.com> wrote: > This is a fairly simple test and it does work with Xen 4.2. > [...] > (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82c4c0160461>] map_domain_page+0x450/0x514 > (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor > (XEN) rax: 0000000000000020 rbx: ffff8300c68f9000 rcx: 0000000000000000 > (XEN) rdx: 0000000000000020 rsi: 0000000000000020 rdi: 0000000000000000 > (XEN) rbp: ffff82c4c02c7cc8 rsp: ffff82c4c02c7c88 r8: ffff820060001000 > (XEN) r9: 00000000ffffffff r10: ffff820060006000 r11: 0000000000000000 > (XEN) r12: ffff83022e1bb000 r13: 00000000001ebcdc r14: 0000000000000020 > (XEN) r15: 0000000000000004 cr0: 0000000080050033 cr4: 00000000000426f0 > (XEN) cr3: 0000000209541000 cr2: ffff88002b683fd0 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c4c02c7c88: > (XEN) ffff83022e1bb2d8 0000000000000286 ffff82c4c012760a ffff83022e1bb000 > (XEN) ffff82e003d79b80 ffff82c4c02c7d60 00000000001ebcdc 0000000000000000 > (XEN) ffff82c4c02c7d38 ffff82c4c01373de ffff82c4c0127b6b ffffffffffffffff > (XEN) 00000000c02c7d38 ffff82c4c02c7d58 ffff83022e1bb2d8 0000000000000286 > (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 > (XEN) 0000000000000000 00000000001ebcdc ffff82c4c02c7d98 ffff82c4c01377c4 > (XEN) 0000000000000000 ffff820040014000 ffff82e003d79b80 00000000001ebcdc > (XEN) ffff82c4c02c7d98 ffff830210ecf390 00000000fffffff4 ffff820040010010 > (XEN) ffff82004001cf50 ffff83022e1bcc90 ffff82c4c02c7e18 ffff82c4c0135929 > (XEN) ffff82c4c02c7db8 ffff82004001cf50 0000000000000000 00000000001ebcdc > (XEN) 0000000000000000 0000000000000000 0000e8a200000000 ffff82c4c02c7e00 > (XEN) ffff82c4c02c7e18 ffff83022e1bcc90 ffff830210ecf390 0000000000000000 > (XEN) 0000000000000001 000000000000009a ffff82c4c02c7ef8 ffff82c4c0136510 > (XEN) 0000002700001000 0000000000000000 ffff82c4c02c7e90 97c4284effffffc2 > (XEN) ffff82c4c02c7e68 ffff82c4c015719d ffff82c4c0127b09 0000000000000000 > (XEN) ffff82c4c02c7e88 ffff82c4c018c13c ffff82c4c0319100 ffff82c4c02c7f18 > (XEN) 0000000000000004 0000000000000001 0000000000000000 0000000000000000 > (XEN) 000000000000e8a2 0000000000000000 00000000001ebcdc 000000000000e030 > (XEN) 0000000000000246 ffff8300c68f9000 0000000000000000 0000000000000000 > (XEN) 0000000000000001 0000000000000000 00007d3b3fd380c7 ffff82c4c02236db > (XEN) Xen call trace: > (XEN) [<ffff82c4c0160461>] map_domain_page+0x450/0x514 > (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b > (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 > (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 > (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 > (XEN) [<ffff82c4c02236db>] syscall_enter+0xeb/0x145 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Xen BUG at domain_page.c:143 > (XEN) ****************************************For one, you won''t see this with debug=n. And then it seems quite likely that tmem shows behavior that I''m unaware of, and hence I may have broken it with the 16Tb support patches - I suspect it simply drives the hypervisor out of domain page mapping resources. After all, x86-64 didn''t do any such mapping yet in 4.2. But tmem being unsupported due to the still pending security audit makes this a low priority issue anyway. And as you may or may not recall, it is being disabled for systems with more than 5Tb too. So quite a bit of work on the tmem side... Jan
On 6/11/2013 10:46 AM, Jan Beulich wrote:>>>> On 11.06.13 at 15:45, konrad wilk <konrad.wilk@oracle.com> wrote: >> This is a fairly simple test and it does work with Xen 4.2. >> [...] >> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 0 >> (XEN) RIP: e008:[<ffff82c4c0160461>] map_domain_page+0x450/0x514 >> (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor >> (XEN) rax: 0000000000000020 rbx: ffff8300c68f9000 rcx: 0000000000000000 >> (XEN) rdx: 0000000000000020 rsi: 0000000000000020 rdi: 0000000000000000 >> (XEN) rbp: ffff82c4c02c7cc8 rsp: ffff82c4c02c7c88 r8: ffff820060001000 >> (XEN) r9: 00000000ffffffff r10: ffff820060006000 r11: 0000000000000000 >> (XEN) r12: ffff83022e1bb000 r13: 00000000001ebcdc r14: 0000000000000020 >> (XEN) r15: 0000000000000004 cr0: 0000000080050033 cr4: 00000000000426f0 >> (XEN) cr3: 0000000209541000 cr2: ffff88002b683fd0 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >> (XEN) Xen stack trace from rsp=ffff82c4c02c7c88: >> (XEN) ffff83022e1bb2d8 0000000000000286 ffff82c4c012760a ffff83022e1bb000 >> (XEN) ffff82e003d79b80 ffff82c4c02c7d60 00000000001ebcdc 0000000000000000 >> (XEN) ffff82c4c02c7d38 ffff82c4c01373de ffff82c4c0127b6b ffffffffffffffff >> (XEN) 00000000c02c7d38 ffff82c4c02c7d58 ffff83022e1bb2d8 0000000000000286 >> (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 >> (XEN) 0000000000000000 00000000001ebcdc ffff82c4c02c7d98 ffff82c4c01377c4 >> (XEN) 0000000000000000 ffff820040014000 ffff82e003d79b80 00000000001ebcdc >> (XEN) ffff82c4c02c7d98 ffff830210ecf390 00000000fffffff4 ffff820040010010 >> (XEN) ffff82004001cf50 ffff83022e1bcc90 ffff82c4c02c7e18 ffff82c4c0135929 >> (XEN) ffff82c4c02c7db8 ffff82004001cf50 0000000000000000 00000000001ebcdc >> (XEN) 0000000000000000 0000000000000000 0000e8a200000000 ffff82c4c02c7e00 >> (XEN) ffff82c4c02c7e18 ffff83022e1bcc90 ffff830210ecf390 0000000000000000 >> (XEN) 0000000000000001 000000000000009a ffff82c4c02c7ef8 ffff82c4c0136510 >> (XEN) 0000002700001000 0000000000000000 ffff82c4c02c7e90 97c4284effffffc2 >> (XEN) ffff82c4c02c7e68 ffff82c4c015719d ffff82c4c0127b09 0000000000000000 >> (XEN) ffff82c4c02c7e88 ffff82c4c018c13c ffff82c4c0319100 ffff82c4c02c7f18 >> (XEN) 0000000000000004 0000000000000001 0000000000000000 0000000000000000 >> (XEN) 000000000000e8a2 0000000000000000 00000000001ebcdc 000000000000e030 >> (XEN) 0000000000000246 ffff8300c68f9000 0000000000000000 0000000000000000 >> (XEN) 0000000000000001 0000000000000000 00007d3b3fd380c7 ffff82c4c02236db >> (XEN) Xen call trace: >> (XEN) [<ffff82c4c0160461>] map_domain_page+0x450/0x514 >> (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b >> (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 >> (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 >> (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 >> (XEN) [<ffff82c4c02236db>] syscall_enter+0xeb/0x145 >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 0: >> (XEN) Xen BUG at domain_page.c:143 >> (XEN) **************************************** > For one, you won''t see this with debug=n. > > And then it seems quite likely that tmem shows behavior that I''m > unaware of, and hence I may have broken it with the 16Tb > support patches - I suspect it simply drives the hypervisor out of > domain page mapping resources. After all, x86-64 didn''t do any > such mapping yet in 4.2.I think this is a more subtle bug. I applied a debug patch (see attached) and with the help of it and the logs: (XEN) domain_page.c:160:d1 mfn (1ebe96) -> 6 idx: 32(i:1,j:0), branch:1 (XEN) domain_page.c:166:d1 [0] idx=26, mfn=0x1ebcd8, refcnt: 0 (XEN) domain_page.c:166:d1 [1] idx=12, mfn=0x1ebcd9, refcnt: 0 (XEN) domain_page.c:166:d1 [2] idx=2, mfn=0x210e9a, refcnt: 0 (XEN) domain_page.c:166:d1 [3] idx=14, mfn=0x210e9b, refcnt: 0 (XEN) domain_page.c:166:d1 [4] idx=7, mfn=0x210e9c, refcnt: 0 (XEN) domain_page.c:166:d1 [5] idx=10, mfn=0x210e9d, refcnt: 0 (XEN) domain_page.c:166:d1 [6] idx=5, mfn=0x210e9e, refcnt: 0 (XEN) domain_page.c:166:d1 [7] idx=13, mfn=0x1ebe97, refcnt: 0 (XEN) Xen BUG at domain_page.c:169 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff8300c68f9000 rcx: 0000000000000000 (XEN) rdx: ffff8302125b2020 rsi: 000000000000000a rdi: ffff82c4c027a6e8 (XEN) rbp: ffff8302125afcc8 rsp: ffff8302125afc48 r8: 0000000000000004 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: ffff83022e2ef000 r13: 00000000001ebe96 r14: 0000000000000020 (XEN) r15: ffff8300c68f9080 cr0: 0000000080050033 cr4: 00000000000426f0 (XEN) cr3: 0000000209541000 cr2: ffffffffff600400 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff8302125afc48: (XEN) 00000000001ebe97 0000000000000000 0000000000000000 ffff830200000001 (XEN) ffff8302125afcc8 ffff82c400000000 00000000001ebe97 000000080000000d (XEN) ffff83022e2ef2d8 0000000000000286 ffff82c4c0127b6b ffff83022e2ef000 (XEN) ffff82e003d7d2c0 ffff8302125afd60 00000000001ebe96 0000000000000000 (XEN) ffff8302125afd38 ffff82c4c01373de 0000000000000000 ffffffffffffffff (XEN) 0000000000000001 ffff8302125afd58 ffff83022e2ef2d8 0000000000000286 (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 (XEN) 0000000000000000 00000000001ebe96 ffff8302125afd98 ffff82c4c01377c4 (XEN) 0000000000000000 ffff820040017000 ffff82e003d7d2c0 00000000001ebe96 (XEN) ffff8302125afd98 ffff830210ecf390 00000000fffffff4 ffff820040009010 (XEN) ffff820040000f50 ffff83022e2f0c90 ffff8302125afe18 ffff82c4c0135929 (XEN) 000000160000001e ffff820040000f50 0000000000000000 00000000001ebe96 (XEN) 0000000000000000 0000000000000000 0000a2f6125afe28 ffff8302125afe00 (XEN) 0000001675f02b51 ffff83022e2f0c90 ffff830210ecf390 0000000000000000 (XEN) 0000000000000001 0000000000000065 ffff8302125afef8 ffff82c4c0136510 (XEN) ffff830200001000 0000000000000000 ffff8302125afe90 255ece02125b2040 (XEN) 00000003125afe68 00000016742667d1 ffff8302125b2100 0000003d52299000 (XEN) ffff8300c68f9000 0000000001c9c380 ffff8302125b2100 ffff8302125b1808 (XEN) 0000000000000004 0000000000000004 0000000000000000 0000000000000000 (XEN) 000000000000a2f6 0000000000000000 00000000001ebe96 ffff82c4c0126e77 (XEN) Xen call trace: (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) Xen BUG at domain_page.c:169 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified) It looks as if the path that is taken is: 110 idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor); 111 if ( unlikely(idx >= dcache->entries) ) 112 { 115 /* /First/, clean the garbage map and update the inuse list. */ 116 for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) 117 { 118 dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); 119 accum |= ~dcache->inuse[i]; Here computes the accum 120 } 121 122 if ( accum ) 123 idx = find_first_zero_bit(dcache->inuse, dcache->entries) Ok, finds the idx (32), 124 else 125 { .. does not go here. 142 } 143 BUG_ON(idx >= dcache->entries); And hits the BUG_ON(). But I am not sure if that is appropriate. Perhaps the BUG_ON was meant as a check for the loop (lines 128 -> 141) - in case it looped around and never found an empty place. But if that is the condition then that would also look suspect as it might have found an empty hash entry and the idx would still end up being 32.> > But tmem being unsupported due to the still pending security > audit makes this a low priority issue anyway. And as you mayI think the discussion on whether a fix should go in Xen 4.3 (or the stable releases) is something we can discuss when a patch has been fabricated.> or may not recall, it is being disabled for systems with more > than 5Tb too. So quite a bit of work on the tmem side... > > Jan >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Tue, Jun 11, 2013 at 4:30 PM, konrad wilk <konrad.wilk@oracle.com> wrote:> I think this is a more subtle bug. > I applied a debug patch (see attached) and with the help of it and the logs: > > (XEN) domain_page.c:160:d1 mfn (1ebe96) -> 6 idx: 32(i:1,j:0), branch:1 > (XEN) domain_page.c:166:d1 [0] idx=26, mfn=0x1ebcd8, refcnt: 0 > (XEN) domain_page.c:166:d1 [1] idx=12, mfn=0x1ebcd9, refcnt: 0 > (XEN) domain_page.c:166:d1 [2] idx=2, mfn=0x210e9a, refcnt: 0 > (XEN) domain_page.c:166:d1 [3] idx=14, mfn=0x210e9b, refcnt: 0 > (XEN) domain_page.c:166:d1 [4] idx=7, mfn=0x210e9c, refcnt: 0 > (XEN) domain_page.c:166:d1 [5] idx=10, mfn=0x210e9d, refcnt: 0 > (XEN) domain_page.c:166:d1 [6] idx=5, mfn=0x210e9e, refcnt: 0 > (XEN) domain_page.c:166:d1 [7] idx=13, mfn=0x1ebe97, refcnt: 0 > (XEN) Xen BUG at domain_page.c:169 > > (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 > > (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: ffff8300c68f9000 rcx: 0000000000000000 > (XEN) rdx: ffff8302125b2020 rsi: 000000000000000a rdi: ffff82c4c027a6e8 > (XEN) rbp: ffff8302125afcc8 rsp: ffff8302125afc48 r8: 0000000000000004 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: ffff83022e2ef000 r13: 00000000001ebe96 r14: 0000000000000020 > (XEN) r15: ffff8300c68f9080 cr0: 0000000080050033 cr4: 00000000000426f0 > (XEN) cr3: 0000000209541000 cr2: ffffffffff600400 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff8302125afc48: > (XEN) 00000000001ebe97 0000000000000000 0000000000000000 ffff830200000001 > (XEN) ffff8302125afcc8 ffff82c400000000 00000000001ebe97 000000080000000d > (XEN) ffff83022e2ef2d8 0000000000000286 ffff82c4c0127b6b ffff83022e2ef000 > (XEN) ffff82e003d7d2c0 ffff8302125afd60 00000000001ebe96 0000000000000000 > (XEN) ffff8302125afd38 ffff82c4c01373de 0000000000000000 ffffffffffffffff > (XEN) 0000000000000001 ffff8302125afd58 ffff83022e2ef2d8 0000000000000286 > > (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 > (XEN) 0000000000000000 00000000001ebe96 ffff8302125afd98 ffff82c4c01377c4 > (XEN) 0000000000000000 ffff820040017000 ffff82e003d7d2c0 00000000001ebe96 > (XEN) ffff8302125afd98 ffff830210ecf390 00000000fffffff4 ffff820040009010 > (XEN) ffff820040000f50 ffff83022e2f0c90 ffff8302125afe18 ffff82c4c0135929 > (XEN) 000000160000001e ffff820040000f50 0000000000000000 00000000001ebe96 > (XEN) 0000000000000000 0000000000000000 0000a2f6125afe28 ffff8302125afe00 > (XEN) 0000001675f02b51 ffff83022e2f0c90 ffff830210ecf390 0000000000000000 > (XEN) 0000000000000001 0000000000000065 ffff8302125afef8 ffff82c4c0136510 > (XEN) ffff830200001000 0000000000000000 ffff8302125afe90 255ece02125b2040 > (XEN) 00000003125afe68 00000016742667d1 ffff8302125b2100 0000003d52299000 > (XEN) ffff8300c68f9000 0000000001c9c380 ffff8302125b2100 ffff8302125b1808 > (XEN) 0000000000000004 0000000000000004 0000000000000000 0000000000000000 > (XEN) 000000000000a2f6 0000000000000000 00000000001ebe96 ffff82c4c0126e77 > (XEN) Xen call trace: > (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 > > (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b > (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 > (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 > (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 > (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145 > > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) Xen BUG at domain_page.c:169 > > (XEN) **************************************** > (XEN) > (XEN) Manual reset required (''noreboot'' specified) > > It looks as if the path that is taken is: > > 110 idx = find_next_zero_bit(dcache->inuse, dcache->entries, > dcache->cursor); > 111 if ( unlikely(idx >= dcache->entries) ) > 112 { > > 115 /* /First/, clean the garbage map and update the inuse list. */ > 116 for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) > 117 { > 118 dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); > 119 accum |= ~dcache->inuse[i]; > > Here computes the accum > 120 } > 121 > 122 if ( accum ) > 123 idx = find_first_zero_bit(dcache->inuse, dcache->entries) > > Ok, finds the idx (32), > 124 else > 125 { > .. does not go here. > 142 } > 143 BUG_ON(idx >= dcache->entries); > > And hits the BUG_ON(). > > But I am not sure if that is appropriate. Perhaps the BUG_ON was meant as a > check > for the loop (lines 128 -> 141) - in case it looped around and never found > an empty place. > But if that is the condition then that would also look suspect as it might > have found an > empty hash entry and the idx would still end up being 32.Right -- it is really curious that "accum |= ~dcache->inuse[x]" managed to be non-zero, while find_first_zero_bit() goes off the end (as it seems). It seems like you should add a printk in the first loop: if(~dcache->inuse[i]) printk(...); Also, I don''t think you''ve printed what dcache->entries is -- is it 32? -George
>>> On 11.06.13 at 17:30, konrad wilk <konrad.wilk@oracle.com> wrote: > I think this is a more subtle bug. > I applied a debug patch (see attached) and with the help of it and the logs: > > (XEN) domain_page.c:160:d1 mfn (1ebe96) -> 6 idx: 32(i:1,j:0), branch:1 > (XEN) domain_page.c:166:d1 [0] idx=26, mfn=0x1ebcd8, refcnt: 0 > (XEN) domain_page.c:166:d1 [1] idx=12, mfn=0x1ebcd9, refcnt: 0 > (XEN) domain_page.c:166:d1 [2] idx=2, mfn=0x210e9a, refcnt: 0 > (XEN) domain_page.c:166:d1 [3] idx=14, mfn=0x210e9b, refcnt: 0 > (XEN) domain_page.c:166:d1 [4] idx=7, mfn=0x210e9c, refcnt: 0 > (XEN) domain_page.c:166:d1 [5] idx=10, mfn=0x210e9d, refcnt: 0 > (XEN) domain_page.c:166:d1 [6] idx=5, mfn=0x210e9e, refcnt: 0 > (XEN) domain_page.c:166:d1 [7] idx=13, mfn=0x1ebe97, refcnt: 0 > (XEN) Xen BUG at domain_page.c:169 > (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 > (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: ffff8300c68f9000 rcx: 0000000000000000 > (XEN) rdx: ffff8302125b2020 rsi: 000000000000000a rdi: ffff82c4c027a6e8 > (XEN) rbp: ffff8302125afcc8 rsp: ffff8302125afc48 r8: 0000000000000004 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: ffff83022e2ef000 r13: 00000000001ebe96 r14: 0000000000000020 > (XEN) r15: ffff8300c68f9080 cr0: 0000000080050033 cr4: 00000000000426f0 > (XEN) cr3: 0000000209541000 cr2: ffffffffff600400 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff8302125afc48: > (XEN) 00000000001ebe97 0000000000000000 0000000000000000 ffff830200000001 > (XEN) ffff8302125afcc8 ffff82c400000000 00000000001ebe97 000000080000000d > (XEN) ffff83022e2ef2d8 0000000000000286 ffff82c4c0127b6b ffff83022e2ef000 > (XEN) ffff82e003d7d2c0 ffff8302125afd60 00000000001ebe96 0000000000000000 > (XEN) ffff8302125afd38 ffff82c4c01373de 0000000000000000 ffffffffffffffff > (XEN) 0000000000000001 ffff8302125afd58 ffff83022e2ef2d8 0000000000000286 > (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 > (XEN) 0000000000000000 00000000001ebe96 ffff8302125afd98 ffff82c4c01377c4 > (XEN) 0000000000000000 ffff820040017000 ffff82e003d7d2c0 00000000001ebe96 > (XEN) ffff8302125afd98 ffff830210ecf390 00000000fffffff4 ffff820040009010 > (XEN) ffff820040000f50 ffff83022e2f0c90 ffff8302125afe18 ffff82c4c0135929 > (XEN) 000000160000001e ffff820040000f50 0000000000000000 00000000001ebe96 > (XEN) 0000000000000000 0000000000000000 0000a2f6125afe28 ffff8302125afe00 > (XEN) 0000001675f02b51 ffff83022e2f0c90 ffff830210ecf390 0000000000000000 > (XEN) 0000000000000001 0000000000000065 ffff8302125afef8 ffff82c4c0136510 > (XEN) ffff830200001000 0000000000000000 ffff8302125afe90 255ece02125b2040 > (XEN) 00000003125afe68 00000016742667d1 ffff8302125b2100 0000003d52299000 > (XEN) ffff8300c68f9000 0000000001c9c380 ffff8302125b2100 ffff8302125b1808 > (XEN) 0000000000000004 0000000000000004 0000000000000000 0000000000000000 > (XEN) 000000000000a2f6 0000000000000000 00000000001ebe96 ffff82c4c0126e77 > (XEN) Xen call trace: > (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 > (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b > (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 > (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 > (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 > (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) Xen BUG at domain_page.c:169 > (XEN) **************************************** > (XEN) > (XEN) Manual reset required (''noreboot'' specified) > > It looks as if the path that is taken is: > > 110 idx = find_next_zero_bit(dcache->inuse, dcache->entries, > dcache->cursor); > 111 if ( unlikely(idx >= dcache->entries) ) > 112 { > > 115 /* /First/, clean the garbage map and update the inuse list. */ > 116 for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) > 117 { > 118 dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); > 119 accum |= ~dcache->inuse[i]; > > Here computes the accum > 120 } > 121 > 122 if ( accum ) > 123 idx = find_first_zero_bit(dcache->inuse, dcache->entries) > > Ok, finds the idx (32), > 124 else > 125 { > .. does not go here. > 142 } > 143 BUG_ON(idx >= dcache->entries); > > And hits the BUG_ON(). > > But I am not sure if that is appropriate. Perhaps the BUG_ON was meant > as a check > for the loop (lines 128 -> 141) - in case it looped around and never > found an empty place. > But if that is the condition then that would also look suspect as it > might have found an > empty hash entry and the idx would still end up being 32.The BUG_ON() here is definitely valid - a few lines down, after the enclosing if(), we use it in ways that requires this to not have triggered. It basically tells you whether an in range idx was found, which apparently isn''t the case here. As I think George already pointed out - printing accum here would be quite useful: It should have at least one of the low 32 bits set, given that dcache->entries must be at most 32 according to the data you already got logged. Jan
On 6/11/2013 12:38 PM, Jan Beulich wrote:>>>> On 11.06.13 at 17:30, konrad wilk <konrad.wilk@oracle.com> wrote: >> I think this is a more subtle bug. >> I applied a debug patch (see attached) and with the help of it and the logs: >> >> (XEN) domain_page.c:160:d1 mfn (1ebe96) -> 6 idx: 32(i:1,j:0), branch:1 >> (XEN) domain_page.c:166:d1 [0] idx=26, mfn=0x1ebcd8, refcnt: 0 >> (XEN) domain_page.c:166:d1 [1] idx=12, mfn=0x1ebcd9, refcnt: 0 >> (XEN) domain_page.c:166:d1 [2] idx=2, mfn=0x210e9a, refcnt: 0 >> (XEN) domain_page.c:166:d1 [3] idx=14, mfn=0x210e9b, refcnt: 0 >> (XEN) domain_page.c:166:d1 [4] idx=7, mfn=0x210e9c, refcnt: 0 >> (XEN) domain_page.c:166:d1 [5] idx=10, mfn=0x210e9d, refcnt: 0 >> (XEN) domain_page.c:166:d1 [6] idx=5, mfn=0x210e9e, refcnt: 0 >> (XEN) domain_page.c:166:d1 [7] idx=13, mfn=0x1ebe97, refcnt: 0 >> (XEN) Xen BUG at domain_page.c:169 >> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 3 >> (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 >> (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor >> (XEN) rax: 0000000000000000 rbx: ffff8300c68f9000 rcx: 0000000000000000 >> (XEN) rdx: ffff8302125b2020 rsi: 000000000000000a rdi: ffff82c4c027a6e8 >> (XEN) rbp: ffff8302125afcc8 rsp: ffff8302125afc48 r8: 0000000000000004 >> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 >> (XEN) r12: ffff83022e2ef000 r13: 00000000001ebe96 r14: 0000000000000020 >> (XEN) r15: ffff8300c68f9080 cr0: 0000000080050033 cr4: 00000000000426f0 >> (XEN) cr3: 0000000209541000 cr2: ffffffffff600400 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >> (XEN) Xen stack trace from rsp=ffff8302125afc48: >> (XEN) 00000000001ebe97 0000000000000000 0000000000000000 ffff830200000001 >> (XEN) ffff8302125afcc8 ffff82c400000000 00000000001ebe97 000000080000000d >> (XEN) ffff83022e2ef2d8 0000000000000286 ffff82c4c0127b6b ffff83022e2ef000 >> (XEN) ffff82e003d7d2c0 ffff8302125afd60 00000000001ebe96 0000000000000000 >> (XEN) ffff8302125afd38 ffff82c4c01373de 0000000000000000 ffffffffffffffff >> (XEN) 0000000000000001 ffff8302125afd58 ffff83022e2ef2d8 0000000000000286 >> (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 >> (XEN) 0000000000000000 00000000001ebe96 ffff8302125afd98 ffff82c4c01377c4 >> (XEN) 0000000000000000 ffff820040017000 ffff82e003d7d2c0 00000000001ebe96 >> (XEN) ffff8302125afd98 ffff830210ecf390 00000000fffffff4 ffff820040009010 >> (XEN) ffff820040000f50 ffff83022e2f0c90 ffff8302125afe18 ffff82c4c0135929 >> (XEN) 000000160000001e ffff820040000f50 0000000000000000 00000000001ebe96 >> (XEN) 0000000000000000 0000000000000000 0000a2f6125afe28 ffff8302125afe00 >> (XEN) 0000001675f02b51 ffff83022e2f0c90 ffff830210ecf390 0000000000000000 >> (XEN) 0000000000000001 0000000000000065 ffff8302125afef8 ffff82c4c0136510 >> (XEN) ffff830200001000 0000000000000000 ffff8302125afe90 255ece02125b2040 >> (XEN) 00000003125afe68 00000016742667d1 ffff8302125b2100 0000003d52299000 >> (XEN) ffff8300c68f9000 0000000001c9c380 ffff8302125b2100 ffff8302125b1808 >> (XEN) 0000000000000004 0000000000000004 0000000000000000 0000000000000000 >> (XEN) 000000000000a2f6 0000000000000000 00000000001ebe96 ffff82c4c0126e77 >> (XEN) Xen call trace: >> (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 >> (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b >> (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 >> (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 >> (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 >> (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145 >> (XEN) >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 3: >> (XEN) Xen BUG at domain_page.c:169 >> (XEN) **************************************** >> (XEN) >> (XEN) Manual reset required (''noreboot'' specified) >> >> It looks as if the path that is taken is: >> >> 110 idx = find_next_zero_bit(dcache->inuse, dcache->entries, >> dcache->cursor); >> 111 if ( unlikely(idx >= dcache->entries) ) >> 112 { >> >> 115 /* /First/, clean the garbage map and update the inuse list. */ >> 116 for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) >> 117 { >> 118 dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); >> 119 accum |= ~dcache->inuse[i]; >> >> Here computes the accum >> 120 } >> 121 >> 122 if ( accum ) >> 123 idx = find_first_zero_bit(dcache->inuse, dcache->entries) >> >> Ok, finds the idx (32), >> 124 else >> 125 { >> .. does not go here. >> 142 } >> 143 BUG_ON(idx >= dcache->entries); >> >> And hits the BUG_ON(). >> >> But I am not sure if that is appropriate. Perhaps the BUG_ON was meant >> as a check >> for the loop (lines 128 -> 141) - in case it looped around and never >> found an empty place. >> But if that is the condition then that would also look suspect as it >> might have found an >> empty hash entry and the idx would still end up being 32. > The BUG_ON() here is definitely valid - a few lines down, after the > enclosing if(), we use it in ways that requires this to not have > triggered. It basically tells you whether an in range idx was found, > which apparently isn''t the case here. > > As I think George already pointed out - printing accum here would > be quite useful: It should have at least one of the low 32 bits set, > given that dcache->entries must be at most 32 according to the > data you already got logged.Of course, here is the new log (and the debug attachment) (XEN) domain_page.c:122:d1 [0]: ffffffff, idx: 32 (XEN) domain_page.c:167:d1 mfn (1eba98) -> 0 idx: 32(i:1,j:0), branch:9 0xffffffff00000000 (XEN) domain_page.c:173:d1 [0] idx=0, mfn=0x182790, refcnt: 0 (XEN) domain_page.c:173:d1 [1] idx=29, mfn=0x1946f9, refcnt: 0 (XEN) domain_page.c:173:d1 [2] idx=15, mfn=0x1946fa, refcnt: 0 (XEN) domain_page.c:173:d1 [3] idx=11, mfn=0x1946fb, refcnt: 0 (XEN) domain_page.c:173:d1 [4] idx=17, mfn=0x1946fc, refcnt: 0 (XEN) domain_page.c:173:d1 [5] idx=21, mfn=0x1946fd, refcnt: 0 (XEN) domain_page.c:173:d1 [6] idx=10, mfn=0x180296, refcnt: 0 (XEN) domain_page.c:173:d1 [7] idx=4, mfn=0x180297, refcnt: 0 (XEN) Xen BUG at domain_page.c:176 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c4c0160742>] map_domain_page+0x6b8/0x77c (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff8300c68f9000 rcx: 0000000000000000 (XEN) rdx: ffff83020e84c020 rsi: 000000000000000a rdi: ffff82c4c027a6e8 (XEN) rbp: ffff83020e847cc8 rsp: ffff83020e847c28 r8: 0000000000000004 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: ffff83022d815000 r13: 00000000001eba98 r14: 0000000000000020 (XEN) r15: ffff8300c68f9080 cr0: 0000000080050033 cr4: 00000000000426f0 (XEN) cr3: 000000019c644000 cr2: ffff88000ef124b0 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83020e847c28: (XEN) 0000000000180297 0000000000000000 ffff830200000000 0000000000000009 (XEN) ffffffff00000000 ffff82c4c0116542 0000000000000297 ffffffff00000000 (XEN) ffff830200000020 00000000c01714f6 0000000000180297 0000000800000004 (XEN) ffff83022d8152d8 0000000000000286 ffff82c4c012760a ffff83022d815000 (XEN) ffff82e003d75300 ffff83020e847d60 00000000001eba98 0000000000000000 (XEN) ffff83020e847d38 ffff82c4c01373de 0000000000000000 ffffffffffffffff (XEN) 0000000000000001 ffff83020e847d58 ffff83022d8152d8 0000000000000286 (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 (XEN) 0000000000000000 00000000001eba98 ffff83020e847d98 ffff82c4c01377c4 (XEN) 0000000000000000 ffff82004001a000 ffff82e003d75300 00000000001eba98 (XEN) ffff83020e847d98 ffff83020354f390 00000000fffffff4 ffff820040002010 (XEN) ffff820040001580 ffff83022d816c90 ffff83020e847e18 ffff82c4c0135929 (XEN) ffff83020e847db8 ffff820040001580 0000000000000000 00000000001eba98 (XEN) 0000000000000000 0000000000000000 000001f200000000 ffff83020e847e00 (XEN) ffff83020e847e18 ffff83022d816c90 ffff83020354f390 0000000000000000 (XEN) 0000000000000001 0000000000000091 ffff83020e847ef8 ffff82c4c0136510 (XEN) ffff830200001000 0000000000000000 ffff83020e847e90 bbbc0ca3c027bba0 (XEN) ffff82c4c027bba0 ffff82c4c02e0000 0000000000000002 ffff83020e847e78 (XEN) ffff82c4c0127b09 ffff82c4c027bba0 ffff83020e847e98 ffff82c4c01299af (XEN) 0000000000000004 0000000000000005 0000000000000000 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82c4c0160742>] map_domain_page+0x6b8/0x77c (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 (XEN) [<ffff82c4c02239bb>] syscall_enter+0xeb/0x145 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) Xen BUG at domain_page.c:176 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
> The BUG_ON() here is definitely valid - a few lines down, after the > enclosing if(), we use it in ways that requires this to not have > triggered. It basically tells you whether an in range idx was found, > which apparently isn''t the case here. > > As I think George already pointed out - printing accum here would > be quite useful: It should have at least one of the low 32 bits set, > given that dcache->entries must be at most 32 according to the > data you already got logged.With extra debugging (see attached patch) (XEN) domain_page.c:125:d1 mfn: 1eb483, [0]: bffff1ff, ~ffffffff40000e00, idx: 9 garbage: 40000e00, inuse: ffffffff (XEN) domain_page.c:125:d1 mfn: 1eb480, [0]: fdbfffff, ~ffffffff02400000, idx: 22 garbage: 2400000, inuse: ffffffff (XEN) domain_page.c:125:d1 mfn: 2067ca, [0]: fffff7ff, ~ffffffff00000800, idx: 11 garbage: 800, inuse: ffffffff (XEN) domain_page.c:125:d1 mfn: 183642, [0]: ffffffff, ~ffffffff00000000, idx: 32 garbage: 0, inuse: ffffffff (XEN) domain_page.c:170:d1 mfn (183642) -> 2 idx: 32(i:1,j:0), branch:9 0xffffffff00000000 (XEN) domain_page.c:176:d1 [0] idx=13, mfn=0x203b00, refcnt: 0 (XEN) domain_page.c:176:d1 [1] idx=25, mfn=0x1839e1, refcnt: 0 (XEN) domain_page.c:176:d1 [2] idx=3, mfn=0x1824d2, refcnt: 0 (XEN) domain_page.c:176:d1 [3] idx=5, mfn=0x1eb48b, refcnt: 0 (XEN) domain_page.c:176:d1 [4] idx=28, mfn=0x203b04, refcnt: 0 (XEN) domain_page.c:176:d1 [5] idx=0, mfn=0x1eb485, refcnt: 0 (XEN) domain_page.c:176:d1 [6] idx=30, mfn=0x203afe, refcnt: 0 (XEN) domain_page.c:176:d1 [7] idx=20, mfn=0x203aff, refcnt: 0 And that does point the picture that we have exhausted the full 32 entries of mapcache. Now off to find out who is holding them and why. Aren''t these operations (map/unmap domain_page) suppose to be shortlived?> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 6/11/2013 2:52 PM, konrad wilk wrote:> >> The BUG_ON() here is definitely valid - a few lines down, after the >> enclosing if(), we use it in ways that requires this to not have >> triggered. It basically tells you whether an in range idx was found, >> which apparently isn''t the case here. >> >> As I think George already pointed out - printing accum here would >> be quite useful: It should have at least one of the low 32 bits set, >> given that dcache->entries must be at most 32 according to the >> data you already got logged. > > With extra debugging (see attached patch) > > (XEN) domain_page.c:125:d1 mfn: 1eb483, [0]: bffff1ff, > ~ffffffff40000e00, idx: 9 garbage: 40000e00, inuse: ffffffff > (XEN) domain_page.c:125:d1 mfn: 1eb480, [0]: fdbfffff, > ~ffffffff02400000, idx: 22 garbage: 2400000, inuse: ffffffff > (XEN) domain_page.c:125:d1 mfn: 2067ca, [0]: fffff7ff, > ~ffffffff00000800, idx: 11 garbage: 800, inuse: ffffffff > (XEN) domain_page.c:125:d1 mfn: 183642, [0]: ffffffff, > ~ffffffff00000000, idx: 32 garbage: 0, inuse: ffffffff > (XEN) domain_page.c:170:d1 mfn (183642) -> 2 idx: 32(i:1,j:0), > branch:9 0xffffffff00000000 > (XEN) domain_page.c:176:d1 [0] idx=13, mfn=0x203b00, refcnt: 0 > (XEN) domain_page.c:176:d1 [1] idx=25, mfn=0x1839e1, refcnt: 0 > (XEN) domain_page.c:176:d1 [2] idx=3, mfn=0x1824d2, refcnt: 0 > (XEN) domain_page.c:176:d1 [3] idx=5, mfn=0x1eb48b, refcnt: 0 > (XEN) domain_page.c:176:d1 [4] idx=28, mfn=0x203b04, refcnt: 0 > (XEN) domain_page.c:176:d1 [5] idx=0, mfn=0x1eb485, refcnt: 0 > (XEN) domain_page.c:176:d1 [6] idx=30, mfn=0x203afe, refcnt: 0 > (XEN) domain_page.c:176:d1 [7] idx=20, mfn=0x203aff, refcnt: 0 > > And that does point the picture that we have exhausted the full 32 > entries of mapcache. > > Now off to find out who is holding them and why. Aren''t these > operations (map/unmap domain_page) suppose to be shortlived?And found the culprit. With some EIP logging: (XEN) domain_page.c:214:d1 [0] mfn=0x1ff67a idx=0, mfn=0x1ff67a, refcnt: 0 [EIP=0] (XEN) domain_page.c:216:d1 [1] mfn=18fef2, [EIP=0] (XEN) domain_page.c:216:d1 [2] mfn=1eb518, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [3] mfn=170a08, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [4] mfn=18feef, [EIP=0] (XEN) domain_page.c:216:d1 [5] mfn=1eb4c8, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [6] mfn=202699, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [7] mfn=18fef0, [EIP=0] (XEN) domain_page.c:216:d1 [8] mfn=0, [EIP=0] (XEN) domain_page.c:214:d1 [9] mfn=0x18e7ed idx=9, mfn=0x18e7ed, refcnt: 0 [EIP=0] (XEN) domain_page.c:214:d1 [10] mfn=0x18f629 idx=10, mfn=0x18f629, refcnt: 0 [EIP=0] (XEN) domain_page.c:216:d1 [11] mfn=1eb47e, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:214:d1 [12] mfn=0x18209c idx=12, mfn=0x18209c, refcnt: 0 [EIP=0] (XEN) domain_page.c:216:d1 [13] mfn=18fef5, [EIP=0] (XEN) domain_page.c:214:d1 [14] mfn=0x18f62b idx=14, mfn=0x18f62b, refcnt: 0 [EIP=0] (XEN) domain_page.c:216:d1 [15] mfn=1eb459, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [16] mfn=1eb512, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [17] mfn=170d2b, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [18] mfn=20272b, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [19] mfn=16c22c, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [20] mfn=18fef4, [EIP=0] (XEN) domain_page.c:216:d1 [21] mfn=18e7e9, [EIP=0] (XEN) domain_page.c:216:d1 [22] mfn=18feee, [EIP=0] (XEN) domain_page.c:216:d1 [23] mfn=1eb4a3, [tmh_persistent_pool_page_get+0x26d/0x2d8] (XEN) domain_page.c:216:d1 [24] mfn=18fef3, [EIP=0] (XEN) domain_page.c:214:d1 [25] mfn=0x18f62f idx=25, mfn=0x18f62f, refcnt: 0 [EIP=0] (XEN) domain_page.c:216:d1 [26] mfn=18ff02, [__get_page_type+0x1001/0x146a] (XEN) domain_page.c:214:d1 [27] mfn=0x18fefe idx=27, mfn=0x18fefe, refcnt: 0 [EIP=0] (XEN) domain_page.c:216:d1 [28] mfn=18ff00, [__get_page_type+0xcc3/0x146a] (XEN) domain_page.c:216:d1 [29] mfn=0, [EIP=0] (XEN) domain_page.c:214:d1 [30] mfn=0x18f628 idx=30, mfn=0x18f628, refcnt: 0 [EIP=0] (XEN) domain_page.c:216:d1 [31] mfn=1eb4ed, [tmh_persistent_pool_page_get+0x26d/0x2d8] And a brief look at the code it looks as any calls to the xmalloc_pool code ends up calling map_domain_page. Since most of the tmem code is using the pool to store guest pages (looking briefly at tmem_malloc), this would explain why we ran out of 32 slots. Especially as we don''t free them until the guest puts the persistent pages back. The fix.. well, not yet here but I think it would be mostly concentrating around tmem code. Thanks for suggestion on looking at the accum value. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 11.06.13 at 23:06, konrad wilk <konrad.wilk@oracle.com> wrote: > On 6/11/2013 2:52 PM, konrad wilk wrote: >> And that does point the picture that we have exhausted the full 32 >> entries of mapcache. >> >> Now off to find out who is holding them and why. Aren''t these >> operations (map/unmap domain_page) suppose to be shortlived?Yes, they are.> And found the culprit. With some EIP logging: > [...] > (XEN) domain_page.c:216:d1 [31] mfn=1eb4ed, > [tmh_persistent_pool_page_get+0x26d/0x2d8] > > And a brief look at the code it looks as any calls to the xmalloc_pool > code ends up > calling map_domain_page. Since most of the tmem code is using the pool > to store > guest pages (looking briefly at tmem_malloc), this would explain why we > ran out of > 32 slots. Especially as we don''t free them until the guest puts the > persistent pages back.Yes, this is (and never was) a valid use model for map_domain_page(). What''s really odd is the respective difference between tmh_mempool_page_get() (using page_to_virt() on the result of tmh_alloc_page()) and tmh_persistent_pool_page_get() (using __map_domain_page() on what _tmh_alloc_page_thispool() returned), while both allocation functions end up calling alloc_domheap_page(). With Dan no longer around, it may be hard to understand the reasons behind this brokenness. As tmem gets disabled anyway when there is memory not covered by the 1:1 mapping, switching tmh_persistent_pool_page_get() to use page_to_virt() would appear to be the obvious immediate solution. Re-enabling it on such huge memory systems is going to require a re-design anyway afaict. Jan
create ^ title it map_domain_page second-stage emergency fallback path never taken thanks On Tue, Jun 11, 2013 at 7:52 PM, konrad wilk <konrad.wilk@oracle.com> wrote:> >> The BUG_ON() here is definitely valid - a few lines down, after the >> enclosing if(), we use it in ways that requires this to not have >> triggered. It basically tells you whether an in range idx was found, >> which apparently isn''t the case here. >> >> As I think George already pointed out - printing accum here would >> be quite useful: It should have at least one of the low 32 bits set, >> given that dcache->entries must be at most 32 according to the >> data you already got logged. > > > With extra debugging (see attached patch) > > (XEN) domain_page.c:125:d1 mfn: 1eb483, [0]: bffff1ff, ~ffffffff40000e00, > idx: 9 garbage: 40000e00, inuse: ffffffff > (XEN) domain_page.c:125:d1 mfn: 1eb480, [0]: fdbfffff, ~ffffffff02400000, > idx: 22 garbage: 2400000, inuse: ffffffff > (XEN) domain_page.c:125:d1 mfn: 2067ca, [0]: fffff7ff, ~ffffffff00000800, > idx: 11 garbage: 800, inuse: ffffffff > (XEN) domain_page.c:125:d1 mfn: 183642, [0]: ffffffff, ~ffffffff00000000, > idx: 32 garbage: 0, inuse: ffffffffSo regardless of the fact that tmem is obviously holding what are supposed to be short-term references for so long, there is something that seems not quite right about this failure path. It looks like the algorithm is: 1. Clean the garbage map and update the inuse list 2. If anything has been cleaned up, use the first not-inuse entry 3. Otherwise, do something else ("replace a hash entry" -- not sure exactly what that means). What we see above is that this failure path succeeds three times, but fails the fourth time: there are, in fact, no zero entries after the garbage clean-up; however, because "inuse" is 32-bit (effectively) and "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will always return true and never fall back to the "something else" This is probably not something we need to fix for 4.3, but we should put it on our to-do list. -George
xen@bugs.xenproject.org
2013-Jun-12 11:15 UTC
Processed: Re: Xen 4.3 + tmem = Xen BUG at domain_page.c:143
Processing commands for xen@bugs.xenproject.org:> create ^Created new bug #14 rooted at `<51B7720B.10607@oracle.com>'' Title: `Re: [Xen-devel] Xen 4.3 + tmem = Xen BUG at domain_page.c:143''> title it map_domain_page second-stage emergency fallback path never takenSet title for #14 to `map_domain_page second-stage emergency fallback path never taken''> thanksFinished processing. Modified/created Bugs: - 14: http://bugs.xenproject.org/xen/bug/14 (new) --- Xen Hypervisor Bug Tracker See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on reporting bugs Contact xen-bugs-owner@bugs.xenproject.org with any infrastructure issues
On Wed, Jun 12, 2013 at 12:00 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> This is probably not something we need to fix for 4.3, but we should > put it on our to-do list.Konrad, Could you try the attached patch with your debug patch, to see if it successfully falls back to the "replace a map hash" patch? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 12.06.13 at 13:00, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > create ^ > title it map_domain_page second-stage emergency fallback path never taken > thanks > > On Tue, Jun 11, 2013 at 7:52 PM, konrad wilk <konrad.wilk@oracle.com> wrote: >> >>> The BUG_ON() here is definitely valid - a few lines down, after the >>> enclosing if(), we use it in ways that requires this to not have >>> triggered. It basically tells you whether an in range idx was found, >>> which apparently isn''t the case here. >>> >>> As I think George already pointed out - printing accum here would >>> be quite useful: It should have at least one of the low 32 bits set, >>> given that dcache->entries must be at most 32 according to the >>> data you already got logged. >> >> >> With extra debugging (see attached patch) >> >> (XEN) domain_page.c:125:d1 mfn: 1eb483, [0]: bffff1ff, ~ffffffff40000e00, >> idx: 9 garbage: 40000e00, inuse: ffffffff >> (XEN) domain_page.c:125:d1 mfn: 1eb480, [0]: fdbfffff, ~ffffffff02400000, >> idx: 22 garbage: 2400000, inuse: ffffffff >> (XEN) domain_page.c:125:d1 mfn: 2067ca, [0]: fffff7ff, ~ffffffff00000800, >> idx: 11 garbage: 800, inuse: ffffffff >> (XEN) domain_page.c:125:d1 mfn: 183642, [0]: ffffffff, ~ffffffff00000000, >> idx: 32 garbage: 0, inuse: ffffffff > > So regardless of the fact that tmem is obviously holding what are > supposed to be short-term references for so long, there is something > that seems not quite right about this failure path. > > It looks like the algorithm is: > 1. Clean the garbage map and update the inuse list > 2. If anything has been cleaned up, use the first not-inuse entry > 3. Otherwise, do something else ("replace a hash entry" -- not sure > exactly what that means). > > What we see above is that this failure path succeeds three times, but > fails the fourth time: there are, in fact, no zero entries after the > garbage clean-up; however, because "inuse" is 32-bit (effectively) and > "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will > always return true and never fall back to the "something else"Right, that''s what occurred to me too yesterday, but the again I knew I had seen this code path executed. Now that I look again, I think I understand why: All of my Dom0-s and typical DomU-s have a vCPU count divisible by 4, and with MAPCACHE_VCPU_ENTRIES being 16, the full unsigned long would always be used.> This is probably not something we need to fix for 4.3, but we should > put it on our to-do list.Actually I think we should fix this right away. Jan
>>> On 12.06.13 at 13:37, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Wed, Jun 12, 2013 at 12:00 PM, George Dunlap > <George.Dunlap@eu.citrix.com> wrote: >> This is probably not something we need to fix for 4.3, but we should >> put it on our to-do list. > > Could you try the attached patch with your debug patch, to see if it > successfully falls back to the "replace a map hash" patch?This could be a fix, but I''d prefer to do this with a change to an entirely cold code path, i.e. by setting the unused bits in the last word in mapcache_vcpu_init(). Jan
On 12/06/13 13:12, Jan Beulich wrote:>>>> On 12.06.13 at 13:00, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >> create ^ >> title it map_domain_page second-stage emergency fallback path never taken >> thanks >> >> On Tue, Jun 11, 2013 at 7:52 PM, konrad wilk <konrad.wilk@oracle.com> wrote: >>>> The BUG_ON() here is definitely valid - a few lines down, after the >>>> enclosing if(), we use it in ways that requires this to not have >>>> triggered. It basically tells you whether an in range idx was found, >>>> which apparently isn''t the case here. >>>> >>>> As I think George already pointed out - printing accum here would >>>> be quite useful: It should have at least one of the low 32 bits set, >>>> given that dcache->entries must be at most 32 according to the >>>> data you already got logged. >>> >>> With extra debugging (see attached patch) >>> >>> (XEN) domain_page.c:125:d1 mfn: 1eb483, [0]: bffff1ff, ~ffffffff40000e00, >>> idx: 9 garbage: 40000e00, inuse: ffffffff >>> (XEN) domain_page.c:125:d1 mfn: 1eb480, [0]: fdbfffff, ~ffffffff02400000, >>> idx: 22 garbage: 2400000, inuse: ffffffff >>> (XEN) domain_page.c:125:d1 mfn: 2067ca, [0]: fffff7ff, ~ffffffff00000800, >>> idx: 11 garbage: 800, inuse: ffffffff >>> (XEN) domain_page.c:125:d1 mfn: 183642, [0]: ffffffff, ~ffffffff00000000, >>> idx: 32 garbage: 0, inuse: ffffffff >> So regardless of the fact that tmem is obviously holding what are >> supposed to be short-term references for so long, there is something >> that seems not quite right about this failure path. >> >> It looks like the algorithm is: >> 1. Clean the garbage map and update the inuse list >> 2. If anything has been cleaned up, use the first not-inuse entry >> 3. Otherwise, do something else ("replace a hash entry" -- not sure >> exactly what that means). >> >> What we see above is that this failure path succeeds three times, but >> fails the fourth time: there are, in fact, no zero entries after the >> garbage clean-up; however, because "inuse" is 32-bit (effectively) and >> "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will >> always return true and never fall back to the "something else" > Right, that''s what occurred to me too yesterday, but the again > I knew I had seen this code path executed. Now that I look again, > I think I understand why: All of my Dom0-s and typical DomU-s > have a vCPU count divisible by 4, and with MAPCACHE_VCPU_ENTRIES > being 16, the full unsigned long would always be used. > >> This is probably not something we need to fix for 4.3, but we should >> put it on our to-do list. > Actually I think we should fix this right away.How often is the second path taken in practice? And, you said this doesn''t happen with debug=n builds -- why not exactly? I''m trying to assess the actual risk of not fixing it, vs the risk of fixing it. -George
>>> On 12.06.13 at 15:16, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 12/06/13 13:12, Jan Beulich wrote: >>>>> On 12.06.13 at 13:00, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >>> create ^ >>> title it map_domain_page second-stage emergency fallback path never taken >>> thanks >>> >>> On Tue, Jun 11, 2013 at 7:52 PM, konrad wilk <konrad.wilk@oracle.com> wrote: >>>>> The BUG_ON() here is definitely valid - a few lines down, after the >>>>> enclosing if(), we use it in ways that requires this to not have >>>>> triggered. It basically tells you whether an in range idx was found, >>>>> which apparently isn''t the case here. >>>>> >>>>> As I think George already pointed out - printing accum here would >>>>> be quite useful: It should have at least one of the low 32 bits set, >>>>> given that dcache->entries must be at most 32 according to the >>>>> data you already got logged. >>>> >>>> With extra debugging (see attached patch) >>>> >>>> (XEN) domain_page.c:125:d1 mfn: 1eb483, [0]: bffff1ff, ~ffffffff40000e00, >>>> idx: 9 garbage: 40000e00, inuse: ffffffff >>>> (XEN) domain_page.c:125:d1 mfn: 1eb480, [0]: fdbfffff, ~ffffffff02400000, >>>> idx: 22 garbage: 2400000, inuse: ffffffff >>>> (XEN) domain_page.c:125:d1 mfn: 2067ca, [0]: fffff7ff, ~ffffffff00000800, >>>> idx: 11 garbage: 800, inuse: ffffffff >>>> (XEN) domain_page.c:125:d1 mfn: 183642, [0]: ffffffff, ~ffffffff00000000, >>>> idx: 32 garbage: 0, inuse: ffffffff >>> So regardless of the fact that tmem is obviously holding what are >>> supposed to be short-term references for so long, there is something >>> that seems not quite right about this failure path. >>> >>> It looks like the algorithm is: >>> 1. Clean the garbage map and update the inuse list >>> 2. If anything has been cleaned up, use the first not-inuse entry >>> 3. Otherwise, do something else ("replace a hash entry" -- not sure >>> exactly what that means). >>> >>> What we see above is that this failure path succeeds three times, but >>> fails the fourth time: there are, in fact, no zero entries after the >>> garbage clean-up; however, because "inuse" is 32-bit (effectively) and >>> "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will >>> always return true and never fall back to the "something else" >> Right, that''s what occurred to me too yesterday, but the again >> I knew I had seen this code path executed. Now that I look again, >> I think I understand why: All of my Dom0-s and typical DomU-s >> have a vCPU count divisible by 4, and with MAPCACHE_VCPU_ENTRIES >> being 16, the full unsigned long would always be used. >> >>> This is probably not something we need to fix for 4.3, but we should >>> put it on our to-do list. >> Actually I think we should fix this right away. > > How often is the second path taken in practice?On non-debug builds, not at all except on systems with more than 5Tb (as all of the map_domain_page() code). Once domain page mappings are needed, this depends on the use pattern of the function. In any case this is going to be way more frequent than the one time per-vCPU setup.> And, you said this doesn''t happen with debug=n builds -- why not exactly?That was with the user (tmem) in mind: For <= 5Tb systems, as said above map_domain_page() has a short cut. And for >5Tb systems tmem gets turned off. But any other users of the function could still run into this on huge memory systems, and with this being one of the listed new features of 4.3 I think we should fix it. Jan
Konrad Rzeszutek Wilk
2013-Jun-12 14:13 UTC
Re: Xen 4.3 + tmem = Xen BUG at domain_page.c:143
On Wed, Jun 12, 2013 at 12:37:53PM +0100, George Dunlap wrote:> On Wed, Jun 12, 2013 at 12:00 PM, George Dunlap > <George.Dunlap@eu.citrix.com> wrote: > > This is probably not something we need to fix for 4.3, but we should > > put it on our to-do list. > > Konrad, > > Could you try the attached patch with your debug patch, to see if it > successfully falls back to the "replace a map hash" patch?It does, but I believe it falls in the BUG_ON scenario where the idx = 32 and the BUG_ON gets hit. Please see the serial log: (attached is the debug patch + your patch) (XEN) domain_page.c:137:d1 mfn: 1eb533, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 1eb864, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 1eb862, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 1eb531, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 1ebd14, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 18e8bb, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 2062e2, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 1eb525, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 20653e, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 1ed3a6, [0]: ff000000->ff000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff (XEN) domain_page.c:137:d1 mfn: 1ed3a6, [0]: ff000000->7f000000, garbage: 80000000 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffff7fffffff (XEN) domain_page.c:137:d1 mfn: 1eb525, [0]: ff000000->df000000, garbage: 20000000 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffdfffffff (XEN) domain_page.c:137:d1 mfn: 1ed3a4, [0]: df000000->df000000, garbage: 0 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffffffffff BusyBox v1.14.3 (2013-06-10 16:30:12 EDT) built-in shell (ash) Enter ''help'' for a list of built-in commands. # (XEN) domain_page.c:137:d1 mfn: 18e8c8, [0]: ff000000->7f000000, garbage: 80000000 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffff7fffffff (XEN) domain_page.c:137:d1 mfn: 1eb531, [0]: ff000000->ef000000, garbage: 10000000 -> ~0 (mask: 0), idx: 0, ~garbage: ffffffffeb531) -> 1 idx: 32(i:1,j:8), branch:a accum: 0x0 (XEN) domain_page.c:222:d1 [0] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [1] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [2] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [3] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [4] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [5] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [6] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [7] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [8] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [9] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [10] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [11] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [12] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [13] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [14] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [15] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [16] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [17] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [18] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [19] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [20] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [21] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [22] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [23] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [24] mfn=1eb531, [EIP=0] (XEN) domain_page.c:222:d1 [25] mfn=2062e2, [EIP=0] (XEN) domain_page.c:222:d1 [26] mfn=18e8bb, [EIP=0] (XEN) domain_page.c:222:d1 [27] mfn=18e8c8, [EIP=0] (XEN) domain_page.c:222:d1 [28] mfn=0, [EIP=0] (XEN) domain_page.c:222:d1 [29] mfn=1ed3a5, [EIP=0] (XEN) domain_page.c:222:d1 [30] mfn=1ed3a6, [EIP=0] (XEN) domain_page.c:222:d1 [31] mfn=1ebc14, [EIP=0] (XEN) Xen BUG at domain_page.c:226 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c4c01609bf>] map_domain_page+0x920/0xa03 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff8300c68f8000 rcx: 0000000000000000 (XEN) rdx: ffff82c4c03191a0 rsi: 000000000000000a rdi: ffff82c4c027a6e8 (XEN) rbp: ffff82c4c02c7e58 rsp: ffff82c4c02c7d78 r8: 0000000000000004 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: ffff83022d7b7000 r13: 00000000001eb531 r14: 0000000000000020 (XEN) r15: 0000000000000020 cr0: 000000008005003b cr4: 00000000000426f0 (XEN) cr3: 000000020653e000 cr2: 00007f808ece33e0 (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c4c02c7d78: (XEN) ffff82c4c02f0de0 0000000000000001 0000000000000008 000000000000000a (XEN) 0000000000000000 ffff830000000000 ffffffffefffffff 0000000000000000 (XEN) 0000000000000000 0000002000000000 00000000ff000000 00000000001eb531 (XEN) ffff82c4c02c7e20 0000000000000020 ffff82c4c02c7e18 ffff8300c68f83c8 (XEN) 0000000ac02c7f08 ffff82c400000000 ffff83022d7b72d8 0000000000000286 (XEN) ffff82c4c03191c0 0000000000000002 0000000900000000 ffff83022d7b7000 (XEN) ffff83022d7b7000 ffff8300c68f8000 0000000000000000 ffff88003f904000 (XEN) ffff82c4c02c7ee8 ffff82c4c017a242 ffffffff81039c19 00000000000426f0 (XEN) 80100001ebc14065 ffff82c4c02c0000 ffffffffffffffff ffff82c4c02c7ed0 (XEN) ffff82c4c0126e77 ffff82c4c02e0000 ffff82c4c02c00e7 00000000001eb531 (XEN) ffffffffffffffff ffff8300c68f8000 0000000000000000 ffff88003e29fe40 (XEN) ffff88003e29fee0 0000000000000080 ffff82c4c02c7ef8 ffff82c4c017a50f (XEN) 00007d3b3fd380c7 ffff82c4c0223c4b ffffffff810011ca 000000000000000e (XEN) ffff88003e27f800 ffff88003f90c740 0000000000000001 ffff88003f911640 (XEN) ffff88003e29fe30 ffff88003f904000 0000000000000286 0000000000000000 (XEN) ffff880000000000 00003ffffffff000 000000000000000e ffffffff810011ca (XEN) 0000000000000000 80100001ebc14065 ffff88003f904000 0001010000000000 (XEN) ffffffff810011ca 000000000000e033 0000000000000286 ffff88003e29fdf8 (XEN) 000000000000e02b 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffff8300c68f8000 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82c4c01609bf>] map_domain_page+0x920/0xa03 (XEN) [<ffff82c4c017a242>] __do_update_va_mapping+0xd6/0x316 (XEN) [<ffff82c4c017a50f>] do_update_va_mapping+0x1e/0x22 (XEN) [<ffff82c4c0223c4b>] syscall_enter+0xeb/0x145 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Xen BUG at domain_page.c:226 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 12/06/2013 13:12, "Jan Beulich" <JBeulich@suse.com> wrote:>> What we see above is that this failure path succeeds three times, but >> fails the fourth time: there are, in fact, no zero entries after the >> garbage clean-up; however, because "inuse" is 32-bit (effectively) and >> "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will >> always return true and never fall back to the "something else" > > Right, that''s what occurred to me too yesterday, but the again > I knew I had seen this code path executed. Now that I look again, > I think I understand why: All of my Dom0-s and typical DomU-s > have a vCPU count divisible by 4, and with MAPCACHE_VCPU_ENTRIES > being 16, the full unsigned long would always be used.Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number and get rid of the accum and the ''replace a hash entry instead'' logic instead? We never used to have it, and it''s kind of extra complication and a bit gross. -- Keir
On 12/06/2013 16:11, "Keir Fraser" <keir.xen@gmail.com> wrote:> On 12/06/2013 13:12, "Jan Beulich" <JBeulich@suse.com> wrote: > >>> What we see above is that this failure path succeeds three times, but >>> fails the fourth time: there are, in fact, no zero entries after the >>> garbage clean-up; however, because "inuse" is 32-bit (effectively) and >>> "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will >>> always return true and never fall back to the "something else" >> >> Right, that''s what occurred to me too yesterday, but the again >> I knew I had seen this code path executed. Now that I look again, >> I think I understand why: All of my Dom0-s and typical DomU-s >> have a vCPU count divisible by 4, and with MAPCACHE_VCPU_ENTRIES >> being 16, the full unsigned long would always be used. > > Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number > and get rid of the accum and the ''replace a hash entry instead'' logic > instead? We never used to have it, and it''s kind of extra complication and a > bit gross.Could even pull MAPHASH_ENTRIES into config.h and explicitly add it to MAPCACHE_VCPU_ENTRIES. That would be neat would it not? And I believe we have space in the mapcache''s perdomain slot to enlarge MAPCACHE_VCPU_ENTRIES in this way (i.e., from 16 to 24). -- Keir> -- Keir > >
>>> On 12.06.13 at 17:11, Keir Fraser <keir.xen@gmail.com> wrote: > On 12/06/2013 13:12, "Jan Beulich" <JBeulich@suse.com> wrote: > >>> What we see above is that this failure path succeeds three times, but >>> fails the fourth time: there are, in fact, no zero entries after the >>> garbage clean-up; however, because "inuse" is 32-bit (effectively) and >>> "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will >>> always return true and never fall back to the "something else" >> >> Right, that''s what occurred to me too yesterday, but the again >> I knew I had seen this code path executed. Now that I look again, >> I think I understand why: All of my Dom0-s and typical DomU-s >> have a vCPU count divisible by 4, and with MAPCACHE_VCPU_ENTRIES >> being 16, the full unsigned long would always be used. > > Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number > and get rid of the accum and the ''replace a hash entry instead'' logic > instead? We never used to have it, and it''s kind of extra complication and a > bit gross.First of all, doubling the entries is not an argument for dropping that code - the old 32-bit implementation really would have needed this too from a theoretical perspective: The number of readily available (garbage) entries is bounded by MAPCACHE_VCPU_ENTRIES - MAPHASH_ENTRIES (because the hash entries actively block getting treated as garbage). Adding MAPHASH_ENTRIES into the calculation of MAPCACHE_VCPU_ENTRIES would result in header dependency problems (I tried this when I re-activated that code, as ideally we would want to set this value to what we anticipate we might need _plus_ MAPHASH_ENTRIES). Consequently the code should be prepared to recover entries from the hash anyway (this is particularly relevant when MAPCACHE_VCPU_ENTRIES <= MAPHASH_ENTRIES - while that''s not the case currently, if someone decided to double the latter it would be). Finally, the VA range for this already is an order-17 block (with the inuse and garbage maps added at the end, i.e. it''s slightly above 512M), so doubling would require the code to be adjusted to handle a single per-domain block covering two per- domain slots (each slot covering 1Gb). Jan
>>> On 12.06.13 at 17:27, Keir Fraser <keir.xen@gmail.com> wrote: > On 12/06/2013 16:11, "Keir Fraser" <keir.xen@gmail.com> wrote: > >> On 12/06/2013 13:12, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>>> What we see above is that this failure path succeeds three times, but >>>> fails the fourth time: there are, in fact, no zero entries after the >>>> garbage clean-up; however, because "inuse" is 32-bit (effectively) and >>>> "accum" is 64-bit, ~inuse always has bits 32-63 set, and so will >>>> always return true and never fall back to the "something else" >>> >>> Right, that''s what occurred to me too yesterday, but the again >>> I knew I had seen this code path executed. Now that I look again, >>> I think I understand why: All of my Dom0-s and typical DomU-s >>> have a vCPU count divisible by 4, and with MAPCACHE_VCPU_ENTRIES >>> being 16, the full unsigned long would always be used. >> >> Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number >> and get rid of the accum and the ''replace a hash entry instead'' logic >> instead? We never used to have it, and it''s kind of extra complication and a >> bit gross. > > Could even pull MAPHASH_ENTRIES into config.h and explicitly add it to > MAPCACHE_VCPU_ENTRIES. That would be neat would it not? And I believe we > have space in the mapcache''s perdomain slot to enlarge MAPCACHE_VCPU_ENTRIES > in this way (i.e., from 16 to 24).Yes, this would an option, but I very much dislike pulling further stuff not belonging there into config.h - this define really best lives side by side with its related definitions. While I just checked the code and think that this not being a power of two should be fine, I''d nevertheless be careful with such a change. Jan
On 12/06/2013 16:48, "Jan Beulich" <JBeulich@suse.com> wrote:>> Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number >> and get rid of the accum and the ''replace a hash entry instead'' logic >> instead? We never used to have it, and it''s kind of extra complication and a >> bit gross. > > First of all, doubling the entries is not an argument for dropping > that code - the old 32-bit implementation really would have > needed this too from a theoretical perspective: The number of > readily available (garbage) entries is bounded by > MAPCACHE_VCPU_ENTRIES - MAPHASH_ENTRIES (because the > hash entries actively block getting treated as garbage).So? We have control over both MAPCACHE_VCPU_ENTRUES and MAPHASH_ENTRIES. We can make these somewhat arbitrary constants big
On Wed, Jun 12, 2013 at 6:26 PM, Keir Fraser <keir.xen@gmail.com> wrote:> On 12/06/2013 16:48, "Jan Beulich" <JBeulich@suse.com> wrote: > >>> Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number >>> and get rid of the accum and the ''replace a hash entry instead'' logic >>> instead? We never used to have it, and it''s kind of extra complication and a >>> bit gross. >> >> First of all, doubling the entries is not an argument for dropping >> that code - the old 32-bit implementation really would have >> needed this too from a theoretical perspective: The number of >> readily available (garbage) entries is bounded by >> MAPCACHE_VCPU_ENTRIES - MAPHASH_ENTRIES (because the >> hash entries actively block getting treated as garbage). > > So? We have control over both MAPCACHE_VCPU_ENTRUES and MAPHASH_ENTRIES. We > can make these somewhat arbitrary constants bigSo is this bug fixed now? Can we close it in the bug tracker? -George
>>> On 05.07.13 at 18:56, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Wed, Jun 12, 2013 at 6:26 PM, Keir Fraser <keir.xen@gmail.com> wrote: >> On 12/06/2013 16:48, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>>> Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number >>>> and get rid of the accum and the ''replace a hash entry instead'' logic >>>> instead? We never used to have it, and it''s kind of extra complication and a >>>> bit gross. >>> >>> First of all, doubling the entries is not an argument for dropping >>> that code - the old 32-bit implementation really would have >>> needed this too from a theoretical perspective: The number of >>> readily available (garbage) entries is bounded by >>> MAPCACHE_VCPU_ENTRIES - MAPHASH_ENTRIES (because the >>> hash entries actively block getting treated as garbage). >> >> So? We have control over both MAPCACHE_VCPU_ENTRUES and MAPHASH_ENTRIES. We >> can make these somewhat arbitrary constants big > > So is this bug fixed now? Can we close it in the bug tracker?The bug itself got fixed, but the number of hash entries didn''t get increased so far (and as previously indicated I''m not intending to do so; Keir indicated he might). Jan
close ^ thanks On Mon, Jul 8, 2013 at 9:58 AM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 05.07.13 at 18:56, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >> On Wed, Jun 12, 2013 at 6:26 PM, Keir Fraser <keir.xen@gmail.com> wrote: >>> On 12/06/2013 16:48, "Jan Beulich" <JBeulich@suse.com> wrote: >>> >>>>> Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number >>>>> and get rid of the accum and the ''replace a hash entry instead'' logic >>>>> instead? We never used to have it, and it''s kind of extra complication and a >>>>> bit gross. >>>> >>>> First of all, doubling the entries is not an argument for dropping >>>> that code - the old 32-bit implementation really would have >>>> needed this too from a theoretical perspective: The number of >>>> readily available (garbage) entries is bounded by >>>> MAPCACHE_VCPU_ENTRIES - MAPHASH_ENTRIES (because the >>>> hash entries actively block getting treated as garbage). >>> >>> So? We have control over both MAPCACHE_VCPU_ENTRUES and MAPHASH_ENTRIES. We >>> can make these somewhat arbitrary constants big >> >> So is this bug fixed now? Can we close it in the bug tracker? > > The bug itself got fixed, but the number of hash entries didn''t > get increased so far (and as previously indicated I''m not > intending to do so; Keir indicated he might). > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
xen@bugs.xenproject.org
2013-Jul-08 09:15 UTC
Processed: Re: Xen 4.3 + tmem = Xen BUG at domain_page.c:143
Processing commands for xen@bugs.xenproject.org:> close ^Command failed: Cannot parse arguments at /srv/xen-devel-bugs/lib/emesinae/control.pl line 325, <M> line 41. Stop processing here. --- Xen Hypervisor Bug Tracker See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on reporting bugs Contact xen-bugs-owner@bugs.xenproject.org with any infrastructure issues
close 14 thanks On Mon, Jul 8, 2013 at 10:07 AM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> close ^ > thanks > > On Mon, Jul 8, 2013 at 9:58 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 05.07.13 at 18:56, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >>> On Wed, Jun 12, 2013 at 6:26 PM, Keir Fraser <keir.xen@gmail.com> wrote: >>>> On 12/06/2013 16:48, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> >>>>>> Why are we so tight on MAPCACHE_VCPU_ENTRIES? Why not say double that number >>>>>> and get rid of the accum and the ''replace a hash entry instead'' logic >>>>>> instead? We never used to have it, and it''s kind of extra complication and a >>>>>> bit gross. >>>>> >>>>> First of all, doubling the entries is not an argument for dropping >>>>> that code - the old 32-bit implementation really would have >>>>> needed this too from a theoretical perspective: The number of >>>>> readily available (garbage) entries is bounded by >>>>> MAPCACHE_VCPU_ENTRIES - MAPHASH_ENTRIES (because the >>>>> hash entries actively block getting treated as garbage). >>>> >>>> So? We have control over both MAPCACHE_VCPU_ENTRUES and MAPHASH_ENTRIES. We >>>> can make these somewhat arbitrary constants big >>> >>> So is this bug fixed now? Can we close it in the bug tracker? >> >> The bug itself got fixed, but the number of hash entries didn''t >> get increased so far (and as previously indicated I''m not >> intending to do so; Keir indicated he might). >> >> Jan >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel
xen@bugs.xenproject.org
2013-Jul-08 09:30 UTC
Processed: Re: Xen 4.3 + tmem = Xen BUG at domain_page.c:143
Processing commands for xen@bugs.xenproject.org:> close 14Closing bug #14> thanksFinished processing. Modified/created Bugs: - 14: http://bugs.xenproject.org/xen/bug/14 --- Xen Hypervisor Bug Tracker See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on reporting bugs Contact xen-bugs-owner@bugs.xenproject.org with any infrastructure issues