Ian Campbell
2009-Nov-21 11:32 UTC
[Xen-devel] [PATCH] xen: correctly restore pfn_to_mfn_list_list after resume
pvops kernels >= 2.6.30 can currently only be saved and restored once. The second attempt to save results in: ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000 ERROR Internal error: Failed to map/save the p2m frame list I finally narrowed it down to: commit cdaead6b4e657f960d6d6f9f380e7dfeedc6a09b Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Date: Fri Feb 27 15:34:59 2009 -0800 xen: split construction of p2m mfn tables from registration Build the p2m_mfn_list_list early with the rest of the p2m table, but register it later when the real shared_info structure is in place. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> The unforeseen side-effect of this change was to cause the mfn list list to not be rebuilt on resume. Prior to this change it would have been rebuilt via xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list(). Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend(). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: xen-devel@lists.xensource.com --- arch/x86/xen/mmu.c | 2 +- arch/x86/xen/suspend.c | 2 ++ arch/x86/xen/xen-ops.h | 1 + 3 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 3bf7b1d..bf4cd6b 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -185,7 +185,7 @@ static inline unsigned p2m_index(unsigned long pfn) } /* Build the parallel p2m_top_mfn structures */ -static void __init xen_build_mfn_list_list(void) +void xen_build_mfn_list_list(void) { unsigned pfn, idx; diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c index 95be7b4..6343a5d 100644 --- a/arch/x86/xen/suspend.c +++ b/arch/x86/xen/suspend.c @@ -27,6 +27,8 @@ void xen_pre_suspend(void) void xen_post_suspend(int suspend_cancelled) { + xen_build_mfn_list_list(); + xen_setup_shared_info(); if (suspend_cancelled) { diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index 3252932..f9153a3 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -25,6 +25,7 @@ extern struct shared_info *HYPERVISOR_shared_info; void xen_setup_mfn_list_list(void); void xen_setup_shared_info(void); +void xen_build_mfn_list_list(void); void xen_setup_machphys_mapping(void); pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn); void xen_ident_map_ISA(void); -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Bartosz Lis
2009-Nov-30 10:17 UTC
Re: [Xen-devel] [PATCH] xen: correctly restore pfn_to_mfn_list_list after resume
Dnia sobota, 21 listopada 2009 o 12:32:49 Ian Campbell napisaĆ(a):> pvops kernels >= 2.6.30 can currently only be saved and restored once. The > second attempt to save results in: > > ERROR Internal error: Frame# in pfn-to-mfn frame list is not in > pseudophys ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, > max 0x120000 ERROR Internal error: Failed to map/save the p2m frame list > > I finally narrowed it down to: > > commit cdaead6b4e657f960d6d6f9f380e7dfeedc6a09b > Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > Date: Fri Feb 27 15:34:59 2009 -0800 > > xen: split construction of p2m mfn tables from registration > > Build the p2m_mfn_list_list early with the rest of the p2m > table, but register it later when the real shared_info structure is in > place. > > Signed-off-by: Jeremy Fitzhardinge > <jeremy.fitzhardinge@citrix.com> > > The unforeseen side-effect of this change was to cause the mfn list list to > not be rebuilt on resume. Prior to this change it would have been rebuilt > via xen_post_suspend() -> xen_setup_shared_info() -> > xen_setup_mfn_list_list(). > > Fix by explicitly calling xen_build_mfn_list_list() from > xen_post_suspend(). >[---] Ian, I have downloaded and compiled pvops kernel after your fixes a week ago (commit e14a6cdfdf5b40330297701b4e6963f9eff6d8df Sat, 21 Nov 2009 23:59:07 +0000 (07:59 +0800)). Now, it has been running stable as xen0 for about 5 days on a dual AMD Opteron 248 and a dual Intel Xeon E5520. 1. Opteron 248 guest For all that time I have been compiling linux kernel in a loop (~700 compilation rouds) on a virtual machine with 2 vcpus. I have mgrated the machine from time to time there and back from one phisical machine to the other, both having Opterons 248. Save/restore/save/restore works fine: kernel continues to compile, even ssh session was not closed. I have tested only 64 bit kernel/userlands in both xen0/U. 2. Xeon 5520 guest For 64 bit kernel/userlands in both xen0/U Save/restore/save/restore works fine: kernel continues to compile, ssh session stays open. I used 2 vcpus in the guest. I have no possibility to check live migration on Xeon E5520 (no SAN connection). Unfortunately save/restore does not work for 64bit kernel/userland in dom0 and 32bit kernel/userland in domU (tested with 1 and then with 2 vcpus). Save hangs. Save file is ~1.5kB long and I''m getting on guest''s console: ----8<---- [ 34.729250] BUG: unable to handle kernel paging request at c1527000 [ 34.729271] IP: [<c1006593>] xen_set_pmd+0x73/0xb0 [ 34.729288] *pdpt = 0000000403162027 [ 34.729299] Oops: 0003 [#1] SMP [ 34.729312] last sysfs file: /sys/module/ip_tables/initstate [ 34.729321] Modules linked in: sch_sfq xt_limit ipt_REJECT xt_tcpudp ipt_LOG xt_state xt_multiport iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_mangle ip_tables x_tables xenfs dm_multipath scsi_dh dm_mod st sd_mod crc_t10dif lpfc qla2xxx scsi_transport_fc scsi_tgt qla1280 scsi_mod psmouse uhci_hcd ehci_hcd usbcore pcspkr xen_netfront evdev ext3 jbd mbcache [ 34.729485] [ 34.729493] Pid: 1686, comm: kstop/0 xid: #0 Not tainted (2.6.31.6x_xenUnogrsecuritypae-BL5.5 #1) [ 34.729504] EIP: 0061:[<c1006593>] EFLAGS: 00010046 CPU: 0 [ 34.729513] EIP is at xen_set_pmd+0x73/0xb0 [ 34.729520] EAX: c1527000 EBX: 031f3067 ECX: 00000004 EDX: c179b000 [ 34.729529] ESI: 00000004 EDI: c1527000 EBP: ddd75eb0 ESP: ddd75ea0 [ 34.729538] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 [ 34.729547] Process kstop/0 (pid: 1686, ti=ddd74000 task=df8f48c0 task.ti=ddd74000) [ 34.729557] Stack: [ 34.729563] c1527000 031f3067 1fd61067 c1527000 ddd75ec4 c10cd650 00000000 00000000 [ 34.729595] <0> 00200000 ddd75f20 c10ceb14 00000000 123ab067 00000000 00000fff 00001000 [ 34.729630] <0> 00000fff c1463000 c1474f60 01ba9067 00000000 ddd75f14 c1006f3a c153001c [ 34.729670] Call Trace: [ 34.729682] [<c10cd650>] ? __pte_alloc_kernel+0xa0/0xb0 [ 34.729693] [<c10ceb14>] ? apply_to_page_range+0x314/0x330 [ 34.729705] [<c1006f3a>] ? xen_force_evtchn_callback+0x1a/0x30 [ 34.729717] [<c10079c6>] ? arch_gnttab_unmap+0x26/0x30 [ 34.729729] [<c1007950>] ? unmap_pte_fn+0x0/0x50 [ 34.729742] [<c1204591>] ? gnttab_suspend+0x41/0x50 [ 34.729753] [<c120756a>] ? xen_suspend+0x3a/0xf0 [ 34.729765] [<c108873d>] ? stop_cpu+0x8d/0xd0 [ 34.729776] [<c1054022>] ? worker_thread+0x112/0x220 [ 34.729787] [<c10886b0>] ? stop_cpu+0x0/0xd0 [ 34.729798] [<c10587e0>] ? autoremove_wake_function+0x0/0x40 [ 34.729810] [<c1053f10>] ? worker_thread+0x0/0x220 [ 34.729821] [<c10584ec>] ? kthread+0x7c/0x90 [ 34.729831] [<c1058470>] ? kthread+0x0/0x90 [ 34.729843] [<c100ad17>] ? kernel_thread_helper+0x7/0x10 [ 34.729851] Code: 00 75 48 8b 45 f0 89 da 89 f1 83 05 fc 32 53 c1 01 e8 e2 fe ff ff 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 90 8d 74 26 00 8b 45 f0 <89> 18 89 70 04 eb e4 ba e0 32 53 c1 b9 33 00 00 00 31 c0 89 d7 [ 34.730076] EIP: [<c1006593>] xen_set_pmd+0x73/0xb0 SS:ESP 0069:ddd75ea0 [ 34.730093] CR2: 00000000c1527000 [ 34.730102] ---[ end trace cd1b831872a4c87f ]--- [ 34.730137] ------------[ cut here ]------------ [ 34.730147] WARNING: at /root/rpm/BUILD/kernel- xenUnogrsecuritypae-2.6.31.6x/linux-2.6.31/kernel/time/timekeeping.c:102 getnstimeofday+0x102/0x110() [ 34.730160] Modules linked in: sch_sfq xt_limit ipt_REJECT xt_tcpudp ipt_LOG xt_state xt_multiport iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_mangle ip_tables x_tables xenfs dm_multipath scsi_dh dm_mod st sd_mod crc_t10dif lpfc qla2xxx scsi_transport_fc scsi_tgt qla1280 scsi_mod psmouse uhci_hcd ehci_hcd usbcore pcspkr xen_netfront evdev ext3 jbd mbcache [ 34.730316] Pid: 0, comm: swapper xid: #0 Tainted: G D 2.6.31.6x_xenUnogrsecuritypae-BL5.5 #1 [ 34.730326] Call Trace: [ 34.730338] [<c1333d7a>] ? printk+0x18/0x1e [ 34.730349] [<c1040fcd>] warn_slowpath_common+0x6d/0xa0 [ 34.730360] [<c106b7d2>] ? getnstimeofday+0x102/0x110 [ 34.730370] [<c106b7d2>] ? getnstimeofday+0x102/0x110 [ 34.730381] [<c1041015>] warn_slowpath_null+0x15/0x20 [ 34.730392] [<c106b7d2>] getnstimeofday+0x102/0x110 [ 34.730403] [<c105c716>] ktime_get_ts+0x26/0x60 [ 34.730413] [<c105c766>] ktime_get+0x16/0x40 [ 34.730425] [<c107056c>] tick_nohz_stop_sched_tick+0x6c/0x390 [ 34.730437] [<c1009187>] cpu_idle+0x27/0x80 [ 34.730449] [<c1323e25>] rest_init+0x55/0x60 [ 34.730461] [<c14a186c>] start_kernel+0x2fb/0x301 [ 34.730472] [<c14a138e>] ? unknown_bootoption+0x0/0x1ad [ 34.730483] [<c14a108d>] i386_start_kernel+0x7c/0x83 [ 34.730494] [<c14a418e>] xen_start_kernel+0x517/0x51f [ 34.730502] ---[ end trace cd1b831872a4c880 ]--- ----8<---- I''m going to try newer commits. Regards, -- Bartosz Lis @ Inst. of Information Technology, Technical Univ. of Lodz, Poland bartoszl @ ics.p.lodz.pl _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Nov-30 10:36 UTC
Re: [Xen-devel] [PATCH] xen: correctly restore pfn_to_mfn_list_list after resume
On Mon, 2009-11-30 at 10:17 +0000, Bartosz Lis wrote:> Unfortunately save/restore does not work for 64bit kernel/userland in dom0 and > 32bit kernel/userland in domU (tested with 1 and then with 2 vcpus). Save > hangs. Save file is ~1.5kB long and I''m getting on guest''s console:You stack trace looks like the issue resolved by "xen: do not unmap grant status on suspend when using v1 grant tables" sent to the list on Wednesday.> ----8<---- > [ 34.729250] BUG: unable to handle kernel paging request at c1527000 > [ 34.729271] IP: [<c1006593>] xen_set_pmd+0x73/0xb0 > [ 34.729288] *pdpt = 0000000403162027 > [ 34.729299] Oops: 0003 [#1] SMP > [ 34.729312] last sysfs file: /sys/module/ip_tables/initstate > [ 34.729321] Modules linked in: sch_sfq xt_limit ipt_REJECT xt_tcpudp > ipt_LOG xt_state xt_multiport iptable_filter iptable_nat nf_nat > nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_mangle ip_tables > x_tables xenfs dm_multipath scsi_dh dm_mod st sd_mod crc_t10dif lpfc qla2xxx > scsi_transport_fc scsi_tgt qla1280 scsi_mod psmouse uhci_hcd ehci_hcd usbcore > pcspkr xen_netfront evdev ext3 jbd mbcache > [ 34.729485] > [ 34.729493] Pid: 1686, comm: kstop/0 xid: #0 Not tainted > (2.6.31.6x_xenUnogrsecuritypae-BL5.5 #1) > [ 34.729504] EIP: 0061:[<c1006593>] EFLAGS: 00010046 CPU: 0 > [ 34.729513] EIP is at xen_set_pmd+0x73/0xb0 > [ 34.729520] EAX: c1527000 EBX: 031f3067 ECX: 00000004 EDX: c179b000 > [ 34.729529] ESI: 00000004 EDI: c1527000 EBP: ddd75eb0 ESP: ddd75ea0 > [ 34.729538] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > [ 34.729547] Process kstop/0 (pid: 1686, ti=ddd74000 task=df8f48c0 > task.ti=ddd74000) > [ 34.729557] Stack: > [ 34.729563] c1527000 031f3067 1fd61067 c1527000 ddd75ec4 c10cd650 00000000 > 00000000 > [ 34.729595] <0> 00200000 ddd75f20 c10ceb14 00000000 123ab067 00000000 > 00000fff 00001000 > [ 34.729630] <0> 00000fff c1463000 c1474f60 01ba9067 00000000 ddd75f14 > c1006f3a c153001c > [ 34.729670] Call Trace: > [ 34.729682] [<c10cd650>] ? __pte_alloc_kernel+0xa0/0xb0 > [ 34.729693] [<c10ceb14>] ? apply_to_page_range+0x314/0x330 > [ 34.729705] [<c1006f3a>] ? xen_force_evtchn_callback+0x1a/0x30 > [ 34.729717] [<c10079c6>] ? arch_gnttab_unmap+0x26/0x30 > [ 34.729729] [<c1007950>] ? unmap_pte_fn+0x0/0x50 > [ 34.729742] [<c1204591>] ? gnttab_suspend+0x41/0x50 > [ 34.729753] [<c120756a>] ? xen_suspend+0x3a/0xf0 > [ 34.729765] [<c108873d>] ? stop_cpu+0x8d/0xd0 > [ 34.729776] [<c1054022>] ? worker_thread+0x112/0x220 > [ 34.729787] [<c10886b0>] ? stop_cpu+0x0/0xd0 > [ 34.729798] [<c10587e0>] ? autoremove_wake_function+0x0/0x40 > [ 34.729810] [<c1053f10>] ? worker_thread+0x0/0x220 > [ 34.729821] [<c10584ec>] ? kthread+0x7c/0x90 > [ 34.729831] [<c1058470>] ? kthread+0x0/0x90 > [ 34.729843] [<c100ad17>] ? kernel_thread_helper+0x7/0x10 > [ 34.729851] Code: 00 75 48 8b 45 f0 89 da 89 f1 83 05 fc 32 53 c1 01 e8 e2 > fe ff ff 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 90 8d 74 26 00 8b 45 f0 <89> > 18 89 70 04 eb e4 ba e0 32 53 c1 b9 33 00 00 00 31 c0 89 d7 > [ 34.730076] EIP: [<c1006593>] xen_set_pmd+0x73/0xb0 SS:ESP 0069:ddd75ea0 > [ 34.730093] CR2: 00000000c1527000 > [ 34.730102] ---[ end trace cd1b831872a4c87f ]--- > [ 34.730137] ------------[ cut here ]------------ > [ 34.730147] WARNING: at /root/rpm/BUILD/kernel- > xenUnogrsecuritypae-2.6.31.6x/linux-2.6.31/kernel/time/timekeeping.c:102 > getnstimeofday+0x102/0x110() > [ 34.730160] Modules linked in: sch_sfq xt_limit ipt_REJECT xt_tcpudp > ipt_LOG xt_state xt_multiport iptable_filter iptable_nat nf_nat > nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_mangle ip_tables > x_tables xenfs dm_multipath scsi_dh dm_mod st sd_mod crc_t10dif lpfc qla2xxx > scsi_transport_fc scsi_tgt qla1280 scsi_mod psmouse uhci_hcd ehci_hcd usbcore > pcspkr xen_netfront evdev ext3 jbd mbcache > [ 34.730316] Pid: 0, comm: swapper xid: #0 Tainted: G D > 2.6.31.6x_xenUnogrsecuritypae-BL5.5 #1 > [ 34.730326] Call Trace: > [ 34.730338] [<c1333d7a>] ? printk+0x18/0x1e > [ 34.730349] [<c1040fcd>] warn_slowpath_common+0x6d/0xa0 > [ 34.730360] [<c106b7d2>] ? getnstimeofday+0x102/0x110 > [ 34.730370] [<c106b7d2>] ? getnstimeofday+0x102/0x110 > [ 34.730381] [<c1041015>] warn_slowpath_null+0x15/0x20 > [ 34.730392] [<c106b7d2>] getnstimeofday+0x102/0x110 > [ 34.730403] [<c105c716>] ktime_get_ts+0x26/0x60 > [ 34.730413] [<c105c766>] ktime_get+0x16/0x40 > [ 34.730425] [<c107056c>] tick_nohz_stop_sched_tick+0x6c/0x390 > [ 34.730437] [<c1009187>] cpu_idle+0x27/0x80 > [ 34.730449] [<c1323e25>] rest_init+0x55/0x60 > [ 34.730461] [<c14a186c>] start_kernel+0x2fb/0x301 > [ 34.730472] [<c14a138e>] ? unknown_bootoption+0x0/0x1ad > [ 34.730483] [<c14a108d>] i386_start_kernel+0x7c/0x83 > [ 34.730494] [<c14a418e>] xen_start_kernel+0x517/0x51f > [ 34.730502] ---[ end trace cd1b831872a4c880 ]--- > ----8<---- > > I''m going to try newer commits. > > Regards, >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel