Ian Campbell
2011-Sep-20 13:40 UTC
[Xen-devel] Re: Bug#642154: BUG: unable to handle kernel paging request at ffff8803bb6ad000
On Tue, 2011-09-20 at 14:20 +0100, Ben Hutchings wrote:> On Tue, 2011-09-20 at 10:12 +0400, rush wrote: > > Hi, > > > > There are several Not tainted lines in old messages file. There are all of them: > > > > Sep 10 22:35:33 xen-dom0 kernel: [24183.985513] Pid: 2605, comm: > > debootstrap Not tainted 3.0.0-1-amd64 #1 Intel Corporation > > S1200BTL/S1200BTL > > Sep 10 22:35:33 xen-dom0 kernel: [24183.985621] RIP: > > e030:[<ffffffff810106db>] [<ffffffff810106db>] > > __sanitize_i387_state+0x23/0xe1 > > Source/disassembly: > > void __sanitize_i387_state(struct task_struct *tsk) > { > u64 xstate_bv; > int feature_bit = 0x2; > struct i387_fxsave_struct *fx = &tsk->thread.fpu.state->fxsave; > ffffffff810106b8: 48 8b 97 48 04 00 00 mov 0x448(%rdi),%rdx > > if (!fx) > return; > ffffffff810106bf: 48 85 d2 test %rdx,%rdx > ffffffff810106c2: 0f 84 d0 00 00 00 je 0xffffffff81010798 > > BUG_ON(task_thread_info(tsk)->status & TS_USEDFPU); > ffffffff810106c8: 48 8b 47 08 mov 0x8(%rdi),%rax > ffffffff810106cc: f6 40 14 01 testb $0x1,0x14(%rax) > ffffffff810106d0: 74 02 je 0xffffffff810106d4 > ffffffff810106d2: 0f 0b ud2 > > xstate_bv = tsk->thread.fpu.state->xsave.xsave_hdr.xstate_bv; > ffffffff810106db: 48 8b b2 00 02 00 00 mov 0x200(%rdx),%rsi > > So tsk->thread.fpu.state in RDX seems to be invalid. > > > Sep 10 22:35:33 xen-dom0 kernel: [24183.985716] RSP: > > e02b:ffff8803bd2c5e00 EFLAGS: 00010246 > > Sep 10 22:35:33 xen-dom0 kernel: [24183.985767] RAX: 0000000000000000 > > RBX: 00007fff3d69ecc0 RCX: 0000000000000200 > > Sep 10 22:35:33 xen-dom0 kernel: [24183.985824] RDX: ffff8803be0e8e00 > > RSI: ffff8803bd2c5fd8 RDI: ffff8803bd65aa30 > [...] > > RDX looks like a reasonable kernel memory pointer. Given the hostname, > I assume this kernel is running under Xen. So could this be a > use-after-free where the freed page has been unmapped for reallocation > by the hypervisor? Can that happen to arbitrary pages in the dom0 > kernel?In a modern pvops kernel there is a tendency towards leaving a page of actual dom0 memory behind in these cases, rather than a hole. A page with no backing mfn should never be escaping into the "wild" anyway but it''s possible fir a given process to see one if it is doing hypercall activities, mapping foreign pages etc. There''s been some similar looking threads on xen-devel recently but I haven''t paid attention to the details, list & Konrad CC''d. Full log is at http://bugs.debian.org/642154.> > Ben. >-- Ian Campbell Everybody has something to conceal. -- Humphrey Bogart _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-22 19:00 UTC
Re: [Xen-devel] Re: Bug#642154: BUG: unable to handle kernel paging request at ffff8803bb6ad000
> There''s been some similar looking threads on xen-devel recently but I > haven''t paid attention to the details, list & Konrad CC''d. Full log is > at http://bugs.debian.org/642154.Does xsave=0 make a difference? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jonathan Nieder
2011-Oct-01 02:50 UTC
[Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
Konrad Rzeszutek Wilk wrote:>> There''s been some similar looking threads on xen-devel recently but I >> haven''t paid attention to the details, list & Konrad CC''d. Full log is >> at http://bugs.debian.org/642154. > > Does xsave=0 make a difference?Cc-ing the reporter. Rush, are you able to reproduce the oops you mentioned? Does adding noxsave to the kernel command line help? Thanks, Jonathan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
rush
2011-Oct-01 07:01 UTC
[Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
2011/10/1, Jonathan Nieder <jrnieder@gmail.com>:> > Cc-ing the reporter. Rush, are you able to reproduce the oops you > mentioned? Does adding noxsave to the kernel command line help? >I''m sorry, i''m not guru in such questions. Do I need to specify xsave=0 in grub boot options or there is another way to do it? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-01 09:19 UTC
[Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
On Sat, 2011-10-01 at 11:01 +0400, rush wrote:> 2011/10/1, Jonathan Nieder <jrnieder@gmail.com>: > > > > Cc-ing the reporter. Rush, are you able to reproduce the oops you > > mentioned? Does adding noxsave to the kernel command line help? > > > > I''m sorry, i''m not guru in such questions. > Do I need to specify xsave=0 in grub boot options or there is another > way to do it?Assuming your system uses grub2 you should add it to GRUB_CMDLINE_LINUX in /etc/default/grub and then run "update-grub". Ian. -- Ian Campbell Place stamp here. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
rush
2011-Oct-01 17:34 UTC
[Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
Unfortunately, xsave=0 didn''t give any effect. Oops is still here. [ 21.095558] BUG: unable to handle kernel paging request at ffff8803bb7c5000 [ 21.095827] IP: [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 21.096002] PGD 1604067 PUD 3d82d9067 PMD 3d84b5067 PTE 0 [ 21.096355] Oops: 0000 [#1] SMP [ 21.096578] CPU 3 [ 21.096646] Modules linked in: bridge stp xen_evtchn xenfs loop snd_pcm snd_timer snd i2c_i801 soundcore snd_page_alloc i2c_core pcspkr evdev joydev ghes video hed button processor ext4 mbcache jbd2 crc16 dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif ahci libahci libata scsi_mod ehci_hcd fan thermal usbcore thermal_sys e1000e [last unloaded: scsi_wait_scan] [ 21.099742] [ 21.099835] Pid: 1207, comm: update-exim4.co Not tainted 3.0.0-1-amd64 #1 Intel Corporation S1200BTL/S1200BTL [ 21.100169] RIP: e030:[<ffffffff810106db>] [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 21.100376] RSP: e02b:ffff8803bbc77e00 EFLAGS: 00010246 [ 21.100481] RAX: 0000000000000000 RBX: 00007fffa6c18700 RCX: 0000000000000200 [ 21.100593] RDX: ffff8803bb7c4e00 RSI: ffff8803bbc77fd8 RDI: ffff8803bbd2ce20 [ 21.100705] RBP: ffff8803bbd2ce20 R08: dead000000200200 R09: ffff8803bda8c2d8 [ 21.100817] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 21.100929] R13: ffffffffffffffff R14: ffff8803bbd2ce20 R15: 00007fffa6c18700 [ 21.101043] FS: 00007f57dae43700(0000) GS:ffff8803d61a0000(0000) knlGS:0000000000000000 [ 21.101185] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 21.101293] CR2: ffff8803bb7c5000 CR3: 00000003bdcce000 CR4: 0000000000002660 [ 21.101405] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 21.101516] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 21.101629] Process update-exim4.co (pid: 1207, threadinfo ffff8803bbc76000, task ffff8803bbd2ce20) [ 21.101774] Stack: [ 21.101868] ffffffff81010919 0000000000000001 ffff8803bbc77f58 0000000000000011 [ 21.102254] ffff8803bbd2d2b0 ffffffffffffffff ffffffff81008fdd 00000000000002d8 [ 21.102642] 00007fffa6c18538 0000000000000011 0000000000040001 00000065000005d8 [ 21.103029] Call Trace: [ 21.103128] [<ffffffff81010919>] ? save_i387_xstate+0x102/0x1f3 [ 21.103238] [<ffffffff81008fdd>] ? do_signal+0x212/0x649 [ 21.103346] [<ffffffff81009450>] ? do_notify_resume+0x25/0x6b [ 21.103458] [<ffffffff8133bfe0>] ? int_signal+0x12/0x17 [ 21.103564] Code: e8 13 2a ff ff 66 90 c3 48 8b 97 48 04 00 00 48 85 d2 0f 84 d0 00 00 00 48 8b 47 08 f6 40 14 01 74 02 0f 0b 48 8b 05 45 4e 71 00 [ 21.106353] 8b b2 00 02 00 00 48 89 c1 48 21 f1 48 39 c1 0f 84 a7 00 00 [ 21.107848] RIP [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 21.108020] RSP <ffff8803bbc77e00> [ 21.108119] CR2: ffff8803bb7c5000 [ 21.108218] ---[ end trace f589986fb387a3c2 ]--- [ 22.776339] BUG: unable to handle kernel paging request at ffff8803bb7c5000 [ 22.776579] IP: [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 22.776754] PGD 1604067 PUD 3d82d9067 PMD 3d84b5067 PTE 0 [ 22.777109] Oops: 0000 [#5] SMP [ 22.777332] CPU 3 [ 22.777399] Modules linked in: bridge stp xen_evtchn xenfs loop snd_pcm snd_timer snd i2c_i801 soundcore snd_page_alloc i2c_core pcspkr evdev joydev ghes video hed button processor ext4 mbcache jbd2 crc16 dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif ahci libahci libata scsi_mod ehci_hcd fan thermal usbcore thermal_sys e1000e [last unloaded: scsi_wait_scan] [ 22.780506] [ 22.780600] Pid: 2070, comm: forks Tainted: G D 3.0.0-1-amd64 #1 Intel Corporation S1200BTL/S1200BTL [ 22.780933] RIP: e030:[<ffffffff810106db>] [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 22.781141] RSP: e02b:ffff8803bc403e00 EFLAGS: 00010246 [ 22.781247] RAX: 0000000000000000 RBX: 00007fff10bfcdc0 RCX: 0000000000000200 [ 22.781359] RDX: ffff8803bb7c4e00 RSI: ffff8803bc403fd8 RDI: ffff8803bbd2ce20 [ 22.781472] RBP: ffff8803bbd2ce20 R08: ffff8803bc402000 R09: ffffffff81684640 [ 22.781584] R10: 00007f2f327999d0 R11: 0000000000000246 R12: 0000000000000000 [ 22.781696] R13: ffffffffffffffff R14: ffff8803bbd2ce20 R15: 00007fff10bfcdc0 [ 22.781809] FS: 00007f2f32799700(0000) GS:ffff8803d61a0000(0000) knlGS:0000000000000000 [ 22.781952] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 22.782059] CR2: ffff8803bb7c5000 CR3: 00000003b7d28000 CR4: 0000000000002660 [ 22.782170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 22.782283] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 22.782395] Process forks (pid: 2070, threadinfo ffff8803bc402000, task ffff8803bbd2ce20) [ 22.782537] Stack: [ 22.782634] ffffffff81010919 0000000000413201 ffff8803bc403f58 0000000000000011 [ 22.783021] ffff8803bbd2d2b0 ffffffffffffffff ffffffff81008fdd 0000000000000000 [ 22.783408] 00007fff10bfcbf8 0000000000000011 0000000000040001 0000fffe00000817 [ 22.783795] Call Trace: [ 22.783895] [<ffffffff81010919>] ? save_i387_xstate+0x102/0x1f3 [ 22.784004] [<ffffffff81008fdd>] ? do_signal+0x212/0x649 [ 22.784112] [<ffffffff8133733a>] ? error_exit+0x2a/0x60 [ 22.784219] [<ffffffff81336e61>] ? retint_restore_args+0x5/0x6 [ 22.784328] [<ffffffff8100122a>] ? hypercall_page+0x22a/0x1000 [ 22.784436] [<ffffffff81009450>] ? do_notify_resume+0x25/0x6b [ 22.784545] [<ffffffff8133bfe0>] ? int_signal+0x12/0x17 [ 22.784652] Code: e8 13 2a ff ff 66 90 c3 48 8b 97 48 04 00 00 48 85 d2 0f 84 d0 00 00 00 48 8b 47 08 f6 40 14 01 74 02 0f 0b 48 8b 05 45 4e 71 00 [ 22.787445] 8b b2 00 02 00 00 48 89 c1 48 21 f1 48 39 c1 0f 84 a7 00 00 [ 22.788942] RIP [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 22.789115] RSP <ffff8803bc403e00> [ 22.789215] CR2: ffff8803bb7c5000 [ 22.789315] ---[ end trace f589986fb387a3c6 ]--- grub boot options was: menuentry ''Debian GNU/Linux, with Xen 4.0-amd64 and Linux 3.0.0-1-amd64'' --class debian --class gnu-linux --class gnu --class os --class xen { insmod raid insmod mdraid1x insmod lvm insmod part_msdos insmod part_msdos insmod ext2 set root=''(xen-system)'' search --no-floppy --fs-uuid --set=root 709c172b-19b2-417d-8a43-e1957bcdc2f6 echo ''Loading Xen 4.0-amd64 ...'' multiboot /boot/xen-4.0-amd64.gz placeholder echo ''Loading Linux 3.0.0-1-amd64 ...'' module /boot/vmlinuz-3.0.0-1-amd64 placeholder root=/dev/mapper/xen-system ro xsave=0 quiet echo ''Loading initial ramdisk ...'' module /boot/initrd.img-3.0.0-1-amd64 } Rush. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-03 18:47 UTC
[Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
> echo ''Loading Xen 4.0-amd64 ...'' > multiboot /boot/xen-4.0-amd64.gz placeholderOops. I meant to try it in the hypervisor - so right after placeholder add "xsave=0"> echo ''Loading Linux 3.0.0-1-amd64 ...'' > module /boot/vmlinuz-3.0.0-1-amd64 placeholder > root=/dev/mapper/xen-system ro xsave=0 quiet_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-03 18:53 UTC
[Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
On Mon, 2011-10-03 at 14:47 -0400, Konrad Rzeszutek Wilk wrote:> > echo ''Loading Xen 4.0-amd64 ...'' > > multiboot /boot/xen-4.0-amd64.gz placeholder > > Oops. I meant to try it in the hypervisor - so right after placeholder add "xsave=0"Which in grub2 means add GRUB_CMDLINE_XEN="xsave=0" to /etc/default/grub (there is no commented out example in this case) and re-run update-grub. Ian. -- Ian Campbell Many a bum show has been saved by the flag. -- George M. Cohan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
rush
2011-Oct-08 06:13 UTC
[Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
OK, I tried it again, but Oops didn''t gone. menuentry ''Debian GNU/Linux, with Xen 4.0-amd64 and Linux 3.0.0-1-amd64'' --class debian --class gnu-linux --class gnu --class os --class xen { insmod raid insmod mdraid1x insmod lvm insmod part_msdos insmod part_msdos insmod ext2 set root=''(xen-system)'' search --no-floppy --fs-uuid --set=root 709c172b-19b2-417d-8a43-e1957bcdc2f6 echo ''Loading Xen 4.0-amd64 ...'' multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0 echo ''Loading Linux 3.0.0-1-amd64 ...'' module /boot/vmlinuz-3.0.0-1-amd64 placeholder root=/dev/mapper/xen-system ro quiet echo ''Loading initial ramdisk ...'' module /boot/initrd.img-3.0.0-1-amd64 } Was it right? [ 24.242539] BUG: unable to handle kernel paging request at ffff8803be1ab000 [ 24.242780] IP: [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 24.242956] PGD 1604067 PUD 3d82d9067 PMD 3d84ca067 PTE 0 [ 24.243309] Oops: 0000 [#1] SMP [ 24.243533] CPU 0 [ 24.243601] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables xen_netback xen_blkback bridge stp xen_evtchn xenfs loop snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 joydev evdev pcspkr ghes i2c_core video processor hed button ext4 mbcache jbd2 crc16 dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif ahci libahci libata scsi_mod ehci_hcd fan thermal usbcore thermal_sys e1000e [last unloaded: scsi_wait_scan] [ 24.247197] [ 24.247291] Pid: 2526, comm: forks Not tainted 3.0.0-1-amd64 #1 Intel Corporation S1200BTL/S1200BTL [ 24.247621] RIP: e030:[<ffffffff810106db>] [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 24.247829] RSP: e02b:ffff88034862be00 EFLAGS: 00010246 [ 24.247935] RAX: 0000000000000000 RBX: 00007fff1755a8c0 RCX: 0000000000000200 [ 24.248047] RDX: ffff8803be1aae00 RSI: ffff88034862bfd8 RDI: ffff8803bbf55650 [ 24.248159] RBP: ffff8803bbf55650 R08: ffff88034862a000 R09: ffffffff81684640 [ 24.248271] R10: 00007fe2b7cd09d0 R11: 0000000000000246 R12: 0000000000000000 [ 24.248384] R13: ffffffffffffffff R14: ffff8803bbf55650 R15: 00007fff1755a8c0 [ 24.248498] FS: 00007fe2b7cd0700(0000) GS:ffff8803d614f000(0000) knlGS:0000000000000000 [ 24.248641] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 24.248750] CR2: ffff8803be1ab000 CR3: 00000003bc23c000 CR4: 0000000000002660 [ 24.248862] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 24.248976] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 24.249088] Process forks (pid: 2526, threadinfo ffff88034862a000, task ffff8803bbf55650) [ 24.249231] Stack: [ 24.249325] ffffffff81010919 0000000000413201 ffff88034862bf58 0000000000000011 [ 24.249715] ffff8803bbf55ae0 ffffffffffffffff ffffffff81008fdd 0000000000000000 [ 24.250101] 00007fff1755a6f8 0000000000000011 0000000000040001 0000fffe000009df [ 24.250491] Call Trace: [ 24.250589] [<ffffffff81010919>] ? save_i387_xstate+0x102/0x1f3 [ 24.250700] [<ffffffff81008fdd>] ? do_signal+0x212/0x649 [ 24.250810] [<ffffffff8133733a>] ? error_exit+0x2a/0x60 [ 24.250916] [<ffffffff81009450>] ? do_notify_resume+0x25/0x6b [ 24.251027] [<ffffffff8133bfe0>] ? int_signal+0x12/0x17 [ 24.251132] Code: e8 13 2a ff ff 66 90 c3 48 8b 97 48 04 00 00 48 85 d2 0f 84 d0 00 00 00 48 8b 47 08 f6 40 14 01 74 02 0f 0b 48 8b 05 45 4e 71 00 [ 24.253911] 8b b2 00 02 00 00 48 89 c1 48 21 f1 48 39 c1 0f 84 a7 00 00 [ 24.255408] RIP [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1 [ 24.255581] RSP <ffff88034862be00> [ 24.255680] CR2: ffff8803be1ab000 [ 24.255780] ---[ end trace e9c161e4e81bf087 ]--- 2011/10/3, Ian Campbell <ijc@hellion.org.uk>:> On Mon, 2011-10-03 at 14:47 -0400, Konrad Rzeszutek Wilk wrote: >> > echo ''Loading Xen 4.0-amd64 ...'' >> > multiboot /boot/xen-4.0-amd64.gz placeholder >> >> Oops. I meant to try it in the hypervisor - so right after placeholder add >> "xsave=0" > > Which in grub2 means add GRUB_CMDLINE_XEN="xsave=0" to /etc/default/grub > (there is no commented out example in this case) and re-run update-grub. > > Ian. > > -- > Ian Campbell > > > Many a bum show has been saved by the flag. > -- George M. Cohan >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-10 16:49 UTC
Re: [Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote:> OK, I tried it again, but Oops didn''t gone... snip..> echo ''Loading Xen 4.0-amd64 ...'' > multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0.. snip..> Was it right?Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel folks to get the xsave part right and I remember seeing this error about a year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes that ultimately went in 4.1.1 did not get ported over to 4.0 and you are just hitting that. Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in the testing and try with the xsave (or without) and see if it works? <holds his fingers hoping it is the xsave feature> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
rush
2011-Oct-10 21:11 UTC
Re: [Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
2011/10/10, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>:> > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in > the testing and try with the xsave (or without) and see if it works? >Ok, but I need around a week for it. (some difficulties with access to this server at the moment).> <holds his fingers hoping it is the xsave feature>Thank you (: _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-11 07:07 UTC
Re: [Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
>>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote: >> OK, I tried it again, but Oops didn''t gone. > .. snip.. >> echo ''Loading Xen 4.0-amd64 ...'' >> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0 > .. snip.. >> Was it right? > > Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel > folks to get the xsave part right and I remember seeing this error about a > year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes that > ultimately went in 4.1.1 did not get ported over to 4.0 and you are just > hitting that. > > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in > the testing and try with the xsave (or without) and see if it works? > > <holds his fingers hoping it is the xsave feature>Are both of you certain this isn''t the problem of the kernel only looking at the xsaveopt feature flag (implying that this means xsave is also available)? I found it necessary to force-clear that flag in the kernel when OSXSAVE is not set (by calling x86_xsave_setup() when !cpu_has_xsave, which in turn was modified to look at X86_FEATURE_OSXSAVE rather than X86_FEATURE_XSAVE under Xen - all of which I''m afraid would need to be done differently in pv-ops). If it is, the problem could be worked around by *en*abling xsave in Xen (which is off by default prior to 4.2), assuming none of the incomplete functionality would cause other headaches. But yes, the CPUID handling code in 4.1.1 should properly hide XSAVEOPT when XSAVE is disabled, so just using this version ought to also get things going. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-11 08:02 UTC
Re: [Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote:> >>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote: > >> OK, I tried it again, but Oops didn''t gone. > > .. snip.. > >> echo ''Loading Xen 4.0-amd64 ...'' > >> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0 > > .. snip.. > >> Was it right? > > > > Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel > > folks to get the xsave part right and I remember seeing this error about a > > year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes that > > ultimately went in 4.1.1 did not get ported over to 4.0 and you are just > > hitting that. > > > > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in > > the testing and try with the xsave (or without) and see if it works? > > > > <holds his fingers hoping it is the xsave feature> > > Are both of you certain this isn''t the problem of the kernel only > looking at the xsaveopt feature flag (implying that this means > xsave is also available)? I found it necessary to force-clear that > flag in the kernel when OSXSAVE is not set (by calling > x86_xsave_setup() when !cpu_has_xsave, which in turn was > modified to look at X86_FEATURE_OSXSAVE rather than > X86_FEATURE_XSAVE under Xen - all of which I''m afraid would > need to be done differently in pv-ops).That all sounds familiar... In mainline we have (in xen_init_cpuid_mask): ... xsave_mask (1 << (X86_FEATURE_XSAVE % 32)) | (1 << (X86_FEATURE_OSXSAVE % 32)); /* Xen will set CR4.OSXSAVE if supported and not disabled by force */ if ((cx & xsave_mask) != xsave_mask) cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & OSXSAVE */ Which I think implements something similar to what you describe? IOW unless both XSAVE and OSXSAVE are available both are forcibly disabled. While grepping I noticed that the kernel command line parameter to disable xsave appears to be "noxsave" rather than "xsave=0", Rush is that something you could try? (GRUB_CMDLINE_LINUX is the place to add it) Ian.> If it is, the problem could be worked around by *en*abling xsave > in Xen (which is off by default prior to 4.2), assuming none of the > incomplete functionality would cause other headaches. > > But yes, the CPUID handling code in 4.1.1 should properly hide > XSAVEOPT when XSAVE is disabled, so just using this version > ought to also get things going. > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Ian Campbell Current Noise: Zyklon - Hammer Revelation The ultimate game show will be the one where somebody gets killed at the end. -- Chuck Barris, creator of "The Gong Show" _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-11 08:36 UTC
Re: [Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
>>> On 11.10.11 at 10:02, Ian Campbell <ijc@hellion.org.uk> wrote: > On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote: >> >>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: >> > On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote: >> >> OK, I tried it again, but Oops didn''t gone. >> > .. snip.. >> >> echo ''Loading Xen 4.0-amd64 ...'' >> >> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0 >> > .. snip.. >> >> Was it right? >> > >> > Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel >> > folks to get the xsave part right and I remember seeing this error about a >> > year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes > that >> > ultimately went in 4.1.1 did not get ported over to 4.0 and you are just >> > hitting that. >> > >> > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in >> > the testing and try with the xsave (or without) and see if it works? >> > >> > <holds his fingers hoping it is the xsave feature> >> >> Are both of you certain this isn''t the problem of the kernel only >> looking at the xsaveopt feature flag (implying that this means >> xsave is also available)? I found it necessary to force-clear that >> flag in the kernel when OSXSAVE is not set (by calling >> x86_xsave_setup() when !cpu_has_xsave, which in turn was >> modified to look at X86_FEATURE_OSXSAVE rather than >> X86_FEATURE_XSAVE under Xen - all of which I''m afraid would >> need to be done differently in pv-ops). > > That all sounds familiar... In mainline we have (in > xen_init_cpuid_mask): > > ... > xsave_mask > (1 << (X86_FEATURE_XSAVE % 32)) | > (1 << (X86_FEATURE_OSXSAVE % 32)); > > /* Xen will set CR4.OSXSAVE if supported and not disabled by force > */ > if ((cx & xsave_mask) != xsave_mask) > cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & > OSXSAVE */ > > Which I think implements something similar to what you describe? IOW > unless both XSAVE and OSXSAVE are available both are forcibly disabled.Apart from the need to disable XSAVEOPT, yes.> While grepping I noticed that the kernel command line parameter to > disable xsave appears to be "noxsave" rather than "xsave=0", Rush is > that something you could try? (GRUB_CMDLINE_LINUX is the place to add > it)Or "noxsaveopt" (if that''s the problem, i.e. Rush''s CPUs have that capability). Jan> Ian. > >> If it is, the problem could be worked around by *en*abling xsave >> in Xen (which is off by default prior to 4.2), assuming none of the >> incomplete functionality would cause other headaches. >> >> But yes, the CPUID handling code in 4.1.1 should properly hide >> XSAVEOPT when XSAVE is disabled, so just using this version >> ought to also get things going. >> >> Jan >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > -- > Ian Campbell > Current Noise: Zyklon - Hammer Revelation > > The ultimate game show will be the one where somebody gets killed at the > end. > -- Chuck Barris, creator of "The Gong Show"_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-11 08:43 UTC
Re: [Xen-devel] Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
On Tue, 2011-10-11 at 09:36 +0100, Jan Beulich wrote:> >>> On 11.10.11 at 10:02, Ian Campbell <ijc@hellion.org.uk> wrote: > > On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote: > >> >>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > >> > On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote: > >> >> OK, I tried it again, but Oops didn''t gone. > >> > .. snip.. > >> >> echo ''Loading Xen 4.0-amd64 ...'' > >> >> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0 > >> > .. snip.. > >> >> Was it right? > >> > > >> > Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel > >> > folks to get the xsave part right and I remember seeing this error about a > >> > year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes > > that > >> > ultimately went in 4.1.1 did not get ported over to 4.0 and you are just > >> > hitting that. > >> > > >> > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in > >> > the testing and try with the xsave (or without) and see if it works? > >> > > >> > <holds his fingers hoping it is the xsave feature> > >> > >> Are both of you certain this isn''t the problem of the kernel only > >> looking at the xsaveopt feature flag (implying that this means > >> xsave is also available)? I found it necessary to force-clear that > >> flag in the kernel when OSXSAVE is not set (by calling > >> x86_xsave_setup() when !cpu_has_xsave, which in turn was > >> modified to look at X86_FEATURE_OSXSAVE rather than > >> X86_FEATURE_XSAVE under Xen - all of which I''m afraid would > >> need to be done differently in pv-ops). > > > > That all sounds familiar... In mainline we have (in > > xen_init_cpuid_mask): > > > > ... > > xsave_mask > > (1 << (X86_FEATURE_XSAVE % 32)) | > > (1 << (X86_FEATURE_OSXSAVE % 32)); > > > > /* Xen will set CR4.OSXSAVE if supported and not disabled by force > > */ > > if ((cx & xsave_mask) != xsave_mask) > > cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & > > OSXSAVE */ > > > > Which I think implements something similar to what you describe? IOW > > unless both XSAVE and OSXSAVE are available both are forcibly disabled. > > Apart from the need to disable XSAVEOPT, yes.Oh, right, I hadn''t noticed it was a different/third flag.> > While grepping I noticed that the kernel command line parameter to > > disable xsave appears to be "noxsave" rather than "xsave=0", Rush is > > that something you could try? (GRUB_CMDLINE_LINUX is the place to add > > it) > > Or "noxsaveopt" (if that''s the problem, i.e. Rush''s CPUs have that > capability).Right, Rush can you try both "noxsave" and "noxsaveopt" independently please. If those work then we need to update the above logic to mask xsaveopt as well. Thanks, Ian.> > Jan > > > Ian. > > > >> If it is, the problem could be worked around by *en*abling xsave > >> in Xen (which is off by default prior to 4.2), assuming none of the > >> incomplete functionality would cause other headaches. > >> > >> But yes, the CPUID handling code in 4.1.1 should properly hide > >> XSAVEOPT when XSAVE is disabled, so just using this version > >> ought to also get things going. > >> > >> Jan > >> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > > > > -- > > Ian Campbell > > Current Noise: Zyklon - Hammer Revelation > > > > The ultimate game show will be the one where somebody gets killed at the > > end. > > -- Chuck Barris, creator of "The Gong Show" > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Ian Campbell Current Noise: Zyklon - Transcendental War - Battle Between Gods If you tell the truth you don''t have to remember anything. -- Mark Twain _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olivier B.
2012-Mar-06 20:22 UTC
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000
On 11/10/2011 10:43, Ian Campbell wrote:> On Tue, 2011-10-11 at 09:36 +0100, Jan Beulich wrote: >>>>> On 11.10.11 at 10:02, Ian Campbell<ijc@hellion.org.uk> wrote: >>> On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote: >>>>>>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk<konrad.wilk@oracle.com> wrote: >>>>> On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote: >>>>>> OK, I tried it again, but Oops didn''t gone. >>>>> .. snip.. >>>>>> echo ''Loading Xen 4.0-amd64 ...'' >>>>>> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0 >>>>> .. snip.. >>>>>> Was it right? >>>>> >>>>> Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel >>>>> folks to get the xsave part right and I remember seeing this error about a >>>>> year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes >>> that >>>>> ultimately went in 4.1.1 did not get ported over to 4.0 and you are just >>>>> hitting that. >>>>> >>>>> Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in >>>>> the testing and try with the xsave (or without) and see if it works? >>>>> >>>>> <holds his fingers hoping it is the xsave feature> >>>> >>>> Are both of you certain this isn''t the problem of the kernel only >>>> looking at the xsaveopt feature flag (implying that this means >>>> xsave is also available)? I found it necessary to force-clear that >>>> flag in the kernel when OSXSAVE is not set (by calling >>>> x86_xsave_setup() when !cpu_has_xsave, which in turn was >>>> modified to look at X86_FEATURE_OSXSAVE rather than >>>> X86_FEATURE_XSAVE under Xen - all of which I''m afraid would >>>> need to be done differently in pv-ops). >>> >>> That all sounds familiar... In mainline we have (in >>> xen_init_cpuid_mask): >>> >>> ... >>> xsave_mask >>> (1<< (X86_FEATURE_XSAVE % 32)) | >>> (1<< (X86_FEATURE_OSXSAVE % 32)); >>> >>> /* Xen will set CR4.OSXSAVE if supported and not disabled by force >>> */ >>> if ((cx& xsave_mask) != xsave_mask) >>> cpuid_leaf1_ecx_mask&= ~xsave_mask; /* disable XSAVE& >>> OSXSAVE */ >>> >>> Which I think implements something similar to what you describe? IOW >>> unless both XSAVE and OSXSAVE are available both are forcibly disabled. >> >> Apart from the need to disable XSAVEOPT, yes. > > Oh, right, I hadn''t noticed it was a different/third flag. > >>> While grepping I noticed that the kernel command line parameter to >>> disable xsave appears to be "noxsave" rather than "xsave=0", Rush is >>> that something you could try? (GRUB_CMDLINE_LINUX is the place to add >>> it) >> >> Or "noxsaveopt" (if that''s the problem, i.e. Rush''s CPUs have that >> capability). > > Right, Rush can you try both "noxsave" and "noxsaveopt" independently > please. If those work then we need to update the above logic to mask > xsaveopt as well. > > Thanks, > Ian. >For the record, same problem here with Xen 4.0 and an Intel Xeon CPU E31220 with "microcode 0x14". (and the problem doesn''t exists with a CPU E31220 without that microcode). The xen parameter "noxsaveopt" solved it. Olivier
Reasonably Related Threads
- Random kernel errors
- Kernel paging request error
- [PATCH] Nested VMX: Allow to set CR4.OSXSAVE if guest has xsave feature
- 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)
- Re: 'virsh capabilities' on Debian Wheezy-amd64 reports different cpu to Wheezy-i386 (on same hardware)