Hello, We are getting alignment check with 2.6.32 kernel running as a domU on an AMD system, while dom0 is a 2.6.18 kernel. As far as I know we should not have run into such problem, since this is x86_64 kernel. I am aware of the fact that for alignment check trap AC bit needs to be set in eflags and AM should be set in CR0. I tracked cr0 and AM was getting set, and problem was occurring when something was setting AC flag at the time of calling memcpy_c(). I cheated and cleared the AM flag in cr0 (as one can see in this trace) but this didn''t help. I haven''t figured out what sets the AM flag... Here is the trace: [ 80.342300] alignment check: 0000 [#1] SMP [ 80.342323] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev [ 80.342330] CPU 1 [ 80.342339] Pid: 3875, comm: loas_check Not tainted 2.6.32.10+drm33.1 #12 [ 80.342347] RIP: e030:[<ffffffff813eb2bb>] [<ffffffff813eb2bb>] memcpy_c+0xb/0x20 [ 80.342365] RSP: e02b:ffff88015556d9b0 EFLAGS: 00050246 [ 80.342371] RAX: ffff88017360cc8c RBX: ffff880176d91900 RCX: 0000000000000002 [ 80.342379] RDX: 0000000000000000 RSI: ffff880176d91958 RDI: ffff88017360cc8c [ 80.342388] RBP: ffff88015556d9e8 R08: ffffffff81570260 R09: ffffffff81ae8840 [ 80.342395] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 80.342403] R13: 000000000000000e R14: ffff880173f3fc00 R15: ffff880176d91958 [ 80.342417] FS: 00007f0f4dafd6e0(0000) GS:ffff880028047000(0000) knlGS:0000000000000000 [ 80.342425] CS: e033 DS: 002b ES: 002b CR0: 000000008001003b [ 80.342432] CR2: 00007f0f4db1a000 CR3: 000000017362d000 CR4: 0000000000000660 [ 80.342440] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 80.342448] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 [ 80.342457] Process loas_check (pid: 3875, threadinfo ffff88015556c000, task ffff8801556cada0) [ 80.342465] Stack: [ 80.342469] ffffffff8157038f ffffffff81ae8840 ffff880173f3fc00 0000000000000000 [ 80.342483] <0> ffff880173f3fc00 ffff880161a36400 0000000000000000 ffff88015556da08 [ 80.342500] <0> ffffffff815705d6 ffff880180000000 ffff880173f3fc00 ffff88015556da28 [ 80.342518] Call Trace: [ 80.342528] [<ffffffff8157038f>] ? ip_finish_output+0x12f/0x2f0 [ 80.342538] [<ffffffff815705d6>] ip_output+0x86/0xd0 [ 80.342546] [<ffffffff8156f600>] ip_local_out+0x20/0x30 [ 80.342555] [<ffffffff8156fed3>] ip_queue_xmit+0x223/0x3f0 [ 80.342565] [<ffffffff81584214>] ? tcp_send_active_reset+0x24/0x180 [ 80.342576] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 [ 80.342586] [<ffffffff8100e532>] ? check_events+0x12/0x20 [ 80.342595] [<ffffffff81583dd2>] tcp_transmit_skb+0x402/0x780 [ 80.342604] [<ffffffff81584279>] tcp_send_active_reset+0x89/0x180 [ 80.342614] [<ffffffff815770bc>] tcp_disconnect+0x6c/0x3c0 [ 80.342622] [<ffffffff81576e34>] tcp_close+0x3e4/0x480 [ 80.342632] [<ffffffff81598b92>] inet_release+0x42/0x70 [ 80.342643] [<ffffffff814ce5d8>] sock_release+0x18/0x60 [ 80.342652] [<ffffffff814ceab2>] sock_close+0x12/0x30 [ 80.342663] [<ffffffff8110e28e>] __fput+0xee/0x200 [ 80.342671] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 [ 80.342681] [<ffffffff8110e3b7>] fput+0x17/0x20 [ 80.342690] [<ffffffff8110a378>] filp_close+0x58/0x90 [ 80.342698] [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1 [ 80.342709] [<ffffffff8105691c>] put_files_struct+0xcc/0xe0 [ 80.342718] [<ffffffff81056980>] exit_files+0x50/0x60 [ 80.342726] [<ffffffff81058587>] do_exit+0x1b7/0x7f0 [ 80.342735] [<ffffffff81065cb6>] ? __dequeue_signal+0x16/0x160 [ 80.342745] [<ffffffff81058bfc>] do_group_exit+0x3c/0xa0 [ 80.342754] [<ffffffff81068328>] get_signal_to_deliver+0x1b8/0x380 [ 80.342764] [<ffffffff810106a9>] do_notify_resume+0xc9/0x8a0 [ 80.342775] [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0 [ 80.342786] [<ffffffff8102cb52>] ? paravirt_end_context_switch+0x12/0x30 [ 80.342798] [<ffffffff81047afb>] ? finish_task_switch+0x5b/0xb0 [ 80.342808] [<ffffffff8101134e>] int_signal+0x12/0x17 [ 80.342815] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 [ 80.342952] RIP [<ffffffff813eb2bb>] memcpy_c+0xb/0x20 [ 80.342962] RSP <ffff88015556d9b0> [ 80.342969] ---[ end trace 1442aa6e9e3d337d ]--- [ 80.342976] Fixing recursive fault but reboot is needed! This happens 2 out of 3 times. I don''t seem to find any similar recent reports and relevant commits so far, and we haven''t had such problem running 2.6.24 domU (Ubuntu hardy) on the 2.6.18 dom0. I''m hoping someone can give a hand. Thanks, --Natalie P.S. Just in case - here is the "original" trace before I tried to modify the cr0: [ 64.544616] alignment check: 0000 [#1] SMP [ 64.544640] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev [ 64.544647] CPU 1 [ 64.544655] Pid: 3737, comm: loas_check Not tainted 2.6.32.10+drm33.1 #8 [ 64.544663] RIP: e030:[<ffffffff813eb23b>] [<ffffffff813eb23b>] memcpy_c+0xb/0x20 [ 64.544681] RSP: e02b:ffff880152e7d9b0 EFLAGS: 00050246 [ 64.544687] RAX: ffff8801731e4c8c RBX: ffff880178185400 RCX: 0000000000000002 [ 64.544696] RDX: 0000000000000000 RSI: ffff880178185458 RDI: ffff8801731e4c8c [ 64.544703] RBP: ffff880152e7d9e8 R08: ffffffff81570110 R09: ffffffff81ae6840 [ 64.544711] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 64.544718] R13: 000000000000000e R14: ffff88017332d800 R15: ffff880178185458 [ 64.544732] FS: 00007fa8de9336e0(0000) GS:ffff880028047000(0000) knlGS:0000000000000000 [ 64.544741] CS: e033 DS: 002b ES: 002b CR0: 000000008005003b [ 64.544748] CR2: 00000000081f1320 CR3: 0000000001001000 CR4: 0000000000000660 [ 64.544756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 64.544764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 [ 64.544772] Process loas_check (pid: 3737, threadinfo ffff880152e7c000, task ffff880152e72da0) [ 64.544781] Stack: [ 64.544785] ffffffff81570214 ffffffff81ae6840 ffff88017332d800 0000000000000000 [ 64.544798] <0> ffff88017332d800 ffff880152e13200 0000000000000000 ffff880152e7da08 [ 64.544814] <0> ffffffff81570486 ffff880180000000 ffff88017332d800 ffff880152e7da28 [ 64.544832] Call Trace: [ 64.544842] [<ffffffff81570214>] ? ip_finish_output+0x104/0x2f0 [ 64.544853] [<ffffffff81570486>] ip_output+0x86/0xd0 [ 64.544862] [<ffffffff8156f4b0>] ip_local_out+0x20/0x30 [ 64.544870] [<ffffffff8156fd83>] ip_queue_xmit+0x223/0x3f0 [ 64.544880] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 [ 64.544889] [<ffffffff8100e532>] ? check_events+0x12/0x20 [ 64.544900] [<ffffffff81583c82>] tcp_transmit_skb+0x402/0x780 [ 64.544909] [<ffffffff81584129>] tcp_send_active_reset+0x89/0x180 [ 64.544920] [<ffffffff8111e16a>] ? __d_free+0x3a/0x60 [ 64.544929] [<ffffffff81576f6c>] tcp_disconnect+0x6c/0x3c0 [ 64.544938] [<ffffffff81576ce4>] tcp_close+0x3e4/0x480 [ 64.544946] [<ffffffff81598a42>] inet_release+0x42/0x70 [ 64.544956] [<ffffffff814ce558>] sock_release+0x18/0x60 [ 64.544964] [<ffffffff814cea32>] sock_close+0x12/0x30 [ 64.544974] [<ffffffff8110e20e>] __fput+0xee/0x200 [ 64.544982] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 [ 64.544991] [<ffffffff8110e337>] fput+0x17/0x20 [ 64.545000] [<ffffffff8110a2f8>] filp_close+0x58/0x90 [ 64.545009] [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1 [ 64.545019] [<ffffffff8105689c>] put_files_struct+0xcc/0xe0 [ 64.545028] [<ffffffff81056900>] exit_files+0x50/0x60 [ 64.545036] [<ffffffff81058507>] do_exit+0x1b7/0x7f0 [ 64.545046] [<ffffffff81065c36>] ? __dequeue_signal+0x16/0x160 [ 64.545055] [<ffffffff81058b7c>] do_group_exit+0x3c/0xa0 [ 64.545064] [<ffffffff810682a8>] get_signal_to_deliver+0x1b8/0x380 [ 64.545073] [<ffffffff81010649>] do_notify_resume+0xc9/0x880 [ 64.545084] [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0 [ 64.545095] [<ffffffff8102cad2>] ? paravirt_end_context_switch+0x12/0x30 [ 64.545106] [<ffffffff81047a7b>] ? finish_task_switch+0x5b/0xb0 [ 64.545115] [<ffffffff810112ce>] int_signal+0x12/0x17 [ 64.545121] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 [ 64.545252] RIP [<ffffffff813eb23b>] memcpy_c+0xb/0x20 [ 64.545262] RSP <ffff880152e7d9b0> [ 64.545269] ---[ end trace 11cf940a2c626919 ]--- [ 64.545276] Fixing recursive fault but reboot is needed! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Natalie Protasevich <protasnb@gmail.com> 30.03.10 05:08 >>> > I cheated and cleared the AM flag in cr0 (as one can see in this > trace) but this didn''t help.Assuming you did this in the hypervisor, this would point at a CPU bug. There should not be any alignment check exceptions with this bit clear.> I haven''t figured out what sets the AM flag...The hypervisor sets up CR0 this way, and doesn''t allow altering later. In order to allow the kernel to support alignment check exceptions for user mode, Xen needs to do it this way and clears AC each time passing control to (64-bit) kernel code. What you''ll need to do is look for where AC gets set, probably by modifying all asm-s using popf, as this is what seems bogus. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Natalie Protasevich
2010-Mar-30 11:34 UTC
Re: [Xen-devel] Alignment check on domU (2.6.32)
On Tue, Mar 30, 2010 at 1:06 AM, Jan Beulich <JBeulich@novell.com> wrote:> >>> Natalie Protasevich <protasnb@gmail.com> 30.03.10 05:08 >>> > > I cheated and cleared the AM flag in cr0 (as one can see in this > > trace) but this didn''t help. > > Assuming you did this in the hypervisor, this would point at a CPU bug. > There should not be any alignment check exceptions with this bit clear. > > > I haven''t figured out what sets the AM flag... > > The hypervisor sets up CR0 this way, and doesn''t allow altering later. > In order to allow the kernel to support alignment check exceptions > for user mode, Xen needs to do it this way and clears AC each time > passing control to (64-bit) kernel code. >I was trying to be careful with AC/AM, but still messed up :) I meant to say I can''t find so far who sets the AC flag, although I sprinkled checks in various places, next was going to see how I can set HW breakpoint.> > What you''ll need to do is look for where AC gets set, probably by > modifying all asm-s using popf, as this is what seems bogus. >Thanks for the hint! I will look at those... --Natalie> > Jan > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Mar-30 16:53 UTC
Re: [Xen-devel] Alignment check on domU (2.6.32)
On 03/29/2010 08:08 PM, Natalie Protasevich wrote:> Hello, > We are getting alignment check with 2.6.32 kernel running as a domU on > an AMD system,Which 2.6.32 is it? Is it stock kernel.org, from xen.git, a distro, elsewhere?> while dom0 is a 2.6.18 kernel. > As far as I know we should not have run into such problem, since this > is x86_64 kernel. I am aware of the fact that for alignment check trap > AC bit needs to be set in eflags and AM should be set in CR0. I > tracked cr0 and AM was getting set, and problem was occurring when > something was setting AC flag at the time of calling memcpy_c(). I > cheated and cleared the AM flag in cr0 (as one can see in this trace) > but this didn''t help. I haven''t figured out what sets the AM flag...Do you have any other domains running at the time? What CPU is this? Does it run the same kernel native OK? J> > Here is the trace: > > [ 80.342300] alignment check: 0000 [#1] SMP > [ 80.342323] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev > [ 80.342330] CPU 1 > [ 80.342339] Pid: 3875, comm: loas_check Not tainted > 2.6.32.10+drm33.1 #12 > [ 80.342347] RIP: e030:[<ffffffff813eb2bb>] [<ffffffff813eb2bb>] > memcpy_c+0xb/0x20 > [ 80.342365] RSP: e02b:ffff88015556d9b0 EFLAGS: 00050246 > [ 80.342371] RAX: ffff88017360cc8c RBX: ffff880176d91900 RCX: > 0000000000000002 > [ 80.342379] RDX: 0000000000000000 RSI: ffff880176d91958 RDI: > ffff88017360cc8c > [ 80.342388] RBP: ffff88015556d9e8 R08: ffffffff81570260 R09: > ffffffff81ae8840 > [ 80.342395] R10: 0000000000000000 R11: 0000000000000000 R12: > 0000000000000000 > [ 80.342403] R13: 000000000000000e R14: ffff880173f3fc00 R15: > ffff880176d91958 > [ 80.342417] FS: 00007f0f4dafd6e0(0000) GS:ffff880028047000(0000) > knlGS:0000000000000000 > [ 80.342425] CS: e033 DS: 002b ES: 002b CR0: 000000008001003b > [ 80.342432] CR2: 00007f0f4db1a000 CR3: 000000017362d000 CR4: > 0000000000000660 > [ 80.342440] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 80.342448] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000000 > [ 80.342457] Process loas_check (pid: 3875, threadinfo > ffff88015556c000, task ffff8801556cada0) > [ 80.342465] Stack: > [ 80.342469] ffffffff8157038f ffffffff81ae8840 ffff880173f3fc00 > 0000000000000000 > [ 80.342483] <0> ffff880173f3fc00 ffff880161a36400 0000000000000000 > ffff88015556da08 > [ 80.342500] <0> ffffffff815705d6 ffff880180000000 ffff880173f3fc00 > ffff88015556da28 > [ 80.342518] Call Trace: > [ 80.342528] [<ffffffff8157038f>] ? ip_finish_output+0x12f/0x2f0 > [ 80.342538] [<ffffffff815705d6>] ip_output+0x86/0xd0 > [ 80.342546] [<ffffffff8156f600>] ip_local_out+0x20/0x30 > [ 80.342555] [<ffffffff8156fed3>] ip_queue_xmit+0x223/0x3f0 > [ 80.342565] [<ffffffff81584214>] ? tcp_send_active_reset+0x24/0x180 > [ 80.342576] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 > [ 80.342586] [<ffffffff8100e532>] ? check_events+0x12/0x20 > [ 80.342595] [<ffffffff81583dd2>] tcp_transmit_skb+0x402/0x780 > [ 80.342604] [<ffffffff81584279>] tcp_send_active_reset+0x89/0x180 > [ 80.342614] [<ffffffff815770bc>] tcp_disconnect+0x6c/0x3c0 > [ 80.342622] [<ffffffff81576e34>] tcp_close+0x3e4/0x480 > [ 80.342632] [<ffffffff81598b92>] inet_release+0x42/0x70 > [ 80.342643] [<ffffffff814ce5d8>] sock_release+0x18/0x60 > [ 80.342652] [<ffffffff814ceab2>] sock_close+0x12/0x30 > [ 80.342663] [<ffffffff8110e28e>] __fput+0xee/0x200 > [ 80.342671] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 > [ 80.342681] [<ffffffff8110e3b7>] fput+0x17/0x20 > [ 80.342690] [<ffffffff8110a378>] filp_close+0x58/0x90 > [ 80.342698] [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1 > [ 80.342709] [<ffffffff8105691c>] put_files_struct+0xcc/0xe0 > [ 80.342718] [<ffffffff81056980>] exit_files+0x50/0x60 > [ 80.342726] [<ffffffff81058587>] do_exit+0x1b7/0x7f0 > [ 80.342735] [<ffffffff81065cb6>] ? __dequeue_signal+0x16/0x160 > [ 80.342745] [<ffffffff81058bfc>] do_group_exit+0x3c/0xa0 > [ 80.342754] [<ffffffff81068328>] get_signal_to_deliver+0x1b8/0x380 > [ 80.342764] [<ffffffff810106a9>] do_notify_resume+0xc9/0x8a0 > [ 80.342775] [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0 > [ 80.342786] [<ffffffff8102cb52>] ? > paravirt_end_context_switch+0x12/0x30 > [ 80.342798] [<ffffffff81047afb>] ? finish_task_switch+0x5b/0xb0 > [ 80.342808] [<ffffffff8101134e>] int_signal+0x12/0x17 > [ 80.342815] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 > 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 > 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 > 00 00 > [ 80.342952] RIP [<ffffffff813eb2bb>] memcpy_c+0xb/0x20 > [ 80.342962] RSP <ffff88015556d9b0> > [ 80.342969] ---[ end trace 1442aa6e9e3d337d ]--- > [ 80.342976] Fixing recursive fault but reboot is needed! > > This happens 2 out of 3 times. > I don''t seem to find any similar recent reports and relevant commits > so far, and we haven''t had such problem running 2.6.24 domU (Ubuntu > hardy) on the 2.6.18 dom0. I''m hoping someone can give a hand. > Thanks, > --Natalie > P.S. Just in case - here is the "original" trace before I tried to > modify the cr0: > > [ 64.544616] alignment check: 0000 [#1] SMP > [ 64.544640] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev > [ 64.544647] CPU 1 > [ 64.544655] Pid: 3737, comm: loas_check Not tainted > 2.6.32.10+drm33.1 #8 > [ 64.544663] RIP: e030:[<ffffffff813eb23b>] [<ffffffff813eb23b>] > memcpy_c+0xb/0x20 > [ 64.544681] RSP: e02b:ffff880152e7d9b0 EFLAGS: 00050246 > [ 64.544687] RAX: ffff8801731e4c8c RBX: ffff880178185400 RCX: > 0000000000000002 > [ 64.544696] RDX: 0000000000000000 RSI: ffff880178185458 RDI: > ffff8801731e4c8c > [ 64.544703] RBP: ffff880152e7d9e8 R08: ffffffff81570110 R09: > ffffffff81ae6840 > [ 64.544711] R10: 0000000000000000 R11: 0000000000000000 R12: > 0000000000000000 > [ 64.544718] R13: 000000000000000e R14: ffff88017332d800 R15: > ffff880178185458 > [ 64.544732] FS: 00007fa8de9336e0(0000) GS:ffff880028047000(0000) > knlGS:0000000000000000 > [ 64.544741] CS: e033 DS: 002b ES: 002b CR0: 000000008005003b > [ 64.544748] CR2: 00000000081f1320 CR3: 0000000001001000 CR4: > 0000000000000660 > [ 64.544756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 64.544764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000000 > [ 64.544772] Process loas_check (pid: 3737, threadinfo > ffff880152e7c000, task ffff880152e72da0) > [ 64.544781] Stack: > [ 64.544785] ffffffff81570214 ffffffff81ae6840 ffff88017332d800 > 0000000000000000 > [ 64.544798] <0> ffff88017332d800 ffff880152e13200 0000000000000000 > ffff880152e7da08 > [ 64.544814] <0> ffffffff81570486 ffff880180000000 ffff88017332d800 > ffff880152e7da28 > [ 64.544832] Call Trace: > [ 64.544842] [<ffffffff81570214>] ? ip_finish_output+0x104/0x2f0 > [ 64.544853] [<ffffffff81570486>] ip_output+0x86/0xd0 > [ 64.544862] [<ffffffff8156f4b0>] ip_local_out+0x20/0x30 > [ 64.544870] [<ffffffff8156fd83>] ip_queue_xmit+0x223/0x3f0 > [ 64.544880] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 > [ 64.544889] [<ffffffff8100e532>] ? check_events+0x12/0x20 > [ 64.544900] [<ffffffff81583c82>] tcp_transmit_skb+0x402/0x780 > [ 64.544909] [<ffffffff81584129>] tcp_send_active_reset+0x89/0x180 > [ 64.544920] [<ffffffff8111e16a>] ? __d_free+0x3a/0x60 > [ 64.544929] [<ffffffff81576f6c>] tcp_disconnect+0x6c/0x3c0 > [ 64.544938] [<ffffffff81576ce4>] tcp_close+0x3e4/0x480 > [ 64.544946] [<ffffffff81598a42>] inet_release+0x42/0x70 > [ 64.544956] [<ffffffff814ce558>] sock_release+0x18/0x60 > [ 64.544964] [<ffffffff814cea32>] sock_close+0x12/0x30 > [ 64.544974] [<ffffffff8110e20e>] __fput+0xee/0x200 > [ 64.544982] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 > [ 64.544991] [<ffffffff8110e337>] fput+0x17/0x20 > [ 64.545000] [<ffffffff8110a2f8>] filp_close+0x58/0x90 > [ 64.545009] [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1 > [ 64.545019] [<ffffffff8105689c>] put_files_struct+0xcc/0xe0 > [ 64.545028] [<ffffffff81056900>] exit_files+0x50/0x60 > [ 64.545036] [<ffffffff81058507>] do_exit+0x1b7/0x7f0 > [ 64.545046] [<ffffffff81065c36>] ? __dequeue_signal+0x16/0x160 > [ 64.545055] [<ffffffff81058b7c>] do_group_exit+0x3c/0xa0 > [ 64.545064] [<ffffffff810682a8>] get_signal_to_deliver+0x1b8/0x380 > [ 64.545073] [<ffffffff81010649>] do_notify_resume+0xc9/0x880 > [ 64.545084] [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0 > [ 64.545095] [<ffffffff8102cad2>] ? > paravirt_end_context_switch+0x12/0x30 > [ 64.545106] [<ffffffff81047a7b>] ? finish_task_switch+0x5b/0xb0 > [ 64.545115] [<ffffffff810112ce>] int_signal+0x12/0x17 > [ 64.545121] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 > 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 > 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 > 00 00 > [ 64.545252] RIP [<ffffffff813eb23b>] memcpy_c+0xb/0x20 > [ 64.545262] RSP <ffff880152e7d9b0> > [ 64.545269] ---[ end trace 11cf940a2c626919 ]--- > [ 64.545276] Fixing recursive fault but reboot is needed! > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Natalie Protasevich
2010-Mar-30 17:32 UTC
Re: [Xen-devel] Alignment check on domU (2.6.32)
On Tue, Mar 30, 2010 at 9:53 AM, Jeremy Fitzhardinge <jeremy@goop.org>wrote:> On 03/29/2010 08:08 PM, Natalie Protasevich wrote: > >> Hello, >> We are getting alignment check with 2.6.32 kernel running as a domU on an >> AMD system, >> > > Which 2.6.32 is it? Is it stock kernel.org, from xen.git, a distro, > elsewhere?It is distro (Ubuntu Lucid) kernel, 2.6.32 based.> > > while dom0 is a 2.6.18 kernel. >> As far as I know we should not have run into such problem, since this is >> x86_64 kernel. I am aware of the fact that for alignment check trap AC bit >> needs to be set in eflags and AM should be set in CR0. I tracked cr0 and AM >> was getting set, and problem was occurring when something was setting AC >> flag at the time of calling memcpy_c(). I cheated and cleared the AM flag in >> cr0 (as one can see in this trace) but this didn''t help. I haven''t figured >> out what sets the AM flag... >> > > Do you have any other domains running at the time? >No, just one domain, 3 vcpus.> What CPU is this? >AMD''s: Dual-Core AMD Opteron(tm) Processor 8214 HE (I am planning to set up and try some other system.)> > Does it run the same kernel native OK? >Yes it does, no problems with bare metal system.. Thanks, --Natalie> > J > > >> Here is the trace: >> >> [ 80.342300] alignment check: 0000 [#1] SMP >> [ 80.342323] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev >> [ 80.342330] CPU 1 >> [ 80.342339] Pid: 3875, comm: loas_check Not tainted 2.6.32.10+drm33.1 >> #12 >> [ 80.342347] RIP: e030:[<ffffffff813eb2bb>] [<ffffffff813eb2bb>] >> memcpy_c+0xb/0x20 >> [ 80.342365] RSP: e02b:ffff88015556d9b0 EFLAGS: 00050246 >> [ 80.342371] RAX: ffff88017360cc8c RBX: ffff880176d91900 RCX: >> 0000000000000002 >> [ 80.342379] RDX: 0000000000000000 RSI: ffff880176d91958 RDI: >> ffff88017360cc8c >> [ 80.342388] RBP: ffff88015556d9e8 R08: ffffffff81570260 R09: >> ffffffff81ae8840 >> [ 80.342395] R10: 0000000000000000 R11: 0000000000000000 R12: >> 0000000000000000 >> [ 80.342403] R13: 000000000000000e R14: ffff880173f3fc00 R15: >> ffff880176d91958 >> [ 80.342417] FS: 00007f0f4dafd6e0(0000) GS:ffff880028047000(0000) >> knlGS:0000000000000000 >> [ 80.342425] CS: e033 DS: 002b ES: 002b CR0: 000000008001003b >> [ 80.342432] CR2: 00007f0f4db1a000 CR3: 000000017362d000 CR4: >> 0000000000000660 >> [ 80.342440] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [ 80.342448] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000000 >> [ 80.342457] Process loas_check (pid: 3875, threadinfo ffff88015556c000, >> task ffff8801556cada0) >> [ 80.342465] Stack: >> [ 80.342469] ffffffff8157038f ffffffff81ae8840 ffff880173f3fc00 >> 0000000000000000 >> [ 80.342483] <0> ffff880173f3fc00 ffff880161a36400 0000000000000000 >> ffff88015556da08 >> [ 80.342500] <0> ffffffff815705d6 ffff880180000000 ffff880173f3fc00 >> ffff88015556da28 >> [ 80.342518] Call Trace: >> [ 80.342528] [<ffffffff8157038f>] ? ip_finish_output+0x12f/0x2f0 >> [ 80.342538] [<ffffffff815705d6>] ip_output+0x86/0xd0 >> [ 80.342546] [<ffffffff8156f600>] ip_local_out+0x20/0x30 >> [ 80.342555] [<ffffffff8156fed3>] ip_queue_xmit+0x223/0x3f0 >> [ 80.342565] [<ffffffff81584214>] ? tcp_send_active_reset+0x24/0x180 >> [ 80.342576] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 >> [ 80.342586] [<ffffffff8100e532>] ? check_events+0x12/0x20 >> [ 80.342595] [<ffffffff81583dd2>] tcp_transmit_skb+0x402/0x780 >> [ 80.342604] [<ffffffff81584279>] tcp_send_active_reset+0x89/0x180 >> [ 80.342614] [<ffffffff815770bc>] tcp_disconnect+0x6c/0x3c0 >> [ 80.342622] [<ffffffff81576e34>] tcp_close+0x3e4/0x480 >> [ 80.342632] [<ffffffff81598b92>] inet_release+0x42/0x70 >> [ 80.342643] [<ffffffff814ce5d8>] sock_release+0x18/0x60 >> [ 80.342652] [<ffffffff814ceab2>] sock_close+0x12/0x30 >> [ 80.342663] [<ffffffff8110e28e>] __fput+0xee/0x200 >> [ 80.342671] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 >> [ 80.342681] [<ffffffff8110e3b7>] fput+0x17/0x20 >> [ 80.342690] [<ffffffff8110a378>] filp_close+0x58/0x90 >> [ 80.342698] [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1 >> [ 80.342709] [<ffffffff8105691c>] put_files_struct+0xcc/0xe0 >> [ 80.342718] [<ffffffff81056980>] exit_files+0x50/0x60 >> [ 80.342726] [<ffffffff81058587>] do_exit+0x1b7/0x7f0 >> [ 80.342735] [<ffffffff81065cb6>] ? __dequeue_signal+0x16/0x160 >> [ 80.342745] [<ffffffff81058bfc>] do_group_exit+0x3c/0xa0 >> [ 80.342754] [<ffffffff81068328>] get_signal_to_deliver+0x1b8/0x380 >> [ 80.342764] [<ffffffff810106a9>] do_notify_resume+0xc9/0x8a0 >> [ 80.342775] [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0 >> [ 80.342786] [<ffffffff8102cb52>] ? >> paravirt_end_context_switch+0x12/0x30 >> [ 80.342798] [<ffffffff81047afb>] ? finish_task_switch+0x5b/0xb0 >> [ 80.342808] [<ffffffff8101134e>] int_signal+0x12/0x17 >> [ 80.342815] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 31 c0 >> c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 >> <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 >> [ 80.342952] RIP [<ffffffff813eb2bb>] memcpy_c+0xb/0x20 >> [ 80.342962] RSP <ffff88015556d9b0> >> [ 80.342969] ---[ end trace 1442aa6e9e3d337d ]--- >> [ 80.342976] Fixing recursive fault but reboot is needed! >> >> This happens 2 out of 3 times. >> I don''t seem to find any similar recent reports and relevant commits so >> far, and we haven''t had such problem running 2.6.24 domU (Ubuntu hardy) on >> the 2.6.18 dom0. I''m hoping someone can give a hand. >> Thanks, >> --Natalie >> P.S. Just in case - here is the "original" trace before I tried to modify >> the cr0: >> >> [ 64.544616] alignment check: 0000 [#1] SMP >> [ 64.544640] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev >> [ 64.544647] CPU 1 >> [ 64.544655] Pid: 3737, comm: loas_check Not tainted 2.6.32.10+drm33.1 >> #8 >> [ 64.544663] RIP: e030:[<ffffffff813eb23b>] [<ffffffff813eb23b>] >> memcpy_c+0xb/0x20 >> [ 64.544681] RSP: e02b:ffff880152e7d9b0 EFLAGS: 00050246 >> [ 64.544687] RAX: ffff8801731e4c8c RBX: ffff880178185400 RCX: >> 0000000000000002 >> [ 64.544696] RDX: 0000000000000000 RSI: ffff880178185458 RDI: >> ffff8801731e4c8c >> [ 64.544703] RBP: ffff880152e7d9e8 R08: ffffffff81570110 R09: >> ffffffff81ae6840 >> [ 64.544711] R10: 0000000000000000 R11: 0000000000000000 R12: >> 0000000000000000 >> [ 64.544718] R13: 000000000000000e R14: ffff88017332d800 R15: >> ffff880178185458 >> [ 64.544732] FS: 00007fa8de9336e0(0000) GS:ffff880028047000(0000) >> knlGS:0000000000000000 >> [ 64.544741] CS: e033 DS: 002b ES: 002b CR0: 000000008005003b >> [ 64.544748] CR2: 00000000081f1320 CR3: 0000000001001000 CR4: >> 0000000000000660 >> [ 64.544756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [ 64.544764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000000 >> [ 64.544772] Process loas_check (pid: 3737, threadinfo ffff880152e7c000, >> task ffff880152e72da0) >> [ 64.544781] Stack: >> [ 64.544785] ffffffff81570214 ffffffff81ae6840 ffff88017332d800 >> 0000000000000000 >> [ 64.544798] <0> ffff88017332d800 ffff880152e13200 0000000000000000 >> ffff880152e7da08 >> [ 64.544814] <0> ffffffff81570486 ffff880180000000 ffff88017332d800 >> ffff880152e7da28 >> [ 64.544832] Call Trace: >> [ 64.544842] [<ffffffff81570214>] ? ip_finish_output+0x104/0x2f0 >> [ 64.544853] [<ffffffff81570486>] ip_output+0x86/0xd0 >> [ 64.544862] [<ffffffff8156f4b0>] ip_local_out+0x20/0x30 >> [ 64.544870] [<ffffffff8156fd83>] ip_queue_xmit+0x223/0x3f0 >> [ 64.544880] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 >> [ 64.544889] [<ffffffff8100e532>] ? check_events+0x12/0x20 >> [ 64.544900] [<ffffffff81583c82>] tcp_transmit_skb+0x402/0x780 >> [ 64.544909] [<ffffffff81584129>] tcp_send_active_reset+0x89/0x180 >> [ 64.544920] [<ffffffff8111e16a>] ? __d_free+0x3a/0x60 >> [ 64.544929] [<ffffffff81576f6c>] tcp_disconnect+0x6c/0x3c0 >> [ 64.544938] [<ffffffff81576ce4>] tcp_close+0x3e4/0x480 >> [ 64.544946] [<ffffffff81598a42>] inet_release+0x42/0x70 >> [ 64.544956] [<ffffffff814ce558>] sock_release+0x18/0x60 >> [ 64.544964] [<ffffffff814cea32>] sock_close+0x12/0x30 >> [ 64.544974] [<ffffffff8110e20e>] __fput+0xee/0x200 >> [ 64.544982] [<ffffffff8100dcfd>] ? xen_force_evtchn_callback+0xd/0x10 >> [ 64.544991] [<ffffffff8110e337>] fput+0x17/0x20 >> [ 64.545000] [<ffffffff8110a2f8>] filp_close+0x58/0x90 >> [ 64.545009] [<ffffffff8100e51f>] ? xen_restore_fl_direct_end+0x0/0x1 >> [ 64.545019] [<ffffffff8105689c>] put_files_struct+0xcc/0xe0 >> [ 64.545028] [<ffffffff81056900>] exit_files+0x50/0x60 >> [ 64.545036] [<ffffffff81058507>] do_exit+0x1b7/0x7f0 >> [ 64.545046] [<ffffffff81065c36>] ? __dequeue_signal+0x16/0x160 >> [ 64.545055] [<ffffffff81058b7c>] do_group_exit+0x3c/0xa0 >> [ 64.545064] [<ffffffff810682a8>] get_signal_to_deliver+0x1b8/0x380 >> [ 64.545073] [<ffffffff81010649>] do_notify_resume+0xc9/0x880 >> [ 64.545084] [<ffffffff8100b8bb>] ? xen_mc_flush+0x11b/0x1d0 >> [ 64.545095] [<ffffffff8102cad2>] ? >> paravirt_end_context_switch+0x12/0x30 >> [ 64.545106] [<ffffffff81047a7b>] ? finish_task_switch+0x5b/0xb0 >> [ 64.545115] [<ffffffff810112ce>] int_signal+0x12/0x17 >> [ 64.545121] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 31 c0 >> c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 03 83 e2 07 >> <f3> 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 >> [ 64.545252] RIP [<ffffffff813eb23b>] memcpy_c+0xb/0x20 >> [ 64.545262] RSP <ffff880152e7d9b0> >> [ 64.545269] ---[ end trace 11cf940a2c626919 ]--- >> [ 64.545276] Fixing recursive fault but reboot is needed! >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel