Christopher S. Aker
2008-Oct-13 21:24 UTC
[Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
I just compiled 2.6.27 (pae) using my .config from 2.6.26. It boots fine, but only one CPU shows up, despite "vcpus = 4" and all of the SMP goodness enabled. This happens on both Xen 3.2.x and 3.3.x (64 bit hypervisor). beefcake:2.6.27-linode14# grep SMP .config CONFIG_X86_SMP=y CONFIG_X86_32_SMP=y CONFIG_USE_GENERIC_SMP_HELPERS=y CONFIG_SMP=y CONFIG_X86_FIND_SMP_CONFIG=y # CONFIG_X86_VSMP is not set CONFIG_PM_SLEEP_SMP=y root@ubuntu:~# uname -a Linux ubuntu 2.6.27-linode14 #1 SMP Sun Oct 12 20:34:47 EDT 2008 i686 GNU/Linux root@ubuntu:~# cat /proc/cpuinfo processor : 0 ... (and that''s it -- just one CPU) I must be missing something obvious, so I''d appreciate another set of eyes on this. My .config, boot log, and xen.conf are located here: http://www.theshore.net/~caker/xen/2.6.27-linode14/config.txt http://www.theshore.net/~caker/xen/2.6.27-linode14/bootlog.txt http://www.theshore.net/~caker/xen/2.6.27-linode14/xen.conf Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2008-Nov-03 20:30 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> I just compiled 2.6.27 (pae) using my .config from 2.6.26. It boots > fine, but only one CPU shows up, despite "vcpus = 4" and all of the SMP > goodness enabled. This happens on both Xen 3.2.x and 3.3.x (64 bit > hypervisor). > > beefcake:2.6.27-linode14# grep SMP .config > CONFIG_X86_SMP=y > CONFIG_X86_32_SMP=y > CONFIG_USE_GENERIC_SMP_HELPERS=y > CONFIG_SMP=y > CONFIG_X86_FIND_SMP_CONFIG=y > # CONFIG_X86_VSMP is not set > CONFIG_PM_SLEEP_SMP=y > > root@ubuntu:~# uname -a > Linux ubuntu 2.6.27-linode14 #1 SMP Sun Oct 12 20:34:47 EDT 2008 i686 > GNU/Linux > root@ubuntu:~# cat /proc/cpuinfo > processor : 0 > ... > (and that''s it -- just one CPU) > > I must be missing something obvious, so I''d appreciate another set of > eyes on this. My .config, boot log, and xen.conf are located here: > > http://www.theshore.net/~caker/xen/2.6.27-linode14/config.txt > http://www.theshore.net/~caker/xen/2.6.27-linode14/bootlog.txt > http://www.theshore.net/~caker/xen/2.6.27-linode14/xen.confSame with 2.6.27.4 and a defconfig + the paravirt stuff enabled. So, what''s the trick for getting SMP working with 2.6.27.* domUs? Has anyone achieved this? Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2008-Nov-04 20:13 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> Christopher S. Aker wrote: >> I just compiled 2.6.27 (pae) using my .config from 2.6.26. It boots >> fine, but only one CPU shows up, despite "vcpus = 4" and all of the >> SMP goodness enabled. This happens on both Xen 3.2.x and 3.3.x (64 >> bit hypervisor). >> >> beefcake:2.6.27-linode14# grep SMP .config >> CONFIG_X86_SMP=y >> CONFIG_X86_32_SMP=y >> CONFIG_USE_GENERIC_SMP_HELPERS=y >> CONFIG_SMP=y >> CONFIG_X86_FIND_SMP_CONFIG=y >> # CONFIG_X86_VSMP is not set >> CONFIG_PM_SLEEP_SMP=y >> >> root@ubuntu:~# uname -a >> Linux ubuntu 2.6.27-linode14 #1 SMP Sun Oct 12 20:34:47 EDT 2008 i686 >> GNU/Linux >> root@ubuntu:~# cat /proc/cpuinfo >> processor : 0 >> ... >> (and that''s it -- just one CPU) >> >> I must be missing something obvious, so I''d appreciate another set of >> eyes on this. My .config, boot log, and xen.conf are located here: >> >> http://www.theshore.net/~caker/xen/2.6.27-linode14/config.txt >> http://www.theshore.net/~caker/xen/2.6.27-linode14/bootlog.txt >> http://www.theshore.net/~caker/xen/2.6.27-linode14/xen.conf > > Same with 2.6.27.4 and a defconfig + the paravirt stuff enabled. So, > what''s the trick for getting SMP working with 2.6.27.* domUs? Has > anyone achieved this?Same thing with x86_64 domUs. I''ll try reverting back to the -rcs... -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2008-Nov-05 15:31 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> Same thing with x86_64 domUs. I''ll try reverting back to the -rcs...Here are the test results from the past few kernel versions: 2.6.26 - Brought up 4 CPUs 2.6.26.7 - Brought up 4 CPUs 2.6.27-rc1 - Brought up 1 CPUs 2.6.27 - Brought up 1 CPUs 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with: http://p.linode.com/1408 Hope that helps. Let me know if there''s anything else I can do! -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-05 22:38 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> Christopher S. Aker wrote: >> Same thing with x86_64 domUs. I''ll try reverting back to the -rcs... > > Here are the test results from the past few kernel versions: > > 2.6.26 - Brought up 4 CPUs > 2.6.26.7 - Brought up 4 CPUs > 2.6.27-rc1 - Brought up 1 CPUs > 2.6.27 - Brought up 1 CPUs > 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with: > http://p.linode.com/1408 > > Hope that helps. Let me know if there''s anything else I can do!I couldn''t repro it with 64 bit on -rc1. Trying 32 bit and -rc2 now. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-05 22:45 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> Christopher S. Aker wrote: >> Same thing with x86_64 domUs. I''ll try reverting back to the -rcs... > > Here are the test results from the past few kernel versions: > > 2.6.26 - Brought up 4 CPUs > 2.6.26.7 - Brought up 4 CPUs > 2.6.27-rc1 - Brought up 1 CPUs > 2.6.27 - Brought up 1 CPUs > 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with: > http://p.linode.com/1408 > > Hope that helps. Let me know if there''s anything else I can do!Hm, couldn''t repro. What''s your .config? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2008-Nov-05 22:57 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Jeremy Fitzhardinge wrote:> Christopher S. Aker wrote: >> Christopher S. Aker wrote: >>> Same thing with x86_64 domUs. I''ll try reverting back to the -rcs... >> >> Here are the test results from the past few kernel versions: >> >> 2.6.26 - Brought up 4 CPUs >> 2.6.26.7 - Brought up 4 CPUs >> 2.6.27-rc1 - Brought up 1 CPUs >> 2.6.27 - Brought up 1 CPUs >> 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with: >> http://p.linode.com/1408 >> >> Hope that helps. Let me know if there''s anything else I can do! > > I couldn''t repro it with 64 bit on -rc1. Trying 32 bit and -rc2 now.Hmm .. well, add 2.6.27.4-x86_64 to the bad list. My setup is as follows: Xen 3.3 and change, 64 bit, PAE dom0. If you get a chance and think it would be useful, stick some domU kernels up somewhere and I''ll give them a go. I''m very curious to find out what the heck is going on. Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2008-Nov-05 23:05 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Jeremy Fitzhardinge wrote:> Hm, couldn''t repro. What''s your .config?Kernel binary and config, xen config, and boot log are located here: http://www.theshore.net/~caker/xen/2.6.27-linode14/ -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-06 00:23 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> Jeremy Fitzhardinge wrote: >> Hm, couldn''t repro. What''s your .config? > > Kernel binary and config, xen config, and boot log are located here: > > http://www.theshore.net/~caker/xen/2.6.27-linode14/Hm, odd. I built 2.6.27 with your config and it worked fine. But I see only 1 cpu with your kernel image... Um, I''m at a bit of a loss. Toolchain issue? I''m using Fedora 8''s gcc 4.1.2 cross-building in a 64-bit environment. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2008-Nov-06 02:00 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Jeremy Fitzhardinge wrote:> Um, I''m at a bit of a loss. Toolchain issue? I''m using Fedora 8''s gcc > 4.1.2 cross-building in a 64-bit environment.Well crap... gcc 4.0.3 (4.0.3-1ubuntu5) (Ubuntu 6.06 LTS) == Brought up 1 CPUs gcc 4.2.4 (4.2.4-1ubuntu3) (Ubuntu 8.04 LTS) == Brought up 4 CPUs -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan de Konink
2008-Nov-06 02:23 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> Jeremy Fitzhardinge wrote: >> Um, I''m at a bit of a loss. Toolchain issue? I''m using Fedora 8''s >> gcc 4.1.2 cross-building in a 64-bit environment. > > Well crap... > > gcc 4.0.3 (4.0.3-1ubuntu5) (Ubuntu 6.06 LTS) == Brought up 1 CPUs > gcc 4.2.4 (4.2.4-1ubuntu3) (Ubuntu 8.04 LTS) == Brought up 4 CPUsI''m looking forward to a 4.3 test :) If this is just Linux with 64bit support in Xen I''ll give it a go myself. Stefan (seen more evil things with gcc vs xen) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-06 03:43 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Christopher S. Aker wrote:> Jeremy Fitzhardinge wrote: >> Um, I''m at a bit of a loss. Toolchain issue? I''m using Fedora 8''s >> gcc 4.1.2 cross-building in a 64-bit environment. > > Well crap... > > gcc 4.0.3 (4.0.3-1ubuntu5) (Ubuntu 6.06 LTS) == Brought up 1 CPUs > gcc 4.2.4 (4.2.4-1ubuntu3) (Ubuntu 8.04 LTS) == Brought up 4 CPUsAre you still seeing that other crash? And I know its a bit of a pain, but if you can work out where 4.0.3 stopped compiling properly, it would be useful to post to lkml for reference. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 2008-11-05 at 10:31 -0500, Christopher S. Aker wrote:> 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with: > http://p.linode.com/1408I''ve been seeing this too. I bisected it down to: ab00fee30cddf975200b3c97aef25bea144a0d89 is first bad commit commit ab00fee30cddf975200b3c97aef25bea144a0d89 Author: Jan Beulich <jbeulich@novell.com> Date: Thu Oct 30 10:37:21 2008 +0000 i386/PAE: fix pud_page() Impact: cleanup To the unsuspecting user it is quite annoying that this broken and inconsistent with x86-64 definition still exists. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> :040000 040000 3b49a9d3792e9f02dd799ad4deb69922d2a085d0 f0136498ef53b36172dca595f11a784f43bebcea M arch It''s late so figuring out how it broke can wait for tomorrow. The interesting bit from the link given is below. Ian. 1 multicall(s) failed: cpu 0 Pid: 1, comm: swapper Not tainted 2.6.28-rc3-test1 #1 Call Trace: [<c0103e41>] xen_mc_flush+0xb1/0x180 [<c010478f>] xen_do_pin+0x3f/0x90 [<c0104e3f>] __xen_pgd_pin+0xcf/0x140 [<c0104f0d>] xen_activate_mm+0x1d/0x30 [<c018f12c>] flush_old_exec+0x29c/0x740 [<c018e2db>] kernel_read+0x3b/0x60 [<c01bd228>] load_elf_binary+0x198/0x16c0 [<c0104159>] xen_set_pte+0x19/0x30 [<c0173ef6>] handle_mm_fault+0xa46/0xc70 [<c01719be>] vm_normal_page+0x4e/0xa0 [<c0171cd9>] follow_page+0x2c9/0x320 [<c0174245>] __get_user_pages+0x125/0x3e0 [<c018e0ca>] get_arg_page+0x4a/0xb0 [<c01bd090>] load_elf_binary+0x0/0x16c0 [<c018fb32>] search_binary_handler+0xa2/0x230 [<c018fe88>] do_execve+0x1c8/0x210 [<c010685f>] sys_execve+0x2f/0x50 [<c01085b6>] syscall_call+0x7/0xb [<c058007b>] sctp_setsockopt+0xd2b/0x1060 [<c01800d8>] sys_swapon+0x308/0xaf0 [<c010c4fc>] kernel_execve+0x1c/0x30 [<c0102292>] init_post+0xb2/0x100 [<c01092f3>] kernel_thread_helper+0x7/0x10 call 1/8: op=14 arg=[d5963000] result=0 call 2/8: op=14 arg=[d5964000] result=0 call 3/8: op=14 arg=[d5965000] result=0 call 4/8: op=14 arg=[d5968000] result=0 call 5/8: op=26 arg=[c12d5880] result=0 call 6/8: op=14 arg=[d5962000] result=0 call 7/8: op=14 arg=[c12b2000] result=0 call 8/8: op=26 arg=[c12d5890] result=-22 BUG: unable to handle kernel paging request at c12b2d0c IP: [<c0106550>] xen_spin_unlock+0x0/0x10 *pdpt = 00000002ccfb6027 Oops: 0003 [#1] SMP last sysfs file: Modules linked in: Pid: 1, comm: swapper Not tainted (2.6.28-rc3-test1 #1) EIP: 0061:[<c0106550>] EFLAGS: 00010002 CPU: 0 EIP is at xen_spin_unlock+0x0/0x10 EAX: c12b2d0c EBX: 00000001 ECX: 00000000 EDX: c12d5a80 ESI: 00000001 EDI: c12d5a80 EBP: c12d5080 ESP: d603fd38 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: e021 Process swapper (pid: 1, ti=d603e000 task=d603d8a0 task.ti=d603e000) Stack: c0104245 c0103eb3 c062b9a4 00000008 00000008 0000001a c12d5890 ffffffea 00000000 00000003 00000000 00000000 d5960e40 c0104e3f 15966001 00000000 d5962000 d5960e40 d5960e84 d603d8a0 d603dbe4 c0104f0d d5960e40 c0699d00 Call Trace: [<c0104245>] xen_pte_unlock+0x5/0x10 [<c0103eb3>] xen_mc_flush+0x123/0x180 [<c0104e3f>] __xen_pgd_pin+0xcf/0x140 [<c0104f0d>] xen_activate_mm+0x1d/0x30 [<c018f12c>] flush_old_exec+0x29c/0x740 [<c018e2db>] kernel_read+0x3b/0x60 [<c01bd228>] load_elf_binary+0x198/0x16c0 [<c0104159>] xen_set_pte+0x19/0x30 [<c0173ef6>] handle_mm_fault+0xa46/0xc70 [<c01719be>] vm_normal_page+0x4e/0xa0 [<c0171cd9>] follow_page+0x2c9/0x320 [<c0174245>] __get_user_pages+0x125/0x3e0 [<c018e0ca>] get_arg_page+0x4a/0xb0 [<c01bd090>] load_elf_binary+0x0/0x16c0 [<c018fb32>] search_binary_handler+0xa2/0x230 [<c018fe88>] do_execve+0x1c8/0x210 [<c010685f>] sys_execve+0x2f/0x50 [<c01085b6>] syscall_call+0x7/0xb [<c058007b>] sctp_setsockopt+0xd2b/0x1060 [<c01800d8>] sys_swapon+0x308/0xaf0 [<c010c4fc>] kernel_execve+0x1c/0x30 [<c0102292>] init_post+0xb2/0x100 [<c01092f3>] kernel_thread_helper+0x7/0x10 Code: 6d c0 e8 d4 51 2c 00 83 f8 0f 89 c1 7f 1a 8b 04 8d 80 95 6d c0 39 34 03 75 e1 5b ba 03 00 00 00 89 c8 5e e9 73 c2 2d 00 5b 5e c3 <c6> 00 00 66 83 78 02 00 75 01 c3 eb b3 8d 76 00 0f 0b eb fe 8d EIP: [<c0106550>] xen_spin_unlock+0x0/0x10 SS:ESP e021:d603fd38 ---[ end trace 72dbea1e75327c37 ]--- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-06 19:15 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Ian Campbell wrote:> On Wed, 2008-11-05 at 10:31 -0500, Christopher S. Aker wrote: > >> 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with: >> http://p.linode.com/1408 >> > > I''ve been seeing this too. I bisected it down to: > > ab00fee30cddf975200b3c97aef25bea144a0d89 is first bad commit > commit ab00fee30cddf975200b3c97aef25bea144a0d89 > Author: Jan Beulich <jbeulich@novell.com> > Date: Thu Oct 30 10:37:21 2008 +0000 > > i386/PAE: fix pud_page() > > Impact: cleanup > > To the unsuspecting user it is quite annoying that this broken and > inconsistent with x86-64 definition still exists. > > Signed-off-by: Jan Beulich <jbeulich@novell.com> > Signed-off-by: Ingo Molnar <mingo@elte.hu> > > :040000 040000 3b49a9d3792e9f02dd799ad4deb69922d2a085d0 f0136498ef53b36172dca595f11a784f43bebcea M arch > > It''s late so figuring out how it broke can wait for tomorrow. > > The interesting bit from the link given is below. >Ah, OK. Ingo, Jan: Did this patch actually fix anything, or was it just a cleanup? It seems to have broken 32-bit Xen in some way, so if its just a cleanup it would be best to drop it until we''ve worked out what''s going on. Thanks, J> Ian. > > > 1 multicall(s) failed: cpu 0 > Pid: 1, comm: swapper Not tainted 2.6.28-rc3-test1 #1 > Call Trace: > [<c0103e41>] xen_mc_flush+0xb1/0x180 > [<c010478f>] xen_do_pin+0x3f/0x90 > [<c0104e3f>] __xen_pgd_pin+0xcf/0x140 > [<c0104f0d>] xen_activate_mm+0x1d/0x30 > [<c018f12c>] flush_old_exec+0x29c/0x740 > [<c018e2db>] kernel_read+0x3b/0x60 > [<c01bd228>] load_elf_binary+0x198/0x16c0 > [<c0104159>] xen_set_pte+0x19/0x30 > [<c0173ef6>] handle_mm_fault+0xa46/0xc70 > [<c01719be>] vm_normal_page+0x4e/0xa0 > [<c0171cd9>] follow_page+0x2c9/0x320 > [<c0174245>] __get_user_pages+0x125/0x3e0 > [<c018e0ca>] get_arg_page+0x4a/0xb0 > [<c01bd090>] load_elf_binary+0x0/0x16c0 > [<c018fb32>] search_binary_handler+0xa2/0x230 > [<c018fe88>] do_execve+0x1c8/0x210 > [<c010685f>] sys_execve+0x2f/0x50 > [<c01085b6>] syscall_call+0x7/0xb > [<c058007b>] sctp_setsockopt+0xd2b/0x1060 > [<c01800d8>] sys_swapon+0x308/0xaf0 > [<c010c4fc>] kernel_execve+0x1c/0x30 > [<c0102292>] init_post+0xb2/0x100 > [<c01092f3>] kernel_thread_helper+0x7/0x10 > call 1/8: op=14 arg=[d5963000] result=0 > call 2/8: op=14 arg=[d5964000] result=0 > call 3/8: op=14 arg=[d5965000] result=0 > call 4/8: op=14 arg=[d5968000] result=0 > call 5/8: op=26 arg=[c12d5880] result=0 > call 6/8: op=14 arg=[d5962000] result=0 > call 7/8: op=14 arg=[c12b2000] result=0 > call 8/8: op=26 arg=[c12d5890] result=-22 > BUG: unable to handle kernel paging request at c12b2d0c > IP: [<c0106550>] xen_spin_unlock+0x0/0x10 > *pdpt = 00000002ccfb6027 > Oops: 0003 [#1] SMP > last sysfs file: > Modules linked in: > > Pid: 1, comm: swapper Not tainted (2.6.28-rc3-test1 #1) > EIP: 0061:[<c0106550>] EFLAGS: 00010002 CPU: 0 > EIP is at xen_spin_unlock+0x0/0x10 > EAX: c12b2d0c EBX: 00000001 ECX: 00000000 EDX: c12d5a80 > ESI: 00000001 EDI: c12d5a80 EBP: c12d5080 ESP: d603fd38 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: e021 > Process swapper (pid: 1, ti=d603e000 task=d603d8a0 task.ti=d603e000) > Stack: > c0104245 c0103eb3 c062b9a4 00000008 00000008 0000001a c12d5890 ffffffea > 00000000 00000003 00000000 00000000 d5960e40 c0104e3f 15966001 00000000 > d5962000 d5960e40 d5960e84 d603d8a0 d603dbe4 c0104f0d d5960e40 c0699d00 > Call Trace: > [<c0104245>] xen_pte_unlock+0x5/0x10 > [<c0103eb3>] xen_mc_flush+0x123/0x180 > [<c0104e3f>] __xen_pgd_pin+0xcf/0x140 > [<c0104f0d>] xen_activate_mm+0x1d/0x30 > [<c018f12c>] flush_old_exec+0x29c/0x740 > [<c018e2db>] kernel_read+0x3b/0x60 > [<c01bd228>] load_elf_binary+0x198/0x16c0 > [<c0104159>] xen_set_pte+0x19/0x30 > [<c0173ef6>] handle_mm_fault+0xa46/0xc70 > [<c01719be>] vm_normal_page+0x4e/0xa0 > [<c0171cd9>] follow_page+0x2c9/0x320 > [<c0174245>] __get_user_pages+0x125/0x3e0 > [<c018e0ca>] get_arg_page+0x4a/0xb0 > [<c01bd090>] load_elf_binary+0x0/0x16c0 > [<c018fb32>] search_binary_handler+0xa2/0x230 > [<c018fe88>] do_execve+0x1c8/0x210 > [<c010685f>] sys_execve+0x2f/0x50 > [<c01085b6>] syscall_call+0x7/0xb > [<c058007b>] sctp_setsockopt+0xd2b/0x1060 > [<c01800d8>] sys_swapon+0x308/0xaf0 > [<c010c4fc>] kernel_execve+0x1c/0x30 > [<c0102292>] init_post+0xb2/0x100 > [<c01092f3>] kernel_thread_helper+0x7/0x10 > Code: 6d c0 e8 d4 51 2c 00 83 f8 0f 89 c1 7f 1a 8b 04 8d 80 95 6d c0 39 > 34 03 75 e1 5b ba 03 00 00 00 89 c8 5e e9 73 c2 2d 00 5b 5e c3 <c6> 00 > 00 66 83 78 02 00 75 01 c3 eb b3 8d 76 00 0f 0b eb fe 8d > EIP: [<c0106550>] xen_spin_unlock+0x0/0x10 SS:ESP e021:d603fd38 > ---[ end trace 72dbea1e75327c37 ]--- > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Ian Campbell wrote: >> On Wed, 2008-11-05 at 10:31 -0500, Christopher S. Aker wrote: >> >>> 2.6.28-rc3 - Brought up 1 CPUs, eventually dies with: >>> http://p.linode.com/1408 >>> >> >> I''ve been seeing this too. I bisected it down to: >> >> ab00fee30cddf975200b3c97aef25bea144a0d89 is first bad commit >> commit ab00fee30cddf975200b3c97aef25bea144a0d89 >> Author: Jan Beulich <jbeulich@novell.com> >> Date: Thu Oct 30 10:37:21 2008 +0000 >> i386/PAE: fix pud_page() >> Impact: cleanup >> To the unsuspecting user it is quite annoying >> that this broken and >> inconsistent with x86-64 definition still exists. >> Signed-off-by: Jan Beulich >> <jbeulich@novell.com> >> Signed-off-by: Ingo Molnar <mingo@elte.hu> >> :040000 040000 3b49a9d3792e9f02dd799ad4deb69922d2a085d0 >> f0136498ef53b36172dca595f11a784f43bebcea M arch >> >> It''s late so figuring out how it broke can wait for tomorrow. >> >> The interesting bit from the link given is below. >> > > Ah, OK. > > Ingo, Jan: > > Did this patch actually fix anything, or was it just a cleanup? It > seems to have broken 32-bit Xen in some way, so if its just a cleanup it > would be best to drop it until we''ve worked out what''s going on.no, it was pure cleanup. The impact line shows this:>> Impact: cleanupa "cleanup" impact line is only added if the change is not intended to have any side-effects whatsoever. We can drop it but it would be really nice to figure out what''s going on. In a very quick late-night look i cannot see anything particularly weird about it, but based on the type of changes it does there are three leading candidates: lost high 32 bits, zero extend problem, or incorrect types. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-06 21:20 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Ingo Molnar wrote:> a "cleanup" impact line is only added if the change is not intended to > have any side-effects whatsoever. > > We can drop it but it would be really nice to figure out what''s going > on. In a very quick late-night look i cannot see anything particularly > weird about it, but based on the type of changes it does there are > three leading candidates: lost high 32 bits, zero extend problem, or > incorrect types.Yeah, I couldn''t see anything either. It''s a reasonable cleanup (I never did understand that struct page * cast), but its always nicer when cleanups don''t break working code ;). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Ingo Molnar wrote: >> a "cleanup" impact line is only added if the change is not intended to >> have any side-effects whatsoever. >> >> We can drop it but it would be really nice to figure out what''s going >> on. In a very quick late-night look i cannot see anything particularly >> weird about it, but based on the type of changes it does there are >> three leading candidates: lost high 32 bits, zero extend problem, or >> incorrect types. > > Yeah, I couldn''t see anything either. It''s a reasonable cleanup (I > never did understand that struct page * cast), but its always nicer > when cleanups don''t break working code ;).Would be nice to have a look at the vmlinux delta with the patch reverted, on the .config that breaks. By all means the object code should be the same. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-06 21:28 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Ingo Molnar wrote:> a "cleanup" impact line is only added if the change is not intended to > have any side-effects whatsoever. > > We can drop it but it would be really nice to figure out what''s going > on. In a very quick late-night look i cannot see anything particularly > weird about it, but based on the type of changes it does there are > three leading candidates: lost high 32 bits, zero extend problem, or > incorrect types. >Interestingly, the Xen code appears to be the *only* user of pud_page - and only via pgd_page in PAE mode. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Ingo Molnar wrote: >> a "cleanup" impact line is only added if the change is not intended to >> have any side-effects whatsoever. >> >> We can drop it but it would be really nice to figure out what''s going >> on. In a very quick late-night look i cannot see anything particularly >> weird about it, but based on the type of changes it does there are >> three leading candidates: lost high 32 bits, zero extend problem, or >> incorrect types. > > Interestingly, the Xen code appears to be the *only* user of > pud_page - and only via pgd_page in PAE mode.where exactly is that use? My grep didnt show any users of pud_page(). pud_page() was changed in an incompatible way, all users of it must be updated. Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-06 21:48 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Ingo Molnar wrote:> where exactly is that use? My grep didnt show any users of pud_page(). > pud_page() was changed in an incompatible way, all users of it must be > updated. >pgd_page() uses it in pgtable-nopud.h, so any users of pgd_page() also need to be looked at. It so happens the only user is arch/x86/xen/mmu.c, which expects it to return the vaddr. Fixed below. J Subject: xen: fix use of pgd_page now that it really does return a page On 32-bit PAE, pud_page, for no good reason, didn''t really return a struct page *. Since Jan Beulich''s fix "i386/PAE: fix pud_page()", pud_page does return a struct page *. Because PAE has 3 pagetable levels, the pud level is folded into the pgd level, so pgd_page() is the same as pud_page(), and now returns a struct page *. Update the xen/mmu.c code which uses pgd_page() accordingly. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/xen/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) ==================================================================--- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -877,7 +877,7 @@ #else /* CONFIG_X86_32 */ #ifdef CONFIG_X86_PAE /* Need to make sure unshared kernel PMD is pinnable */ - xen_pin_page(mm, virt_to_page(pgd_page(pgd[pgd_index(TASK_SIZE)])), + xen_pin_page(mm, pgd_page(pgd[pgd_index(TASK_SIZE)]), PT_PMD); #endif xen_do_pin(MMUEXT_PIN_L3_TABLE, PFN_DOWN(__pa(pgd))); @@ -994,7 +994,7 @@ #ifdef CONFIG_X86_PAE /* Need to make sure unshared kernel PMD is unpinned */ - xen_unpin_page(mm, virt_to_page(pgd_page(pgd[pgd_index(TASK_SIZE)])), + xen_unpin_page(mm, pgd_page(pgd[pgd_index(TASK_SIZE)]), PT_PMD); #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Jeremy Fitzhardinge <jeremy@goop.org> wrote:> Ingo Molnar wrote: >> where exactly is that use? My grep didnt show any users of pud_page(). >> pud_page() was changed in an incompatible way, all users of it must be >> updated. >> > > pgd_page() uses it in pgtable-nopud.h, so any users of pgd_page() > also need to be looked at. It so happens the only user is > arch/x86/xen/mmu.c, which expects it to return the vaddr. Fixed > below.ah! asm-generic was missed by my grep. (and i suspect Jan missed it too)> Subject: xen: fix use of pgd_page now that it really does return a pageapplied to tip/x86/urgent, thanks! Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-06 22:29 UTC
Re: [Xen-devel] 2.6.27 - SMP enabled, but only 1 CPU
Ingo Molnar wrote:> ah! asm-generic was missed by my grep. (and i suspect Jan missed it > too) >cscope is your friend.>> Subject: xen: fix use of pgd_page now that it really does return a page >> > > applied to tip/x86/urgent, thanks! >Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Ingo Molnar <mingo@elte.hu> 06.11.08 23:20 >>> > >* Jeremy Fitzhardinge <jeremy@goop.org> wrote: > >> Ingo Molnar wrote: >>> where exactly is that use? My grep didnt show any users of pud_page(). >>> pud_page() was changed in an incompatible way, all users of it must be >>> updated. >>> >> >> pgd_page() uses it in pgtable-nopud.h, so any users of pgd_page() >> also need to be looked at. It so happens the only user is >> arch/x86/xen/mmu.c, which expects it to return the vaddr. Fixed >> below. > >ah! asm-generic was missed by my grep. (and i suspect Jan missed it >too)Indeed - broken as it was I never even considered this could be used somewhere in generic code.>> Subject: xen: fix use of pgd_page now that it really does return a page > >applied to tip/x86/urgent, thanks!And my thanks, too, Jeremy, for the quick spotting of the problem. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel