Hello... I''ve tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during boot, with it panics. I''ve attached my config, if somebody thinks I''ve left something out Does anybody else see this: ------------[ cut here ]------------ kernel BUG at arch/i386/xen/multicalls.c:68! invalid opcode: 0000 [#1] SMP CPU: 0 EIP: 0061:[<c01019a6>] Not tainted VLI EFLAGS: 00010002 (2.6.23 #26) EIP is at xen_mc_flush+0xa6/0xb0 eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 c5eadaa0 Call Trace: [<c0101ede>] xen_pgd_pin+0x9e/0x100 [<c0101ff3>] xen_activate_mm+0x13/0x20 [<c015caca>] flush_old_exec+0x3ca/0x7f0 [<c01584d0>] do_sync_read+0x0/0x120 [<c015c077>] kernel_read+0x37/0x50 [<c018308e>] load_elf_binary+0x2fe/0x1af0 [<c013ec27>] __alloc_pages+0x57/0x310 [<c0155387>] kmem_cache_alloc+0x47/0x90 [<c0147610>] handle_mm_fault+0x540/0x710 [<c01585a5>] do_sync_read+0xd5/0x120 [<c0145e10>] vm_normal_page+0x10/0x70 [<c0145e10>] vm_normal_page+0x10/0x70 [<c0146471>] follow_page+0xf1/0x170 [<c01478ca>] get_user_pages+0xea/0x2e0 [<c015bbe2>] get_arg_page+0x42/0xa0 [<c015bdc6>] copy_strings+0x186/0x1a0 [<c015be64>] search_binary_handler+0x54/0x110 [<c015d83b>] do_execve+0x14b/0x170 [<c010425f>] sys_execve+0x2f/0x90 [<c0105c72>] syscall_call+0x7/0xb [<c010a364>] kernel_execve+0x14/0x20 [<c0100173>] init_post+0xa3/0xf0 [<c02d594f>] kernel_init+0x20f/0x310 [<c0118464>] schedule_tail+0x34/0x90 [<c0105b16>] ret_from_fork+0x6/0x20 [<c0105c7b>] syscall_exit+0x5/0x1b [<c02d5740>] kernel_init+0x0/0x310 [<c02d5740>] kernel_init+0x0/0x310 [<c0106e77>] kernel_thread_helper+0x7/0x10 ====================== _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi there, I''m going to move this discussion to xen-devel since it sounds like a bug. Cheers, Mark On Thursday 11 October 2007, Morten Bøgeskov wrote:> Hello... > I''ve tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during > boot, with it panics. I''ve attached my config, if somebody thinks I''ve > left something out > > Does anybody else see this: > > ------------[ cut here ]------------ > kernel BUG at arch/i386/xen/multicalls.c:68! > invalid opcode: 0000 [#1] > SMP > CPU: 0 > EIP: 0061:[<c01019a6>] Not tainted VLI > EFLAGS: 00010002 (2.6.23 #26) > EIP is at xen_mc_flush+0xa6/0xb0 > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 > c5eadaa0 Call Trace: > [<c0101ede>] xen_pgd_pin+0x9e/0x100 > [<c0101ff3>] xen_activate_mm+0x13/0x20 > [<c015caca>] flush_old_exec+0x3ca/0x7f0 > [<c01584d0>] do_sync_read+0x0/0x120 > [<c015c077>] kernel_read+0x37/0x50 > [<c018308e>] load_elf_binary+0x2fe/0x1af0 > [<c013ec27>] __alloc_pages+0x57/0x310 > [<c0155387>] kmem_cache_alloc+0x47/0x90 > [<c0147610>] handle_mm_fault+0x540/0x710 > [<c01585a5>] do_sync_read+0xd5/0x120 > [<c0145e10>] vm_normal_page+0x10/0x70 > [<c0145e10>] vm_normal_page+0x10/0x70 > [<c0146471>] follow_page+0xf1/0x170 > [<c01478ca>] get_user_pages+0xea/0x2e0 > [<c015bbe2>] get_arg_page+0x42/0xa0 > [<c015bdc6>] copy_strings+0x186/0x1a0 > [<c015be64>] search_binary_handler+0x54/0x110 > [<c015d83b>] do_execve+0x14b/0x170 > [<c010425f>] sys_execve+0x2f/0x90 > [<c0105c72>] syscall_call+0x7/0xb > [<c010a364>] kernel_execve+0x14/0x20 > [<c0100173>] init_post+0xa3/0xf0 > [<c02d594f>] kernel_init+0x20f/0x310 > [<c0118464>] schedule_tail+0x34/0x90 > [<c0105b16>] ret_from_fork+0x6/0x20 > [<c0105c7b>] syscall_exit+0x5/0x1b > [<c02d5740>] kernel_init+0x0/0x310 > [<c02d5740>] kernel_init+0x0/0x310 > [<c0106e77>] kernel_thread_helper+0x7/0x10 > ======================-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I''m bringing this discussion onto Xen-devel as it smells like it needs some more specific developer input than I can give.> I''ve tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during > boot, with it panics. I''ve attached my config, if somebody thinks I''ve > left something outDo you mean that it hangs during boot if SMP is not compiled into the guest kernel? That seems strange, I''ll try that out myself and see what happens. How far into the boot does it manage to get? Has it started running userspace apps (the normal startup messages starting essential services), or is it still during the kernel initialisation?> Does anybody else see this:I''ve booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 kernel, vcpus = 1 but I didn''t bother giving it a virtual disk, so I don''t know if userspace worked. I''ll give it a try shortly... Cheers, Mark> ------------[ cut here ]------------ > kernel BUG at arch/i386/xen/multicalls.c:68! > invalid opcode: 0000 [#1] > SMP > CPU: 0 > EIP: 0061:[<c01019a6>] Not tainted VLI > EFLAGS: 00010002 (2.6.23 #26) > EIP is at xen_mc_flush+0xa6/0xb0 > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 > c5eadaa0 Call Trace: > [<c0101ede>] xen_pgd_pin+0x9e/0x100 > [<c0101ff3>] xen_activate_mm+0x13/0x20 > [<c015caca>] flush_old_exec+0x3ca/0x7f0 > [<c01584d0>] do_sync_read+0x0/0x120 > [<c015c077>] kernel_read+0x37/0x50 > [<c018308e>] load_elf_binary+0x2fe/0x1af0 > [<c013ec27>] __alloc_pages+0x57/0x310 > [<c0155387>] kmem_cache_alloc+0x47/0x90 > [<c0147610>] handle_mm_fault+0x540/0x710 > [<c01585a5>] do_sync_read+0xd5/0x120 > [<c0145e10>] vm_normal_page+0x10/0x70 > [<c0145e10>] vm_normal_page+0x10/0x70 > [<c0146471>] follow_page+0xf1/0x170 > [<c01478ca>] get_user_pages+0xea/0x2e0 > [<c015bbe2>] get_arg_page+0x42/0xa0 > [<c015bdc6>] copy_strings+0x186/0x1a0 > [<c015be64>] search_binary_handler+0x54/0x110 > [<c015d83b>] do_execve+0x14b/0x170 > [<c010425f>] sys_execve+0x2f/0x90 > [<c0105c72>] syscall_call+0x7/0xb > [<c010a364>] kernel_execve+0x14/0x20 > [<c0100173>] init_post+0xa3/0xf0 > [<c02d594f>] kernel_init+0x20f/0x310 > [<c0118464>] schedule_tail+0x34/0x90 > [<c0105b16>] ret_from_fork+0x6/0x20 > [<c0105c7b>] syscall_exit+0x5/0x1b > [<c02d5740>] kernel_init+0x0/0x310 > [<c02d5740>] kernel_init+0x0/0x310 > [<c0106e77>] kernel_thread_helper+0x7/0x10 > ======================-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > I''ve tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. > > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during > > boot, with it panics. I''ve attached my config, if somebody thinks I''ve > > left something out > > Do you mean that it hangs during boot if SMP is not compiled into the guest > kernel? That seems strange, I''ll try that out myself and see what happens.Where does it hang for you? I disabled SMP in my kernel config and found that the guest hung during the kernel messages, just after: installing Xen timer for CPU 0 Is this similar to what you observed? Cheers, Mark> How far into the boot does it manage to get? Has it started running > userspace apps (the normal startup messages starting essential services), > or is it still during the kernel initialisation? > > > Does anybody else see this: > > I''ve booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 > kernel, vcpus = 1 but I didn''t bother giving it a virtual disk, so I don''t > know if userspace worked. I''ll give it a try shortly... > > Cheers, > Mark > > > ------------[ cut here ]------------ > > kernel BUG at arch/i386/xen/multicalls.c:68! > > invalid opcode: 0000 [#1] > > SMP > > CPU: 0 > > EIP: 0061:[<c01019a6>] Not tainted VLI > > EFLAGS: 00010002 (2.6.23 #26) > > EIP is at xen_mc_flush+0xa6/0xb0 > > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 > > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 > > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 > > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) > > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 > > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 > > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 > > c5eadaa0 Call Trace: > > [<c0101ede>] xen_pgd_pin+0x9e/0x100 > > [<c0101ff3>] xen_activate_mm+0x13/0x20 > > [<c015caca>] flush_old_exec+0x3ca/0x7f0 > > [<c01584d0>] do_sync_read+0x0/0x120 > > [<c015c077>] kernel_read+0x37/0x50 > > [<c018308e>] load_elf_binary+0x2fe/0x1af0 > > [<c013ec27>] __alloc_pages+0x57/0x310 > > [<c0155387>] kmem_cache_alloc+0x47/0x90 > > [<c0147610>] handle_mm_fault+0x540/0x710 > > [<c01585a5>] do_sync_read+0xd5/0x120 > > [<c0145e10>] vm_normal_page+0x10/0x70 > > [<c0145e10>] vm_normal_page+0x10/0x70 > > [<c0146471>] follow_page+0xf1/0x170 > > [<c01478ca>] get_user_pages+0xea/0x2e0 > > [<c015bbe2>] get_arg_page+0x42/0xa0 > > [<c015bdc6>] copy_strings+0x186/0x1a0 > > [<c015be64>] search_binary_handler+0x54/0x110 > > [<c015d83b>] do_execve+0x14b/0x170 > > [<c010425f>] sys_execve+0x2f/0x90 > > [<c0105c72>] syscall_call+0x7/0xb > > [<c010a364>] kernel_execve+0x14/0x20 > > [<c0100173>] init_post+0xa3/0xf0 > > [<c02d594f>] kernel_init+0x20f/0x310 > > [<c0118464>] schedule_tail+0x34/0x90 > > [<c0105b16>] ret_from_fork+0x6/0x20 > > [<c0105c7b>] syscall_exit+0x5/0x1b > > [<c02d5740>] kernel_init+0x0/0x310 > > [<c02d5740>] kernel_init+0x0/0x310 > > [<c0106e77>] kernel_thread_helper+0x7/0x10 > > ======================-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson wrote:> I''m bringing this discussion onto Xen-devel as it smells like it needs some > more specific developer input than I can give. > > >> I''ve tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. >> my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during >> boot, with it panics. I''ve attached my config, if somebody thinks I''ve >> left something out >>Oh, I just fixed this. Try these patches. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Quoting Mark Williamson <mark.williamson@cl.cam.ac.uk>:>> > I''ve tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. >> > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during >> > boot, with it panics. I''ve attached my config, if somebody thinks I''ve >> > left something out >> >> Do you mean that it hangs during boot if SMP is not compiled into the guest >> kernel? That seems strange, I''ll try that out myself and see what happens. > > Where does it hang for you? > > I disabled SMP in my kernel config and found that the guest hung during the > kernel messages, just after: > > installing Xen timer for CPU 0 > > Is this similar to what you observed?That is exactly what I experienced. I tracked it down to: xen_vcpuop_set_next_event(...) ret = HYPERVISOR_vcpu_op(VCPUOP_set_singleshot_timer, cpu, &single); always returns -ETIME resulting in a infinite loop in tick_setup_periodic(...) Where this never succeeds if (!clockevents_program_event(dev, next, ktime_get())) return; Now my brain needs a rest. I never thought I had to go head first into the linux-kernel ;-) can''t claim that I got any wiser ;-)> > Cheers, > Mark > >> How far into the boot does it manage to get? Has it started running >> userspace apps (the normal startup messages starting essential services), >> or is it still during the kernel initialisation? >> >> > Does anybody else see this: >> >> I''ve booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 >> kernel, vcpus = 1 but I didn''t bother giving it a virtual disk, so I don''t >> know if userspace worked. I''ll give it a try shortly... >> >> Cheers, >> Mark >> >> > ------------[ cut here ]------------ >> > kernel BUG at arch/i386/xen/multicalls.c:68! >> > invalid opcode: 0000 [#1] >> > SMP >> > CPU: 0 >> > EIP: 0061:[<c01019a6>] Not tainted VLI >> > EFLAGS: 00010002 (2.6.23 #26) >> > EIP is at xen_mc_flush+0xa6/0xb0 >> > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 >> > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 >> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 >> > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) >> > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 >> > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 >> > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 >> > c5eadaa0 Call Trace: >> > [<c0101ede>] xen_pgd_pin+0x9e/0x100 >> > [<c0101ff3>] xen_activate_mm+0x13/0x20 >> > [<c015caca>] flush_old_exec+0x3ca/0x7f0 >> > [<c01584d0>] do_sync_read+0x0/0x120 >> > [<c015c077>] kernel_read+0x37/0x50 >> > [<c018308e>] load_elf_binary+0x2fe/0x1af0 >> > [<c013ec27>] __alloc_pages+0x57/0x310 >> > [<c0155387>] kmem_cache_alloc+0x47/0x90 >> > [<c0147610>] handle_mm_fault+0x540/0x710 >> > [<c01585a5>] do_sync_read+0xd5/0x120 >> > [<c0145e10>] vm_normal_page+0x10/0x70 >> > [<c0145e10>] vm_normal_page+0x10/0x70 >> > [<c0146471>] follow_page+0xf1/0x170 >> > [<c01478ca>] get_user_pages+0xea/0x2e0 >> > [<c015bbe2>] get_arg_page+0x42/0xa0 >> > [<c015bdc6>] copy_strings+0x186/0x1a0 >> > [<c015be64>] search_binary_handler+0x54/0x110 >> > [<c015d83b>] do_execve+0x14b/0x170 >> > [<c010425f>] sys_execve+0x2f/0x90 >> > [<c0105c72>] syscall_call+0x7/0xb >> > [<c010a364>] kernel_execve+0x14/0x20 >> > [<c0100173>] init_post+0xa3/0xf0 >> > [<c02d594f>] kernel_init+0x20f/0x310 >> > [<c0118464>] schedule_tail+0x34/0x90 >> > [<c0105b16>] ret_from_fork+0x6/0x20 >> > [<c0105c7b>] syscall_exit+0x5/0x1b >> > [<c02d5740>] kernel_init+0x0/0x310 >> > [<c02d5740>] kernel_init+0x0/0x310 >> > [<c0106e77>] kernel_thread_helper+0x7/0x10 >> > ======================> > > > -- > Dave: Just a question. What use is a unicyle with no seat? And no pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Is this similar to what you observed? > > That is exactly what I experienced. I tracked it down to:Awesome! Thanks for helping out here.> xen_vcpuop_set_next_event(...) > ret = HYPERVISOR_vcpu_op(VCPUOP_set_singleshot_timer, cpu, &single);> always returns -ETIMEWhich, AFAICS has the expected meaning that the requested time is in the past.> resulting in a infinite loop in > tick_setup_periodic(...) > Where this never succeeds > if (!clockevents_program_event(dev, next, ktime_get())) > return;in kernel/time/tick-common.c, right? I see what you mean, but it''s not immediately obvious to me what''s going wrong. I don''t think the kernel mainline Xen uses even has clockevents, anyway, so I''ve not seen it before :-)> Now my brain needs a rest. I never thought I had to go head first into > the linux-kernel ;-) can''t claim that I got any wiser ;-)Every little helps. Dip your head into tepid water just to forestall any overheating. I may be able to take a look at this later tonight if Jeremy doesn''t beat me to it. I''d like to get a bit more familiar with our patches to mainline. Cheers, Mark> > Cheers, > > Mark > > > >> How far into the boot does it manage to get? Has it started running > >> userspace apps (the normal startup messages starting essential > >> services), or is it still during the kernel initialisation? > >> > >> > Does anybody else see this: > >> > >> I''ve booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 > >> kernel, vcpus = 1 but I didn''t bother giving it a virtual disk, so I > >> don''t know if userspace worked. I''ll give it a try shortly... > >> > >> Cheers, > >> Mark > >> > >> > ------------[ cut here ]------------ > >> > kernel BUG at arch/i386/xen/multicalls.c:68! > >> > invalid opcode: 0000 [#1] > >> > SMP > >> > CPU: 0 > >> > EIP: 0061:[<c01019a6>] Not tainted VLI > >> > EFLAGS: 00010002 (2.6.23 #26) > >> > EIP is at xen_mc_flush+0xa6/0xb0 > >> > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 > >> > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 > >> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 > >> > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) > >> > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 > >> > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 > >> > 00000000 c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 > >> > c015c077 c1101d84 c5eadaa0 Call Trace: > >> > [<c0101ede>] xen_pgd_pin+0x9e/0x100 > >> > [<c0101ff3>] xen_activate_mm+0x13/0x20 > >> > [<c015caca>] flush_old_exec+0x3ca/0x7f0 > >> > [<c01584d0>] do_sync_read+0x0/0x120 > >> > [<c015c077>] kernel_read+0x37/0x50 > >> > [<c018308e>] load_elf_binary+0x2fe/0x1af0 > >> > [<c013ec27>] __alloc_pages+0x57/0x310 > >> > [<c0155387>] kmem_cache_alloc+0x47/0x90 > >> > [<c0147610>] handle_mm_fault+0x540/0x710 > >> > [<c01585a5>] do_sync_read+0xd5/0x120 > >> > [<c0145e10>] vm_normal_page+0x10/0x70 > >> > [<c0145e10>] vm_normal_page+0x10/0x70 > >> > [<c0146471>] follow_page+0xf1/0x170 > >> > [<c01478ca>] get_user_pages+0xea/0x2e0 > >> > [<c015bbe2>] get_arg_page+0x42/0xa0 > >> > [<c015bdc6>] copy_strings+0x186/0x1a0 > >> > [<c015be64>] search_binary_handler+0x54/0x110 > >> > [<c015d83b>] do_execve+0x14b/0x170 > >> > [<c010425f>] sys_execve+0x2f/0x90 > >> > [<c0105c72>] syscall_call+0x7/0xb > >> > [<c010a364>] kernel_execve+0x14/0x20 > >> > [<c0100173>] init_post+0xa3/0xf0 > >> > [<c02d594f>] kernel_init+0x20f/0x310 > >> > [<c0118464>] schedule_tail+0x34/0x90 > >> > [<c0105b16>] ret_from_fork+0x6/0x20 > >> > [<c0105c7b>] syscall_exit+0x5/0x1b > >> > [<c02d5740>] kernel_init+0x0/0x310 > >> > [<c02d5740>] kernel_init+0x0/0x310 > >> > [<c0106e77>] kernel_thread_helper+0x7/0x10 > >> > ======================> > > > -- > > Dave: Just a question. What use is a unicyle with no seat? And no > > pedals! Mark: To answer a question with a question: What use is a > > skateboard? Dave: Skateboards have wheels. > > Mark: My wheel has a wheel!-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Quoting Jeremy Fitzhardinge <jeremy@goop.org>:> Mark Williamson wrote: >> I''m bringing this discussion onto Xen-devel as it smells like it needs some >> more specific developer input than I can give. >> >> >>> I''ve tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. >>> my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during >>> boot, with it panics. I''ve attached my config, if somebody thinks I''ve >>> left something out >>> > > Oh, I just fixed this. Try these patches. > > J >I''ve applied both, and tried with and without SMP, with exactly the same result as before. UP hangs after "installing Xen timer for CPU 0" SMP the oops''es the same way except the line no. is now 78. Morten Bøgeskov _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Morten Bøgeskov wrote:> I''ve applied both, and tried with and without SMP, with exactly the > same result as before. > > UP hangs after "installing Xen timer for CPU 0"Ah, OK. I''d overlooked this one. Hm, I probably haven''t tried UP in a while.> SMP the oops''es the same way except the line no. is now 78.Oh, that''s odd. Could you resend your original bug report? What kind of load does it fail under? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Quoting Jeremy Fitzhardinge <jeremy@goop.org>:> Morten Bøgeskov wrote: >> I''ve applied both, and tried with and without SMP, with exactly the >> same result as before. >> >> UP hangs after "installing Xen timer for CPU 0" > > Ah, OK. I''d overlooked this one. Hm, I probably haven''t tried UP in a > while. > >> SMP the oops''es the same way except the line no. is now 78. > > Oh, that''s odd. Could you resend your original bug report? What kind > of load does it fail under?I can''t really say what load. It doesn''t get that far. I''ve included .config config: kernel = "/usr/src/linux-2.6.23/vmlinux" memory = 96 name = "foo" vif = [ ''mac=00:16:3e:00:00:00, bridge=br1'' ] vcpus = 1 disk = [ ''phy:/dev/VOL/foo,hda1,w'' ] root = "/dev/xvda1 ro" extra = "ip=192.168.0.2::192.168.0.1:255.255.255.0:foo:eth0:" Output: Using config file "/etc/xen/foo". Started domain foo Reserving virtual address space above 0xfbffe000 Linux version 2.6.23 (root@hobbes) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #32 SMP Thu Oct 11 21:02:46 CEST 2007 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000006000000 (usable) 0MB HIGHMEM available. 96MB LOWMEM available. Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 24576 HighMem 24576 -> 24576 Movable zone start PFN for each node early_node_map[1] active PFN ranges 0: 0 -> 24576 DMI not present or invalid. Allocating PCI resources starting at 10000000 (gap: 06000000:fa000000) Built 1 zonelists in Zone order. Total pages: 24384 Kernel command line: root=/dev/xvda1 ro ip=192.168.0.2::192.168.0.1:255.255.255.0:foo:eth0: Local APIC disabled by BIOS -- you can enable it with "lapic" Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 512 (order: 9, 2048 bytes) Detected 1666.723 MHz processor. console [hvc0] enabled Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) Memory: 94884k/98304k available (1543k kernel code, 3360k reserved, 343k data, 176k init, 0k highmem) virtual kernel memory layout: fixmap : 0xfbf9d000 - 0xfbffd000 ( 384 kB) pkmap : 0xfb800000 - 0xfbc00000 (4096 kB) vmalloc : 0xc6800000 - 0xfb7fe000 ( 847 MB) lowmem : 0xc0000000 - 0xc6000000 ( 96 MB) .init : 0xc02dd000 - 0xc0309000 ( 176 kB) .data : 0xc0281ed6 - 0xc02d7d6c ( 343 kB) .text : 0xc0100000 - 0xc0281ed6 (1543 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. installing Xen timer for CPU 0 Calibrating delay using timer specific routine.. 3341.81 BogoMIPS (lpj=5567710) Mount-cache hash table entries: 512 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) Compat vDSO mapped to fbffc000. SMP alternatives: switching to UP code Freeing SMP alternatives: 9k freed Brought up 1 CPUs Booting paravirtualized kernel on Xen Hypervisor signature: xen-3.0-x86_32 Grant table initialized NET: Registered protocol family 16 Setting up standard PCI resources NET: Registered protocol family 2 Time: xen clocksource has been installed. IP route cache hash table entries: 1024 (order: 0, 4096 bytes) TCP established hash table entries: 4096 (order: 3, 49152 bytes) TCP bind hash table entries: 4096 (order: 3, 32768 bytes) TCP: Hash tables configured (established 4096 bind 4096) TCP reno registered SGI XFS with no debug enabled io scheduler noop registered io scheduler deadline registered (default) Initialising Xen virtual ethernet driver. blkfront: xvda1: barriers enabled TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI Shortcut mode XENBUS: Device with no driver: device/console/0 IP-Config: Complete: device=eth0, addr=192.168.0.2, mask=255.255.255.0, gw=192.168.0.1, host=foo, domain=, nis-domain=(none), bootserver=255.255.255.255, rootserver=255.255.255.255, rootpathblkfront: xvda1: write barrier op failed blkfront: xvda1: barriers disabled Filesystem "xvda1": Disabling barriers, trial barrier write failed XFS mounting filesystem xvda1 VFS: Mounted root (xfs filesystem) readonly. Freeing unused kernel memory: 176k freed ------------[ cut here ]------------ kernel BUG at arch/i386/xen/multicalls.c:78! invalid opcode: 0000 [#1] SMP CPU: 0 EIP: 0061:[<c0101a82>] Not tainted VLI EFLAGS: 00010002 (2.6.23 #32) EIP is at xen_mc_flush+0xd2/0xe0 eax: 00000000 ebx: c10c1060 ecx: 00000003 edx: 00000003 esi: c10c1060 edi: 00000000 ebp: 00000001 esp: c10efd1c ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 Process swapper (pid: 1, ti=c10ee000 task=c10eba90 task.ti=c10ee000) Stack: 000d9b10 c10c10a0 c10c1460 c130b000 c01020d0 c1329e40 c02c1100 c10eba90 00000000 c01021f3 c1329e40 c015e498 c132849c 00000080 c12e0aa0 00000000 c10ece60 c0159b50 c02c1ac0 c12fa560 c1253c20 c1144440 c015da4d c10efd7c Call Trace: [<c01020d0>] xen_pgd_pin+0xb0/0x120 [<c01021f3>] xen_activate_mm+0x13/0x20 [<c015e498>] flush_old_exec+0x3c8/0x7e0 [<c0159b50>] do_sync_read+0x0/0x120 [<c015da4d>] kernel_read+0x3d/0x60 [<c0185ed6>] load_elf_binary+0x316/0x1aa0 [<c013fe27>] __alloc_pages+0x57/0x2f0 [<c0156796>] kmem_cache_alloc+0x56/0xb0 [<c01488eb>] handle_mm_fault+0x52b/0x6f0 [<c014710b>] vm_normal_page+0x1b/0x80 [<c014710b>] vm_normal_page+0x1b/0x80 [<c0147786>] follow_page+0x106/0x180 [<c0148bb7>] get_user_pages+0x107/0x2d0 [<c015d59b>] get_arg_page+0x4b/0xb0 [<c015d774>] copy_strings+0x174/0x190 [<c015d824>] search_binary_handler+0x54/0x110 [<c015f236>] do_execve+0x166/0x190 [<c010460f>] sys_execve+0x2f/0x90 [<c010605e>] syscall_call+0x7/0xb [<c010a3ac>] kernel_execve+0x1c/0x30 [<c0100173>] init_post+0xa3/0xf0 [<c02dd942>] kernel_init+0x222/0x330 [<c0118173>] schedule_tail+0x33/0xa0 [<c0105f1a>] ret_from_fork+0x6/0x1c [<c0106067>] syscall_exit+0x5/0x1b [<c02dd720>] kernel_init+0x0/0x330 [<c02dd720>] kernel_init+0x0/0x330 [<c0106c33>] kernel_thread_helper+0x7/0x10 ======================Code: 89 d8 72 e9 85 ed c7 86 08 07 00 00 00 00 00 00 75 19 5b 5e 5f 5d c3 0f 0b eb fe 8b 96 04 07 00 00 31 ed 85 d2 74 ac 0f 0b eb fe <0f> 0b eb fe 8d 76 00 8d bc 00 00 00 00 83 ec 0c 89 1c 24 89 EIP: [<c0101a82>] xen_mc_flush+0xd2/0xe0 SS:ESP e021:c10efd1c Kernel panic - not syncing: Attempted to kill init! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Morten Bøgeskov wrote:> Quoting Jeremy Fitzhardinge <jeremy@goop.org>: > >> Morten Bøgeskov wrote: >>> I''ve applied both, and tried with and without SMP, with exactly the >>> same result as before. >>> >>> UP hangs after "installing Xen timer for CPU 0" >> >> Ah, OK. I''d overlooked this one. Hm, I probably haven''t tried UP in a >> while. >> >>> SMP the oops''es the same way except the line no. is now 78. >> >> Oh, that''s odd. Could you resend your original bug report? What kind >> of load does it fail under? > > I can''t really say what load. It doesn''t get that far.Odd.> I''ve included .configIt seems a bit small. Can you send your whole .config so I can rebuild your kernel? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Morten Bøgeskov wrote: > > I''ve applied both, and tried with and without SMP, with exactly the > > same result as before. > > > > UP hangs after "installing Xen timer for CPU 0" > > Ah, OK. I''d overlooked this one. Hm, I probably haven''t tried UP in a > while.Actually, I think the kernel is hanging in an infinite loop whilst trying to measure loops per jiffies... For some reason the UP kernel isn''t getting timer interrupts from Xen at this point, whereas the SMP kernel is. I''ve not figured out why this could be happening, yet, but that''s where I am at the moment. Cheers, Mark> > SMP the oops''es the same way except the line no. is now 78. > > Oh, that''s odd. Could you resend your original bug report? What kind > of load does it fail under? > > J-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson wrote:> Actually, I think the kernel is hanging in an infinite loop whilst trying to > measure loops per jiffies... For some reason the UP kernel isn''t getting > timer interrupts from Xen at this point, whereas the SMP kernel is. I''ve not > figured out why this could be happening, yet, but that''s where I am at the > moment.I can''t reproduce that specific symptom, but this may help. diff -r 25f5c8bdd699 arch/i386/xen/enlighten.c --- a/arch/i386/xen/enlighten.c Thu Oct 11 18:46:33 2007 -0700 +++ b/arch/i386/xen/enlighten.c Thu Oct 11 23:20:29 2007 -0700 @@ -115,7 +115,7 @@ static void __init xen_vcpu_setup(int cp info.mfn = virt_to_mfn(vcpup); info.offset = offset_in_page(vcpup); - printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %x, offset %d\n", + printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %llx, offset %d\n", cpu, vcpup, info.mfn, info.offset); /* Check to see if the hypervisor will put the vcpu_info diff -r 25f5c8bdd699 include/xen/interface/vcpu.h --- a/include/xen/interface/vcpu.h Thu Oct 11 18:46:33 2007 -0700 +++ b/include/xen/interface/vcpu.h Thu Oct 11 23:20:29 2007 -0700 @@ -160,8 +160,9 @@ struct vcpu_set_singleshot_timer { */ #define VCPUOP_register_vcpu_info 10 /* arg == struct vcpu_info */ struct vcpu_register_vcpu_info { - uint32_t mfn; /* mfn of page to place vcpu_info */ - uint32_t offset; /* offset within page */ + uint64_t mfn; /* mfn of page to place vcpu_info */ + uint32_t offset; /* offset within page */ + uint32_t rsvd; /* unused */ }; #endif /* __XEN_PUBLIC_VCPU_H__ */ J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Quoting Jeremy Fitzhardinge <jeremy@goop.org>:> Morten Bøgeskov wrote: >> Quoting Jeremy Fitzhardinge <jeremy@goop.org>: >> >>> Morten Bøgeskov wrote: >>>> I''ve applied both, and tried with and without SMP, with exactly the >>>> same result as before. >>>> >>>> UP hangs after "installing Xen timer for CPU 0" >>> >>> Ah, OK. I''d overlooked this one. Hm, I probably haven''t tried UP in a >>> while. >>> >>>> SMP the oops''es the same way except the line no. is now 78. >>> >>> Oh, that''s odd. Could you resend your original bug report? What kind >>> of load does it fail under? >> >> I can''t really say what load. It doesn''t get that far. > > Odd. > >> I''ve included .config > > It seems a bit small. Can you send your whole .config so I can rebuild > your kernel? >Ah, you don''t have PAE enabled. Could you try it with (HIGHMEM64G)? I do technically support non-PAE kernels, but they''re a bit tricky to test (stock Xen doesn''t really support non-PAE any more). I''ll look at the UP issue too. J New compiled xen-3.1.1-rc3 PAE, 2.6.18-xen (domU) PAE & 2.6.23 PAE (config included) Still no difference. TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI Shortcut mode blkfront: xvda1: barriers enabled XENBUS: Device with no driver: device/console/0 IP-Config: Complete: device=eth0, addr=192.168.0.2, mask=255.255.255.0, gw=192.168.0.1, host=foo, domain=, nis-domain=(none), bootserver=255.255.255.255, rootserver=255.255.255.255, rootpathblkfront: xvda1: write barrier op failed blkfront: xvda1: barriers disabled Filesystem "xvda1": Disabling barriers, trial barrier write failed XFS mounting filesystem xvda1 VFS: Mounted root (xfs filesystem) readonly. Freeing unused kernel memory: 180k freed ------------[ cut here ]------------ kernel BUG at arch/i386/xen/multicalls.c:78! invalid opcode: 0000 [#1] SMP CPU: 0 EIP: 0061:[<c0101a92>] Not tainted VLI EFLAGS: 00010002 (2.6.23 #33) EIP is at xen_mc_flush+0xd2/0xe0 eax: 00000000 ebx: c10c1060 ecx: 00000007 edx: 00000007 esi: c10c1060 edi: 00000000 ebp: 00000001 esp: c10efd1c ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 Process swapper (pid: 1, ti=c10ee000 task=c10eba90 task.ti=c10ee000) Stack: 000d991c c10c1120 c10c1460 c12d6000 c0102400 c12d5e40 c02c5100 c10eba90 00000000 c01025d3 c12d5e40 c0161728 c125f49c 00000080 c1240aa0 00000000 c10ece60 c015cdd0 c02c5ac0 c131d560 c134cbc0 c1141440 c0160ccd c10efd7c Call Trace: [<c0102400>] xen_pgd_pin+0xb0/0x120 [<c01025d3>] xen_activate_mm+0x13/0x20 [<c0161728>] flush_old_exec+0x3c8/0x7e0 [<c015cdd0>] do_sync_read+0x0/0x120 [<c0160ccd>] kernel_read+0x3d/0x60 [<c01891e6>] load_elf_binary+0x316/0x1aa0 [<c0159a16>] kmem_cache_alloc+0x56/0xb0 [<c01f610c>] xfs_file_aio_read+0x6c/0x80 [<c0148b3a>] vm_normal_page+0x2a/0xb0 [<c01492ef>] follow_page+0x1af/0x230 [<c014ac25>] get_user_pages+0x105/0x360 [<c016081b>] get_arg_page+0x4b/0xb0 [<c01609f4>] copy_strings+0x174/0x190 [<c0160aa4>] search_binary_handler+0x54/0x110 [<c01624c6>] do_execve+0x166/0x190 [<c0104bef>] sys_execve+0x2f/0x90 [<c010663e>] syscall_call+0x7/0xb [<c010a98c>] kernel_execve+0x1c/0x30 [<c0100173>] init_post+0xa3/0xf0 [<c02e1942>] kernel_init+0x222/0x330 [<c0119453>] schedule_tail+0x33/0xa0 [<c01064fa>] ret_from_fork+0x6/0x1c [<c0106647>] syscall_exit+0x5/0x1b [<c02e1720>] kernel_init+0x0/0x330 [<c02e1720>] kernel_init+0x0/0x330 [<c0107213>] kernel_thread_helper+0x7/0x10 ======================Code: 89 d8 72 e9 85 ed c7 86 08 07 00 00 00 00 00 00 75 19 5b 5e 5f 5d c3 0f 0b eb fe 8b 96 04 07 00 00 31 ed 85 d2 74 ac 0f 0b eb fe <0f> 0b eb fe 8d 76 00 8d bc 27 00 00 00 00 83 ec 0c 89 1c 24 89 EIP: [<c0101a92>] xen_mc_flush+0xd2/0xe0 SS:ESP e021:c10efd1c Kernel panic - not syncing: Attempted to kill init! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Morten Bøgeskov wrote:> New compiled xen-3.1.1-rc3 PAE, 2.6.18-xen (domU) PAE & > 2.6.23 PAE (config included) Still no difference.I just realized I hadn''t been reading your backtrace closely enough, since it looks similar to the bug I''d been working on. Turns out having an xfs rootfs is what triggers your bug - I can repro it now, so I''ll see if I can work out what''s going on. BTW, did last night''s little patch help with the UP time issue? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> I just realized I hadn''t been reading your backtrace closely enough, > since it looks similar to the bug I''d been working on. Turns out having > an xfs rootfs is what triggers your bug - I can repro it now, so I''ll > see if I can work out what''s going on. > > BTW, did last night''s little patch help with the UP time issue?I''ve not had a chance to try it out yet... I''ll try and take a long. But I didn''t entirely understand the semantic significance of the change? Could you possibly elaborate? Cheers, Mark -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12/10/07 16:34, "Mark Williamson" <mark.williamson@cl.cam.ac.uk> wrote:>> I just realized I hadn''t been reading your backtrace closely enough, >> since it looks similar to the bug I''d been working on. Turns out having >> an xfs rootfs is what triggers your bug - I can repro it now, so I''ll >> see if I can work out what''s going on. >> >> BTW, did last night''s little patch help with the UP time issue? > > I''ve not had a chance to try it out yet... I''ll try and take a long. > > But I didn''t entirely understand the semantic significance of the change? > Could you possibly elaborate?It fixed the layout of the structure passed to VCPUOP_register_vcpu_info. I would expect that to improve stability! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson wrote:>> I just realized I hadn''t been reading your backtrace closely enough, >> since it looks similar to the bug I''d been working on. Turns out having >> an xfs rootfs is what triggers your bug - I can repro it now, so I''ll >> see if I can work out what''s going on. >> >> BTW, did last night''s little patch help with the UP time issue? >> > > I''ve not had a chance to try it out yet... I''ll try and take a long. > > But I didn''t entirely understand the semantic significance of the change? > Could you possibly elaborate? >There was version drift in the register_vcpu_info hypercall arg structure, and the version of the structure being used by the kernel was smaller than the one that xen was expecting. That meant that the mfn argument was OK, but the offset was being corrupted, and so the vcpu_info structure could have been placed anywhere, corrupting kernel memory. For me it manifested as an oops, but it could also have corrupted the timing parameters - or at the very least, reading the time from the vcpu_info structure wouldn''t work. So I think there''s a good chance this change would fix the UP problem. It doesn''t hit in the same way in SMP because the per-cpu data area is elsewhere, but it could still have caused havok. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel