Nakajima, Jun
2005-Nov-17 06:46 UTC
[Xen-devel] [PATCH] Fixing PAE SMP dom0 hang at boot time (take 2)
Nakajima, Jun wrote:>> And why would we need to take interrupts between loading esp0 and >> LDT? >> >> load_esp0(t, thread); >> >> + local_irq_enable(); >> + >> load_LDT(&init_mm.context); > > I thought it''s required to get IPI working (for load_LDT and the other > on-going flush TLB actitivies), but looks bogus after sleeping on it. > I''m pretty sure that it resolves the hang, and it''s hiding an > underlying bug. >I''ve finally root caused it. It''s much deeper than I expect... Here is what''s happening: void arch_do_createdomain(struct vcpu *v) { ... l1_pgentry_t gdt_l1e; ... d->arch.mm_perdomain_pt = alloc_xenheap_page(); memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE); ... for ( vcpuid = 0; vcpuid < MAX_VIRT_CPUS; vcpuid++ ) d->arch.mm_perdomain_pt[ (vcpuid << PDPT_VCPU_SHIFT) + FIRST_RESERVED_GDT_PAGE] gdt_l1e; The max value of (vcpuid << PDPT_VCPU_SHIFT) + FIRST_RESERVED_GDT_PAGE is 1006 (< 1024), but the size of each entry is 8 bytes for PAE (and x86_64), so alloc_xenheap_page() (i.e. a single page) was not sufficient, and it''s corrupting the next page which contains the areas for vcpu_info, which contains evtchn_upcall_pending for vcpus. That affected vcpu 7 (and 23) on my machine, and at load_LDT, we check the pending events at hypercall_preempt_check(), and it''s already on for vcpu 7, but it''s never cleared by hypercall4_create_continuation() because nobody set such events... So it was looping there. int do_mmuext_op( struct mmuext_op *uops, ... { ... for ( i = 0; i < count; i++ ) { if ( hypercall_preempt_check() ) { rc = hypercall4_create_continuation( __HYPERVISOR_mmuext_op, uops, (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom); break; } Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> ---- diff -r 9c7aeec94f8a xen/arch/x86/domain.c --- a/xen/arch/x86/domain.c Tue Nov 15 19:46:48 2005 +0100 +++ b/xen/arch/x86/domain.c Wed Nov 16 23:23:44 2005 -0700 @@ -252,6 +252,8 @@ struct domain *d = v->domain; l1_pgentry_t gdt_l1e; int vcpuid; + physaddr_t size; + int order; if ( is_idle_task(d) ) return; @@ -265,9 +267,11 @@ SHARE_PFN_WITH_DOMAIN(virt_to_page(d->shared_info), d); set_pfn_from_mfn(virt_to_phys(d->shared_info) >> PAGE_SHIFT, INVALID_M2P_ENTRY); - - d->arch.mm_perdomain_pt = alloc_xenheap_page(); - memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE); + size = ((((MAX_VIRT_CPUS - 1) << PDPT_VCPU_SHIFT) + + FIRST_RESERVED_GDT_PAGE) * sizeof (l1_pgentry_t)); + order = get_order_from_bytes(size); + d->arch.mm_perdomain_pt = alloc_xenheap_pages(order); + memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE << order); set_pfn_from_mfn(virt_to_phys(d->arch.mm_perdomain_pt) >> PAGE_SHIFT, INVALID_M2P_ENTRY); v->arch.perdomain_ptes = d->arch.mm_perdomain_pt; Jun --- Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2005-Nov-17 11:59 UTC
Re: [Xen-devel] [PATCH] Fixing PAE SMP dom0 hang at boot time (take 2)
On 17 Nov 2005, at 06:46, Nakajima, Jun wrote:> The max value of (vcpuid << PDPT_VCPU_SHIFT) + FIRST_RESERVED_GDT_PAGE > is 1006 (< 1024), but the size of each entry is 8 bytes for PAE (and > x86_64), so alloc_xenheap_page() (i.e. a single page) was not > sufficient, and it''s corrupting the next page which contains the areas > for vcpu_info, which contains evtchn_upcall_pending for vcpus. That > affected vcpu 7 (and 23) on my machine, and at load_LDT, we check the > pending events at hypercall_preempt_check(), and it''s already on for > vcpu 7, but it''s never cleared by hypercall4_create_continuation() > because nobody set such events... So it was looping there.Thanks Jun! I''ve fixed your patch a little (e.g., to deallocate the correct number of pages) and checked into our staging tree. Hopefully I haven''t broken it again. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel