Some AMD processor systems assign the boot processor (cpu 0) to apicid 4. Standard Linux handles this cleanly by keeping track of the correspondences amongst cpu id, apic id, and acpi id. dom0 Xen does not have the code to do that, and AMD is seeing some strange behavior on our 4 socket quad-core systems. Specifically, when we try to get ACPI information for cpu 0, the correspondences break down and the request fails. I know some of the code needs to be added to mpparse-xen.c, but it looks like the relevant code was #ifdef''d out in the first place. Does anyone know why the code to match cpuids to apicids was removed and what would need to be done to restore it? -Mark Langsdorf Operating System Research Center AMD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8/2/08 21:05, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote:> Some AMD processor systems assign the boot processor > (cpu 0) to apicid 4. Standard Linux handles this > cleanly by keeping track of the correspondences amongst > cpu id, apic id, and acpi id. dom0 Xen does not have > the code to do that, and AMD is seeing some strange > behavior on our 4 socket quad-core systems. > > Specifically, when we try to get ACPI information for > cpu 0, the correspondences break down and the request > fails. > > I know some of the code needs to be added to mpparse-xen.c, > but it looks like the relevant code was #ifdef''d out in > the first place. Does anyone know why the code to match > cpuids to apicids was removed and what would need to be > done to restore it?Probably most of the things that depend on it are actually managed by Xen itself. Possibly this assumption is wrong and the code needs to be re-added, or perhaps the problem is actually within Xen? Can you point out the specific troubling code in our Linux tree (e.g., the specific #if0''ed code, and also the place where the kernel ultimately gets confused and the request fails)? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8/2/08 21:19, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk> wrote:>> I know some of the code needs to be added to mpparse-xen.c, >> but it looks like the relevant code was #ifdef''d out in >> the first place. Does anyone know why the code to match >> cpuids to apicids was removed and what would need to be >> done to restore it? > > Probably most of the things that depend on it are actually managed by Xen > itself. Possibly this assumption is wrong and the code needs to be re-added, > or perhaps the problem is actually within Xen? > > Can you point out the specific troubling code in our Linux tree (e.g., the > specific #if0''ed code, and also the place where the kernel ultimately gets > confused and the request fails)?Also, a link to the upstream patch, or area of code in an upstream kernel, that you think needs to be added would be useful too. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: Friday, February 08, 2008 3:20 PM > To: Langsdorf, Mark; xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] dom0 and apicid not equal to cpuid > > On 8/2/08 21:05, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote: > > > Some AMD processor systems assign the boot processor > > (cpu 0) to apicid 4. Standard Linux handles this > > cleanly by keeping track of the correspondences amongst > > cpu id, apic id, and acpi id. dom0 Xen does not have > > the code to do that, and AMD is seeing some strange > > behavior on our 4 socket quad-core systems. > > > > Specifically, when we try to get ACPI information for > > cpu 0, the correspondences break down and the request > > fails. > > > > I know some of the code needs to be added to mpparse-xen.c, > > but it looks like the relevant code was #ifdef''d out in > > the first place. Does anyone know why the code to match > > cpuids to apicids was removed and what would need to be > > done to restore it? > > Can you point out the specific troubling code in our Linux > tree (e.g., the specific #if0''ed code, and also the place > where the kernel ultimately gets confused and the request fails)?Sure. in arch/x86_64/kernel/mpparse-xen.c, there is no code in mp_register_lapic_address(), so dom0 has no idea who the boot CPU is. Later, when mp_register_lapic() is called, it still doesn''t have this information, so it can''t pass this information to MP_processor_info(). MP_processor_info() doesn''t get the cpuid to apicid information from the previous functions, so it sets the cpu_to_apicidp[] table using some other method such that apicid is assumed to be cpuid. However, acpiid is not the same as apicid, so later when drivers/acpi/processor_core.c tries to recover _PSS objects, it can''t always get the _PSS objects because it starts searching at cpu id 4 (instead of 0) and fails when looking for cpu id 16+ (on a system with 16 cores). It appears that the call to GET_APIC_ID() in mp_register_lapic_address() isn''t legal for dom0, but the failure to make that call results in an incorrect cpu_to_apicid[] table and that means the acpiid_to_apicid[] table is also wrong. -Mark Langsdorf Operating System Research Center AMD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11/2/08 22:38, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote:> It appears that the call to GET_APIC_ID() in > mp_register_lapic_address() isn''t legal for dom0, but the failure > to make that call results in an incorrect cpu_to_apicid[] table > and that means the acpiid_to_apicid[] table is also wrong.Can acpiid_to_apicid[] be filled in just from information available in BIOS tables? Obviously we have no guaranteed correspondence between vcpuid and apicid unless we specify dom0_vcpus_pin, so working out acpiid_to_apicid transitively via cpuids isn''t going to work very well. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11/2/08 22:38, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote:> It appears that the call to GET_APIC_ID() in > mp_register_lapic_address() isn''t legal for dom0, but the failure > to make that call results in an incorrect cpu_to_apicid[] table > and that means the acpiid_to_apicid[] table is also wrong.The code in processor_core.c ultimately wants to convert acpiid to cpuid. I think we''re going to be in a confusing mess if we set up the acpiid-apicid-cpuid relationships for anything other than a dom0_vcpu_pin''ed system. So perhaps we should expose that configuration option to dom0 (so it can tell whether it is pinned or not). If so, it can set up its mapping arrays just like a native kernel (we can reinstate the code for this purpose, and backport any fixes to it as necessary). In the non-pinned case perhaps we should initialise all those arrays to -1 for sanity''s sake. The reason I think this is a rathole for the non-pinned case is that the cpuid space of a non-pinned dom0 kernel has no consistent relationship with underlying physical CPUs. So when the kernel decides to make use of that cpu-apicid-acpiid relationship information, e.g., to change power states, it will all go horribly wrong. Setting the mapping arrays to -1 is a way to try and nobble the ACPI code paths in a reasonably well-defined manner. Do you think you could do the Linux side of this in a reasonably clean manner? You could provide a Linux kernel command-line option to specify pinned/not-pinned for now if you like, and I would do the proper plumbing through from Xen. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > It appears that the call to GET_APIC_ID() in > > mp_register_lapic_address() isn''t legal for dom0, but the failure > > to make that call results in an incorrect cpu_to_apicid[] table > > and that means the acpiid_to_apicid[] table is also wrong. > > The code in processor_core.c ultimately wants to convert > acpiid to cpuid. I think we''re going to be in a confusing > mess if we set up the acpiid-apicid-cpuid relationships for > anything other than a dom0_vcpu_pin''ed system.Agreed. That''s all I really need it for anyway.> So perhaps we should expose that configuration option > to dom0 (so it can tell whether it is pinned or not)...Good idea.> In the non-pinned case perhaps we should initialise all those > arrays to -1 for sanity''s sake.If this had been done initially, I''d think it was a good idea, but I''m worried about introducing that kind of change now.> Do you think you could do the Linux side of this in a reasonably clean > manner? You could provide a Linux kernel command-line option > to specify pinned/not-pinned for now if you like, and I would do the > proper plumbing through from Xen.Sure, I''ll try to get something out soon. -Mark Langsdorf Operating System Research Center AMD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12/2/08 15:33, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote:>> In the non-pinned case perhaps we should initialise all those >> arrays to -1 for sanity''s sake. > > If this had been done initially, I''d think it was a good idea, > but I''m worried about introducing that kind of change now.It''s fine not to change this case if you don''t want to. It would probably logically belong in a separate patch anyway. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> On 12/2/08 15:33, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote: > >> In the non-pinned case perhaps we should initialise all those > >> arrays to -1 for sanity''s sake. > > > > If this had been done initially, I''d think it was a good idea, > > but I''m worried about introducing that kind of change now. > > It''s fine not to change this case if you don''t want to. It > would probably logically belong in a separate patch anyway.Is there an easy way to get Xen to pass the "cpufreq=dom0-kernel" hypervisor argument onto to the Linux command line? I can add a separate parament to Linux easily enough, but I''d like it if the end-user didn''t need to add it twice. -Mark Langsdorf Operating System Research Center AMD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> "Langsdorf, Mark" <mark.langsdorf@amd.com> 12.02.08 20:09 >>> >> On 12/2/08 15:33, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote: >> >> In the non-pinned case perhaps we should initialise all those >> >> arrays to -1 for sanity''s sake. >> > >> > If this had been done initially, I''d think it was a good idea, >> > but I''m worried about introducing that kind of change now. >> >> It''s fine not to change this case if you don''t want to. It >> would probably logically belong in a separate patch anyway. > >Is there an easy way to get Xen to pass the "cpufreq=dom0-kernel" >hypervisor argument onto to the Linux command line? I can >add a separate parament to Linux easily enough, but I''d like it >if the end-user didn''t need to add it twice.You could do this along with the other similar stuff in xen/arch/x86/setup.c:__start_xen() (following the comment "Grab the DOM0 command line.") Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12/2/08 19:09, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote:>> It''s fine not to change this case if you don''t want to. It >> would probably logically belong in a separate patch anyway. > > Is there an easy way to get Xen to pass the "cpufreq=dom0-kernel" > hypervisor argument onto to the Linux command line? I can > add a separate parament to Linux easily enough, but I''d like it > if the end-user didn''t need to add it twice.I''m happy to fix this aspect myself when I check your patch in. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/2/08 09:23, "Jan Beulich" <jbeulich@novell.com> wrote:>> Is there an easy way to get Xen to pass the "cpufreq=dom0-kernel" >> hypervisor argument onto to the Linux command line? I can >> add a separate parament to Linux easily enough, but I''d like it >> if the end-user didn''t need to add it twice. > > You could do this along with the other similar stuff in > xen/arch/x86/setup.c:__start_xen() (following the comment "Grab > the DOM0 command line.")I think it''s cleaner to define a new hypervisor CPUID flag and/or XENFEAT_ flag for this. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wednesday 13 February 2008, Keir Fraser wrote:> > On 12/2/08 19:09, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote: > > >> It''s fine not to change this case if you don''t want to. It > >> would probably logically belong in a separate patch anyway. > > > > Is there an easy way to get Xen to pass the "cpufreq=dom0-kernel" > > hypervisor argument onto to the Linux command line? I can > > add a separate parament to Linux easily enough, but I''d like it > > if the end-user didn''t need to add it twice. > > I''m happy to fix this aspect myself when I check your patch in.I''m not exactly sure if this is what you wanted or not, but it''s the minimum patch that resolves the issue on my system and it should be the correct logic path. Replace the hard coded values in the dom0_vcpus_pinned case with with the correct logic to get Xen to do the right thing. I''m happy to add a platform call if that''s what you wanted instead. -Mark Langsdorf Operating System Research Center AMD Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> diff -r 08e85e79c65d arch/x86_64/kernel/setup-xen.c --- a/arch/x86_64/kernel/setup-xen.c Mon Feb 11 11:05:27 2008 +0000 +++ b/arch/x86_64/kernel/setup-xen.c Wed Feb 13 16:39:58 2008 -0600 @@ -107,6 +107,9 @@ DEFINE_PER_CPU(int, nr_multicall_ents); /* Raw start-of-day parameters from the hypervisor. */ start_info_t *xen_start_info; EXPORT_SYMBOL(xen_start_info); + +int dom0_vcpus_pinned; +EXPORT_SYMBOL(dom0_vcpus_pinned); #endif /* @@ -412,6 +415,9 @@ static __init void parse_cmdline_early ( skip_ioapic_setup = 0; ioapic_force = 1; } +#else + if (fullarg(from, "cpufreq=dom0-kernel") || fullarg(from, "pin_dom0_vcpus")) + dom0_vcpus_pinned = 1; #endif if (!memcmp(from, "mem=", 4)) diff -r 08e85e79c65d drivers/xen/core/smpboot.c --- a/drivers/xen/core/smpboot.c Mon Feb 11 11:05:27 2008 +0000 +++ b/drivers/xen/core/smpboot.c Wed Feb 13 16:39:58 2008 -0600 @@ -70,6 +70,8 @@ unsigned int maxcpus = NR_CPUS; unsigned int maxcpus = NR_CPUS; #endif +extern int dom0_vcpus_pinned; + void __init prefill_possible_map(void) { int i, rc; @@ -267,9 +269,13 @@ void __init smp_prepare_cpus(unsigned in boot_cpu_data.apicid = 0; cpu_data[0] = boot_cpu_data; - cpu_2_logical_apicid[0] = 0; - x86_cpu_to_apicid[0] = 0; - + if (dom0_vcpus_pinned) { + cpu_2_logical_apicid[0] = 4; + x86_cpu_to_apicid[0] = 4; + } else { + cpu_2_logical_apicid[0] = 0; + x86_cpu_to_apicid[0] = 0; + } current_thread_info()->cpu = 0; for (cpu = 0; cpu < NR_CPUS; cpu++) { @@ -315,8 +321,13 @@ void __init smp_prepare_cpus(unsigned in cpu_data[cpu] = boot_cpu_data; cpu_data[cpu].apicid = cpu; - cpu_2_logical_apicid[cpu] = cpu; - x86_cpu_to_apicid[cpu] = cpu; + if (dom0_vcpus_pinned) { + cpu_2_logical_apicid[cpu] = cpu + 4; + x86_cpu_to_apicid[cpu] = cpu + 4; + } else { + cpu_2_logical_apicid[cpu] = cpu; + x86_cpu_to_apicid[cpu] = cpu; + } idle = fork_idle(cpu); if (IS_ERR(idle)) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/2/08 22:52, "Mark Langsdorf" <mark.langsdorf@amd.com> wrote:> I''m not exactly sure if this is what you wanted or not, but it''s > the minimum patch that resolves the issue on my system and it > should be the correct logic path. > > Replace the hard coded values in the dom0_vcpus_pinned case with > with the correct logic to get Xen to do the right thing.It''s pretty obviously not the correct thing to do since it is not generally the case that apicid == cpuid+4. The latter id space does not even have a guaranteed ordering across physical cores and sockets! This is why you were discussing reviving code in mpparse.c, because this relationship must be established dynamically, right? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Friday 08 February 2008, Keir Fraser wrote:> On 8/2/08 21:19, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk> wrote: > > >> I know some of the code needs to be added to mpparse-xen.c, > >> but it looks like the relevant code was #ifdef''d out in > >> the first place. Does anyone know why the code to match > >> cpuids to apicids was removed and what would need to be > >> done to restore it? > > > > Probably most of the things that depend on it are actually managed by Xen > > itself. Possibly this assumption is wrong and the code needs to be re-added, > > or perhaps the problem is actually within Xen? > > > > Can you point out the specific troubling code in our Linux tree (e.g., the > > specific #if0''ed code, and also the place where the kernel ultimately gets > > confused and the request fails)? > > Also, a link to the upstream patch, or area of code in an upstream kernel, > that you think needs to be added would be useful too.Simpler solution that the one I proposed above: Fix the APIC to ACPI mappings in drivers/xen/core/smpboot.c. That doesn''t require pinning vcpus or knowing about the underlying APIC/ACPI/CPU ID interaction. Patch for this solution below: Xen dom0 arbitrarily assigns APIC ID x to CPU ID x. Make dom0 also assign the APIC ID to ACPI ID mapping in the same way. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> diff -r 08e85e79c65d drivers/xen/core/smpboot.c --- a/drivers/xen/core/smpboot.c Mon Feb 11 11:05:27 2008 +0000 +++ b/drivers/xen/core/smpboot.c Thu Feb 14 16:12:41 2008 -0600 @@ -269,6 +269,7 @@ void __init smp_prepare_cpus(unsigned in cpu_2_logical_apicid[0] = 0; x86_cpu_to_apicid[0] = 0; + x86_acpiid_to_apicid[0] = 0; current_thread_info()->cpu = 0; @@ -317,6 +318,7 @@ void __init smp_prepare_cpus(unsigned in cpu_2_logical_apicid[cpu] = cpu; x86_cpu_to_apicid[cpu] = cpu; + x86_acpiid_to_apicid[cpu] = cpu; idle = fork_idle(cpu); if (IS_ERR(idle)) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/2/08 22:16, "Mark Langsdorf" <mark.langsdorf@amd.com> wrote:> Simpler solution that the one I proposed above: Fix the APIC to ACPI mappings > in drivers/xen/core/smpboot.c. That doesn''t require pinning vcpus or knowing > about the underlying APIC/ACPI/CPU ID interaction.That''s more like it. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel