Debian has decided to take Jeremy''s microcode patch [0] as an interim measure for their next release. (TL;DR -- Debian is shipping pvops Linux 3.2 and Xen 4.1 in the next release. See http://bugs.debian.org/693053 and https://lists.debian.org/debian-devel/2012/11/msg00141.html for some more background). However the patch is a bit old and predates the use introduction of separate firmware files for AMD family >= 15h. Looking at the SuSE forward ported classic Xen patches it seems like the following patch is all that is required. But it seems a little too simple to be true and I don''t have any such processors to test on. Jan, can you recall if it really is that easy on the kernel side ;-) Ian. [0] http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=refs/heads/upstream/microcode commit 109cf37876567ef346c0ecde8b473e7ad1e74e07 Author: Ian Campbell <ijc@hellion.org.uk> Date: Mon Nov 26 09:41:02 2012 +0000 microcode_xen: Add support for AMD family >= 15h Signed-off-by: Ian Campbell <ijc@hellion.org.uk> diff --git a/arch/x86/kernel/microcode_xen.c b/arch/x86/kernel/microcode_xen.c index 9d2a06b..2b8a78a 100644 --- a/arch/x86/kernel/microcode_xen.c +++ b/arch/x86/kernel/microcode_xen.c @@ -74,7 +74,11 @@ static enum ucode_state xen_request_microcode_fw(int cpu, struct device *device) break; case X86_VENDOR_AMD: - snprintf(name, sizeof(name), "amd-ucode/microcode_amd.bin"); + /* Beginning with family 15h AMD uses family-specific firmware files. */ + if (c->x86 >= 0x15) + snprintf(name, sizeof(name), "amd-ucode/microcode_amd_fam%.2xh.bin", c->x86); + else + snprintf(name, sizeof(name), "amd-ucode/microcode_amd.bin"); break; default: -- Ian Campbell Current Noise: Dew-Scented - Metal Militia Now KEN and BARBIE are PERMANENTLY ADDICTED to MIND-ALTERING DRUGS ...
>>> On 26.11.12 at 14:21, Ian Campbell <ijc@hellion.org.uk> wrote: > Debian has decided to take Jeremy''s microcode patch [0] as an interim > measure for their next release. (TL;DR -- Debian is shipping pvops Linux > 3.2 and Xen 4.1 in the next release. See http://bugs.debian.org/693053 > and https://lists.debian.org/debian-devel/2012/11/msg00141.html for some > more background). > > However the patch is a bit old and predates the use introduction of > separate firmware files for AMD family >= 15h. Looking at the SuSE > forward ported classic Xen patches it seems like the following patch is > all that is required. But it seems a little too simple to be true and I > don''t have any such processors to test on. > > Jan, can you recall if it really is that easy on the kernel side ;-)While so far I didn''t myself run anything on post-Fam10 systems either, it really ought to be that easy - the patch format didn''t change, it''s just that they decided to spit the files by family to keep them manageable. The only other thing to check for is that you don''t have any artificial size restriction left in that code (I think patch files early on were limited to 4k in size, and that got lifted during the last couple of years). The hypervisor is really going to take care of all other aspects here. Jan> [0] > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=refs/h > eads/upstream/microcode > > commit 109cf37876567ef346c0ecde8b473e7ad1e74e07 > Author: Ian Campbell <ijc@hellion.org.uk> > Date: Mon Nov 26 09:41:02 2012 +0000 > > microcode_xen: Add support for AMD family >= 15h > > Signed-off-by: Ian Campbell <ijc@hellion.org.uk> > > diff --git a/arch/x86/kernel/microcode_xen.c > b/arch/x86/kernel/microcode_xen.c > index 9d2a06b..2b8a78a 100644 > --- a/arch/x86/kernel/microcode_xen.c > +++ b/arch/x86/kernel/microcode_xen.c > @@ -74,7 +74,11 @@ static enum ucode_state xen_request_microcode_fw(int cpu, > struct device *device) > break; > > case X86_VENDOR_AMD: > - snprintf(name, sizeof(name), "amd-ucode/microcode_amd.bin"); > + /* Beginning with family 15h AMD uses family-specific firmware files. */ > + if (c->x86 >= 0x15) > + snprintf(name, sizeof(name), "amd-ucode/microcode_amd_fam%.2xh.bin", > c->x86); > + else > + snprintf(name, sizeof(name), "amd-ucode/microcode_amd.bin"); > break; > > default: > > > -- > Ian Campbell > Current Noise: Dew-Scented - Metal Militia > > Now KEN and BARBIE are PERMANENTLY ADDICTED to MIND-ALTERING DRUGS ...
On Mon, 2012-11-26 at 13:44 +0000, Jan Beulich wrote:> >>> On 26.11.12 at 14:21, Ian Campbell <ijc@hellion.org.uk> wrote: > > Debian has decided to take Jeremy''s microcode patch [0] as an interim > > measure for their next release. (TL;DR -- Debian is shipping pvops Linux > > 3.2 and Xen 4.1 in the next release. See http://bugs.debian.org/693053 > > and https://lists.debian.org/debian-devel/2012/11/msg00141.html for some > > more background). > > > > However the patch is a bit old and predates the use introduction of > > separate firmware files for AMD family >= 15h. Looking at the SuSE > > forward ported classic Xen patches it seems like the following patch is > > all that is required. But it seems a little too simple to be true and I > > don''t have any such processors to test on. > > > > Jan, can you recall if it really is that easy on the kernel side ;-) > > While so far I didn''t myself run anything on post-Fam10 systems > either, it really ought to be that easy - the patch format didn''t > change, it''s just that they decided to spit the files by family to > keep them manageable. > > The only other thing to check for is that you don''t have any > artificial size restriction left in that code (I think patch files early > on were limited to 4k in size, and that got lifted during the last > couple of years).I can''t find one by inspection, it uses the standard request_firmware interface and stashes the result in a valloc''d buffer, neither of which suffer from any 4K related limitations AFAIK. I''ll try and track something more recent down to test but the worst downside of applying this patch seems to be that something which doesn''t work still doesn''t work.> The hypervisor is really going to take care of all other aspects > here.Sweet, thanks. Ian. -- Ian Campbell Current Noise: Testament - The Ritual Chemist who falls in acid will be tripping for weeks.
On 11/26/2012 09:13 AM, Ian Campbell wrote:> On Mon, 2012-11-26 at 13:44 +0000, Jan Beulich wrote: >>>>> On 26.11.12 at 14:21, Ian Campbell <ijc@hellion.org.uk> wrote: >>> Debian has decided to take Jeremy''s microcode patch [0] as an interim >>> measure for their next release. (TL;DR -- Debian is shipping pvops Linux >>> 3.2 and Xen 4.1 in the next release. See http://bugs.debian.org/693053 >>> and https://lists.debian.org/debian-devel/2012/11/msg00141.html for some >>> more background). >>> >>> However the patch is a bit old and predates the use introduction of >>> separate firmware files for AMD family >= 15h. Looking at the SuSE >>> forward ported classic Xen patches it seems like the following patch is >>> all that is required. But it seems a little too simple to be true and I >>> don''t have any such processors to test on. >>> >>> Jan, can you recall if it really is that easy on the kernel side ;-) >> >> While so far I didn''t myself run anything on post-Fam10 systems >> either, it really ought to be that easy - the patch format didn''t >> change, it''s just that they decided to spit the files by family to >> keep them manageable. >> >> The only other thing to check for is that you don''t have any >> artificial size restriction left in that code (I think patch files early >> on were limited to 4k in size, and that got lifted during the last >> couple of years). > > I can''t find one by inspection, it uses the standard request_firmware > interface and stashes the result in a valloc''d buffer, neither of which > suffer from any 4K related limitations AFAIK. > > I''ll try and track something more recent down to test but the worst > downside of applying this patch seems to be that something which doesn''t > work still doesn''t work.I submitted a fix for fam 16h to Linux right before the Thanksgiving break in US and was planning to look at Xen as well. Give me a day or two to test it. -boris> >> The hypervisor is really going to take care of all other aspects >> here. > > Sweet, thanks. > > Ian. >
On 11/26/2012 09:58 AM, Boris Ostrovsky wrote:> > > On 11/26/2012 09:13 AM, Ian Campbell wrote: >> On Mon, 2012-11-26 at 13:44 +0000, Jan Beulich wrote:>>> >>> The only other thing to check for is that you don''t have any >>> artificial size restriction left in that code (I think patch files early >>> on were limited to 4k in size, and that got lifted during the last >>> couple of years). >> >> I can''t find one by inspection, it uses the standard request_firmware >> interface and stashes the result in a valloc''d buffer, neither of which >> suffer from any 4K related limitations AFAIK. >> >> I''ll try and track something more recent down to test but the worst >> downside of applying this patch seems to be that something which doesn''t >> work still doesn''t work. > > I submitted a fix for fam 16h to Linux right before the Thanksgiving > break in US and was planning to look at Xen as well. Give me a day or > two to test it.It works fine, no issues with size (which is different from other families). -boris
On Mon, 2012-11-26 at 23:47 +0000, Boris Ostrovsky wrote:> > On 11/26/2012 09:58 AM, Boris Ostrovsky wrote: > > > > > > On 11/26/2012 09:13 AM, Ian Campbell wrote: > >> On Mon, 2012-11-26 at 13:44 +0000, Jan Beulich wrote: > > >>> > >>> The only other thing to check for is that you don''t have any > >>> artificial size restriction left in that code (I think patch files early > >>> on were limited to 4k in size, and that got lifted during the last > >>> couple of years). > >> > >> I can''t find one by inspection, it uses the standard request_firmware > >> interface and stashes the result in a valloc''d buffer, neither of which > >> suffer from any 4K related limitations AFAIK. > >> > >> I''ll try and track something more recent down to test but the worst > >> downside of applying this patch seems to be that something which doesn''t > >> work still doesn''t work. > > > > I submitted a fix for fam 16h to Linux right before the Thanksgiving > > break in US and was planning to look at Xen as well. Give me a day or > > two to test it. > > It works fine, no issues with size (which is different from other families).I''ve just tried this on a fam 15h and I get: (XEN) microcode: collect_cpu_info: patch_id=0x6000626 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: CPU0 found a matching microcode update with version 0x6000629 (current=0x6000626) (XEN) microcode: CPU0 updated from revision 0x6000626 to 0x6000629 (XEN) microcode: collect_cpu_info: patch_id=0x6000629 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: size 5260, block size 2592, offset 2660 (XEN) microcode: CPU1 patch does not match (patch is 6101, cpu base id is 6012) (XEN) microcode: collect_cpu_info: patch_id=0x6000626 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: CPU2 found a matching microcode update with version 0x6000629 (current=0x6000626) (XEN) microcode: CPU2 updated from revision 0x6000626 to 0x6000629 (XEN) microcode: collect_cpu_info: patch_id=0x6000629 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: size 5260, block size 2592, offset 2660 (XEN) microcode: CPU3 patch does not match (patch is 6101, cpu base id is 6012) (XEN) microcode: collect_cpu_info: patch_id=0x6000626 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: CPU4 found a matching microcode update with version 0x6000629 (current=0x6000626) (XEN) microcode: CPU4 updated from revision 0x6000626 to 0x6000629 (XEN) microcode: collect_cpu_info: patch_id=0x6000629 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: size 5260, block size 2592, offset 2660 (XEN) microcode: CPU5 patch does not match (patch is 6101, cpu base id is 6012) (XEN) microcode: collect_cpu_info: patch_id=0x6000626 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: CPU6 found a matching microcode update with version 0x6000629 (current=0x6000626) (XEN) microcode: CPU6 updated from revision 0x6000626 to 0x6000629 .... It seems like it is applying successfully on only the even numbered cpus. Is this because the odd and even ones share some execution units and therefore share microcode updates too? IOW update CPU0 also updates CPU1 under the hood. If so then we probably want to teach Xen about this, although at least for now though it would mean that the microcode is actually getting applied despite the messages. Ian. -- Ian Campbell I like your game but we have to change the rules.
On Mon, 2012-11-26 at 13:44 +0000, Jan Beulich wrote:> >>> On 26.11.12 at 14:21, Ian Campbell <ijc@hellion.org.uk> wrote: > > Debian has decided to take Jeremy''s microcode patch [0] as an interim > > measure for their next release. (TL;DR -- Debian is shipping pvops Linux > > 3.2 and Xen 4.1 in the next release. See http://bugs.debian.org/693053 > > and https://lists.debian.org/debian-devel/2012/11/msg00141.html for some > > more background). > > > > However the patch is a bit old and predates the use introduction of > > separate firmware files for AMD family >= 15h. Looking at the SuSE > > forward ported classic Xen patches it seems like the following patch is > > all that is required. But it seems a little too simple to be true and I > > don''t have any such processors to test on. > > > > Jan, can you recall if it really is that easy on the kernel side ;-) > > While so far I didn''t myself run anything on post-Fam10 systems > either, it really ought to be that easy - the patch format didn''t > change, it''s just that they decided to spit the files by family to > keep them manageable. > > The only other thing to check for is that you don''t have any > artificial size restriction left in that code (I think patch files early > on were limited to 4k in size, and that got lifted during the last > couple of years).I managed to find a machine and try this and it turns out that all that was missing from the kernel side was: @@ -58,7 +58,7 @@ static enum ucode_state xen_request_microcode_fw(int cpu, struct device *device) { - char name[30]; + char name[36]; struct cpuinfo_x86 *c = &cpu_data(cpu); const struct firmware *firmware; struct ucode_cpu_info *uci = ucode_cpu_info + cpu;> The hypervisor is really going to take care of all other aspects > here.There may be some other issue here (I replied to Boris about it) but it does seem like the kernel side is now correct. Ian. -- Ian Campbell Friction is a drag.
FWIW, there''s a bug in this original implementation. See Konrad''s "misc" tree - for the fix: http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=commit;h=f6c958ff0d00ffbf1cdc8fcf2f2a82f06fbbb5f4 Here is the original thread where I submitted the fix: http://markmail.org/message/i2dc4vbqrujkwhu7 On Mon, Nov 26, 2012 at 8:21 AM, Ian Campbell <ijc@hellion.org.uk> wrote:> Debian has decided to take Jeremy''s microcode patch [0] as an interim > measure for their next release. (TL;DR -- Debian is shipping pvops Linux > 3.2 and Xen 4.1 in the next release. See http://bugs.debian.org/693053 > and https://lists.debian.org/debian-devel/2012/11/msg00141.html for some > more background). > > However the patch is a bit old and predates the use introduction of > separate firmware files for AMD family >= 15h. Looking at the SuSE > forward ported classic Xen patches it seems like the following patch is > all that is required. But it seems a little too simple to be true and I > don''t have any such processors to test on. > > Jan, can you recall if it really is that easy on the kernel side ;-) > > Ian. > > [0] > > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=refs/heads/upstream/microcode > > commit 109cf37876567ef346c0ecde8b473e7ad1e74e07 > Author: Ian Campbell <ijc@hellion.org.uk> > Date: Mon Nov 26 09:41:02 2012 +0000 > > microcode_xen: Add support for AMD family >= 15h > > Signed-off-by: Ian Campbell <ijc@hellion.org.uk> > > diff --git a/arch/x86/kernel/microcode_xen.c > b/arch/x86/kernel/microcode_xen.c > index 9d2a06b..2b8a78a 100644 > --- a/arch/x86/kernel/microcode_xen.c > +++ b/arch/x86/kernel/microcode_xen.c > @@ -74,7 +74,11 @@ static enum ucode_state xen_request_microcode_fw(int > cpu, struct device *device) > break; > > case X86_VENDOR_AMD: > - snprintf(name, sizeof(name), > "amd-ucode/microcode_amd.bin"); > + /* Beginning with family 15h AMD uses family-specific > firmware files. */ > + if (c->x86 >= 0x15) > + snprintf(name, sizeof(name), > "amd-ucode/microcode_amd_fam%.2xh.bin", c->x86); > + else > + snprintf(name, sizeof(name), > "amd-ucode/microcode_amd.bin"); > break; > > default: > > > -- > Ian Campbell > Current Noise: Dew-Scented - Metal Militia > > Now KEN and BARBIE are PERMANENTLY ADDICTED to MIND-ALTERING DRUGS ... > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Wed, 2012-12-05 at 12:43 +0000, Ian Campbell wrote:> > It seems like it is applying successfully on only the even numbered > cpus. Is this because the odd and even ones share some execution units > and therefore share microcode updates too? IOW update CPU0 also > updates CPU1 under the hood.I added some debug and it does seem like the odd CPUs have already been updated when we get to them. Ian.
On 12/05/2012 07:43 AM, Ian Campbell wrote:> I''ve just tried this on a fam 15h and I get: > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > (XEN) microcode: size 5260, block size 2592, offset 60 > (XEN) microcode: CPU0 found a matching microcode update with version 0x6000629 (current=0x6000626) > (XEN) microcode: CPU0 updated from revision 0x6000626 to 0x6000629 > > (XEN) microcode: collect_cpu_info: patch_id=0x6000629 > (XEN) microcode: size 5260, block size 2592, offset 60 > (XEN) microcode: size 5260, block size 2592, offset 2660 > (XEN) microcode: CPU1 patch does not match (patch is 6101, cpu base id is 6012) > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > (XEN) microcode: size 5260, block size 2592, offset 60 > (XEN) microcode: CPU2 found a matching microcode update with version 0x6000629 (current=0x6000626) > (XEN) microcode: CPU2 updated from revision 0x6000626 to 0x6000629 > > (XEN) microcode: collect_cpu_info: patch_id=0x6000629 > (XEN) microcode: size 5260, block size 2592, offset 60 > (XEN) microcode: size 5260, block size 2592, offset 2660 > (XEN) microcode: CPU3 patch does not match (patch is 6101, cpu base id is 6012) > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > (XEN) microcode: size 5260, block size 2592, offset 60 > (XEN) microcode: CPU4 found a matching microcode update with version 0x6000629 (current=0x6000626) > (XEN) microcode: CPU4 updated from revision 0x6000626 to 0x6000629 > > (XEN) microcode: collect_cpu_info: patch_id=0x6000629 > (XEN) microcode: size 5260, block size 2592, offset 60 > (XEN) microcode: size 5260, block size 2592, offset 2660 > (XEN) microcode: CPU5 patch does not match (patch is 6101, cpu base id is 6012) > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > (XEN) microcode: size 5260, block size 2592, offset 60 > (XEN) microcode: CPU6 found a matching microcode update with version 0x6000629 (current=0x6000626) > (XEN) microcode: CPU6 updated from revision 0x6000626 to 0x6000629 > > .... > > It seems like it is applying successfully on only the even numbered > cpus. Is this because the odd and even ones share some execution units > and therefore share microcode updates too? IOW update CPU0 also updates > CPU1 under the hood. > > If so then we probably want to teach Xen about this, although at least > for now though it would mean that the microcode is actually getting > applied despite the messages.On fam15h cores are grouped in pairs into compute units (CUs) and cores in CUs share microcode engine. So yes, you are right --- when we apply a patch to one core, the other one sees the update. I believe at some point we thought about making code smarter and applying patch only on one core in a CU but then decided against it because of some corner cases, For example, there are parts with single-core CUs and it is not out of question that some BIOSes may not enumerate them correctly. Yes, we can figure this all out in the code but we didn''t feel that adding complexity was worth it. -boris
>>> On 05.12.12 at 17:48, Boris Ostrovsky <boris.ostrovsky@amd.com> wrote: > On 12/05/2012 07:43 AM, Ian Campbell wrote: >> I''ve just tried this on a fam 15h and I get: >> >> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >> (XEN) microcode: size 5260, block size 2592, offset 60 >> (XEN) microcode: CPU0 found a matching microcode update with > version 0x6000629 (current=0x6000626) >> (XEN) microcode: CPU0 updated from revision 0x6000626 to 0x6000629 >> >> (XEN) microcode: collect_cpu_info: patch_id=0x6000629 >> (XEN) microcode: size 5260, block size 2592, offset 60 >> (XEN) microcode: size 5260, block size 2592, offset 2660 >> (XEN) microcode: CPU1 patch does not match (patch is 6101, cpu base > id is 6012) >> >> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >> (XEN) microcode: size 5260, block size 2592, offset 60 >> (XEN) microcode: CPU2 found a matching microcode update with > version 0x6000629 (current=0x6000626) >> (XEN) microcode: CPU2 updated from revision 0x6000626 to 0x6000629 >> >> (XEN) microcode: collect_cpu_info: patch_id=0x6000629 >> (XEN) microcode: size 5260, block size 2592, offset 60 >> (XEN) microcode: size 5260, block size 2592, offset 2660 >> (XEN) microcode: CPU3 patch does not match (patch is 6101, cpu base > id is 6012) >> >> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >> (XEN) microcode: size 5260, block size 2592, offset 60 >> (XEN) microcode: CPU4 found a matching microcode update with > version 0x6000629 (current=0x6000626) >> (XEN) microcode: CPU4 updated from revision 0x6000626 to 0x6000629 >> >> (XEN) microcode: collect_cpu_info: patch_id=0x6000629 >> (XEN) microcode: size 5260, block size 2592, offset 60 >> (XEN) microcode: size 5260, block size 2592, offset 2660 >> (XEN) microcode: CPU5 patch does not match (patch is 6101, cpu base > id is 6012) >> >> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >> (XEN) microcode: size 5260, block size 2592, offset 60 >> (XEN) microcode: CPU6 found a matching microcode update with > version 0x6000629 (current=0x6000626) >> (XEN) microcode: CPU6 updated from revision 0x6000626 to 0x6000629 >> >> .... >> >> It seems like it is applying successfully on only the even numbered >> cpus. Is this because the odd and even ones share some execution units >> and therefore share microcode updates too? IOW update CPU0 also updates >> CPU1 under the hood. >> >> If so then we probably want to teach Xen about this, although at least >> for now though it would mean that the microcode is actually getting >> applied despite the messages. > > On fam15h cores are grouped in pairs into compute units (CUs) and cores > in CUs share microcode engine. So yes, you are right --- when we apply a > patch to one core, the other one sees the update. > > I believe at some point we thought about making code smarter and > applying patch only on one core in a CU but then decided against it > because of some corner cases, For example, there are parts with > single-core CUs and it is not out of question that some BIOSes may not > enumerate them correctly. Yes, we can figure this all out in the code > but we didn''t feel that adding complexity was worth it.But all of this shouldn''t lead to equivalent ID mismatches, should it? It ought to simply find nothing to update... Jan
On Wed, 2012-12-05 at 16:48 +0000, Boris Ostrovsky wrote:> On 12/05/2012 07:43 AM, Ian Campbell wrote: > > I''ve just tried this on a fam 15h and I get: > > > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > > (XEN) microcode: size 5260, block size 2592, offset 60 > > (XEN) microcode: CPU0 found a matching microcode update with version 0x6000629 (current=0x6000626) > > (XEN) microcode: CPU0 updated from revision 0x6000626 to 0x6000629 > > > > (XEN) microcode: collect_cpu_info: patch_id=0x6000629 > > (XEN) microcode: size 5260, block size 2592, offset 60 > > (XEN) microcode: size 5260, block size 2592, offset 2660 > > (XEN) microcode: CPU1 patch does not match (patch is 6101, cpu base id is 6012) > > > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > > (XEN) microcode: size 5260, block size 2592, offset 60 > > (XEN) microcode: CPU2 found a matching microcode update with version 0x6000629 (current=0x6000626) > > (XEN) microcode: CPU2 updated from revision 0x6000626 to 0x6000629 > > > > (XEN) microcode: collect_cpu_info: patch_id=0x6000629 > > (XEN) microcode: size 5260, block size 2592, offset 60 > > (XEN) microcode: size 5260, block size 2592, offset 2660 > > (XEN) microcode: CPU3 patch does not match (patch is 6101, cpu base id is 6012) > > > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > > (XEN) microcode: size 5260, block size 2592, offset 60 > > (XEN) microcode: CPU4 found a matching microcode update with version 0x6000629 (current=0x6000626) > > (XEN) microcode: CPU4 updated from revision 0x6000626 to 0x6000629 > > > > (XEN) microcode: collect_cpu_info: patch_id=0x6000629 > > (XEN) microcode: size 5260, block size 2592, offset 60 > > (XEN) microcode: size 5260, block size 2592, offset 2660 > > (XEN) microcode: CPU5 patch does not match (patch is 6101, cpu base id is 6012) > > > > (XEN) microcode: collect_cpu_info: patch_id=0x6000626 > > (XEN) microcode: size 5260, block size 2592, offset 60 > > (XEN) microcode: CPU6 found a matching microcode update with version 0x6000629 (current=0x6000626) > > (XEN) microcode: CPU6 updated from revision 0x6000626 to 0x6000629 > > > > .... > > > > It seems like it is applying successfully on only the even numbered > > cpus. Is this because the odd and even ones share some execution units > > and therefore share microcode updates too? IOW update CPU0 also updates > > CPU1 under the hood. > > > > If so then we probably want to teach Xen about this, although at least > > for now though it would mean that the microcode is actually getting > > applied despite the messages. > > On fam15h cores are grouped in pairs into compute units (CUs) and cores > in CUs share microcode engine. So yes, you are right --- when we apply a > patch to one core, the other one sees the update. > > I believe at some point we thought about making code smarter and > applying patch only on one core in a CU but then decided against it > because of some corner cases, For example, there are parts with > single-core CUs and it is not out of question that some BIOSes may not > enumerate them correctly. Yes, we can figure this all out in the code > but we didn''t feel that adding complexity was worth it.It looks to me like Linux silently avoids updating the microcode on a core if it detects that already has that version, which silently avoids this issue without the possibility of missing a core out in a corned case. I looked at trying to apply the same logic to the Xen side of things but it is different enough that I can''t immediately see how. microcode_fits() would seem to be the place to do it, but I''m not at all sure what this equiv table stuff is all about. Ian. -- Ian Campbell I''ve finally found the perfect girl, I couldn''t ask for more, She''s deaf and dumb and over-sexed, And owns a liquor store.
On 12/05/2012 12:02 PM, Jan Beulich wrote:>>>> On 05.12.12 at 17:48, Boris Ostrovsky <boris.ostrovsky@amd.com> wrote: >> On 12/05/2012 07:43 AM, Ian Campbell wrote: >>> I''ve just tried this on a fam 15h and I get: >>> >>> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >>> (XEN) microcode: size 5260, block size 2592, offset 60 >>> (XEN) microcode: CPU0 found a matching microcode update with >> version 0x6000629 (current=0x6000626) >>> (XEN) microcode: CPU0 updated from revision 0x6000626 to 0x6000629 >>> >>> (XEN) microcode: collect_cpu_info: patch_id=0x6000629 >>> (XEN) microcode: size 5260, block size 2592, offset 60 >>> (XEN) microcode: size 5260, block size 2592, offset 2660 >>> (XEN) microcode: CPU1 patch does not match (patch is 6101, cpu base >> id is 6012) >>> >>> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >>> (XEN) microcode: size 5260, block size 2592, offset 60 >>> (XEN) microcode: CPU2 found a matching microcode update with >> version 0x6000629 (current=0x6000626) >>> (XEN) microcode: CPU2 updated from revision 0x6000626 to 0x6000629 >>> >>> (XEN) microcode: collect_cpu_info: patch_id=0x6000629 >>> (XEN) microcode: size 5260, block size 2592, offset 60 >>> (XEN) microcode: size 5260, block size 2592, offset 2660 >>> (XEN) microcode: CPU3 patch does not match (patch is 6101, cpu base >> id is 6012) >>> >>> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >>> (XEN) microcode: size 5260, block size 2592, offset 60 >>> (XEN) microcode: CPU4 found a matching microcode update with >> version 0x6000629 (current=0x6000626) >>> (XEN) microcode: CPU4 updated from revision 0x6000626 to 0x6000629 >>> >>> (XEN) microcode: collect_cpu_info: patch_id=0x6000629 >>> (XEN) microcode: size 5260, block size 2592, offset 60 >>> (XEN) microcode: size 5260, block size 2592, offset 2660 >>> (XEN) microcode: CPU5 patch does not match (patch is 6101, cpu base >> id is 6012) >>> >>> (XEN) microcode: collect_cpu_info: patch_id=0x6000626 >>> (XEN) microcode: size 5260, block size 2592, offset 60 >>> (XEN) microcode: CPU6 found a matching microcode update with >> version 0x6000629 (current=0x6000626) >>> (XEN) microcode: CPU6 updated from revision 0x6000626 to 0x6000629 >>> >>> .... >>> >>> It seems like it is applying successfully on only the even numbered >>> cpus. Is this because the odd and even ones share some execution units >>> and therefore share microcode updates too? IOW update CPU0 also updates >>> CPU1 under the hood. >>> >>> If so then we probably want to teach Xen about this, although at least >>> for now though it would mean that the microcode is actually getting >>> applied despite the messages. >> >> On fam15h cores are grouped in pairs into compute units (CUs) and cores >> in CUs share microcode engine. So yes, you are right --- when we apply a >> patch to one core, the other one sees the update. >> >> I believe at some point we thought about making code smarter and >> applying patch only on one core in a CU but then decided against it >> because of some corner cases, For example, there are parts with >> single-core CUs and it is not out of question that some BIOSes may not >> enumerate them correctly. Yes, we can figure this all out in the code >> but we didn''t feel that adding complexity was worth it. > > But all of this shouldn''t lead to equivalent ID mismatches, should > it? It ought to simply find nothing to update...The patch file (/lib/firmware/amd-ucode/microcode_amd_fam15h.bin) may contain more than one patch. The driver goes over this file patch by patch and tries to see whether to apply it. I think what happened in Ian''s case was that the patch file contained two patches --- one for this processor (ID 6012) and another for a different processor (ID 6101). (Both are family 15h but different revs). The driver applied the first patch on core 0. Then, on core 1, the code tried the first patch (at file offset 60) and noticed that it is already applied. So it continued to the next patch (at offset 2660) which is not meant for this processor, thus generating the "does not match" message. So we have at least a problem in how the error is reported to the log -- it is confusing. I''ll try to make it more understandable. And maybe core 1 shouldn''t go into the second patch in the first place because it already found a patch for this processor (but decided that it is not needed based on patch ID). -boris
On 12/05/2012 12:05 PM, Ian Campbell wrote:> > I looked at trying to apply the same logic to the Xen side of things but > it is different enough that I can''t immediately see how. > microcode_fits() would seem to be the place to do it, but I''m not at all > sure what this equiv table stuff is all about. >Because more than one processor revision may require the same patch we group processors into "equivalence classes". The mapping is stored in the patch file header. The Equivalent Processor ID is verified by HW when the patch is being loaded. -boris
On Wed, 2012-12-05 at 17:27 +0000, Boris Ostrovsky wrote:> On 12/05/2012 12:02 PM, Jan Beulich wrote: > > But all of this shouldn''t lead to equivalent ID mismatches, should > > it? It ought to simply find nothing to update... > > > The patch file (/lib/firmware/amd-ucode/microcode_amd_fam15h.bin) may > contain more than one patch. The driver goes over this file patch by > patch and tries to see whether to apply it. > > I think what happened in Ian''s case was that the patch file contained > two patches --- one for this processor (ID 6012) and another for a > different processor (ID 6101). (Both are family 15h but different revs). > > The driver applied the first patch on core 0. Then, on core 1, the code > tried the first patch (at file offset 60) and noticed that it is already > applied. So it continued to the next patch (at offset 2660) which is not > meant for this processor, thus generating the "does not match" message.I added some debugging and can confirm this is what happens: (XEN) microcode: collect_cpu_info: CPU0 patch_id=0x6000626 (XEN) CPU0: current patch level 0x6000626 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) microcode: CPU0 found a matching microcode update with version 0x6000629 (current=0x6000626) (XEN) CPU0: apply_microcodeA: current patch level 0x6000626. Patch is 0x6000629 (XEN) CPU0: apply_microcodeB: new patch level 0x6000629. Patch is 0x6000629 (XEN) microcode: CPU0 updated from revision 0x6000626 to 0x6000629 (XEN) microcode: collect_cpu_info: CPU1 patch_id=0x6000629 (XEN) CPU1: current patch level 0x6000629 (XEN) microcode: size 5260, block size 2592, offset 60 (XEN) CPU1: microcode_fits: older patch 0x6000629 <= 0x6000629, returning (XEN) microcode: size 5260, block size 2592, offset 2660 (XEN) microcode: CPU1 patch does not match (patch is 6101, cpu base id is 6012)> So we have at least a problem in how the error is reported to the log -- > it is confusing. I''ll try to make it more understandable.FWIW it also results in an error from the hypercall overall as well as the logging stuff.> And maybe core 1 shouldn''t go into the second patch in the first place > because it already found a patch for this processor (but decided that it > is not needed based on patch ID).-- Ian Campbell * PerlGeek is really a space alien * Knghtktty believes PerlGeek
On Wed, Dec 05, 2012 at 12:46:39PM +0000, Ian Campbell wrote:> On Mon, 2012-11-26 at 13:44 +0000, Jan Beulich wrote: > > >>> On 26.11.12 at 14:21, Ian Campbell <ijc@hellion.org.uk> wrote: > > > Debian has decided to take Jeremy''s microcode patch [0] as an interim > > > measure for their next release. (TL;DR -- Debian is shipping pvops Linux > > > 3.2 and Xen 4.1 in the next release. See http://bugs.debian.org/693053 > > > and https://lists.debian.org/debian-devel/2012/11/msg00141.html for some > > > more background). > > > > > > However the patch is a bit old and predates the use introduction of > > > separate firmware files for AMD family >= 15h. Looking at the SuSE > > > forward ported classic Xen patches it seems like the following patch is > > > all that is required. But it seems a little too simple to be true and I > > > don''t have any such processors to test on. > > > > > > Jan, can you recall if it really is that easy on the kernel side ;-) > > > > While so far I didn''t myself run anything on post-Fam10 systems > > either, it really ought to be that easy - the patch format didn''t > > change, it''s just that they decided to spit the files by family to > > keep them manageable. > > > > The only other thing to check for is that you don''t have any > > artificial size restriction left in that code (I think patch files early > > on were limited to 4k in size, and that got lifted during the last > > couple of years). > > I managed to find a machine and try this and it turns out that all that > was missing from the kernel side was: > > @@ -58,7 +58,7 @@ > > static enum ucode_state xen_request_microcode_fw(int cpu, struct device *device) > { > - char name[30]; > + char name[36]; > struct cpuinfo_x86 *c = &cpu_data(cpu); > const struct firmware *firmware; > struct ucode_cpu_info *uci = ucode_cpu_info + cpu;Do you want to prep a patch that I can stick in my ''microcode'' branch? .. That I will at some point try to upstream.
(trim quote please...) On Wed, 2012-12-05 at 21:47 +0000, Konrad Rzeszutek Wilk wrote:> Do you want to prep a patch that I can stick in my ''microcode'' branch? > .. That I will at some point try to upstream.You might want to look back at the archives when Jeremy first tried to upstream this work, it was a vehement "No" and the resulting thread was not pretty. Now that we have early loading via the hypervisor in 4.2 and Linux is finally in the process of growing its own early microcode loading solution I suspect the No would be even firmer. It is on xenbits if you want it anyway: git://xenbits.xen.org/people/ianc/linux-2.6.git debian/wheezy/microcode About the only argument I can see for continuing to try upstreaming this stuff is that in http://www.gossamer-threads.com/lists/linux/kernel/1583630 Fenghua says: Note, however, that Linux users have gotten used to being able to install a microcode patch in the field without having a reboot; we support that model too. i.e. this is an argument for keeping the previous scheme in parallel, which I suppose is an argument for supporting the same under Xen (I don''t know if its a good one though. Ian. -- Ian Campbell All the existing 2.0.x kernels are to buggy for 2.1.x to be the main goal. -- Alan Cox
(Putting debian-kernel to bcc, since I don''t imagine they are interested in the details of this discussion, I''ll reraise the result with the Debian Xen maintainer when we have one) On Wed, 2012-12-05 at 17:53 +0000, Ian Campbell wrote:> On Wed, 2012-12-05 at 17:27 +0000, Boris Ostrovsky wrote: > > On 12/05/2012 12:02 PM, Jan Beulich wrote: > > > But all of this shouldn''t lead to equivalent ID mismatches, should > > > it? It ought to simply find nothing to update... > > > > > > The patch file (/lib/firmware/amd-ucode/microcode_amd_fam15h.bin) may > > contain more than one patch. The driver goes over this file patch by > > patch and tries to see whether to apply it. > > > > I think what happened in Ian''s case was that the patch file contained > > two patches --- one for this processor (ID 6012) and another for a > > different processor (ID 6101). (Both are family 15h but different revs). > > > > The driver applied the first patch on core 0. Then, on core 1, the code > > tried the first patch (at file offset 60) and noticed that it is already > > applied. So it continued to the next patch (at offset 2660) which is not > > meant for this processor, thus generating the "does not match" message.OOI what would have happened if the two patches were in the opposite order? Would CPU0 have seen the ID 6101 patch first and aborted? Ian.
>>> On 06.12.12 at 09:34, Ian Campbell <ijc@hellion.org.uk> wrote: > (trim quote please...) > On Wed, 2012-12-05 at 21:47 +0000, Konrad Rzeszutek Wilk wrote: >> Do you want to prep a patch that I can stick in my ''microcode'' branch? >> .. That I will at some point try to upstream. > > You might want to look back at the archives when Jeremy first tried to > upstream this work, it was a vehement "No" and the resulting thread was > not pretty. > > Now that we have early loading via the hypervisor in 4.2 and Linux is > finally in the process of growing its own early microcode loading > solution I suspect the No would be even firmer. > > It is on xenbits if you want it anyway: > > git://xenbits.xen.org/people/ianc/linux-2.6.git debian/wheezy/microcode > > About the only argument I can see for continuing to try upstreaming this > stuff is that in > http://www.gossamer-threads.com/lists/linux/kernel/1583630 Fenghua says: > > Note, however, that Linux users have gotten used to being able > to install a microcode patch in the field without having a > reboot; we support that model too. > > i.e. this is an argument for keeping the previous scheme in parallel, > which I suppose is an argument for supporting the same under Xen (I > don''t know if its a good one though.Another counter argument would be that the kernel really is only relaying things in the Xen case. Which means the user mode tool could as well interface with Xen directly. Jan
>>> On 06.12.12 at 11:08, Ian Campbell <Ian.Campbell@citrix.com> wrote: > (Putting debian-kernel to bcc, since I don''t imagine they are interested > in the details of this discussion, I''ll reraise the result with the > Debian Xen maintainer when we have one) > > On Wed, 2012-12-05 at 17:53 +0000, Ian Campbell wrote: >> On Wed, 2012-12-05 at 17:27 +0000, Boris Ostrovsky wrote: >> > On 12/05/2012 12:02 PM, Jan Beulich wrote: >> > > But all of this shouldn''t lead to equivalent ID mismatches, should >> > > it? It ought to simply find nothing to update... >> > >> > >> > The patch file (/lib/firmware/amd-ucode/microcode_amd_fam15h.bin) may >> > contain more than one patch. The driver goes over this file patch by >> > patch and tries to see whether to apply it. >> > >> > I think what happened in Ian''s case was that the patch file contained >> > two patches --- one for this processor (ID 6012) and another for a >> > different processor (ID 6101). (Both are family 15h but different revs). >> > >> > The driver applied the first patch on core 0. Then, on core 1, the code >> > tried the first patch (at file offset 60) and noticed that it is already >> > applied. So it continued to the next patch (at offset 2660) which is not >> > meant for this processor, thus generating the "does not match" message. > > OOI what would have happened if the two patches were in the opposite > order? Would CPU0 have seen the ID 6101 patch first and aborted?That would work well. The problem is that cpu_request_microcode() returns the result of its last call to microcode_fits(), no matter whether a prior one already returned success (>= 0). Something like &offset)) == 0 ) { error = microcode_fits(mc_amd, cpu); - if (error <= 0) + if (error < 0) + error = 0; + if (error == 0) continue; error = apply_microcode(cpu); would apparently be needed. Or we could of course make microcode_fits() return a bool_t in the first place. But then again it would probably be nice to indeed return failure from cpu_request_microcode() if _no_ suitable microcode was found at all. Question is whether one blob can contain more than one update for a given equivalent ID. If not, bailing from the loop even if microcode_fits() returns zero might be the right solution (and presumably the latter then shouldn''t return zero when no equivalent ID was found). But no matter what solution we pick, we need to review this carefully in the context of the earlier regressions we had in this area. Jan
O Thu, Dec 06, 2012 at 11:13:09AM +0000, Jan Beulich wrote:> >>> On 06.12.12 at 11:08, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > (Putting debian-kernel to bcc, since I don''t imagine they are interested > > in the details of this discussion, I''ll reraise the result with the > > Debian Xen maintainer when we have one) > > > > On Wed, 2012-12-05 at 17:53 +0000, Ian Campbell wrote: > >> On Wed, 2012-12-05 at 17:27 +0000, Boris Ostrovsky wrote: > >> > On 12/05/2012 12:02 PM, Jan Beulich wrote: > >> > > But all of this shouldn''t lead to equivalent ID mismatches, should > >> > > it? It ought to simply find nothing to update... > >> > > >> > > >> > The patch file (/lib/firmware/amd-ucode/microcode_amd_fam15h.bin) may > >> > contain more than one patch. The driver goes over this file patch by > >> > patch and tries to see whether to apply it. > >> > > >> > I think what happened in Ian''s case was that the patch file contained > >> > two patches --- one for this processor (ID 6012) and another for a > >> > different processor (ID 6101). (Both are family 15h but different revs). > >> > > >> > The driver applied the first patch on core 0. Then, on core 1, the code > >> > tried the first patch (at file offset 60) and noticed that it is already > >> > applied. So it continued to the next patch (at offset 2660) which is not > >> > meant for this processor, thus generating the "does not match" message. > > > > OOI what would have happened if the two patches were in the opposite > > order? Would CPU0 have seen the ID 6101 patch first and aborted? > > That would work well. > > The problem is that cpu_request_microcode() returns the result > of its last call to microcode_fits(), no matter whether a prior one > already returned success (>= 0). > > Something like > > &offset)) == 0 ) > { > error = microcode_fits(mc_amd, cpu); > - if (error <= 0) > + if (error < 0) > + error = 0; > + if (error == 0) > continue; > > error = apply_microcode(cpu); > > would apparently be needed. Or we could of course make > microcode_fits() return a bool_t in the first place. > > But then again it would probably be nice to indeed return > failure from cpu_request_microcode() if _no_ suitable > microcode was found at all. Question is whether one blob > can contain more than one update for a given equivalent ID. > If not, bailing from the loop even if microcode_fits() returns > zero might be the right solution (and presumably the latter > then shouldn''t return zero when no equivalent ID was > found).I would argue that cpu_request_microcode() should not return an error if no suitable microcode is available. In most cases BIOS will already have the right version of microcode and so the driver not loading anything is really considered a normal scenario. One could even say that being able to load the microcode in the driver indicates stale BIOS and so *that* is the error. I am not suggesting returning an error on that but perhaps raising log level from KERN_INFO to KERN_WARN. As for whether a container can have more than one update --- typically no but I would like to be able to support this.> > But no matter what solution we pick, we need to review this > carefully in the context of the earlier regressions we had in > this area.I will probably have a patch for review either later today or tomorrow --- I need to test more patch file configurations. -boris
>>> On 06.12.12 at 14:08, Boris Ostrovsky <boris.ostrovsky@amd.com> wrote: > As for whether a container can have more than one update --- typically no but I > would like to be able to support this.In which case further changes would be necessary. Jan
On Thu, Dec 06, 2012 at 08:34:31AM +0000, Ian Campbell wrote:> (trim quote please...) > On Wed, 2012-12-05 at 21:47 +0000, Konrad Rzeszutek Wilk wrote: > > Do you want to prep a patch that I can stick in my ''microcode'' branch? > > .. That I will at some point try to upstream. > > You might want to look back at the archives when Jeremy first tried to > upstream this work, it was a vehement "No" and the resulting thread was > not pretty. > > Now that we have early loading via the hypervisor in 4.2 and Linux is > finally in the process of growing its own early microcode loading > solution I suspect the No would be even firmer. > > It is on xenbits if you want it anyway: > > git://xenbits.xen.org/people/ianc/linux-2.6.git debian/wheezy/microcodeThx. Pulled it in my stable/misc branch.> > About the only argument I can see for continuing to try upstreaming this > stuff is that in > http://www.gossamer-threads.com/lists/linux/kernel/1583630 Fenghua says: > > Note, however, that Linux users have gotten used to being able > to install a microcode patch in the field without having a > reboot; we support that model too. > > i.e. this is an argument for keeping the previous scheme in parallel, > which I suppose is an argument for supporting the same under Xen (I > don''t know if its a good one though. > > Ian. > > -- > Ian Campbell > > > All the existing 2.0.x kernels are to buggy for 2.1.x to be the > main goal. > -- Alan Cox >