Ross Philipson
2008-Sep-15 21:34 UTC
[Xen-devel] Crash in update microcode changes - change set 18475
The changes for CPU microcode loading that were done recently (change set 18475 in unstable staging) seem to be causing a crash. I am using an Intel system and I get an assertion in Xen during the DOM0 boot. This is what I believe is going on. In xen/arch/x86/microcode.c the routine do_microcode_update() is dispatching the update work to each CPU with on_each_cpu() which in turn uses an IPI to dispatch the callback vector on each CPU. The microcode update routine passed in is called in the IPI context on the target CPU (including irq_enter() before calling the ucode function). Within the ucode function the calls eventually get down to the Intel specific calls in microcode_intel.c. Specifically: do_microcode_update_one() microcode_update_cpu() cpu_request_microcode() get_next_ucode_from_buffer() Within the last call, vmalloc() is called and eventually _xmalloc() asserts on ASSERT(!in_irq()). I checked and the earlier code, though it dispatched work to different CPUs with IPIs, did not try to dynamically allocate memory. I am not sure how to fix this easily without redoing how the whole new microcode framework works. Also I didn''t look closely at AMD but it may have the same issue. I can take a crack at fixing it but maybe someone will see a simple solution. Thanks Ross Ross Philipson Senior Software Engineer Citrix Systems, Inc 14 Crosby Drive Bedford, MA 01730 781-301-7949 ross.philipson@citrix.com <mailto:ross.philipson@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-16 06:38 UTC
Re: [Xen-devel] Crash in update microcode changes - change set 18475
I¹ll take a look. Probably not hard to fix. -- Keir On 15/9/08 22:34, "Ross Philipson" <Ross.Philipson@citrix.com> wrote:> The changes for CPU microcode loading that were done recently (change set > 18475 in unstable staging) seem to be causing a crash. I am using an Intel > system and I get an assertion in Xen during the DOM0 boot. This is what I > believe is going on. > > In xen/arch/x86/microcode.c the routine do_microcode_update() is dispatching > the update work to each CPU with on_each_cpu() which in turn uses an IPI to > dispatch the callback vector on each CPU. The microcode update routine passed > in is called in the IPI context on the target CPU (including irq_enter() > before calling the ucode function). Within the ucode function the calls > eventually get down to the Intel specific calls in microcode_intel.c. > Specifically: > > do_microcode_update_one() > microcode_update_cpu() > cpu_request_microcode() > get_next_ucode_from_buffer() > > Within the last call, vmalloc() is called and eventually _xmalloc() asserts on > ASSERT(!in_irq()). I checked and the earlier code, though it dispatched work > to different CPUs with IPIs, did not try to dynamically allocate memory. I am > not sure how to fix this easily without redoing how the whole new microcode > framework works. Also I didn¹t look closely at AMD but it may have the same > issue. I can take a crack at fixing it but maybe someone will see a simple > solution. > > Thanks > Ross > > Ross Philipson > Senior Software Engineer > Citrix Systems, Inc > 14 Crosby Drive > Bedford, MA 01730 > 781-301-7949 > ross.philipson@citrix.com <mailto:ross.philipson@citrix.com> > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ross Philipson
2008-Sep-16 12:37 UTC
RE: [Xen-devel] Crash in update microcode changes - change set 18475
Actually perhaps it is easy to fix. The copying of chunks out of the larger microcode buffer seems to just be a convenience. Perhaps just returning offset pointers into the original buffer from the get_next_ucode_from_buffer() function would get rid of the vmalloc() call. Just a thought as I looked at it further. Thanks Ross From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Ross Philipson Sent: Monday, September 15, 2008 5:35 PM To: xen-devel@lists.xensource.com Subject: [Xen-devel] Crash in update microcode changes - change set 18475 The changes for CPU microcode loading that were done recently (change set 18475 in unstable staging) seem to be causing a crash. I am using an Intel system and I get an assertion in Xen during the DOM0 boot. This is what I believe is going on. In xen/arch/x86/microcode.c the routine do_microcode_update() is dispatching the update work to each CPU with on_each_cpu() which in turn uses an IPI to dispatch the callback vector on each CPU. The microcode update routine passed in is called in the IPI context on the target CPU (including irq_enter() before calling the ucode function). Within the ucode function the calls eventually get down to the Intel specific calls in microcode_intel.c. Specifically: do_microcode_update_one() microcode_update_cpu() cpu_request_microcode() get_next_ucode_from_buffer() Within the last call, vmalloc() is called and eventually _xmalloc() asserts on ASSERT(!in_irq()). I checked and the earlier code, though it dispatched work to different CPUs with IPIs, did not try to dynamically allocate memory. I am not sure how to fix this easily without redoing how the whole new microcode framework works. Also I didn''t look closely at AMD but it may have the same issue. I can take a crack at fixing it but maybe someone will see a simple solution. Thanks Ross Ross Philipson Senior Software Engineer Citrix Systems, Inc 14 Crosby Drive Bedford, MA 01730 781-301-7949 ross.philipson@citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Sep-16 12:44 UTC
Re: [Xen-devel] Crash in update microcode changes - change set 18475
The better fix was not to do the update in interrupt context at all, but instead ''continue'' the hypercall on each online processor in turn. This is what I implemented in c/s 18487. It''d be nice to know that this works for you (and also for Christoph). In earlier changesets I''ve also cleaned up the microcode quite a lot and reformatted for Xen coding style. My thinking is that microcode-update logic is not all the complex, and the original code not really all that great, so I''m not that bothered about keeping closely in sync with the Linux original version. In this case I''d rather have it in a format I''m happy to hack on myself. -- Keir On 16/9/08 13:37, "Ross Philipson" <Ross.Philipson@citrix.com> wrote:> Actually perhaps it is easy to fix. The copying of chunks out of the larger > microcode buffer seems to just be a convenience. Perhaps just returning offset > pointers into the original buffer from the get_next_ucode_from_buffer() > function would get rid of the vmalloc() call. Just a thought as I looked at it > further. > > Thanks > Ross > > > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Ross Philipson > Sent: Monday, September 15, 2008 5:35 PM > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] Crash in update microcode changes - change set 18475 > > The changes for CPU microcode loading that were done recently (change set > 18475 in unstable staging) seem to be causing a crash. I am using an Intel > system and I get an assertion in Xen during the DOM0 boot. This is what I > believe is going on. > > In xen/arch/x86/microcode.c the routine do_microcode_update() is dispatching > the update work to each CPU with on_each_cpu() which in turn uses an IPI to > dispatch the callback vector on each CPU. The microcode update routine passed > in is called in the IPI context on the target CPU (including irq_enter() > before calling the ucode function). Within the ucode function the calls > eventually get down to the Intel specific calls in microcode_intel.c. > Specifically: > > do_microcode_update_one() > microcode_update_cpu() > cpu_request_microcode() > get_next_ucode_from_buffer() > > Within the last call, vmalloc() is called and eventually _xmalloc() asserts on > ASSERT(!in_irq()). I checked and the earlier code, though it dispatched work > to different CPUs with IPIs, did not try to dynamically allocate memory. I am > not sure how to fix this easily without redoing how the whole new microcode > framework works. Also I didn¹t look closely at AMD but it may have the same > issue. I can take a crack at fixing it but maybe someone will see a simple > solution. > > Thanks > Ross > > Ross Philipson > Senior Software Engineer > Citrix Systems, Inc > 14 Crosby Drive > Bedford, MA 01730 > 781-301-7949 > ross.philipson@citrix.com > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ross Philipson
2008-Sep-16 12:51 UTC
RE: [Xen-devel] Crash in update microcode changes - change set 18475
Right thanks. I will pull/try it shortly and give you feedback. I will only be able to try it on my Intel systems - no AMDs at the moment. Thanks Ross From: Keir Fraser Sent: Tuesday, September 16, 2008 8:45 AM To: Ross Philipson; xen-devel@lists.xensource.com Cc: Christoph Egger Subject: Re: [Xen-devel] Crash in update microcode changes - change set 18475 The better fix was not to do the update in interrupt context at all, but instead ''continue'' the hypercall on each online processor in turn. This is what I implemented in c/s 18487. It''d be nice to know that this works for you (and also for Christoph). In earlier changesets I''ve also cleaned up the microcode quite a lot and reformatted for Xen coding style. My thinking is that microcode-update logic is not all the complex, and the original code not really all that great, so I''m not that bothered about keeping closely in sync with the Linux original version. In this case I''d rather have it in a format I''m happy to hack on myself. -- Keir On 16/9/08 13:37, "Ross Philipson" <Ross.Philipson@citrix.com> wrote: Actually perhaps it is easy to fix. The copying of chunks out of the larger microcode buffer seems to just be a convenience. Perhaps just returning offset pointers into the original buffer from the get_next_ucode_from_buffer() function would get rid of the vmalloc() call. Just a thought as I looked at it further. Thanks Ross From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Ross Philipson Sent: Monday, September 15, 2008 5:35 PM To: xen-devel@lists.xensource.com Subject: [Xen-devel] Crash in update microcode changes - change set 18475 The changes for CPU microcode loading that were done recently (change set 18475 in unstable staging) seem to be causing a crash. I am using an Intel system and I get an assertion in Xen during the DOM0 boot. This is what I believe is going on. In xen/arch/x86/microcode.c the routine do_microcode_update() is dispatching the update work to each CPU with on_each_cpu() which in turn uses an IPI to dispatch the callback vector on each CPU. The microcode update routine passed in is called in the IPI context on the target CPU (including irq_enter() before calling the ucode function). Within the ucode function the calls eventually get down to the Intel specific calls in microcode_intel.c. Specifically: do_microcode_update_one() microcode_update_cpu() cpu_request_microcode() get_next_ucode_from_buffer() Within the last call, vmalloc() is called and eventually _xmalloc() asserts on ASSERT(!in_irq()). I checked and the earlier code, though it dispatched work to different CPUs with IPIs, did not try to dynamically allocate memory. I am not sure how to fix this easily without redoing how the whole new microcode framework works. Also I didn''t look closely at AMD but it may have the same issue. I can take a crack at fixing it but maybe someone will see a simple solution. Thanks Ross Ross Philipson Senior Software Engineer Citrix Systems, Inc 14 Crosby Drive Bedford, MA 01730 781-301-7949 ross.philipson@citrix.com ________________________________ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ross Philipson
2008-Sep-16 14:15 UTC
RE: [Xen-devel] Crash in update microcode changes - change set 18475
It seems to be working just fine on Intel. Thanks. From: Keir Fraser Sent: Tuesday, September 16, 2008 8:45 AM To: Ross Philipson; xen-devel@lists.xensource.com Cc: Christoph Egger Subject: Re: [Xen-devel] Crash in update microcode changes - change set 18475 The better fix was not to do the update in interrupt context at all, but instead ''continue'' the hypercall on each online processor in turn. This is what I implemented in c/s 18487. It''d be nice to know that this works for you (and also for Christoph). In earlier changesets I''ve also cleaned up the microcode quite a lot and reformatted for Xen coding style. My thinking is that microcode-update logic is not all the complex, and the original code not really all that great, so I''m not that bothered about keeping closely in sync with the Linux original version. In this case I''d rather have it in a format I''m happy to hack on myself. -- Keir On 16/9/08 13:37, "Ross Philipson" <Ross.Philipson@citrix.com> wrote: Actually perhaps it is easy to fix. The copying of chunks out of the larger microcode buffer seems to just be a convenience. Perhaps just returning offset pointers into the original buffer from the get_next_ucode_from_buffer() function would get rid of the vmalloc() call. Just a thought as I looked at it further. Thanks Ross From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Ross Philipson Sent: Monday, September 15, 2008 5:35 PM To: xen-devel@lists.xensource.com Subject: [Xen-devel] Crash in update microcode changes - change set 18475 The changes for CPU microcode loading that were done recently (change set 18475 in unstable staging) seem to be causing a crash. I am using an Intel system and I get an assertion in Xen during the DOM0 boot. This is what I believe is going on. In xen/arch/x86/microcode.c the routine do_microcode_update() is dispatching the update work to each CPU with on_each_cpu() which in turn uses an IPI to dispatch the callback vector on each CPU. The microcode update routine passed in is called in the IPI context on the target CPU (including irq_enter() before calling the ucode function). Within the ucode function the calls eventually get down to the Intel specific calls in microcode_intel.c. Specifically: do_microcode_update_one() microcode_update_cpu() cpu_request_microcode() get_next_ucode_from_buffer() Within the last call, vmalloc() is called and eventually _xmalloc() asserts on ASSERT(!in_irq()). I checked and the earlier code, though it dispatched work to different CPUs with IPIs, did not try to dynamically allocate memory. I am not sure how to fix this easily without redoing how the whole new microcode framework works. Also I didn''t look closely at AMD but it may have the same issue. I can take a crack at fixing it but maybe someone will see a simple solution. Thanks Ross Ross Philipson Senior Software Engineer Citrix Systems, Inc 14 Crosby Drive Bedford, MA 01730 781-301-7949 ross.philipson@citrix.com ________________________________ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel