Ke, Liping
2008-Dec-04 07:32 UTC
[Xen-devel] [Doc] writeup for error handling usage in XEN
Hi, all Those days, we spent some efforts to check severe error handling (panic, BUG_ON, BUG, ASSERT) in XEN. We have several round internal discussions as well as several mail threads with Keir. Below is the discussion writeup. If agreed, after review, we want to place it in XEN document folder or XEN wiki since we think it might be helpful to developers. Thanks a lot for your help! Regards, Criping [Background] We found error handling [Panic/BUG_ON/ASSERT/BUG] greatly impacts VM Running/service time. So we did some investigation on its usage in current XEN. Also we have some discussion with Keir. The following writeup logged down them. It might be useful to those who have interest in XEN''s error handling. [Current error handler in XEN] We have five error handlers in XEN. 1) domain_crash 2) panic 3) BUG_ON 4) ASSERT 5) BUG domain_crash only impact the crashed domain, while other four handlers will cause whole system/machine halt/reboot. Panic/BUG_ON/ASSERT/BUG has slight differences: 1) ASSERT only takes effect when DEBUG=y while other three handlers takes effect even if DEBUG=y is not used. 2) panic will halt or restart machine based on boot_option. 3) BUG will give more print information besides panic 4) BUG_ON is the "if" added version of BUG We can see panic, BUG, BUG_ON actually have similar functions. [Error handler usage guideline] 1) domain_crash VS BUG_ON? a) We should keep bug severity/scope in mind. If the bug only affects one domain, use domain_crash to kill the domain instead of panic whole machine. b) When one error impacts the HV''s overall consistency, even if it only impact one domain, we prefer to use BUG_ON instead. Use [Panic/BUG_ON/ASSERT/BUG] will help different linked software modules to be aware of the HV''s consistency constraints. Below is an example we discussed with Keir which''s illustrative: I8254.c/hvm.c (c:\upstream\xen\xen\arch\x86\hvm): BUG_ON(bytes != 1); We want to make sure the handler for a single I/O port never accessed by multi-byte I/O port access. Although the illegal-access is not that fatal, it still affects HV''s consistency constraints. So we choose BUG_ON. 2) How to choose between ASSERT and Panic/BUG_ON/BUG? a) In order to collect more error report and save debug effort, ASSERT is preferred when BUG_ON will cause too much overhead in non-debug build. b) For consistency and simplicity, BUG_ON should be used instead of panic/BUG as they all have similar behavior 3) When decide to use BUG_ON, be cautious. Please add necessary comments if possible. Only when severe error/HV''s consistency constraints broken, should we use it. 4) Don''t use BUG_ON for checking expected BIOS issues/settings such as invalid ACPI table. We can turn off those specific features in VMM instead. For example, if VT-d table is incorrect in BIOS, disable VT-d in the VMM instead of using BUG_ON. [Current Status] We searched [Panic/BUG_ON/ASSERT/BUG] ocurrences in XEN code (cs 18498), agreed current usage is basically reasonable. Keir also mentioned when check in, he tried to make sure that its usage is qualified. Just as Keir''s input, XEN is an inter-linked set of software modules, and BUG_ON/ASSERT gives some explicit description and checking of some of the more subtle interface constraints between them. Those error handlers will save us tremendous debug efforts. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Dec-04 08:24 UTC
[Xen-devel] Re: [Doc] writeup for error handling usage in XEN
Looks okay to me. Do you want me to clean it up and check it in? -- Keir On 04/12/2008 07:32, "Ke, Liping" <liping.ke@intel.com> wrote:> Hi, all > Those days, we spent some efforts to check severe error handling (panic, > BUG_ON, BUG, ASSERT) in XEN. We have several round internal discussions as > well as several mail threads with Keir. Below is the discussion writeup. > > If agreed, after review, we want to place it in XEN document folder or XEN > wiki since we think it might be helpful to developers. > > Thanks a lot for your help! > Regards, > Criping > > [Background] > We found error handling [Panic/BUG_ON/ASSERT/BUG] greatly impacts VM > Running/service time. So we did some investigation on its usage in current > XEN. > Also we have some discussion with Keir. The following writeup logged down > them. > It might be useful to those who have interest in XEN''s error handling. > > [Current error handler in XEN] > We have five error handlers in XEN. > 1) domain_crash > 2) panic > 3) BUG_ON > 4) ASSERT > 5) BUG > domain_crash only impact the crashed domain, while other four handlers will > cause whole system/machine halt/reboot. > Panic/BUG_ON/ASSERT/BUG has slight differences: > 1) ASSERT only takes effect when DEBUG=y while other three handlers takes > effect > even if DEBUG=y is not used. > 2) panic will halt or restart machine based on boot_option. > 3) BUG will give more print information besides panic > 4) BUG_ON is the "if" added version of BUG > We can see panic, BUG, BUG_ON actually have similar functions. > > [Error handler usage guideline] > 1) domain_crash VS BUG_ON? > a) We should keep bug severity/scope in mind. If the bug only affects > one domain, use domain_crash to kill the domain instead of panic > whole machine. > b) When one error impacts the HV''s overall consistency, even if it only > impact > one domain, we prefer to use BUG_ON instead. Use > [Panic/BUG_ON/ASSERT/BUG] > will help different linked software modules to be aware of the HV''s > consistency constraints. Below is an example we discussed with Keir > which''s illustrative: I8254.c/hvm.c (c:\upstream\xen\xen\arch\x86\hvm): > BUG_ON(bytes != 1); > We want to make sure the handler for a single I/O port never accessed by > multi-byte I/O port access. Although the illegal-access is not that > fatal, > it still affects HV''s consistency constraints. So we choose BUG_ON. > 2) How to choose between ASSERT and Panic/BUG_ON/BUG? > a) In order to collect more error report and save debug effort, ASSERT is > preferred when BUG_ON will cause too much overhead in non-debug build. > b) For consistency and simplicity, BUG_ON should be used instead of > panic/BUG as they all have similar behavior > 3) When decide to use BUG_ON, be cautious. Please add necessary comments if > possible. Only when severe error/HV''s consistency constraints broken, > should we use it. > 4) Don''t use BUG_ON for checking expected BIOS issues/settings such as invalid > ACPI table. We can turn off those specific features in VMM instead. For > example, if VT-d table is incorrect in BIOS, disable VT-d in the VMM > instead > of using BUG_ON. > > [Current Status] > We searched [Panic/BUG_ON/ASSERT/BUG] ocurrences in XEN code (cs 18498), > agreed current usage is basically reasonable. Keir also mentioned when check > in, he tried to make sure that its usage is qualified. Just as Keir''s input, > XEN > is an inter-linked set of software modules, and BUG_ON/ASSERT gives some > explicit > description and checking of some of the more subtle interface constraints > between > them. Those error handlers will save us tremendous debug efforts. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2008-Dec-04 08:32 UTC
[Xen-devel] RE: [Doc] writeup for error handling usage in XEN
Hi, Keir It would be very nice. Thanks a lot! Criping -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: 2008年12月4日 16:24 To: Ke, Liping Cc: xen-devel@lists.xensource.com Subject: Re: [Doc] writeup for error handling usage in XEN Looks okay to me. Do you want me to clean it up and check it in? -- Keir On 04/12/2008 07:32, "Ke, Liping" <liping.ke@intel.com> wrote:> Hi, all > Those days, we spent some efforts to check severe error handling (panic, > BUG_ON, BUG, ASSERT) in XEN. We have several round internal discussions as > well as several mail threads with Keir. Below is the discussion writeup. > > If agreed, after review, we want to place it in XEN document folder or XEN > wiki since we think it might be helpful to developers. > > Thanks a lot for your help! > Regards, > Criping > > [Background] > We found error handling [Panic/BUG_ON/ASSERT/BUG] greatly impacts VM > Running/service time. So we did some investigation on its usage in current > XEN. > Also we have some discussion with Keir. The following writeup logged down > them. > It might be useful to those who have interest in XEN's error handling. > > [Current error handler in XEN] > We have five error handlers in XEN. > 1) domain_crash > 2) panic > 3) BUG_ON > 4) ASSERT > 5) BUG > domain_crash only impact the crashed domain, while other four handlers will > cause whole system/machine halt/reboot. > Panic/BUG_ON/ASSERT/BUG has slight differences: > 1) ASSERT only takes effect when DEBUG=y while other three handlers takes > effect > even if DEBUG=y is not used. > 2) panic will halt or restart machine based on boot_option. > 3) BUG will give more print information besides panic > 4) BUG_ON is the "if" added version of BUG > We can see panic, BUG, BUG_ON actually have similar functions. > > [Error handler usage guideline] > 1) domain_crash VS BUG_ON? > a) We should keep bug severity/scope in mind. If the bug only affects > one domain, use domain_crash to kill the domain instead of panic > whole machine. > b) When one error impacts the HV's overall consistency, even if it only > impact > one domain, we prefer to use BUG_ON instead. Use > [Panic/BUG_ON/ASSERT/BUG] > will help different linked software modules to be aware of the HV's > consistency constraints. Below is an example we discussed with Keir > which's illustrative: I8254.c/hvm.c (c:\upstream\xen\xen\arch\x86\hvm): > BUG_ON(bytes != 1); > We want to make sure the handler for a single I/O port never accessed by > multi-byte I/O port access. Although the illegal-access is not that > fatal, > it still affects HV's consistency constraints. So we choose BUG_ON. > 2) How to choose between ASSERT and Panic/BUG_ON/BUG? > a) In order to collect more error report and save debug effort, ASSERT is > preferred when BUG_ON will cause too much overhead in non-debug build. > b) For consistency and simplicity, BUG_ON should be used instead of > panic/BUG as they all have similar behavior > 3) When decide to use BUG_ON, be cautious. Please add necessary comments if > possible. Only when severe error/HV's consistency constraints broken, > should we use it. > 4) Don't use BUG_ON for checking expected BIOS issues/settings such as invalid > ACPI table. We can turn off those specific features in VMM instead. For > example, if VT-d table is incorrect in BIOS, disable VT-d in the VMM > instead > of using BUG_ON. > > [Current Status] > We searched [Panic/BUG_ON/ASSERT/BUG] ocurrences in XEN code (cs 18498), > agreed current usage is basically reasonable. Keir also mentioned when check > in, he tried to make sure that its usage is qualified. Just as Keir's input, > XEN > is an inter-linked set of software modules, and BUG_ON/ASSERT gives some > explicit > description and checking of some of the more subtle interface constraints > between > them. Those error handlers will save us tremendous debug efforts. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Dec-04 08:36 UTC
Re: [Xen-devel] [Doc] writeup for error handling usage in XEN
Would be nice if it also mentioned BUILD_BUG_ON(). Jan>>> "Ke, Liping" <liping.ke@intel.com> 04.12.08 08:32 >>>Hi, all Those days, we spent some efforts to check severe error handling (panic, BUG_ON, BUG, ASSERT) in XEN. We have several round internal discussions as well as several mail threads with Keir. Below is the discussion writeup. If agreed, after review, we want to place it in XEN document folder or XEN wiki since we think it might be helpful to developers. Thanks a lot for your help! Regards, Criping [Background] We found error handling [Panic/BUG_ON/ASSERT/BUG] greatly impacts VM Running/service time. So we did some investigation on its usage in current XEN. Also we have some discussion with Keir. The following writeup logged down them. It might be useful to those who have interest in XEN''s error handling. [Current error handler in XEN] We have five error handlers in XEN. 1) domain_crash 2) panic 3) BUG_ON 4) ASSERT 5) BUG domain_crash only impact the crashed domain, while other four handlers will cause whole system/machine halt/reboot. Panic/BUG_ON/ASSERT/BUG has slight differences: 1) ASSERT only takes effect when DEBUG=y while other three handlers takes effect even if DEBUG=y is not used. 2) panic will halt or restart machine based on boot_option. 3) BUG will give more print information besides panic 4) BUG_ON is the "if" added version of BUG We can see panic, BUG, BUG_ON actually have similar functions. [Error handler usage guideline] 1) domain_crash VS BUG_ON? a) We should keep bug severity/scope in mind. If the bug only affects one domain, use domain_crash to kill the domain instead of panic whole machine. b) When one error impacts the HV''s overall consistency, even if it only impact one domain, we prefer to use BUG_ON instead. Use [Panic/BUG_ON/ASSERT/BUG] will help different linked software modules to be aware of the HV''s consistency constraints. Below is an example we discussed with Keir which''s illustrative: I8254.c/hvm.c (c:\upstream\xen\xen\arch\x86\hvm): BUG_ON(bytes != 1); We want to make sure the handler for a single I/O port never accessed by multi-byte I/O port access. Although the illegal-access is not that fatal, it still affects HV''s consistency constraints. So we choose BUG_ON. 2) How to choose between ASSERT and Panic/BUG_ON/BUG? a) In order to collect more error report and save debug effort, ASSERT is preferred when BUG_ON will cause too much overhead in non-debug build. b) For consistency and simplicity, BUG_ON should be used instead of panic/BUG as they all have similar behavior 3) When decide to use BUG_ON, be cautious. Please add necessary comments if possible. Only when severe error/HV''s consistency constraints broken, should we use it. 4) Don''t use BUG_ON for checking expected BIOS issues/settings such as invalid ACPI table. We can turn off those specific features in VMM instead. For example, if VT-d table is incorrect in BIOS, disable VT-d in the VMM instead of using BUG_ON. [Current Status] We searched [Panic/BUG_ON/ASSERT/BUG] ocurrences in XEN code (cs 18498), agreed current usage is basically reasonable. Keir also mentioned when check in, he tried to make sure that its usage is qualified. Just as Keir''s input, XEN is an inter-linked set of software modules, and BUG_ON/ASSERT gives some explicit description and checking of some of the more subtle interface constraints between them. Those error handlers will save us tremendous debug efforts. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Dec-04 09:09 UTC
Re: [Xen-devel] [Doc] writeup for error handling usage in XEN
Yes, I''ll add something about it. -- Keir On 04/12/2008 08:36, "Jan Beulich" <jbeulich@novell.com> wrote:> Would be nice if it also mentioned BUILD_BUG_ON(). Jan > >>>> "Ke, Liping" <liping.ke@intel.com> 04.12.08 08:32 >>> > Hi, all > Those days, we spent some efforts to check severe error handling (panic, > BUG_ON, BUG, ASSERT) in XEN. We have several round internal discussions as > well as several mail threads with Keir. Below is the discussion writeup. > > If agreed, after review, we want to place it in XEN document folder or XEN > wiki since we think it might be helpful to developers. > > Thanks a lot for your help! > Regards, > Criping > > [Background] > We found error handling [Panic/BUG_ON/ASSERT/BUG] greatly impacts VM > Running/service time. So we did some investigation on its usage in current > XEN. > Also we have some discussion with Keir. The following writeup logged down > them. > It might be useful to those who have interest in XEN''s error handling. > > [Current error handler in XEN] > We have five error handlers in XEN. > 1) domain_crash > 2) panic > 3) BUG_ON > 4) ASSERT > 5) BUG > domain_crash only impact the crashed domain, while other four handlers will > cause whole system/machine halt/reboot. > Panic/BUG_ON/ASSERT/BUG has slight differences: > 1) ASSERT only takes effect when DEBUG=y while other three handlers takes > effect > even if DEBUG=y is not used. > 2) panic will halt or restart machine based on boot_option. > 3) BUG will give more print information besides panic > 4) BUG_ON is the "if" added version of BUG > We can see panic, BUG, BUG_ON actually have similar functions. > > [Error handler usage guideline] > 1) domain_crash VS BUG_ON? > a) We should keep bug severity/scope in mind. If the bug only affects > one domain, use domain_crash to kill the domain instead of panic > whole machine. > b) When one error impacts the HV''s overall consistency, even if it only > impact > one domain, we prefer to use BUG_ON instead. Use > [Panic/BUG_ON/ASSERT/BUG] > will help different linked software modules to be aware of the HV''s > consistency constraints. Below is an example we discussed with Keir > which''s illustrative: I8254.c/hvm.c (c:\upstream\xen\xen\arch\x86\hvm): > BUG_ON(bytes != 1); > We want to make sure the handler for a single I/O port never accessed by > multi-byte I/O port access. Although the illegal-access is not that > fatal, > it still affects HV''s consistency constraints. So we choose BUG_ON. > 2) How to choose between ASSERT and Panic/BUG_ON/BUG? > a) In order to collect more error report and save debug effort, ASSERT is > preferred when BUG_ON will cause too much overhead in non-debug build. > b) For consistency and simplicity, BUG_ON should be used instead of > panic/BUG as they all have similar behavior > 3) When decide to use BUG_ON, be cautious. Please add necessary comments if > possible. Only when severe error/HV''s consistency constraints broken, > should we use it. > 4) Don''t use BUG_ON for checking expected BIOS issues/settings such as invalid > ACPI table. We can turn off those specific features in VMM instead. For > example, if VT-d table is incorrect in BIOS, disable VT-d in the VMM > instead > of using BUG_ON. > > [Current Status] > We searched [Panic/BUG_ON/ASSERT/BUG] ocurrences in XEN code (cs 18498), > agreed current usage is basically reasonable. Keir also mentioned when check > in, he tried to make sure that its usage is qualified. Just as Keir''s input, > XEN > is an inter-linked set of software modules, and BUG_ON/ASSERT gives some > explicit > description and checking of some of the more subtle interface constraints > between > them. Those error handlers will save us tremendous debug efforts. > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2008-Dec-04 15:17 UTC
RE: [Xen-devel] [Doc] writeup for error handling usage in XEN
Thanks for posting this Criping. Since you''ve started this discussion, I''d like to add a suggestion for future use: It would be nice if ASSERT could be enabled at runtime rather than just at compile time. If there were a global flag "enable_asserts" that could be enabled by a Xen grub command line option, and the ASSERT macro always tested that global flag before testing the assert-condition, then additional debug/checking code could be easily enabled with a very small runtime cost. (The global variable would be checked frequently enough that it would always be in cache, and since it only changes once -- at bootime -- there would be no cache-synchronization costs.)> -----Original Message----- > From: Ke, Liping [mailto:liping.ke@intel.com] > Sent: Thursday, December 04, 2008 12:32 AM > To: Keir Fraser > Cc: xen-devel@lists.xensource.com > Subject: [Xen-devel] [Doc] writeup for error handling usage in XEN > > > Hi, all > Those days, we spent some efforts to check severe error > handling (panic, BUG_ON, BUG, ASSERT) in XEN. We have several > round internal discussions as well as several mail threads > with Keir. Below is the discussion writeup. > > If agreed, after review, we want to place it in XEN document > folder or XEN wiki since we think it might be helpful to developers. > > Thanks a lot for your help! > Regards, > Criping > > [Background] > We found error handling [Panic/BUG_ON/ASSERT/BUG] greatly impacts VM > Running/service time. So we did some investigation on its > usage in current XEN. > Also we have some discussion with Keir. The following writeup > logged down them. > It might be useful to those who have interest in XEN''s error handling. > > [Current error handler in XEN] > We have five error handlers in XEN. > 1) domain_crash > 2) panic > 3) BUG_ON > 4) ASSERT > 5) BUG > domain_crash only impact the crashed domain, while other four > handlers will cause whole system/machine halt/reboot. > Panic/BUG_ON/ASSERT/BUG has slight differences: > 1) ASSERT only takes effect when DEBUG=y while other three > handlers takes effect > even if DEBUG=y is not used. > 2) panic will halt or restart machine based on boot_option. > 3) BUG will give more print information besides panic > 4) BUG_ON is the "if" added version of BUG > We can see panic, BUG, BUG_ON actually have similar functions. > > [Error handler usage guideline] > 1) domain_crash VS BUG_ON? > a) We should keep bug severity/scope in mind. If the bug > only affects > one domain, use domain_crash to kill the domain instead > of panic > whole machine. > b) When one error impacts the HV''s overall consistency, > even if it only impact > one domain, we prefer to use BUG_ON instead. Use > [Panic/BUG_ON/ASSERT/BUG] > will help different linked software modules to be aware > of the HV''s > consistency constraints. Below is an example we > discussed with Keir > which''s illustrative: I8254.c/hvm.c > (c:\upstream\xen\xen\arch\x86\hvm): > BUG_ON(bytes != 1); > We want to make sure the handler for a single I/O port > never accessed by > multi-byte I/O port access. Although the illegal-access > is not that fatal, > it still affects HV''s consistency constraints. So we > choose BUG_ON. > 2) How to choose between ASSERT and Panic/BUG_ON/BUG? > a) In order to collect more error report and save debug > effort, ASSERT is > preferred when BUG_ON will cause too much overhead in > non-debug build. > b) For consistency and simplicity, BUG_ON should be used > instead of > panic/BUG as they all have similar behavior > 3) When decide to use BUG_ON, be cautious. Please add > necessary comments if > possible. Only when severe error/HV''s consistency > constraints broken, > should we use it. > 4) Don''t use BUG_ON for checking expected BIOS > issues/settings such as invalid > ACPI table. We can turn off those specific features in VMM > instead. For > example, if VT-d table is incorrect in BIOS, disable VT-d > in the VMM instead > of using BUG_ON. > > [Current Status] > We searched [Panic/BUG_ON/ASSERT/BUG] ocurrences in XEN code > (cs 18498), > agreed current usage is basically reasonable. Keir also > mentioned when check > in, he tried to make sure that its usage is qualified. Just > as Keir''s input, XEN > is an inter-linked set of software modules, and BUG_ON/ASSERT > gives some explicit > description and checking of some of the more subtle interface > constraints between > them. Those error handlers will save us tremendous debug efforts. > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Dec-04 15:35 UTC
Re: [Xen-devel] [Doc] writeup for error handling usage in XEN
On 04/12/2008 15:17, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> It would be nice if ASSERT could be enabled at runtime rather > than just at compile time. If there were a global flag > "enable_asserts" that could be enabled by a Xen grub command > line option, and the ASSERT macro always tested that global > flag before testing the assert-condition, then additional > debug/checking code could be easily enabled with a very > small runtime cost. (The global variable would be checked > frequently enough that it would always be in cache, and > since it only changes once -- at bootime -- there would be > no cache-synchronization costs.)A patch to make runtime-selectable assertions a feature selectable at compile time might be acceptable. I''d suggest just shipping debug and non-debug hypervisors though, if you want extra boot-time selectable debugging in the field. Or just have a private patch to always-enable assertions, if you consider the cost low enough. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Apparently Analagous Threads
- [PATCH] ioemu-remote: ACPI S3 state wake up
- [RFC][patch 0/7] Enable PCIE-AER support for XEN
- FW: [patch 0/4]Enable CMCI (Corrected Machine Check Error Interrupt) for Intel CPUs
- [pvops-dom0]Let PV ops guest could handle Machine Check trap
- [Patch 4/4]: Xend interface for HVM S3