Keir Fraser
2008-Sep-22 08:57 UTC
[Xen-devel] Re: Questions about current panic/BUG_ON/BUG usage in XEN
On 22/9/08 09:01, "Ke, Liping" <liping.ke@intel.com> wrote:> 1) Lots of parameter''s checking using BUG_ON where it would be nicer if we use > ASSERT instead?I tend to use ASSERT() where I think the error condition is *really* unlikely or could incur an unwanted overhead on non-debug builds. I don''t use it where I have an inkling that a BUG_ON() might fire when I don''t want it to.> 2) Some errors which only impact a device/domain cause whole machine panic > such as > I8254.c (c:\upstream\xen\xen\arch\x86\hvm): BUG_ON(bytes != 1); > Hvm.c (c:\upstream\xen\xen\arch\x86\hvm): BUG_ON(bytes != 1);Both valid because they are handlers for a single I/O port, and code in intercept.c will prevent multi-byte I/O port accesses from reaching the handler.> Just want to know whether we need to do some clean up jobs and made some panic > criteria?No, if you are seeing BUG_ON()s firing then we only remove the BUG_ON() or panic() if its assumptions are invalid. Also panic/BUG_ON is not great if we are relying on correct BIOS tables or timely operation of asynchronous hardware (I''m thinking programming of VT-d engines, for example). Assertion/BUG_ON/panic about self-consistency of the hypervisor itself should absolutely stay as they are. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2008-Oct-06 02:35 UTC
[Xen-devel] RE: Questions about current panic/BUG_ON/BUG usage in XEN
Keir Fraser wrote:> On 22/9/08 09:01, "Ke, Liping" <liping.ke@intel.com> wrote: > >> 1) Lots of parameter''s checking using BUG_ON where it would be nicer >> if we use ASSERT instead? > > I tend to use ASSERT() where I think the error condition is *really* > unlikely or could incur an unwanted overhead on non-debug builds. I > don''t use it where I have an inkling that a BUG_ON() might fire when > I don''t want it to. > >> 2) Some errors which only impact a device/domain cause whole machine >> panic such as I8254.c (c:\upstream\xen\xen\arch\x86\hvm): >> BUG_ON(bytes != 1); >> Hvm.c (c:\upstream\xen\xen\arch\x86\hvm): BUG_ON(bytes != 1); > > Both valid because they are handlers for a single I/O port, and code > in intercept.c will prevent multi-byte I/O port accesses from > reaching the handler. >Hi, Keir Here you mean that this BUG_ON would never happen? So would it be better to remove such BUG_ON or replace them with ASSERT? Another example is Intercept.c (c:\upstream\xen\xen\arch\x86\hvm): BUG_ON(num >= MAX_IO_HANDLER); I remember I meet the error when do HVM Sx test. When register same handler repeatedly several times, this bug will be triggered. So just want to know the criteria here: Even if it is not good to register repeatedly, must we panic the whole machine? After all, it is only an error contained in a hvm domain, is it? Thanks a lot for your help -:) Regards, Criping>> Just want to know whether we need to do some clean up jobs and made >> some panic criteria? > > No, if you are seeing BUG_ON()s firing then we only remove the > BUG_ON() or panic() if its assumptions are invalid. Also panic/BUG_ON > is not great if we are relying on correct BIOS tables or timely > operation of asynchronous hardware (I''m thinking programming of VT-d > engines, for example). Assertion/BUG_ON/panic about self-consistency > of the hypervisor itself should absolutely stay as they are. > > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-06 06:13 UTC
[Xen-devel] Re: Questions about current panic/BUG_ON/BUG usage in XEN
On 6/10/08 03:35, "Ke, Liping" <liping.ke@intel.com> wrote:>> Both valid because they are handlers for a single I/O port, and code >> in intercept.c will prevent multi-byte I/O port accesses from >> reaching the handler. >> > > Hi, Keir > Here you mean that this BUG_ON would never happen? > So would it be better to remove such BUG_ON or replace them with ASSERT?Could go either way. My choice between BUG_ON and ASSERT is quite arbitrary in some cases. I guess I prefer BUG_ON in general, unless I think the check will be too expensive or the function is called very often. We don''t expect *any* BUG_ON or ASSERT in Xen to trigger, but that does not mean we should remove them! Xen is an inter-linked set of software modules, and BUG_ON/ASSERT gives some explicit description and checking of some of the more subtle interface constraints between them. They save our bacon quite often when changes in one part of the hypervisor forget their responsibilities to other parts of the hypervisor.> Another example is > Intercept.c (c:\upstream\xen\xen\arch\x86\hvm): BUG_ON(num >> MAX_IO_HANDLER); > > I remember I meet the error when do HVM Sx test. When register same handler > repeatedly several times, > this bug will be triggered. So just want to know the criteria here: Even if it > is not good to register repeatedly, > must we panic the whole machine? After all, it is only an error contained in a > hvm domain, is it?It''s an error in usage of this (admittedly rather weak) interface. The interface is that this function may only be called up to MAX_IO_HANDLER times per HVM guest. If Xen exceeds this then it is an internal error within Xen. The correct fix is not to remove the BUG_ON, nor even to directly crash the HVM guest, but to: * Avoid registering so many I/O handlers; or * Increase MAX_IO_HANDLER; or * Maintain a handler linked list rather than static array Also possibly return an error code to callers and make them deal with EBUSY or ENOMEM errors. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel