I''m wondering about something in xen I don''t understand, related to interrupts. Basically, in hardware systems you have a processor running with interrupts enabled or disabled, which in Unix and sons generally is called a priority level, let''s just call it hi or lo. In many OSes (Plan 9 included) to block interrupts you splhi, to enable them you spllo. In Linux it is CLI to block and STI to enable. When running at lo, you can take an interrupt and you land in the kernel at splhi; or you can take a fault and land in the kernel without changing to hi or lo. As a general rule, the cpu saves your state including your interrupt enable flags (x86 does this in eflags). For page faults, typically you run at whatever level you were at in the kernel fault handler; for hardware interrupts typically you run at splhi, regardless of where you were before. In the handler, you do your work and (e.g.) execute an iret. Regardless of the level you are running at in the fault handler, after the iret, you are running at the previous priority level, i.e. hi or lo. So you can, for example, page fault in an interrupt routine. Hardware saves your interrupt-enabled flag and restores it automagically when you execute the iret. On interrupt, the IF flag is cleared, no more interrupts can happen. On page fault, the IF flag is not touched. In any event, it''s not the job of the interrupt handler to restore the state of the IF flag that was in play before the interrupt or exception -- hardware does that, specifically the IRET popping EFLAGS on the x86 restores the IF. My question is, why doesn''t Xen work that way? In other words, Xen will set the mask for the event channel; but restoring it is up to the OS. If Xen worked the way the hardware does, Xen would save the mask, set the mask, and then restore the mask to its previous value when the OS returns from the trap. Is there a reason that the OS has to do the restoration of the mask? I am wondering because I just fixed a bug in Plan 9 that boiled down to adding an spllo() to the plan 9 trap handler, since Xen sets the mask on page fault and does not clear it when the OS returns. Why doesn''t Xen do the equivalent of this: x = splhi(); trap_to_os(); splx(x); instead of what is does now, which is pretty much this: splhi(); trap_to_os(); /* OS restores IF */ There''s an awful lot of complexity in the trap handler in the OS to deal with the problem of the OS setting spllo() before doing the iret, since as soon as the OS sets spllo() it can take an interrupt -- while in the interrupt handler. thanks for any clarity you can lend to my brain :-) ron ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> I am wondering because I just fixed a bug in Plan 9 that boiled > down to adding an spllo() to the plan 9 trap handler, since Xen sets the > mask on page fault and does not clear it when the OS returns.This isn''t true. Xen only sets the mask when it delivers an event-channel notification. It doesn''t set the mask for page faults unless you specified a "virtual interrupt gate" instead of a "virtual trap gate" during the set_trap_table() hypercall -- that is, for exception vector 14 you must have set bit 2 of the flag byte of the trap_info_t struct.> Why doesn''t Xen do the equivalent of this: > x = splhi(); > trap_to_os(); > splx(x); > instead of what is does now, which is pretty much this: > splhi(); > trap_to_os(); > /* OS restores IF */ > > There''s an awful lot of complexity in the trap handler in the OS to deal > with the problem of the OS setting spllo() before doing the iret, since as > soon as the OS sets spllo() it can take an interrupt -- while in the > interrupt handler.It avoids the need to reenter Xen to do tail work. Instead the OS does it (with some added complexity, not on the fast path) and avoids reentering ring 0. Thus, in the extremely common case, an interrupt will cause ring transitions 3->0->1->3, instead of 3->0->1->0->3. For people who don''t care about the extra cost, we could provide a "virtual IRET" hypercall which would "atomically" reenable events and IRET. -- Keir ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Tue, 14 Dec 2004, Keir Fraser wrote:> This isn''t true. Xen only sets the mask when it delivers an > event-channel notification. It doesn''t set the mask for page faults > unless you specified a "virtual interrupt gate" instead of a "virtual > trap gate" during the set_trap_table() hypercall -- that is, for > exception vector 14 you must have set bit 2 of the flag byte of the > trap_info_t struct.You''re right, I said it incorrectly, for some reason I switched in my Last-1th attempt to figure out what was up with the last one. Oops. The plan 9 trap handler assumes that there is a hardware restore of INTF, which is generally true of modern CPUs. So what was happening was that at the end of the Plan 9 trap handler there was an splhi for a critical section (two lines or some such) then a return. There was no spllo() as they relied on hardware. I just realized the easy way to fix this, which is to take my own medicine: x = splhi(); stuff splx(x); just need to talk to the plan 9 guys about this and fix up their trap code if they''re willing.> It avoids the need to reenter Xen to do tail work. Instead the OS does > it (with some added complexity, not on the fast path) and avoids > reentering ring 0.That''s what I figured. I was wondering if you all knew the tradeoff of adding complexity at the OS trap level vs. just dropping back to ring 0.> Thus, in the extremely common case, an interrupt will cause ring > transitions 3->0->1->3, instead of 3->0->1->0->3.OK. I was assuming this was the reason. So the idea is that the tradeoff of NOT doing this: 0->1->0->3 is worth the complexity of the trap return code which I see in the various kernels. Has this tradeoff been measured and is known to be the one we want? I''m more curious than anything. Thanks Keir! ron ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel