Mukesh Rathor
2011-Jul-21 01:30 UTC
[Xen-devel] anomaly in irq check in fixup_page_fault()
Hi, This is a bit confusing. This for PVOPs kernel, I''ve not looked at older PV kernels to see what they do yet. But, the VCPU starts with evtchn_upcall_mask set and eflags.IF enabled. However, during kernel boot memory mapping lot of faults are getting fixed up by xen in: fixup_page_fault(): /* No fixups in interrupt context or when interrupts are disabled. */ if ( in_irq() || !(regs->eflags & X86_EFLAGS_IF) ) <------ return 0; The guest is running under the assumption of INTs disabled during init_memory_mapping, and the first enable happens much later. So this check seems redundant at least for PVOPs kernel. Now for my hybrid, the guest during initial boot is running with IF disabled, so fixup doesn''t like that. Not sure if permanently disabling the (eflags & X86_EFLAGS_IF) check for hybrid would be a good idea for me. thanks, Mukesh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jul-21 06:35 UTC
Re: [Xen-devel] anomaly in irq check in fixup_page_fault()
On 21/07/2011 02:30, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:> Hi, > > This is a bit confusing. This for PVOPs kernel, I''ve not looked at older > PV kernels to see what they do yet. But, the VCPU starts with > evtchn_upcall_mask set and eflags.IF enabled. However, during kernel > boot memory mapping lot of faults are getting fixed up by xen in: > > fixup_page_fault(): > /* No fixups in interrupt context or when interrupts are disabled. */ > if ( in_irq() || !(regs->eflags & X86_EFLAGS_IF) ) <------ > return 0;A PV guest never has EF.IF=0, so the early exit should never be triggered by a guest fault. Your best bet is to fake this out in your HVM container wrapper. Just write an EFLAGS into the saved regs that has EF.IF=1, as would always be the case for a normal PV guest. Rather that than fragile eis_hvm_pv() checks scattered around. The setting of EF.IF shouldn''t matter much for your guest as you''ll be doing PV event delivery anyway, but I wonder how it ends up with EF.IF=0 -- is that deliberate? -- Keir> The guest is running under the assumption of INTs disabled during > init_memory_mapping, and the first enable happens much later. So this > check seems redundant at least for PVOPs kernel. > > Now for my hybrid, the guest during initial boot is running with IF > disabled, so fixup doesn''t like that. Not sure if permanently disabling > the (eflags & X86_EFLAGS_IF) check for hybrid would be a good idea for > me. > > thanks, > Mukesh > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mukesh Rathor
2011-Jul-21 19:28 UTC
Re: [Xen-devel] anomaly in irq check in fixup_page_fault()
On Thu, 21 Jul 2011 07:35:00 +0100 Keir Fraser <keir.xen@gmail.com> wrote:> On 21/07/2011 02:30, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote: > > > Hi, > > > > This is a bit confusing. This for PVOPs kernel, I''ve not looked at > > older PV kernels to see what they do yet. But, the VCPU starts with > > evtchn_upcall_mask set and eflags.IF enabled. However, during kernel > > boot memory mapping lot of faults are getting fixed up by xen in: > > > > fixup_page_fault(): > > /* No fixups in interrupt context or when interrupts are > > disabled. */ if ( in_irq() || !(regs->eflags & X86_EFLAGS_IF) ) > > <------ return 0; > > A PV guest never has EF.IF=0, so the early exit should never be > triggered by a guest fault. > > Your best bet is to fake this out in your HVM container wrapper. Just > write an EFLAGS into the saved regs that has EF.IF=1, as would always > be the case for a normal PV guest. Rather that than fragile > eis_hvm_pv() checks scattered around.Ok. In my prototype, i''ve the check, but I''ll do the wrapper. I realize now the above check is more for hyp not taking fault disabled than the guest doing so.> The setting of EF.IF shouldn''t matter much for your guest as you''ll > be doing PV event delivery anyway, but I wonder how it ends up with > EF.IF=0 -- is that deliberate?Yeah, I change IF=0 initially to make sure events are not delivered until the guest is ready and does irq enable. For PV, the vcpu-mask=1 assures this. Unlike PV, the hybrid changes IF in enable/disable to make "interrupt window exiting" work, BTW. thanks, Mukesh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jul-22 06:22 UTC
Re: [Xen-devel] anomaly in irq check in fixup_page_fault()
On 21/07/2011 20:28, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:> On Thu, 21 Jul 2011 07:35:00 +0100 > Keir Fraser <keir.xen@gmail.com> wrote: > >> On 21/07/2011 02:30, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote: >> >> >> A PV guest never has EF.IF=0, so the early exit should never be >> triggered by a guest fault. >> >> Your best bet is to fake this out in your HVM container wrapper. Just >> write an EFLAGS into the saved regs that has EF.IF=1, as would always >> be the case for a normal PV guest. Rather that than fragile >> eis_hvm_pv() checks scattered around. > > Ok. In my prototype, i''ve the check, but I''ll do the wrapper. I realize > now the above check is more for hyp not taking fault disabled than the > guest doing so. > >> The setting of EF.IF shouldn''t matter much for your guest as you''ll >> be doing PV event delivery anyway, but I wonder how it ends up with >> EF.IF=0 -- is that deliberate? > > Yeah, I change IF=0 initially to make sure events are not delivered > until the guest is ready and does irq enable. For PV, the vcpu-mask=1 > assures this. Unlike PV, the hybrid changes IF in > enable/disable to make "interrupt window exiting" work, BTW.I hope this can be an optional extension in the final version. -- Keir> > thanks, > Mukesh_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Jul-22 07:39 UTC
Re: [Xen-devel] anomaly in irq check in fixup_page_fault()
On Thu, 2011-07-21 at 07:35 +0100, Keir Fraser wrote:> On 21/07/2011 02:30, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote: > > > Hi, > > > > This is a bit confusing. This for PVOPs kernel, I''ve not looked at older > > PV kernels to see what they do yet. But, the VCPU starts with > > evtchn_upcall_mask set and eflags.IF enabled. However, during kernel > > boot memory mapping lot of faults are getting fixed up by xen in: > > > > fixup_page_fault(): > > /* No fixups in interrupt context or when interrupts are disabled. */ > > if ( in_irq() || !(regs->eflags & X86_EFLAGS_IF) ) <------ > > return 0; > > A PV guest never has EF.IF=0, so the early exit should never be triggered by > a guest fault.When I was playing with PV in HVM prototypes way back I noticed that, for a pvops kernel at least, we seem to accidentally rely on the fact that trying to clear EFLAGS.IF from RING>0 silently ignores the change (as it does for any privileged bits in EFLAGS). This meant that on vmexit I would sometimes discover that IF was cleared. Originally I made this shoot the guest (it must be misbehaving, right!) but in the end I decided to be pragmatic and always |=EFLAGS_IF on the vmexit path. I _think_ this was the original reason I discovered the issue that I fixed with the short series I reposted at http://marc.info/?l=linux-kernel&m=130987084009107 (IOW I think kernel_eflags ended up with IF incorrect under pvops Xen, but it was a long time ago so perhaps I''m misremembering). I also vaguely recall that the optimisation used in Xen''s implementation of the xen_save_fl or xen_save_fl_direct, which basically only guarantees that the bit at EFLAGS_IF is valid in the value it returns (compared with native_save_fl which returns a full set of EFLAGS), was also something I suspected being implicated in IF getting turned off -- but you can pretty easily (at the expense of the optimisation) make those hooks return the real eflags with the ~evtchn_upcall_mask in the EFLAGS_IF bit. At the time I sprinkled assertions around the guest kernel to help debug the issue, patch (against 2.6.32, so ancient) attached FWIW. Ian.> Your best bet is to fake this out in your HVM container wrapper. Just write > an EFLAGS into the saved regs that has EF.IF=1, as would always be the case > for a normal PV guest. Rather that than fragile eis_hvm_pv() checks > scattered around. > > The setting of EF.IF shouldn''t matter much for your guest as you''ll be doing > PV event delivery anyway, but I wonder how it ends up with EF.IF=0 -- is > that deliberate? > > -- Keir > > > The guest is running under the assumption of INTs disabled during > > init_memory_mapping, and the first enable happens much later. So this > > check seems redundant at least for PVOPs kernel. > > > > Now for my hybrid, the guest during initial boot is running with IF > > disabled, so fixup doesn''t like that. Not sure if permanently disabling > > the (eflags & X86_EFLAGS_IF) check for hybrid would be a good idea for > > me. > > > > thanks, > > Mukesh > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel