I''m having the impression that this is significantly flawed at the moment, and due to the number of issues I don''t feel comfortable to submit a patch without clarification on at least some of the aspects: 1) The outermost conditional in do_iret() compares against VCPU_TRAP_NMI, which I would assume is a copy-and-paste mistake and should really be VCPU_TRAP_MCE. 2) While c/s 19422 handled Dom0 only (and introduced some hardcoded references to dom0), c/s 19871 relaxed the check to permit any domain, but apparently failed to clean up the dom0 references. 3) I don''t think the trap priority adjustment works properly at all, as there''s only provision for a single level of nesting (due to the old value being stored in the vcpu structure), but the entry.S code would happily nest an MCE inside an NMI. I think these rather need to be - just like in physical CPUs - individual flags that tell whether an exception of that kind is currently being processed. And I don''t think either should mask the other, the flags should just be used to prevent nested injection of the same type of exception (but there needs to be a way to tell which of the two was injected first, so the iret can clear the right flag). 4) The code in do_iret() doesn''t seem to be 64-bit specific at all, i.e. I''d think this should really be a common subroutine called from all three do_iret() handlers (perhaps even including the trap priority and affinity handling). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thanks for your checking very much. See comments below please. Jan Beulich wrote:> I''m having the impression that this is significantly flawed at > the moment, > and due to the number of issues I don''t feel comfortable to submit a > patch without clarification on at least some of the aspects: > > 1) The outermost conditional in do_iret() compares against > VCPU_TRAP_NMI, which I would assume is a copy-and-paste mistake > and should really be VCPU_TRAP_MCE.Yes, it should be.> > 2) While c/s 19422 handled Dom0 only (and introduced some hardcoded > references to dom0), c/s 19871 relaxed the check to permit any domain, > but apparently failed to clean up the dom0 references. > > 3) I don''t think the trap priority adjustment works properly at all, > as there''s only provision for a single level of nesting (due to > the old value > being stored in the vcpu structure), but the entry.S code would > happily nest an MCE inside an NMI. I think these rather need to be - > just like in > physical CPUs - individual flags that tell whether an exception of > that kind is currently being processed. And I don''t think either > should mask the other, the flags should just be used to prevent > nested injection of the same type of exception (but there needs to be > a way to tell which of the two was injected first, so the iret can > clear the right flag).I remember Ke Liping and I discussed this before, but I forgot the conclusion. So I''d continue discussion here. I''m not sure if it''s always correct for the " don''t think either should mask the other". Followed is quote from Intel''s SDM section 6.9, PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND INTERRUPTS : "The processor first services a pending exception or interrupt from the class which has the highest priority, transferring execution to the first instruction of the handler. Lower priority exceptions are discarded; lower priority interrupts are held pending. Discarded exceptions are re-generated when the interrupt handler returns execution to the point in the program or task where the exceptions and/or interrupts occurred.". Because this section is talking about "more than one exception or interrupt is pending at an instruction boundary", so I''m not sure how nested will happen. Can you share me where you get the idea of nested? In fact, I suspect if we really need nested for these trap (currently only NMI and MCE). Do you know when will an NMI happen to a guest, does we support NMI watchdog to guest? How about simply killing the guest if any nest among these traps is ok if they are all rare situation.> > 4) The code in do_iret() doesn''t seem to be 64-bit specific at > all, i.e. I''d > think this should really be a common subroutine called from all three > do_iret() handlers (perhaps even including the trap priority > and affinity > handling).As stated above, I have no idea of NMI usage, but for MCE, I suspect if we need care about 32bit at all. --jyh> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 26.11.09 10:55 >>> >Jan Beulich wrote: >> 3) I don''t think the trap priority adjustment works properly at all, >> as there''s only provision for a single level of nesting (due to >> the old value >> being stored in the vcpu structure), but the entry.S code would >> happily nest an MCE inside an NMI. I think these rather need to be - >> just like in >> physical CPUs - individual flags that tell whether an exception of >> that kind is currently being processed. And I don''t think either >> should mask the other, the flags should just be used to prevent >> nested injection of the same type of exception (but there needs to be >> a way to tell which of the two was injected first, so the iret can >> clear the right flag). > >I remember Ke Liping and I discussed this before, but I forgot the conclusion. So I''d continue discussion here. > >I''m not sure if it''s always correct for the " don''t think either should mask the other". Followed is quote from Intel''s SDM >section 6.9, PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND INTERRUPTS : "The processor first services a >pending exception or interrupt from the class which has the highest priority, transferring execution to the first >instruction of the handler. Lower priority exceptions are discarded; lower priority interrupts are held pending. >Discarded exceptions are re-generated when the interrupt handler returns execution to the point in the program or >task where the exceptions and/or interrupts occurred.". Because this section is talking about "more than one exception >or interrupt is pending at an instruction boundary", so I''m not sure how nested will happen. Can you share me where >you get the idea of nested?No, I''m not talking about both exceptions happening simultaneously, but rather about one happening while the other is being serviced (i.e. before the corresponding HYPERVISOR_iret was executed).>In fact, I suspect if we really need nested for these trap (currently only NMI and MCE). Do you know when will an NMI happen >to a guest, does we support NMI watchdog to guest? How about simply killing the guest if any nest among these traps is ok if >they are all rare situation.Dom0 can get hardware generated NMIs, and any domain can get software injected ones. And no, killing the VM unless an unhandleable, catastrophic situation makes it necessary doesn''t seem like a good course of action.>> >> 4) The code in do_iret() doesn''t seem to be 64-bit specific at >> all, i.e. I''d >> think this should really be a common subroutine called from all three >> do_iret() handlers (perhaps even including the trap priority >> and affinity >> handling). > >As stated above, I have no idea of NMI usage, but for MCE, I suspect if we need care about 32bit at all.As long as it doesn''t require significant effort, I think 32-bit guests shouldn''t be made second class citizens (and even less if they''re running on 64-bit Xen). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich wrote:>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 26.11.09 10:55 >>> >> Jan Beulich wrote: >>> 3) I don''t think the trap priority adjustment works properly at all, >>> as there''s only provision for a single level of nesting (due to >>> the old value >>> being stored in the vcpu structure), but the entry.S code would >>> happily nest an MCE inside an NMI. I think these rather need to be >>> - just like in physical CPUs - individual flags that tell whether >>> an exception of that kind is currently being processed. And I don''t >>> think either should mask the other, the flags should just be used >>> to prevent nested injection of the same type of exception (but >>> there needs to be a way to tell which of the two was injected >>> first, so the iret can clear the right flag). >> >> I remember Ke Liping and I discussed this before, but I > forgot the conclusion. So I''d continue discussion here. >> >> I''m not sure if it''s always correct for the " don''t think > either should mask the other". Followed is quote from Intel''s SDM >> section 6.9, PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND > INTERRUPTS : "The processor first services a >> pending exception or interrupt from the class which has the > highest priority, transferring execution to the first >> instruction of the handler. Lower priority exceptions are > discarded; lower priority interrupts are held pending. >> Discarded exceptions are re-generated when the interrupt > handler returns execution to the point in the program or >> task where the exceptions and/or interrupts occurred.". > Because this section is talking about "more than one exception >> or interrupt is pending at an instruction boundary", so I''m > not sure how nested will happen. Can you share me where >> you get the idea of nested? > > No, I''m not talking about both exceptions happening simultaneously, > but rather about one happening while the other is being serviced (i.e. > before the corresponding HYPERVISOR_iret was executed). > >> In fact, I suspect if we really need nested for these trap > (currently only NMI and MCE). Do you know when will an NMI happen >> to a guest, does we support NMI watchdog to guest? How about > simply killing the guest if any nest among these traps is ok if >> they are all rare situation. > > Dom0 can get hardware generated NMIs, and any domain can get > software injected ones. > > And no, killing the VM unless an unhandleable, catastrophic situation > makes it necessary doesn''t seem like a good course of action.If NMI is used commonly, we of course should not kill them simply because the nesting.> >>> >>> 4) The code in do_iret() doesn''t seem to be 64-bit specific at >>> all, i.e. I''d >>> think this should really be a common subroutine called from all >>> three do_iret() handlers (perhaps even including the trap priority >>> and affinity >>> handling). >> >> As stated above, I have no idea of NMI usage, but for MCE, I > suspect if we need care about 32bit at all. > > As long as it doesn''t require significant effort, I think 32-bit > guests shouldn''t be made second class citizens (and even less if > they''re running on 64-bit Xen).Oops, mistaken that I thought it is xen hypervisor :( Yes, do_iret for 32 bit guest should be supported also. --jyh> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel