On Mon, Jun 13, 2022 at 04:26:01PM +0800, Lai Jiangshan
wrote:> On Wed, Jun 8, 2022 at 10:48 PM Peter Zijlstra <peterz at
infradead.org> wrote:
> >
> > Now that arch_cpu_idle() is expected to return with IRQs disabled,
> > avoid the useless STI/CLI dance.
> >
> > Per the specs this is supposed to work, but nobody has yet relied up
> > this behaviour so broken implementations are possible.
>
> I'm totally newbie here.
>
> The point of safe_halt() is that STI must be used and be used
> directly before HLT to enable IRQ during the halting and stop
> the halting if there is any IRQ.
Correct; on real hardware. But this is virt...
> In TDX case, STI must be used directly before the hypercall.
> Otherwise, no IRQ can come and the vcpu would be stalled forever.
>
> Although the hypercall has an "irq_disabled" argument.
> But the hypervisor doesn't (and can't) touch the IRQ flags no
matter
> what the "irq_disabled" argument is. The IRQ is not enabled
during
> the halting if the IRQ is disabled before the hypercall even if
> irq_disabled=false.
All we need the VMM to do is wake the vCPU, and it can do that,
irrespective of the guest's IF.
So the VMM can (and does) know if there's an interrupt pending, and
that's all that's needed to wake from this hypercall. Once the vCPU is
back up and running again, we'll eventually set IF again and the pending
interrupt will get delivered and all's well.
Think of this like MWAIT with ECX[0] set if you will.