thr3ads.net - Linux Virtualization - [PATCH 7/9] virtio-pci: harden INTX interrupts [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Thomas Gleixner

2021-Sep-13 21:36 UTC

[PATCH 7/9] virtio-pci: harden INTX interrupts

Jason,

On Mon, Sep 13 2021 at 13:53, Jason Wang wrote:> This patch tries to make sure the virtio interrupt handler for INTX
> won't be called after a reset and before virtio_device_ready(). We
> can't use IRQF_NO_AUTOEN since we're using shared interrupt
> (IRQF_SHARED). So this patch tracks the INTX enabling status in a new
> intx_soft_enabled variable and toggle it during in
> vp_disable/enable_vectors(). The INTX interrupt handler will check
> intx_soft_enabled before processing the actual interrupt.
Ah, there it is :)

Cc'ed our memory ordering wizards as I might be wrong as usual.
> -	if (vp_dev->intx_enabled)
> +	if (vp_dev->intx_enabled) {
> +		vp_dev->intx_soft_enabled = false;
> +		/* ensure the vp_interrupt see this intx_soft_enabled value */
> +		smp_wmb();
>  		synchronize_irq(vp_dev->pci_dev->irq);
As you are synchronizing the interrupt here anyway, what is the value of
the barrier?

 		vp_dev->intx_soft_enabled = false;
  		synchronize_irq(vp_dev->pci_dev->irq);

is sufficient because of:

synchronize_irq()
   do {
   	raw_spin_lock(desc->lock);
        in_progress = check_inprogress(desc);
   	raw_spin_unlock(desc->lock);
   } while (in_progress);     

raw_spin_lock() has ACQUIRE semantics so the store to intx_soft_enabled
can complete after lock has been acquired which is uninteresting.

raw_spin_unlock() has RELEASE semantics so the store to intx_soft_enabled
has to be completed before the unlock completes.

So if the interrupt is on the flight then it might or might not see
intx_soft_enabled == false. But that's true for your barrier construct
as well.

The important part is that any interrupt for this line arriving after
synchronize_irq() has completed is guaranteed to see intx_soft_enabled
== false.

That is what you want to achieve, right?
>  	for (i = 0; i < vp_dev->msix_vectors; ++i)
>  		disable_irq(pci_irq_vector(vp_dev->pci_dev, i));
> @@ -43,8 +47,12 @@ void vp_enable_vectors(struct virtio_device *vdev)
>  	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>  	int i;
>  
> -	if (vp_dev->intx_enabled)
> +	if (vp_dev->intx_enabled) {
> +		vp_dev->intx_soft_enabled = true;
> +		/* ensure the vp_interrupt see this intx_soft_enabled value */
> +		smp_wmb();
For the enable case the barrier is pointless vs. intx_soft_enabled

CPU 0                                           CPU 1

interrupt                                       vp_enable_vectors()
  vp_interrupt()                                
    if (!vp_dev->intx_soft_enabled)
       return IRQ_NONE;
                                                  vp_dev->intx_soft_enabled =
true;

IOW, the concurrent interrupt might or might not see the store. That's
not a problem for legacy PCI interrupts. If it did not see the store and
the interrupt originated from that device then it will account it as one
spurious interrupt which will get raised again because those interrupts
are level triggered and nothing acknowledged it at the device level.

Now, what's more interesting is that is has to be guaranteed that the
interrupt which observes

        vp_dev->intx_soft_enabled == true

also observes all preceeding stores, i.e. those which make the interrupt
handler capable of handling the interrupt.

That's the real problem and for that your barrier is at the wrong place
because you want to make sure that those stores are visible before the
store to intx_soft_enabled becomes visible, i.e. this should be:

        /* Ensure that all preceeding stores are visible before
intx_soft_enabled */
	smp_wmb();
	vp_dev->intx_soft_enabled = true;

Now Micheal is not really enthusiatic about the barrier in the interrupt
handler hotpath, which is understandable.

As the device startup is not really happening often it's sensible to do
the following

        disable_irq();
        vp_dev->intx_soft_enabled = true;
        enable_irq();

because:

        disable_irq()
          synchronize_irq()

acts as a barrier for the preceeding stores:

        disable_irq()
   	  raw_spin_lock(desc->lock);
          __disable_irq(desc);
   	  raw_spin_unlock(desc->lock);

          synchronize_irq()
            do {
   	      raw_spin_lock(desc->lock);
              in_progress = check_inprogress(desc);
   	      raw_spin_unlock(desc->lock);
            } while (in_progress);     

        intx_soft_enabled = true;

        enable_irq();

In this case synchronize_irq() prevents the subsequent store to
intx_soft_enabled to leak into the __disable_irq(desc) section which in
turn makes it impossible for an interrupt handler to observe
intx_soft_enabled == true before the prerequisites which preceed the
call to disable_irq() are visible.

Of course the memory ordering wizards might disagree, but if they do,
then we have a massive chase of ordering problems vs. similar constructs
all over the tree ahead of us.
>From the interrupt perspective the sequence:
        disable_irq();
        vp_dev->intx_soft_enabled = true;
        enable_irq();

is perfectly fine as well. Any interrupt arriving during the disabled
section will be reraised on enable_irq() in hardware because it's a
level interrupt. Any resulting failure is either a hardware or a
hypervisor bug.

Thanks,

        tglx

Michael S. Tsirkin

2021-Sep-13 22:01 UTC

head link

[PATCH 7/9] virtio-pci: harden INTX interrupts

On Mon, Sep 13, 2021 at 11:36:24PM +0200, Thomas Gleixner
wrote:> >From the interrupt perspective the sequence:
> 
>         disable_irq();
>         vp_dev->intx_soft_enabled = true;
>         enable_irq();
> 
> is perfectly fine as well. Any interrupt arriving during the disabled
> section will be reraised on enable_irq() in hardware because it's a
> level interrupt. Any resulting failure is either a hardware or a
> hypervisor bug.
yes but it's a shared interrupt. what happens if multiple callers do
this in parallel?

Jason Wang

2021-Sep-14 02:50 UTC

head link

[PATCH 7/9] virtio-pci: harden INTX interrupts

? 2021/9/14 ??5:36, Thomas Gleixner ??:> Jason,
>
> On Mon, Sep 13 2021 at 13:53, Jason Wang wrote:
>> This patch tries to make sure the virtio interrupt handler for INTX
>> won't be called after a reset and before virtio_device_ready(). We
>> can't use IRQF_NO_AUTOEN since we're using shared interrupt
>> (IRQF_SHARED). So this patch tracks the INTX enabling status in a new
>> intx_soft_enabled variable and toggle it during in
>> vp_disable/enable_vectors(). The INTX interrupt handler will check
>> intx_soft_enabled before processing the actual interrupt.
> Ah, there it is :)
>
> Cc'ed our memory ordering wizards as I might be wrong as usual.
>
>> -	if (vp_dev->intx_enabled)
>> +	if (vp_dev->intx_enabled) {
>> +		vp_dev->intx_soft_enabled = false;
>> +		/* ensure the vp_interrupt see this intx_soft_enabled value */
>> +		smp_wmb();
>>   		synchronize_irq(vp_dev->pci_dev->irq);
> As you are synchronizing the interrupt here anyway, what is the value of
> the barrier?
>
>   		vp_dev->intx_soft_enabled = false;
>    		synchronize_irq(vp_dev->pci_dev->irq);
>
> is sufficient because of:
>
> synchronize_irq()
>     do {
>     	raw_spin_lock(desc->lock);
>          in_progress = check_inprogress(desc);
>     	raw_spin_unlock(desc->lock);
>     } while (in_progress);
>
> raw_spin_lock() has ACQUIRE semantics so the store to intx_soft_enabled
> can complete after lock has been acquired which is uninteresting.
>
> raw_spin_unlock() has RELEASE semantics so the store to intx_soft_enabled
> has to be completed before the unlock completes.
>
> So if the interrupt is on the flight then it might or might not see
> intx_soft_enabled == false. But that's true for your barrier construct
> as well.
>
> The important part is that any interrupt for this line arriving after
> synchronize_irq() has completed is guaranteed to see intx_soft_enabled
> == false.
>
> That is what you want to achieve, right?

Right.

>
>>   	for (i = 0; i < vp_dev->msix_vectors; ++i)
>>   		disable_irq(pci_irq_vector(vp_dev->pci_dev, i));
>> @@ -43,8 +47,12 @@ void vp_enable_vectors(struct virtio_device *vdev)
>>   	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
>>   	int i;
>>   
>> -	if (vp_dev->intx_enabled)
>> +	if (vp_dev->intx_enabled) {
>> +		vp_dev->intx_soft_enabled = true;
>> +		/* ensure the vp_interrupt see this intx_soft_enabled value */
>> +		smp_wmb();
> For the enable case the barrier is pointless vs. intx_soft_enabled
>
> CPU 0                                           CPU 1
>
> interrupt                                       vp_enable_vectors()
>    vp_interrupt()
>      if (!vp_dev->intx_soft_enabled)
>         return IRQ_NONE;
>                                                   
vp_dev->intx_soft_enabled = true;
>
> IOW, the concurrent interrupt might or might not see the store. That's
> not a problem for legacy PCI interrupts. If it did not see the store and
> the interrupt originated from that device then it will account it as one
> spurious interrupt which will get raised again because those interrupts
> are level triggered and nothing acknowledged it at the device level.

I see.

>
> Now, what's more interesting is that is has to be guaranteed that the
> interrupt which observes
>
>          vp_dev->intx_soft_enabled == true
>
> also observes all preceeding stores, i.e. those which make the interrupt
> handler capable of handling the interrupt.
>
> That's the real problem and for that your barrier is at the wrong place
> because you want to make sure that those stores are visible before the
> store to intx_soft_enabled becomes visible, i.e. this should be:
>
>
>          /* Ensure that all preceeding stores are visible before
intx_soft_enabled */
> 	smp_wmb();
> 	vp_dev->intx_soft_enabled = true;

Yes, I see.

>
> Now Micheal is not really enthusiatic about the barrier in the interrupt
> handler hotpath, which is understandable.
>
> As the device startup is not really happening often it's sensible to do
> the following
>
>          disable_irq();
>          vp_dev->intx_soft_enabled = true;
>          enable_irq();
>
> because:
>
>          disable_irq()
>            synchronize_irq()
>
> acts as a barrier for the preceeding stores:
>
>          disable_irq()
>     	  raw_spin_lock(desc->lock);
>            __disable_irq(desc);
>     	  raw_spin_unlock(desc->lock);
>
>            synchronize_irq()
>              do {
>     	      raw_spin_lock(desc->lock);
>                in_progress = check_inprogress(desc);
>     	      raw_spin_unlock(desc->lock);
>              } while (in_progress);
>
>          intx_soft_enabled = true;
>
>          enable_irq();
>
> In this case synchronize_irq() prevents the subsequent store to
> intx_soft_enabled to leak into the __disable_irq(desc) section which in
> turn makes it impossible for an interrupt handler to observe
> intx_soft_enabled == true before the prerequisites which preceed the
> call to disable_irq() are visible.
>
> Of course the memory ordering wizards might disagree, but if they do,
> then we have a massive chase of ordering problems vs. similar constructs
> all over the tree ahead of us.
>
>  From the interrupt perspective the sequence:
>
>          disable_irq();
>          vp_dev->intx_soft_enabled = true;
>          enable_irq();
>
> is perfectly fine as well. Any interrupt arriving during the disabled
> section will be reraised on enable_irq() in hardware because it's a
> level interrupt. Any resulting failure is either a hardware or a
> hypervisor bug.

Thanks a lot for the detail clarifications. Will switch to use 
disable_irq()/enable_irq() if no objection from memory ordering wizards.

>
> Thanks,
>
>          tglx
>

Boqun Feng

2021-Sep-14 09:34 UTC

head link

[PATCH 7/9] virtio-pci: harden INTX interrupts

On Mon, Sep 13, 2021 at 11:36:24PM +0200, Thomas Gleixner wrote:
[...]> As the device startup is not really happening often it's sensible to do
> the following
> 
>         disable_irq();
>         vp_dev->intx_soft_enabled = true;
>         enable_irq();
> 
> because:
> 
>         disable_irq()
>           synchronize_irq()
> 
> acts as a barrier for the preceeding stores:
> 
>         disable_irq()
>    	  raw_spin_lock(desc->lock);
>           __disable_irq(desc);
>    	  raw_spin_unlock(desc->lock);
> 
>           synchronize_irq()
>             do {
>    	      raw_spin_lock(desc->lock);
>               in_progress = check_inprogress(desc);
>    	      raw_spin_unlock(desc->lock);
>             } while (in_progress);     
> 
>         intx_soft_enabled = true;
> 
>         enable_irq();
> 
> In this case synchronize_irq() prevents the subsequent store to
> intx_soft_enabled to leak into the __disable_irq(desc) section which in
> turn makes it impossible for an interrupt handler to observe
> intx_soft_enabled == true before the prerequisites which preceed the
> call to disable_irq() are visible.
> 
Right. In our memory model, raw_spin_unlock(desc->lock) +
raw_spin_lock(desc->lock) provides the so-call RCtso ordering, that is
for the following code:

	A
	...
	raw_spin_unlock(desc->lock);
	...
	raw_spin_lock(desc->lock);
	...
	B

Memory accesses A and B will not be reordered unless A is a store and B
is a load. Such an ordering guarantee fulfils the requirement here.

For more information, see the LOCKING section of
tools/memory-model/Documentation/explanation.txt

Regards,
Boqun
> Of course the memory ordering wizards might disagree, but if they do,
> then we have a massive chase of ordering problems vs. similar constructs
> all over the tree ahead of us.
> 
> From the interrupt perspective the sequence:
> 
>         disable_irq();
>         vp_dev->intx_soft_enabled = true;
>         enable_irq();
> 
> is perfectly fine as well. Any interrupt arriving during the disabled
> section will be reraised on enable_irq() in hardware because it's a
> level interrupt. Any resulting failure is either a hardware or a
> hypervisor bug.
> 
> Thanks,
> 
>         tglx

Peter Zijlstra

2021-Sep-14 11:03 UTC

head link

[PATCH 7/9] virtio-pci: harden INTX interrupts

On Mon, Sep 13, 2021 at 11:36:24PM +0200, Thomas Gleixner wrote:
> That's the real problem and for that your barrier is at the wrong place
> because you want to make sure that those stores are visible before the
> store to intx_soft_enabled becomes visible, i.e. this should be:
> 
> 
>         /* Ensure that all preceeding stores are visible before
intx_soft_enabled */
> 	smp_wmb();
> 	vp_dev->intx_soft_enabled = true;
That arguably wants to be smp_store_release() instead of smp_wmb() :-)
> Now Micheal is not really enthusiatic about the barrier in the interrupt
> handler hotpath, which is understandable.
> 
> As the device startup is not really happening often it's sensible to do
> the following
> 
>         disable_irq();
>         vp_dev->intx_soft_enabled = true;
>         enable_irq();
> 
> because:
> 
>         disable_irq()
>           synchronize_irq()
> 
> acts as a barrier for the preceeding stores:
> 
>         disable_irq()
>    	  raw_spin_lock(desc->lock);
>           __disable_irq(desc);
>    	  raw_spin_unlock(desc->lock);
> 
>           synchronize_irq()
>             do {
>    	      raw_spin_lock(desc->lock);
>               in_progress = check_inprogress(desc);
>    	      raw_spin_unlock(desc->lock);
>             } while (in_progress);     
Here you rely on the UNLOCK+LOCK pattern because we have two adjacent
critical sections (or rather, the same twice), which provides RCtso
ordering, which is sufficient to make the below store:
> 
>         intx_soft_enabled = true;
a RELEASE. still, I would suggest writing it at least using
WRITE_ONCE() with a comment on.

	disable_irq();
	/*
	 * The above disable_irq() provides TSO ordering and as such
	 * promotes the below store to store-release.
	 */
	WRITE_ONCE(intx_soft_enabled, true);
	enable_irq();
> In this case synchronize_irq() prevents the subsequent store to
> intx_soft_enabled to leak into the __disable_irq(desc) section which in
> turn makes it impossible for an interrupt handler to observe
> intx_soft_enabled == true before the prerequisites which preceed the
> call to disable_irq() are visible.
> 
> Of course the memory ordering wizards might disagree, but if they do,
> then we have a massive chase of ordering problems vs. similar constructs
> all over the tree ahead of us.
Your case, UNLOCK s + LOCK s, is fully documented to provide RCtso
ordering. The more general case of: UNLOCK r + LOCK s, will shortly
appear in documentation near you. Meaning we can forget about the
details an blanket state that any UNLOCK followed by a LOCK (on the same
CPU) will provide TSO ordering.

Linux Virtualization - Sep 2021 - [PATCH 7/9] virtio-pci: harden INTX interrupts

[PATCH 7/9] virtio-pci: harden INTX interrupts

[PATCH 7/9] virtio-pci: harden INTX interrupts

[PATCH 7/9] virtio-pci: harden INTX interrupts

[PATCH 7/9] virtio-pci: harden INTX interrupts

[PATCH 7/9] virtio-pci: harden INTX interrupts