thr3ads.net - Xen devel - [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses [Apr 2006]

If this information is useful, please help other people find it:
Share via:

Ryan

2006-Apr-24 13:52 UTC

[Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

Some of the Linux PCI functions called by the virtual configuration
space handlers were making calls into ACPI code which uses semaphores.
Since semaphores can not be locked while atomic (because they could
sleep), I changed the way the PCI backend responds to requests from the
frontend. Previously, the virtual configuration space handlers ran in
the same context as the event channel interrupt handler (which was often
atomic if not always atomic). Now the interrupt handler schedules a
callback function (a bottom half) in the system work queue (keventd)
that will get called in process context at a slightly later time. This
allows the handlers in the virtual configuration space to run in process
context and to call any core PCI function regardless of whether it will
sleep or not.

This patch was previously submitted as part of a larger patch. There
have been no significant changes, just renamed a variable and made this
code independent of the other patch.

Signed-off-by: Ryan Wilson <hap9@epoch.ncsc.mil>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Apr-25 08:15 UTC

head link

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

On 24 Apr 2006, at 14:52, Ryan wrote:
>  Previously, the virtual configuration space handlers ran in
> the same context as the event channel interrupt handler (which was 
> often
> atomic if not always atomic). Now the interrupt handler schedules a
> callback function (a bottom half) in the system work queue (keventd)
> that will get called in process context at a slightly later time. This
> allows the handlers in the virtual configuration space to run in 
> process
> context and to call any core PCI function regardless of whether it will
> sleep or not.
This is okay in principle, but I found the op_in_progress counter 
rather confusing and I''m not sure why it''s needed? If
it''s to prevent a
double schedule_work() call on a single PCI request then I''m not sure 
that it''s watertight. Does it need to be?

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Apr-25 09:17 UTC

head link

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

On 25 Apr 2006, at 09:15, Keir Fraser wrote:
>>  Previously, the virtual configuration space handlers ran in
>> the same context as the event channel interrupt handler (which was 
>> often
>> atomic if not always atomic). Now the interrupt handler schedules a
>> callback function (a bottom half) in the system work queue (keventd)
>> that will get called in process context at a slightly later time. This
>> allows the handlers in the virtual configuration space to run in 
>> process
>> context and to call any core PCI function regardless of whether it 
>> will
>> sleep or not.
>
> This is okay in principle, but I found the op_in_progress counter 
> rather confusing and I''m not sure why it''s needed? If
it''s to prevent
> a double schedule_work() call on a single PCI request then I''m not
> sure that it''s watertight. Does it need to be?
Let me be a bit more specific here: I think that if an interrupt is 
delivered after the work function has incremented op_in_progress, but 
before it clears _PCIF_active, then work can be scheduled erroneously 
because the IRQ handler will see atomic_dec_and_test() return TRUE.

If serialised execution of pci requests is important, and it looks like 
it is, I think the simplest solution is simply to create your own 
single-threaded workqueue and queue_work() onto that. Personally I get 
worried about using the shared workqueues anyway, as they''re another 
shared resource to deadlock on.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ryan

2006-Apr-25 13:32 UTC

head link

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

On Tue, 2006-04-25 at 10:17 +0100, Keir Fraser wrote:> On 25 Apr 2006, at 09:15, Keir Fraser wrote:
> >
> > This is okay in principle, but I found the op_in_progress counter 
> > rather confusing and I''m not sure why it''s needed?
If it''s to prevent
> > a double schedule_work() call on a single PCI request then
I''m not
> > sure that it''s watertight. Does it need to be?
> 
> Let me be a bit more specific here: I think that if an interrupt is 
> delivered after the work function has incremented op_in_progress, but 
> before it clears _PCIF_active, then work can be scheduled erroneously 
> because the IRQ handler will see atomic_dec_and_test() return TRUE.
Maybe you''re seeing something that I''m missing so let me
explain what I
think about this section of code. And please point out where you think
I''m mistaken or have made an incorrect assumption.

I added the counter to act as a lock to guarantee that each request from
the frontend would complete one at a time. This way, virtual
configuration space handlers can be guaranteed to never be running
concurrently on the same device. And they never should be because there
is only one op structure in the shared memory. When I do the atomic_inc
after completing the configuration space access, I don''t believe it
would matter if the frontend triggers an interrupt again (before
PCIF_active is cleared). In this case, the backend would be causing
another request to be processed and overwriting the return value from
the previous op. I don''t think that work would be scheduled erroneously
in this case because by triggering an interrupt with the _PCIF_active
bit still set, the frontend is basically placing a second request (but
is only affecting itself and its device by executing the request again).

I can''t put the atomic_inc after I clear the _PCIF_active bit as then
the frontend could try and immediately turn around and send a second
legitimate request that would fail because op_in_progress may not have
decremented yet.
> If serialised execution of pci requests is important, and it looks like 
> it is, I think the simplest solution is simply to create your own 
> single-threaded workqueue and queue_work() onto that. Personally I get 
> worried about using the shared workqueues anyway, as they''re
another
> shared resource to deadlock on.
> 
Can you give me an example of how you think I could be deadlocking on
the workqueue? I believe I''m interacting with it correctly and a quick
glance at the kernel code for workqueues themselves shows only a single
lock per workqueue. Granted, other driver/kernel code that may run in
the workqueue could deadlock that thread, but I don''t see how this is
any different from other kernel bugs that could take down the kernel (I
hope that someday the PCI Backend will run isolated in its own driver
domain). It seemed like overkill to me to create my own work queue and
kernel thread that would sit and wait for pci configuration space
requests when they will most likely be few and far between for most
devices.

Ryan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Apr-25 14:09 UTC

head link

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

On 25 Apr 2006, at 14:32, Ryan wrote:
> I can''t put the atomic_inc after I clear the _PCIF_active bit as
then
> the frontend could try and immediately turn around and send a second
> legitimate request that would fail because op_in_progress may not have
> decremented yet.
Okay, I think I see what you mean. But I think the code would be 
clearer *and* robust against spurious interrupts (event-channel 
callbacks) by doing the following:

  * turn op_in_progress into a proper boolean, but with type ''unsigned 
long''.

  * in the IRQ handler do:
      if (test_bit(_XEN_PCIF_active, &pdev->...) &&
          !test_and_set_bit(0, &pdev->op_in_progress))
         schedule_work(...);

  * at the very end of pciback_handle_event() do:
      smp_wmb(); /* finish the response /then/ flag the IRQ handler */
      pdev->in_progress = 0;
      smp_mb(); /* flag the IRQ handler /then/ check for more work */
      if (test_bit(_XEN_PCIF_active, &pdev->...) &&
          !test_and_set_bit(0, &pdev->op_in_progress))
          schedule_work(...);

I think this is better than the current implementation of 
op_in_progress as a counter with zero meaning in-progress, giving clear 
one-shot-setting semantics for op_in_progress for each queued PCI 
request.
>> If serialised execution of pci requests is important, and it looks 
>> like
>> it is, I think the simplest solution is simply to create your own
>> single-threaded workqueue and queue_work() onto that. Personally I get
>> worried about using the shared workqueues anyway, as they''re
another
>> shared resource to deadlock on.
>>
>
> Can you give me an example of how you think I could be deadlocking on
> the workqueue?
No, it was more an aside really. I think the keventd workqueues are 
totally fine in this case, but I have achieved deadlock loops involving 
the shared workqueues in my own code in the past (arguably due to 
stupid use of semaphores though).

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ryan

2006-Apr-25 17:02 UTC

head link

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

On Tue, 2006-04-25 at 15:09 +0100, Keir Fraser wrote:> On 25 Apr 2006, at 14:32, Ryan wrote:
> 
> > I can''t put the atomic_inc after I clear the _PCIF_active bit
as then
> > the frontend could try and immediately turn around and send a second
> > legitimate request that would fail because op_in_progress may not have
> > decremented yet.
> 
> Okay, I think I see what you mean. But I think the code would be 
> clearer *and* robust against spurious interrupts (event-channel 
> callbacks) by doing the following:
> 
>   * turn op_in_progress into a proper boolean, but with type
''unsigned
> long''.
> 
>   * in the IRQ handler do:
>       if (test_bit(_XEN_PCIF_active, &pdev->...) &&
>           !test_and_set_bit(0, &pdev->op_in_progress))
>          schedule_work(...);
> 
>   * at the very end of pciback_handle_event() do:
>       smp_wmb(); /* finish the response /then/ flag the IRQ handler */
>       pdev->in_progress = 0;
>       smp_mb(); /* flag the IRQ handler /then/ check for more work */
>       if (test_bit(_XEN_PCIF_active, &pdev->...) &&
>           !test_and_set_bit(0, &pdev->op_in_progress))
>           schedule_work(...);
> 
> I think this is better than the current implementation of 
> op_in_progress as a counter with zero meaning in-progress, giving clear 
> one-shot-setting semantics for op_in_progress for each queued PCI 
> request.
I agree: setting/clearing a bit is definitely clearer than the atomic
type I was using. I''ve fixed that in the attached patch.

I also see how your method is an improvement over mine in that the
_XEN_PCIF_active flag isn''t cleared when the backend is still
processing. I was willing to accept that as a consequence of the
frontend misbehaving but your method is even more robust and will
guarantee that the request will only run once and that the flag will be
set/cleared appropriately. I''ve fixed up my code and am attaching a
patch.

I also added a comment about the need to defer (to avoid sleeping in
atomic context).

--

Some of the Linux PCI functions called by the virtual configuration
space handlers were making calls into ACPI code which uses semaphores.
Since semaphores can not be locked while atomic (because they could
sleep), I changed the way the PCI backend responds to requests from the
frontend. Previously, the virtual configuration space handlers ran in
the same context as the event channel interrupt handler (which was often
atomic if not always atomic). Now the interrupt handler schedules a
callback function (a bottom half) in the system work queue (keventd)
that will get called in process context at a slightly later time. This
allows the handlers in the virtual configuration space to run in process
context and to call any core PCI function regardless of whether it will
sleep or not.

Signed-off-by: Ryan Wilson <hap9@epoch.ncsc.mil>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Apr 2006 - [PATCH] pciback: deferred handling of pci configuration space accesses

[Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses

Re: [Xen-devel] [PATCH] pciback: deferred handling of pci configuration space accesses