Wei Liu
2011-Jun-23 07:21 UTC
[Xen-devel] [RFC] implement "trap-process-return"-like behavior with grant ring + evtchn
Hi, all As you all know, I''m implementing pure-xen virtio. I mainly pay attention to its transport layer. In hvm case, the transport layer is implemented as virtual PCI. Now I''m about to replace this layer with grant ring + evtchn implementation. In the hvm case with virtual PCI transport layer, virtio works as followed: 1. guest writes to PCI cfg space to get trapped by hypervisor 2. backend dispatches and processes request 3. return to guest This is a *synchronous* communication. However, evtchn is by designed *asynchronous*, which means if we write our requests to the ring and notify the other end, we have no idea when it will get processed. This will cause race conditions. Say: 1. FE changes config: that is, write to configuration space, and notify the other end via evtchn; 2. FE immediately reads configuration space, the change may not have been acked or recorded by BE. As a result, FE / BE states are inconsistent. NOTE: Here by "configuration space" I mean the internal states of devices, not limited to PCI configuration space. Stefano and IanC suggest FE spin-wait for an answer from BE. I come up with my rough design, I would like to ask you for advice. This is how I would do it. Initialization: 1. setup a evtchn (cfg-evtchn) and two grant rings (fe-to-be / be-to-fe). 2. zero-out two rings. FE 1. puts requests to fe-to-be ring, then notifies BE via cfg-evtchn. 2. spin waits for exactly one answer in be-to-fe ring, otherwise BUG(). 3. consumes that answer and reset be-to-fe ring. BE 1. gets notified. 2. check if there is exactly one request in the ring, otherwise BUG(). 3. consumes the request in fe-to-be ring. 4. writes back answer in be-to-fe ring. As you can see, cfg-evtchn is only used when doing fe-to-be notification. This allows BE to relax. This is only a rough design, I haven''t implemented it. If anyone has better idea or spots any problem with this design, please let me know. Your advice is always welcomed. Wei. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Jun-23 07:42 UTC
[Xen-devel] Re: [RFC] implement "trap-process-return"-like behavior with grant ring + evtchn
On Thu, 2011-06-23 at 08:21 +0100, Wei Liu wrote:> Hi, all > > As you all know, I''m implementing pure-xen virtio. I mainly pay > attention to its transport layer. In hvm case, the transport layer is > implemented as virtual PCI. Now I''m about to replace this layer with > grant ring + evtchn implementation. > > In the hvm case with virtual PCI transport layer, virtio works as followed: > 1. guest writes to PCI cfg space to get trapped by hypervisor > 2. backend dispatches and processes request > 3. return to guest > This is a *synchronous* communication. > > However, evtchn is by designed *asynchronous*, which means if we write > our requests to the ring and notify the other end, we have no idea > when it will get processed.> This will cause race conditions. Say: > > 1. FE changes config: that is, write to configuration space, and > notify the other end via evtchn; > 2. FE immediately reads configuration space, the change may not have > been acked or recorded by BE. > As a result, FE / BE states are inconsistent. > > NOTE: Here by "configuration space" I mean the internal states of > devices, not limited to PCI configuration space. > > Stefano and IanC suggest FE spin-wait for an answer from BE. I come up > with my rough design, I would like to ask you for advice. > > This is how I would do it. > > Initialization: > 1. setup a evtchn (cfg-evtchn) and two grant rings (fe-to-be / be-to-fe). > 2. zero-out two rings. > > FE > 1. puts requests to fe-to-be ring, then notifies BE via cfg-evtchn. > 2. spin waits for exactly one answer in be-to-fe ring, otherwise BUG(). > 3. consumes that answer and reset be-to-fe ring. > > BE > 1. gets notified. > 2. check if there is exactly one request in the ring, otherwise BUG(). > 3. consumes the request in fe-to-be ring. > 4. writes back answer in be-to-fe ring. > > As you can see, cfg-evtchn is only used when doing fe-to-be > notification. This allows BE to relax.You don''t need to spin in the frontend (Stefano just suggested this was the simplest possible thing to implement) and really you want to get the backend to notify you (via the same evtchn) when it has finished processing and arrange for the frontend to wait for that return signal before reading the response. If these CFG space operations are happening in a sleeping context (as far as the guest kernel is concerned) then you can simply treat this returning evtchn as an IRQ and use one of the kernel''s queuing/waiting/completion type primitives to sleep until your IRQ handler wakes you up. If you cannot sleep in the context these activities are happening in then I think you can use the evtchn poll hypercall to block until the evtchn is triggered -- this is no worse than the existing blocking behaviour while the PCI CFG space accesses are trapped, it''s just explicit in the code instead of hidden behind a magic I/O instruction emulation.> > This is only a rough design, I haven''t implemented it. If anyone has > better idea or spots any problem with this design, please let me know. > Your advice is always welcomed. > > Wei._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Liu
2011-Jun-23 08:01 UTC
[Xen-devel] Re: [RFC] implement "trap-process-return"-like behavior with grant ring + evtchn
On Thu, Jun 23, 2011 at 3:42 PM, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:> You don''t need to spin in the frontend (Stefano just suggested this was > the simplest possible thing to implement) and really you want to get the > backend to notify you (via the same evtchn) when it has finished > processing and arrange for the frontend to wait for that return signal > before reading the response. > > If these CFG space operations are happening in a sleeping context (as > far as the guest kernel is concerned) then you can simply treat this > returning evtchn as an IRQ and use one of the kernel''s > queuing/waiting/completion type primitives to sleep until your IRQ > handler wakes you up. >I would rather not sleep in the transport layer. I''m not sure if it will get called from some unsleepable context.> If you cannot sleep in the context these activities are happening in > then I think you can use the evtchn poll hypercall to block until the > evtchn is triggered -- this is no worse than the existing blocking > behaviour while the PCI CFG space accesses are trapped, it''s just > explicit in the code instead of hidden behind a magic I/O instruction > emulation. >This evtchn poll hypercall seems the right answer to me. I haven''t noticed this hypercall before. I will investigate more. Thank you. Wei. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei Liu
2011-Jun-23 08:40 UTC
[Xen-devel] Re: [RFC] implement "trap-process-return"-like behavior with grant ring + evtchn
On Thu, Jun 23, 2011 at 3:42 PM, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote:> If you cannot sleep in the context these activities are happening in > then I think you can use the evtchn poll hypercall to block until the > evtchn is triggeredI looked through do_event_channel_op and didn''t see a polling operation. Which hypercall are you referring to? Thanks. Wei. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Jun-23 08:57 UTC
[Xen-devel] Re: [RFC] implement "trap-process-return"-like behavior with grant ring + evtchn
On Thu, 2011-06-23 at 09:40 +0100, Wei Liu wrote:> On Thu, Jun 23, 2011 at 3:42 PM, Ian Campbell > <Ian.Campbell@eu.citrix.com> wrote: > > If you cannot sleep in the context these activities are happening in > > then I think you can use the evtchn poll hypercall to block until the > > evtchn is triggered > > I looked through do_event_channel_op and didn''t see a polling > operation. Which hypercall are you referring to?Oh, I forgot -- it''s a schedop, see SCHEDOP_poll in xen/include/public/sched.h. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel