Hello Simon, I took the source as per you message: http://marc.info/?l=xen-devel&m=124748015304566&w=4 compiled and run it on an Intel-DQ35JO, Fedora-10. When I try to pass pci device through at boot time in configuration file, there''s a race between xend and qemu accessing xenstore. Xend waits in signalDeviceModel(...) for qemu to declare ''running'' then write to the dm-command pipe the devices to be passed-through. On the qemu side, it poses a watch on /local/domain/0/device-model/2/command and expects the dm-command from there, by calling xs_watch(...). xs_watch(...) causes xenstored to run do_watch(...) and at the end, run add_event(...) with the following comment: /* We fire once up front: simplifies clients and restart. */ The problem shows when xend is faster, detecting qemu ''running'' state, and calls xstransact.Store adn writes to the command pipe, before qemu can call main_loop_wait(...) and run one empty loop on the command pipe. This write causes xenstored to run a fires_watch, thus another add_event(...). The problem shows in qemu log by an extra dm-command, using wrong parameter and fails to initialize, for instance: ... xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command read_message: msg type reply pci-ins dm-command: hot insert pass-through pci dev read_message: msg type reply 0000:00:1b.0@100 register_real_device: Assigning real physical device 00:1b.0 ... pt_register_regions: IO region registered (size=0x00004000 base_addr=0x90420004) pt_msi_setup: msi mapped with pirq ff register_real_device: Real physical device 00:1b.0 registered successfuly! IRQ type = MSI-INTx read_message: msg type reply OK read_message: msg type reply OK xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command read_message: msg type reply pci-ins dm-command: hot insert pass-through pci dev read_message: msg type reply 0x20 hot add pci devfn -1 exceed. read_message: msg type reply OK ... On the xend side: ... (bdf_str, vdevfn)) VmError: Cannot pass-through PCI function ''0000:00:1b.0@100''. Device model reported an error: no free hotplug devfn [2009-10-13 10:45:10 4174] ERROR (XendDomainInfo:471) VM start failed Traceback (most recent call last): ... Thank you. Phung-Te _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Oct-14 13:34 UTC
Re: [Xen-devel] pci device hotplug, race accessing xenstore
On Tue, 13 Oct 2009, Phung Te Ha wrote:> Hello Simon, > > I took the source as per you message: http://marc.info/?l=xen-devel&m=124748015304566&w=4 > > compiled and run it on an Intel-DQ35JO, Fedora-10. > > When I try to pass pci device through at boot time in configuration file, there''s a race between xend and qemu accessing > xenstore. > > Xend waits in signalDeviceModel(...) for qemu to declare ''running'' then write to the dm-command pipe the devices to be > passed-through. > > On the qemu side, it poses a watch on /local/domain/0/device-model/2/command and expects the dm-command from there, by > calling xs_watch(...). xs_watch(...) causes xenstored to run do_watch(...) and at the end, run add_event(...) with the > following comment: > /* We fire once up front: simplifies clients and restart. */ > > > The problem shows when xend is faster, detecting qemu ''running'' state, and calls xstransact.Store adn writes to the > command pipe, before qemu can call main_loop_wait(...) and run one empty loop on the command pipe. This write causes > xenstored to run a fires_watch, thus another add_event(...). > The problem shows in qemu log by an extra dm-command, using wrong parameter and fails to initialize, for instance: > > ... > xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command > read_message: msg type reply pci-ins > dm-command: hot insert pass-through pci dev > read_message: msg type reply 0000:00:1b.0@100 > register_real_device: Assigning real physical device 00:1b.0 ... > pt_register_regions: IO region registered (size=0x00004000 base_addr=0x90420004) > pt_msi_setup: msi mapped with pirq ff > register_real_device: Real physical device 00:1b.0 registered successfuly! > IRQ type = MSI-INTx > read_message: msg type reply OK > read_message: msg type reply OK > xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command > read_message: msg type reply pci-ins > dm-command: hot insert pass-through pci dev > read_message: msg type reply 0x20 > hot add pci devfn -1 exceed. > read_message: msg type reply OK > ... > > On the xend side: > > ... > (bdf_str, vdevfn)) > VmError: Cannot pass-through PCI function ''0000:00:1b.0@100''. Device model reported an error: no free hotplug devfn > [2009-10-13 10:45:10 4174] ERROR (XendDomainInfo:471) VM start failed > Traceback (most recent call last): > ... > >I think we should take this chance to make the pci-insert protocol more reliable. In particular we are missing the following things: - qemu shouldn''t accept any dm-command unless it is in state "running"; - xend should remove the command node on xenstore after reading state "pci-inserted" and before writing state "running" again. This way when the second xenstore watch fires the pci-ins command is never executed for a second time because either qemu is not in the right state (pci-inserted instead of running) or the command node doesn''t contain any data (it has been removed by xend). Another problem is that nothing else can happen while xend waits for the device model to be in state running, this also prevents pci coldplug from working with stubdoms. Is it possible to run signalDeviceModel in a new xend Thread? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Simon Horman
2009-Oct-14 22:49 UTC
Re: [Xen-devel] pci device hotplug, race accessing xenstore
On Wed, Oct 14, 2009 at 02:34:35PM +0100, Stefano Stabellini wrote:> On Tue, 13 Oct 2009, Phung Te Ha wrote: > > Hello Simon, > > > > I took the source as per you message: http://marc.info/?l=xen-devel&m=124748015304566&w=4 > > > > compiled and run it on an Intel-DQ35JO, Fedora-10. > > > > When I try to pass pci device through at boot time in configuration file, there''s a race between xend and qemu accessing > > xenstore. > > > > Xend waits in signalDeviceModel(...) for qemu to declare ''running'' then write to the dm-command pipe the devices to be > > passed-through. > > > > On the qemu side, it poses a watch on /local/domain/0/device-model/2/command and expects the dm-command from there, by > > calling xs_watch(...). xs_watch(...) causes xenstored to run do_watch(...) and at the end, run add_event(...) with the > > following comment: > > /* We fire once up front: simplifies clients and restart. */ > > > > > > The problem shows when xend is faster, detecting qemu ''running'' state, and calls xstransact.Store adn writes to the > > command pipe, before qemu can call main_loop_wait(...) and run one empty loop on the command pipe. This write causes > > xenstored to run a fires_watch, thus another add_event(...). > > The problem shows in qemu log by an extra dm-command, using wrong parameter and fails to initialize, for instance: > > > > ... > > xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command > > read_message: msg type reply pci-ins > > dm-command: hot insert pass-through pci dev > > read_message: msg type reply 0000:00:1b.0@100 > > register_real_device: Assigning real physical device 00:1b.0 ... > > pt_register_regions: IO region registered (size=0x00004000 base_addr=0x90420004) > > pt_msi_setup: msi mapped with pirq ff > > register_real_device: Real physical device 00:1b.0 registered successfuly! > > IRQ type = MSI-INTx > > read_message: msg type reply OK > > read_message: msg type reply OK > > xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command > > read_message: msg type reply pci-ins > > dm-command: hot insert pass-through pci dev > > read_message: msg type reply 0x20 > > hot add pci devfn -1 exceed. > > read_message: msg type reply OK > > ... > > > > On the xend side: > > > > ... > > (bdf_str, vdevfn)) > > VmError: Cannot pass-through PCI function ''0000:00:1b.0@100''. Device model reported an error: no free hotplug devfn > > [2009-10-13 10:45:10 4174] ERROR (XendDomainInfo:471) VM start failed > > Traceback (most recent call last): > > ... > > > > > > I think we should take this chance to make the pci-insert protocol more > reliable. > In particular we are missing the following things: > > - qemu shouldn''t accept any dm-command unless it is in state "running"; > > - xend should remove the command node on xenstore after reading > state "pci-inserted" and before writing state "running" again. > > This way when the second xenstore watch fires the pci-ins command is > never executed for a second time because either qemu is not in the right > state (pci-inserted instead of running) or the command node doesn''t > contain any data (it has been removed by xend).My memory of that code is a bit hazy, but that sounds like a good idea.> Another problem is that nothing else can happen while xend waits for the > device model to be in state running, this also prevents pci coldplug > from working with stubdoms. > Is it possible to run signalDeviceModel in a new xend Thread?I''m interested to hear a comment on what the status of the Ocaml replacement for xend is. It seems silly to spend time fixing up the python code - there is ample scope for fixing - if a replacement is in the wings. In particular, I''m refering to the toolstack.git XCI tree. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Oct-16 12:37 UTC
Re: [Xen-devel] pci device hotplug, race accessing xenstore
On Wed, 14 Oct 2009, Simon Horman wrote:> > > > I think we should take this chance to make the pci-insert protocol more > > reliable. > > In particular we are missing the following things: > > > > - qemu shouldn''t accept any dm-command unless it is in state "running"; > > > > - xend should remove the command node on xenstore after reading > > state "pci-inserted" and before writing state "running" again. > > > > This way when the second xenstore watch fires the pci-ins command is > > never executed for a second time because either qemu is not in the right > > state (pci-inserted instead of running) or the command node doesn''t > > contain any data (it has been removed by xend). > > My memory of that code is a bit hazy, but that sounds like a good idea.Do you think you''ll find the time to fix the first two issues, or someone else should do it? I should be able to find a solution for the stubdom coldplug problem myself.> > Another problem is that nothing else can happen while xend waits for the > > device model to be in state running, this also prevents pci coldplug > > from working with stubdoms. > > Is it possible to run signalDeviceModel in a new xend Thread? > > I''m interested to hear a comment on what the status of the Ocaml > replacement for xend is. It seems silly to spend time fixing up the > python code - there is ample scope for fixing - if a replacement > is in the wings. In particular, I''m refering to the toolstack.git > XCI tree. > >Surely no replacement is going to be ready for the 3.5 release and I think if anything is going to happen it will probably be a smooth transition rather than an abrupt change. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Simon Horman
2009-Oct-16 13:06 UTC
Re: [Xen-devel] pci device hotplug, race accessing xenstore
On Fri, Oct 16, 2009 at 01:37:11PM +0100, Stefano Stabellini wrote:> On Wed, 14 Oct 2009, Simon Horman wrote: > > > > > > I think we should take this chance to make the pci-insert protocol more > > > reliable. > > > In particular we are missing the following things: > > > > > > - qemu shouldn''t accept any dm-command unless it is in state "running"; > > > > > > - xend should remove the command node on xenstore after reading > > > state "pci-inserted" and before writing state "running" again. > > > > > > This way when the second xenstore watch fires the pci-ins command is > > > never executed for a second time because either qemu is not in the right > > > state (pci-inserted instead of running) or the command node doesn''t > > > contain any data (it has been removed by xend). > > > > My memory of that code is a bit hazy, but that sounds like a good idea. > > Do you think you''ll find the time to fix the first two issues, or > someone else should do it?I will try and find some time to address these issues. Though I am travelling for the next ~2 weeks, so its unlikely to be during that time.> I should be able to find a solution for the stubdom coldplug problem > myself. > > > > Another problem is that nothing else can happen while xend waits for the > > > device model to be in state running, this also prevents pci coldplug > > > from working with stubdoms. > > > Is it possible to run signalDeviceModel in a new xend Thread? > > > > I''m interested to hear a comment on what the status of the Ocaml > > replacement for xend is. It seems silly to spend time fixing up the > > python code - there is ample scope for fixing - if a replacement > > is in the wings. In particular, I''m refering to the toolstack.git > > XCI tree. > > > > > > Surely no replacement is going to be ready for the 3.5 release and I > think if anything is going to happen it will probably be a smooth > transition rather than an abrupt change.Sure, but changes like adding new threads sound like fairly major surgery to me. Perhaps I am wrong there. My experience with the current xm/xend pass-through code is that it is very fragile and refactoring is hard to do without introducing regressions. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Oct-16 14:55 UTC
Re: [Xen-devel] pci device hotplug, race accessing xenstore
On Fri, 16 Oct 2009, Simon Horman wrote:> Sure, but changes like adding new threads sound like fairly > major surgery to me. Perhaps I am wrong there. > > My experience with the current xm/xend pass-through code is that it is very > fragile and refactoring is hard to do without introducing regressions. >True. In fact now I am exploring a different approach: adding a wait for the device model to be ready right after the creation of the stubdom. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel