Yosuke Iwamatsu
2008-Feb-21 11:00 UTC
[Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
Hi all, This patch set adds passthrough PCI device hotplug support for PV driver domains. I began working on this some little time ago and developed independently of the HVM PCI device hotplug code. Now that HVM PCI device hotplug was checked in, some modifications are needed to apply this patch to current xen-unstable. Please see below for detailed descriptions. Comments are welcome. [Usage] To hot add a new PCI device ''0000:00:1d.0'', xm pci-attach <PV domain> 0000:00:1d.0 To hot remove a PCI device ''0000:00:1d.0'', xm pci-detach <PV domain> 0000:00:1d.0 [Implementation details] - Interface changes New xenbus states "Reconfiguring" and "Reconfigured" are introduced. - Xenstore changes Substates(state-#) and virtual pci slots(vdev-#) entries are added for pciback driver. For example, when PCI devices 00:1d.0 and 00:1d.1 are connected, xenstore-ls will show information like this. pci = "" 3 = "" 0 = "" ... state = "4" dev-0 = "0000:00:1d.00" vdev-0 = "0000:00:00.00" state-0 = "3" dev-1 = "0000:00:1d.1" vdev-1 = "0000:00:00.01" state-1 = "3" ... - Attach sequence 1) User executes xm pci-attach. 2) If there''s no pcidev, Xend create new one. Otherwise, Xend does following things: 2a) write new dev-# entry on xenstore 2b) write new state-# entry and set it as "Initialising" 2c) enable io resources 2d) switch pciback state from "Connected" to "Reconfiguring" 3) pcifront detects backend change and switch its state from "Connected" to "Reconfiguring". 4) pciback detects frontend change and: 4a) scan xenstore and find the device being attached 4b) export the device, write vdev-# entry on xenstore and change state-# to "Initialized" 4c) switch its state from "Reconfiguring" to "Reconfigured" 5) pcifront detects backend change and: 5a) rescan the pcibus for the attached device and enable it. 5b) switch its state from "Reconfiguring" to "Connected" 6) pciback detects pcifront change and switch its state from "Reconfigured" to "Connected". - Detach sequence 1) User executes xm pci-detach. 2) If the specified device exists, Xend does following things: 2a) change the device''s substate(state-#) to "Closing" 2b) switch pciback state from "Connected" to "Reconfiguring" 3) pcifront detects backend change and: 3a) scan xenstore and find the device being detached (the virtual pci slot can be identified by vdev-# entry) 3b) remove the device 3c) switch its state from "Connected" to "Reconfiguring" 4) pciback detects frontend change and: 4a) scan xenstore and find the device being detached 4b) remove the device 4c) switch its state from "Reconfiguring" to "Reconfigured" 5) pcifront detects backend change and switch its state from "Reconfiguring" to "Connected". 6) pciback detects pcifront change and switch its state from "Reconfigured" to "Connected". 7) Xend, who has been watching on the xenbus state, detects backend change and: 7a) cleanup xenstore entries 7b) disable io resources 7c) if there''s no device left, destroy pcidev - Limitations Hotplug currently only works when pciback is compiled with CONFIG_XEN_PCIDEV_BACKEND_VPCI or CONFIG_XEN_PCIDEV_BACKEND_SLOT. Thanks, ------------------- Yosuke Iwamatsu NEC Corporation _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Feb-21 11:09 UTC
Re: [Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
On 21/2/08 11:00, "Yosuke Iwamatsu" <y-iwamatsu@ab.jp.nec.com> wrote:> - Interface changes > New xenbus states "Reconfiguring" and "Reconfigured" are introduced.Not sure about this. If this is just to flag changes to individual devices, could pcifront watch individual device nodes? Or at least I think a separate configuration xenstore node would be sensible, with its own state-machine enumeration. I''m reluctant to mess with the main state node as we currently have something that works!> - Xenstore changes > Substates(state-#) and virtual pci slots(vdev-#) entries are added > for pciback driver. For example, when PCI devices 00:1d.0 and 00:1d.1 > are connected, xenstore-ls will show information like this.Why vdev-#? Aren''t the dev-# names already the virtual names (mapped to physical slots transparently by pciback)? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yosuke Iwamatsu
2008-Feb-21 12:37 UTC
Re: [Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
Keir Fraser wrote:> On 21/2/08 11:00, "Yosuke Iwamatsu" <y-iwamatsu@ab.jp.nec.com> wrote: > >> - Interface changes >> New xenbus states "Reconfiguring" and "Reconfigured" are introduced. > > Not sure about this. If this is just to flag changes to individual devices, > could pcifront watch individual device nodes? Or at least I think a separate > configuration xenstore node would be sensible, with its own state-machine > enumeration. I''m reluctant to mess with the main state node as we currently > have something that works!PCI device attach/detach may result in changing several nodes such as "num_devs", "root-#" and "root_num". So I thought switching the main state might be suitable here. If you are unwilling to change the main state node, watching individual device nodes may be a possible solution. In that case, we prepare pci slots for hotplug and make pciback/pcifront watch all the states of these slots. But I wonder if it is acceptable that pv drivers watch lots of xenstore nodes. I''m not sure about the idea to have a separate configuration node per device. Does that mean having pv drivers for each pci device?> >> - Xenstore changes >> Substates(state-#) and virtual pci slots(vdev-#) entries are added >> for pciback driver. For example, when PCI devices 00:1d.0 and 00:1d.1 >> are connected, xenstore-ls will show information like this. > > Why vdev-#? Aren''t the dev-# names already the virtual names (mapped to > physical slots transparently by pciback)?dev-# names are the physical names. vdev-# becomes the same as dev-# when compiled with PCIDEV_BACKEND_PASS, but different when compiled with other options (VPCI, SLOT). vdev-# names are necessary for pcifront to recognize which devices are to be detached. Thanks, Yosuke> > -- Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Feb-21 12:50 UTC
Re: [Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
On 21/2/08 12:37, "Yosuke Iwamatsu" <y-iwamatsu@ab.jp.nec.com> wrote:> If you are unwilling to change the main state node, > watching individual device nodes may be a possible solution. > In that case, we prepare pci slots for hotplug and make pciback/pcifront > watch all the states of these slots. But I wonder if it is acceptable > that pv drivers watch lots of xenstore nodes. > > I''m not sure about the idea to have a separate configuration node > per device. Does that mean having pv drivers for each pci device?I mean have an extra global node called e.g., reconfigure. Set to 1 when pciback has updated hotplug info, causes pcifront to set reconfigure back to 0 and then re-scan the xenstore directory. Would that work?> dev-# names are the physical names. > vdev-# becomes the same as dev-# when compiled with PCIDEV_BACKEND_PASS, > but different when compiled with other options (VPCI, SLOT). > vdev-# names are necessary for pcifront to recognize which devices are > to be detached.Why would pcifront need to know the physical name? I would think that dev-# should be the virtual name. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yosuke Iwamatsu
2008-Feb-21 13:25 UTC
Re: [Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
Keir Fraser wrote:> I mean have an extra global node called e.g., reconfigure. Set to 1 when > pciback has updated hotplug info, causes pcifront to set reconfigure back to > 0 and then re-scan the xenstore directory. Would that work?Understood. Perhaps I''ll give it a try in that manner.> >> dev-# names are the physical names. >> vdev-# becomes the same as dev-# when compiled with PCIDEV_BACKEND_PASS, >> but different when compiled with other options (VPCI, SLOT). >> vdev-# names are necessary for pcifront to recognize which devices are >> to be detached. > > Why would pcifront need to know the physical name? I would think that dev-# > should be the virtual name.dev-#, vdev-# and state-# are all backend nodes. Xend writes physical names on dev-# to let pciback know which device should be exported (This is the original behaviour). Then pciback publishes the corresponding virtul name on vdev-#. At the time of detachment, pcifront scans backend nodes and finds which device should be removed by seeing vdev-#. pcifront doesn''t need to know the physical name indeed. ---- Yosuke> > -- Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Feb-21 13:45 UTC
Re: [Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
On 21/2/08 13:25, "Yosuke Iwamatsu" <y-iwamatsu@ab.jp.nec.com> wrote:> Keir Fraser wrote: >> I mean have an extra global node called e.g., reconfigure. Set to 1 when >> pciback has updated hotplug info, causes pcifront to set reconfigure back to >> 0 and then re-scan the xenstore directory. Would that work? > > Understood. Perhaps I''ll give it a try in that manner.> dev-#, vdev-# and state-# are all backend nodes. > Xend writes physical names on dev-# to let pciback know which device > should be exported (This is the original behaviour). > Then pciback publishes the corresponding virtul name on vdev-#. > At the time of detachment, pcifront scans backend nodes and finds > which device should be removed by seeing vdev-#. > pcifront doesn''t need to know the physical name indeed.This seems a bit different from how initial probe happens (by reading the root-%d xenstore nodes). Could we not just watch those in pcifront and re-parse them for changes? Your state machine was also quite complicated, perhaps primarily so that pcifront could handshake when devices are removed. Is this handshake necessary? (e.g., in particular, is there any equivalent notion for real PCI hot-unplug?). If not, we can get equivalent hotplug functionality between HVM and PV without needing a complicated xend->frontend->backend->xend handshaking ring: * xend updates dev-% nodes * pciback sees this and updates root-%d nodes * pcifront sees this and re-parses root-%d nodes -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yosuke Iwamatsu
2008-Feb-22 02:03 UTC
Re: [Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
Keir Fraser wrote:> On 21/2/08 13:25, "Yosuke Iwamatsu" <y-iwamatsu@ab.jp.nec.com> wrote: >> dev-#, vdev-# and state-# are all backend nodes. >> Xend writes physical names on dev-# to let pciback know which device >> should be exported (This is the original behaviour). >> Then pciback publishes the corresponding virtul name on vdev-#. >> At the time of detachment, pcifront scans backend nodes and finds >> which device should be removed by seeing vdev-#. >> pcifront doesn''t need to know the physical name indeed. > > This seems a bit different from how initial probe happens (by reading the > root-%d xenstore nodes). Could we not just watch those in pcifront and > re-parse them for changes? > > Your state machine was also quite complicated, perhaps primarily so that > pcifront could handshake when devices are removed. Is this handshake > necessary? (e.g., in particular, is there any equivalent notion for real PCI > hot-unplug?). If not, we can get equivalent hotplug functionality between > HVM and PV without needing a complicated xend->frontend->backend->xend > handshaking ring: > * xend updates dev-% nodes > * pciback sees this and updates root-%d nodes > * pcifront sees this and re-parses root-%d nodesWhen hot add, this method (xend->pciback->pcifront) works. And I did that way in my patch. When hot remove, however, I''m afraid it''s not safe. xend->pciback->pcifront means that we first disable io/mmio port access, destroy backend config space emulation and then notify the guest OS of the removal. The guest OS would see something like a virtual surprise-style removal, that is, a device suddenly disappears while it is in use. I confess that I have never done a real PCI hot-unplug myself, but several specs (acpi spec 3.0b 6.3, pcihp spec 1.1) say that a recommended hot removal sequence is like: 1. the user notifys the OS of the desire to remove a slot 2. the OS logicaly removes the device (unload driver, poweroff, etc) and report the user it''s ready 3. the user removes the slot So the reason I added vdev-# and I chose xend->frontend->backend->xend cicle is the same; let the guest OS cleanly shutdown the device first and then actually remove it. I think it''s quite essential. Thanks, Yosuke> > -- Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Feb-22 08:01 UTC
Re: [Xen-devel] [PATCH 0/3][RFC] PV Passthrough PCI Device Hotplug Support
On 22/2/08 02:03, "Yosuke Iwamatsu" <y-iwamatsu@ab.jp.nec.com> wrote:> I confess that I have never done a real PCI hot-unplug myself, but > several specs (acpi spec 3.0b 6.3, pcihp spec 1.1) say that a > recommended hot removal sequence is like: > 1. the user notifys the OS of the desire to remove a slot > 2. the OS logicaly removes the device (unload driver, poweroff, etc) > and report the user it''s ready > 3. the user removes the slot > So the reason I added vdev-# and I chose xend->frontend->backend->xend > cicle is the same; let the guest OS cleanly shutdown the device first > and then actually remove it. I think it''s quite essential.Okay, and it does look like Linux makes sensible use of the notification. I''ve dug into your patches a bit and actually I think they''re fine as is. I don''t think there''s much can be done that would significantly simplify them. I can even live with the extra xenbus states! I''ll check the patches in today. Thanks, Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel