Alex Williamson
2007-Jun-10  18:24 UTC
[Xen-devel] [RFC][PATCH] "Controller" pcibackend and frontend extensions
On ia64, we''ve run into the case where the I/O hierarchies are more
complicated than the current set of driver domain backends can describe.
Some platforms make use of translation offsets for I/O port and MMIO
ranges.  Without knowledge of these translation offsets, devices are
unusable by driver domains.  For instance, here''s an example of a tulip
card that lives under a PCI root bus making use of an I/O port
translation:
# lspci -v -s 02:05.0
02:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev
41)
        Subsystem: Hewlett-Packard Company Unknown device 125a
        Flags: bus master, medium devsel, latency 128, IRQ 67
        I/O ports at 2001100 [size=128]
        Memory at 90102000 (32-bit, non-prefetchable) [size=1K]
        Expansion ROM at 90080000 [disabled] [size=256K]
# setpci -s 02:05.0 BASE_ADDRESS_0
00001101
# cat /proc/ioports 
...
02000000-0200ffff : PCI Bus 0000:01
  02001000-02001fff : PCI Bus #02
    02001100-0200117f : 0000:02:05.0
      02001100-0200117f : tulip
...
# cat /proc/iomem
...
80100000000-80103ffffff : PCI Bus 0000:01 I/O Ports 02000000-0200ffff
...
   I/O port spaces are of course limited to 64k, but on this system
multiple I/O port spaces are available (one per PCI root bridge in this
case).  On ia64, I/O port spaces are typically a sparse encoding of an
MMIO range.  The legacy I/O port range is decoded directly by the
processor, additional ranges are decoded by the I/O hardware.  To access
I/O port 0x1100 on this device, the driver needs to do an inb/outb to
address 0x2001100.  The kernel will then swizzle the bits to create an
MMIO transaction within the MMIO range for that set of I/O ports.
   To support this, I''ve created the "controller" backend as
shown
below.  This is unfortunately an ia64-specific backend, but I don''t see
any mechanism to generically support the kinds of things this backend
needs to do.  PCI controllers on ia64 are created to represent the PCI
root bridges found in ACPI.  These root bridge ACPI nodes have _CRS
(Current Resource Setting) methods that describe the address ranges
consumed by the bus below the root bridge.  Address ranges described
with a translation attribute make use of a translation offset to reach
the desired address.  This information must be provided to a driver
domain guest to allow it to access the devices.
   Given this architecture, the obvious choice is to create virtual PCI
buses based on controllers.  All devices physically under the same
controller are virtualized under the same domain:bus.  Within a bus,
device slots are virtualized much like the slot backend.  The tricky
part comes with how to describe the address translation for a controller
to the guest driver domain.  For this, I chose to store the information
in xenbus.  We already make use of the following keys for driver
domains:
root_num	/* Number of PCI roots exposed */
root-X		/* domain:bus information for root X */
To this, I''ve added:
root-X-resources	/* number of resources for root X */
root-X-resource-Y	/* resource umber Y for root X */
root-resource-magic	/* synchronization/versioning for resource info */
   I debated for a while how to expose the root-X-resource-Y information
and came up with a simple ASCII dump of the struct acpi_resource
returned from the ACPI _CRS method.  This isn''t quite a silly as it
sounds because the structure is a fixed size regardless of word length,
and it''s contents are largely based on fixed tables found in the ACPI
spec.  This makes it relatively immune to frequent changes.  The PCI
backend stores the ASCII byte stream of the controller resources into
xenbus, the PCI frontend then extracts the byte stream, and decodes it
back into a struct acpi_resource for use.
   The only changes to the existing code to support the frontend is a
trivial addition of passing the bus number to pcifront_init_sd() and a
hook to setup the root windows after the bus is scanned.  No changes are
required for the controller backend.
   I would like to see this backend become the default backend for ia64,
but it probably needs testing on other systems before we can make the
switch.  I would appreciate testing and/or review feedback.  To make use
of extended I/O port spaces, you''ll need the patch I sent to
xen-ia64-devel last week to register the I/O port spaces with Xen.
Thanks,
	Alex
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---
 b/linux-2.6-xen-sparse/drivers/xen/pciback/controller.c |  404 ++++++++++++++++
 linux-2.6-xen-sparse/arch/ia64/pci/pci.c                |   28 +
 linux-2.6-xen-sparse/drivers/xen/Kconfig                |   18 
 linux-2.6-xen-sparse/drivers/xen/pciback/Makefile       |    1 
 linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c      |    4 
 linux-2.6-xen-sparse/drivers/xen/pcifront/pcifront.h    |    3 
 linux-2.6-xen-sparse/include/xen/pcifront.h             |  135 +++++
 7 files changed, 582 insertions(+), 11 deletions(-)
diff -r 0cf6b75423e9 linux-2.6-xen-sparse/arch/ia64/pci/pci.c
--- a/linux-2.6-xen-sparse/arch/ia64/pci/pci.c	Mon Jun 04 14:17:54 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/ia64/pci/pci.c	Sun Jun 10 12:12:33 2007 -0600
@@ -834,3 +839,31 @@ int pci_vector_resources(int last, int n
 
 	return count;
 }
+
+#ifdef CONFIG_XEN
+void __devinit xen_add_resource(struct pci_controller *controller,
+				unsigned int domain, unsigned int bus,
+				struct acpi_resource *resource)
+{
+	struct pci_root_info info;
+	char *name;
+
+	name = kmalloc(16, GFP_KERNEL);
+	if (!name)
+		return;
+
+	sprintf(name, "PCI Bus %04x:%02x", domain, bus);
+	info.controller = controller;
+	info.name = name;
+
+	add_window(resource, &info);
+}
+EXPORT_SYMBOL(xen_add_resource);
+
+void __devinit xen_pcibios_setup_root_windows(struct pci_bus *bus,
+					      struct pci_controller *controller)
+{
+	pcibios_setup_root_windows(bus, controller);
+}
+EXPORT_SYMBOL(xen_pcibios_setup_root_windows);
+#endif
diff -r 0cf6b75423e9 linux-2.6-xen-sparse/drivers/xen/Kconfig
--- a/linux-2.6-xen-sparse/drivers/xen/Kconfig	Mon Jun 04 14:17:54 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/Kconfig	Sat Jun 09 12:52:03 2007 -0600
@@ -109,7 +109,8 @@ choice
 choice
 	prompt "PCI Backend Mode"
 	depends on XEN_PCIDEV_BACKEND
-	default XEN_PCIDEV_BACKEND_VPCI
+	default XEN_PCIDEV_BACKEND_VPCI if !IA64
+	default XEN_PCIDEV_BACKEND_CONTROLLER if IA64
 
 config XEN_PCIDEV_BACKEND_VPCI
 	bool "Virtual PCI"
@@ -138,6 +139,21 @@ config XEN_PCIDEV_BACKEND_SLOT
 	  For example, a device at 03:05.2 will be re-assigned to 00:00.0. A
 	  second device at 02:1a.1 will be re-assigned to 00:01.0.
 
+config XEN_PCIDEV_BACKEND_CONTROLLER
+	bool "Controller"
+	depends on IA64
+	---help---
+	  This PCI backend virtualizes the PCI bus topology by providing a
+	  virtual bus per PCI root device.  Devices which are physically under
+	  the same root bus will appear on the same virtual bus.  For systems
+	  with complex I/O addressing, this is the only backend which supports
+	  extended I/O port spaces and MMIO translation offsets.  This backend
+	  also supports slot virtualization.  For example, a device at
+	  0000:01:02.1 will be re-assigned to 0000:00:00.0.  A second device
+	  at 0000:02:05.0 (behind a P2P bridge on bus 0000:01) will be
+	  re-assigned to 0000:00:01.0.  A third device at 0000:16:05.0 (under
+	  a different PCI root bus) will be re-assigned to 0000:01:00.0.
+
 endchoice
 
 config XEN_PCIDEV_BE_DEBUG
diff -r 0cf6b75423e9 linux-2.6-xen-sparse/drivers/xen/pciback/Makefile
--- a/linux-2.6-xen-sparse/drivers/xen/pciback/Makefile	Mon Jun 04 14:17:54 2007
-0600
+++ b/linux-2.6-xen-sparse/drivers/xen/pciback/Makefile	Sat Jun 09 12:31:55 2007
-0600
@@ -9,6 +9,7 @@ pciback-$(CONFIG_XEN_PCIDEV_BACKEND_VPCI
 pciback-$(CONFIG_XEN_PCIDEV_BACKEND_VPCI) += vpci.o
 pciback-$(CONFIG_XEN_PCIDEV_BACKEND_SLOT) += slot.o
 pciback-$(CONFIG_XEN_PCIDEV_BACKEND_PASS) += passthrough.o
+pciback-$(CONFIG_XEN_PCIDEV_BACKEND_CONTROLLER) += controller.o
 
 ifeq ($(CONFIG_XEN_PCIDEV_BE_DEBUG),y)
 EXTRA_CFLAGS += -DDEBUG
diff -r 0cf6b75423e9 linux-2.6-xen-sparse/drivers/xen/pciback/controller.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/linux-2.6-xen-sparse/drivers/xen/pciback/controller.c	Sun Jun 10 12:07:42
2007 -0600
@@ -0,0 +1,404 @@
+/*
+ * Copyright (C) 2007 Hewlett-Packard Development Company, L.P.
+ *      Alex Williamson <alex.williamson@hp.com>
+ *
+ * PCI "Controller" Backend - virtualize PCI bus topology based on
PCI
+ * controllers.  Devices under the same PCI controller are exposed on the
+ * same virtual domain:bus.  Within a bus, device slots are virtualized
+ * to compact the bus.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+
+#include <linux/acpi.h>
+#include <linux/list.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include "pciback.h"
+
+#define PCI_MAX_BUSSES	255
+#define PCI_MAX_SLOTS	32
+
+struct controller_dev_entry {
+	struct list_head list;
+	struct pci_dev *dev;
+	unsigned int devfn;
+};
+
+struct controller_list_entry {
+	struct list_head list;
+	struct pci_controller *controller;
+	unsigned int domain;
+	unsigned int bus;
+	unsigned int next_devfn;
+	struct list_head dev_list;
+};
+
+struct controller_dev_data {
+	struct list_head list;
+	unsigned int next_domain;
+	unsigned int next_bus;
+	spinlock_t lock;
+};
+
+struct walk_info {
+	struct pciback_device *pdev;
+	int resource_count;
+	int root_num;
+};
+
+struct pci_dev *pciback_get_pci_dev(struct pciback_device *pdev,
+				    unsigned int domain, unsigned int bus,
+				    unsigned int devfn)
+{
+	struct controller_dev_data *dev_data = pdev->pci_dev_data;
+	struct controller_dev_entry *dev_entry;
+	struct controller_list_entry *cntrl_entry;
+	struct pci_dev *dev = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev_data->lock, flags);
+
+	list_for_each_entry(cntrl_entry, &dev_data->list, list) {
+		if (cntrl_entry->domain != domain ||
+		    cntrl_entry->bus != bus)
+			continue;
+
+		list_for_each_entry(dev_entry, &cntrl_entry->dev_list, list) {
+			if (devfn == dev_entry->devfn) {
+				dev = dev_entry->dev;
+				goto found;
+			}
+		}
+	}
+found:
+	spin_unlock_irqrestore(&dev_data->lock, flags);
+
+	return dev;
+}
+
+int pciback_add_pci_dev(struct pciback_device *pdev, struct pci_dev *dev)
+{
+	struct controller_dev_data *dev_data = pdev->pci_dev_data;
+	struct controller_dev_entry *dev_entry;
+	struct controller_list_entry *cntrl_entry;
+	struct pci_controller *dev_controller = PCI_CONTROLLER(dev);
+	unsigned long flags;
+	int ret = 0, found = 0;
+
+	spin_lock_irqsave(&dev_data->lock, flags);
+
+	/* Look to see if we already have a domain:bus for this controller */
+	list_for_each_entry(cntrl_entry, &dev_data->list, list) {
+		if (cntrl_entry->controller == dev_controller) {
+			found = 1;
+			break;
+		}
+	}
+
+	if (!found) {
+		cntrl_entry = kmalloc(sizeof(*cntrl_entry), GFP_ATOMIC);
+		if (!cntrl_entry) {
+			ret =  -ENOMEM;
+			goto out;
+		}
+
+		cntrl_entry->controller = dev_controller;
+		cntrl_entry->next_devfn = PCI_DEVFN(0, 0);
+
+		cntrl_entry->domain = dev_data->next_domain;
+		cntrl_entry->bus = dev_data->next_bus++;
+		if (dev_data->next_bus > PCI_MAX_BUSSES) {
+			dev_data->next_domain++;
+			dev_data->next_bus = 0;
+		}
+
+		INIT_LIST_HEAD(&cntrl_entry->dev_list);
+
+		list_add_tail(&cntrl_entry->list, &dev_data->list);
+	}
+
+	if (PCI_SLOT(cntrl_entry->next_devfn) > PCI_MAX_SLOTS) {
+		/*
+		 * While it seems unlikely, this can actually happen if
+		 * a controller has P2P bridges under it.
+		 */
+		xenbus_dev_fatal(pdev->xdev, -ENOSPC, "Virtual bus %04x:%02x"
+				 " is full, no room to export %02x:%02x.%x", 
+				 cntrl_entry->domain,
+				 cntrl_entry->bus, dev->bus->number,
+				 PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
+		ret = -ENOSPC;
+		goto out;
+	}
+
+	dev_entry = kmalloc(sizeof(*dev_entry), GFP_ATOMIC);
+	if (!dev_entry) {
+		if (list_empty(&cntrl_entry->dev_list)) {
+			list_del(&cntrl_entry->list);
+			kfree(cntrl_entry);
+		}
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	dev_entry->dev = dev;
+	dev_entry->devfn = cntrl_entry->next_devfn;
+
+	list_add_tail(&dev_entry->list, &cntrl_entry->dev_list);
+
+	cntrl_entry->next_devfn += PCI_DEVFN(1, 0);
+
+out:
+	spin_unlock_irqrestore(&dev_data->lock, flags);
+	return ret;
+}
+
+void pciback_release_pci_dev(struct pciback_device *pdev, struct pci_dev *dev)
+{
+	struct controller_dev_data *dev_data = pdev->pci_dev_data;
+	struct controller_list_entry *cntrl_entry;
+	struct controller_dev_entry *dev_entry = NULL;
+	struct pci_dev *found_dev = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev_data->lock, flags);
+
+	list_for_each_entry(cntrl_entry, &dev_data->list, list) {
+		if (cntrl_entry->controller != PCI_CONTROLLER(dev))
+			continue;
+
+		list_for_each_entry(dev_entry, &cntrl_entry->dev_list, list) {
+			if (dev_entry->dev == dev) {
+				found_dev = dev_entry->dev;
+				break;
+			}
+		}
+	}
+
+	if (!found_dev) {
+		spin_unlock_irqrestore(&dev_data->lock, flags);
+		return;
+	}
+
+	list_del(&dev_entry->list);
+	kfree(dev_entry);
+
+	if (list_empty(&cntrl_entry->dev_list)) {
+		list_del(&cntrl_entry->list);
+		kfree(cntrl_entry);
+	}
+
+	spin_unlock_irqrestore(&dev_data->lock, flags);
+	pcistub_put_pci_dev(found_dev);
+}
+
+int pciback_init_devices(struct pciback_device *pdev)
+{
+	struct controller_dev_data *dev_data;
+
+	dev_data = kmalloc(sizeof(*dev_data), GFP_KERNEL);
+	if (!dev_data)
+		return -ENOMEM;
+
+	spin_lock_init(&dev_data->lock);
+
+	INIT_LIST_HEAD(&dev_data->list);
+
+	/* Starting domain:bus numbers */
+	dev_data->next_domain = 0;
+	dev_data->next_bus = 0;
+
+	pdev->pci_dev_data = dev_data;
+
+	return 0;
+}
+
+static acpi_status write_xenbus_resource(struct acpi_resource *res, void *data)
+{
+	struct walk_info *info = data;
+	struct acpi_resource_address64 addr;
+	acpi_status status;
+	int i, len, err;
+	char str[32], tmp[3];
+	unsigned char *ptr, *buf;
+
+	status = acpi_resource_to_address64(res, &addr);
+
+	/* Do we care about this range?  Let''s check. */
+	if (!ACPI_SUCCESS(status) ||
+	    !(addr.resource_type == ACPI_MEMORY_RANGE ||
+	      addr.resource_type == ACPI_IO_RANGE) ||
+	    !addr.address_length || addr.producer_consumer != ACPI_PRODUCER)
+		return AE_OK;
+
+	/*
+	 * Furthermore, we really only care to tell the guest about
+	 * address ranges that require address translation of some sort.
+	 */
+	if (!(addr.resource_type == ACPI_MEMORY_RANGE &&
+	      addr.info.mem.translation) &&
+	    !(addr.resource_type == ACPI_IO_RANGE &&
+	      addr.info.io.translation))
+		return AE_OK;
+	   
+	/* Store the resource in xenbus for the guest */
+	len = snprintf(str, sizeof(str), "root-%d-resource-%d",
+		       info->root_num, info->resource_count);
+	if (unlikely(len >= (sizeof(str) - 1)))
+		return AE_OK;
+
+	buf = kzalloc((sizeof(*res) * 2) + 1, GFP_KERNEL);
+	if (!buf)
+		return AE_OK;
+
+	/* Clean out resource_source */
+	res->data.address64.resource_source.index = 0xFF;
+	res->data.address64.resource_source.string_length = 0;
+	res->data.address64.resource_source.string_ptr = NULL;
+
+	ptr = (unsigned char *)res;
+
+	/* Turn the acpi_resource into an ASCII byte stream */
+	for (i = 0; i < sizeof(*res); i++) {
+		snprintf(tmp, sizeof(tmp), "%02x", ptr[i]);
+		strncat(buf, tmp, 2);
+	}
+
+	err = xenbus_printf(XBT_NIL, info->pdev->xdev->nodename,
+			    str, "%s", buf);
+
+	if (!err)
+		info->resource_count++;
+
+	kfree(buf);
+
+	return AE_OK;
+}
+
+int pciback_publish_pci_roots(struct pciback_device *pdev,
+			      publish_pci_root_cb publish_root_cb)
+{
+	struct controller_dev_data *dev_data = pdev->pci_dev_data;
+	struct controller_list_entry *cntrl_entry;
+	int i, root_num, len, err = 0;
+	unsigned int domain, bus;
+	char str[64];
+	struct walk_info info;
+
+	spin_lock(&dev_data->lock);
+
+	list_for_each_entry(cntrl_entry, &dev_data->list, list) {
+		/* First publish all the domain:bus info */
+		err = publish_root_cb(pdev, cntrl_entry->domain,
+				      cntrl_entry->bus);
+		if (err)
+			goto out;
+
+		/*
+ 		 * Now figure out which root-%d this belongs to
+		 * so we can associate resources with it.
+		 */
+		err = xenbus_scanf(XBT_NIL, pdev->xdev->nodename,
+				   "root_num", "%d", &root_num);
+
+		if (err != 1)
+			goto out;
+
+		for (i = 0; i < root_num; i++) {
+			len = snprintf(str, sizeof(str), "root-%d", i);
+			if (unlikely(len >= (sizeof(str) - 1))) {
+				err = -ENOMEM;
+				goto out;
+			}
+
+			err = xenbus_scanf(XBT_NIL, pdev->xdev->nodename,
+					   str, "%x:%x", &domain, &bus);
+			if (err != 2)
+				goto out;
+
+			/* Is this the one we just published? */
+			if (domain == cntrl_entry->domain &&
+			    bus == cntrl_entry->bus)
+				break;
+		}
+
+		if (i == root_num)
+			goto out;
+
+		info.pdev = pdev;
+		info.resource_count = 0;
+		info.root_num = i;
+
+		/* Let ACPI do the heavy lifting on decoding resources */
+		acpi_walk_resources(cntrl_entry->controller->acpi_handle,
+				    METHOD_NAME__CRS, write_xenbus_resource,
+				    &info);
+
+		/* No resouces.  OK.  On to the next one */
+		if (!info.resource_count)
+			continue;
+
+		/* Store the number of resources we wrote for this root-%d */
+		len = snprintf(str, sizeof(str), "root-%d-resources", i);
+		if (unlikely(len >= (sizeof(str) - 1))) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		err = xenbus_printf(XBT_NIL, pdev->xdev->nodename, str,
+				    "%d", info.resource_count);
+		if (err)
+			goto out;
+	}
+
+	/* Finally, write some magic to synchronize with the guest. */
+	len = snprintf(str, sizeof(str), "root-resource-magic");
+	if (unlikely(len >= (sizeof(str) - 1))) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	err = xenbus_printf(XBT_NIL, pdev->xdev->nodename, str,
+			    "%lx", (sizeof(struct acpi_resource) * 2) + 1);
+
+out:
+	spin_unlock(&dev_data->lock);
+
+	return err;
+}
+
+void pciback_release_devices(struct pciback_device *pdev)
+{
+	struct controller_dev_data *dev_data = pdev->pci_dev_data;
+	struct controller_list_entry *cntrl_entry, *c;
+	struct controller_dev_entry *dev_entry, *d;
+
+	list_for_each_entry_safe(cntrl_entry, c, &dev_data->list, list) {
+		list_for_each_entry_safe(dev_entry, d,
+					 &cntrl_entry->dev_list, list) {
+			list_del(&dev_entry->list);
+			pcistub_put_pci_dev(dev_entry->dev);
+			kfree(dev_entry);
+		}
+		list_del(&cntrl_entry->list);
+		kfree(cntrl_entry);
+	}
+
+	kfree(dev_data);
+	pdev->pci_dev_data = NULL;
+}
diff -r 0cf6b75423e9 linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c
--- a/linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c	Mon Jun 04 14:17:54
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c	Sun Jun 10 11:58:51
2007 -0600
@@ -207,7 +207,7 @@ int pcifront_scan_root(struct pcifront_d
 		err = -ENOMEM;
 		goto err_out;
 	}
-	pcifront_init_sd(sd, domain, pdev);
+	pcifront_init_sd(sd, domain, bus, pdev);
 
 	b = pci_scan_bus_parented(&pdev->xdev->dev, bus,
 				  &pcifront_bus_ops, sd);
@@ -217,6 +217,8 @@ int pcifront_scan_root(struct pcifront_d
 		err = -ENOMEM;
 		goto err_out;
 	}
+
+	pcifront_setup_root_resources(b, sd);
 	bus_entry->bus = b;
 
 	list_add(&bus_entry->list, &pdev->root_buses);
diff -r 0cf6b75423e9 linux-2.6-xen-sparse/drivers/xen/pcifront/pcifront.h
--- a/linux-2.6-xen-sparse/drivers/xen/pcifront/pcifront.h	Mon Jun 04 14:17:54
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/pcifront/pcifront.h	Fri Jun 08 21:59:48
2007 -0600
@@ -10,7 +10,6 @@
 #include <linux/pci.h>
 #include <xen/xenbus.h>
 #include <xen/interface/io/pciif.h>
-#include <xen/pcifront.h>
 
 struct pci_bus_entry {
 	struct list_head list;
@@ -30,6 +29,8 @@ struct pcifront_device {
 	struct xen_pci_sharedinfo *sh_info;
 };
 
+#include <xen/pcifront.h>
+
 int pcifront_connect(struct pcifront_device *pdev);
 void pcifront_disconnect(struct pcifront_device *pdev);
 
diff -r 0cf6b75423e9 linux-2.6-xen-sparse/include/xen/pcifront.h
--- a/linux-2.6-xen-sparse/include/xen/pcifront.h	Mon Jun 04 14:17:54 2007 -0600
+++ b/linux-2.6-xen-sparse/include/xen/pcifront.h	Sun Jun 10 12:11:46 2007 -0600
@@ -12,7 +12,6 @@
 
 #ifndef __ia64__
 
-struct pcifront_device;
 struct pci_bus;
 
 struct pcifront_sd {
@@ -26,7 +25,8 @@ pcifront_get_pdev(struct pcifront_sd *sd
 	return sd->pdev;
 }
 
-static inline void pcifront_init_sd(struct pcifront_sd *sd, int domain,
+static inline void pcifront_init_sd(struct pcifront_sd *sd,
+				    unsigned int domain, unsigned int bus,
 				    struct pcifront_device *pdev)
 {
 	sd->domain = domain;
@@ -45,10 +45,21 @@ static inline int pci_proc_domain(struct
 }
 #endif /* CONFIG_PCI_DOMAINS */
 
+static inline void pcifront_setup_root_resources(struct pci_bus *bus,
+						 struct pcifront_sd *sd)
+{
+}
+
 #else /* __ia64__ */
 
+#include <linux/acpi.h>
 #include <asm/pci.h>
 #define pcifront_sd pci_controller
+
+extern void xen_add_resource(struct pci_controller *, unsigned int,
+			     unsigned int, struct acpi_resource *);
+extern void xen_pcibios_setup_root_windows(struct pci_bus *,
+					   struct pci_controller *);
 
 static inline struct pcifront_device *
 pcifront_get_pdev(struct pcifront_sd *sd)
@@ -56,16 +67,124 @@ pcifront_get_pdev(struct pcifront_sd *sd
 	return (struct pcifront_device *)sd->platform_data;
 }
 
-static inline void pcifront_init_sd(struct pcifront_sd *sd, int domain,
+static inline void pcifront_init_sd(struct pcifront_sd *sd,
+				    unsigned int domain, unsigned int bus,
 				    struct pcifront_device *pdev)
 {
+	int err, i, j, k, len, root_num, res_count;
+	struct acpi_resource res;
+	unsigned int d, b, byte;
+	unsigned long magic;
+	char str[64], tmp[3];
+	unsigned char *buf, *bufp;
+	u8 *ptr;
+
+	memset(sd, 0, sizeof(*sd));
+
 	sd->segment = domain;
-	sd->acpi_handle = NULL;
-	sd->iommu = NULL;
-	sd->node = -1;
-	sd->windows = 0;
-	sd->window = NULL;
+	sd->node = -1;	/* Revisit for NUMA */
 	sd->platform_data = pdev;
+
+	/* Look for resources for this controller in xenbus. */
+	err = xenbus_scanf(XBT_NIL, pdev->xdev->otherend, "root_num",
+			   "%d", &root_num);
+	if (err != 1)
+		return;
+
+	for (i = 0; i < root_num; i++) {
+		len = snprintf(str, sizeof(str), "root-%d", i);
+		if (unlikely(len >= (sizeof(str) - 1)))
+			return;
+
+		err = xenbus_scanf(XBT_NIL, pdev->xdev->otherend,
+				   str, "%x:%x", &d, &b);
+		if (err != 2)
+			return;
+
+		if (d == domain && b == bus)
+			break;
+	}
+
+	if (i == root_num)
+		return;
+
+	len = snprintf(str, sizeof(str), "root-resource-magic");
+
+	err = xenbus_scanf(XBT_NIL, pdev->xdev->otherend,
+			   str, "%lx", &magic);
+
+	if (err != 1)
+		return; /* No resources, nothing to do */
+
+	if (magic != (sizeof(res) * 2) + 1) {
+		printk(KERN_WARNING "pcifront: resource magic mismatch\n");
+		return;
+	}
+
+	len = snprintf(str, sizeof(str), "root-%d-resources", i);
+	if (unlikely(len >= (sizeof(str) - 1)))
+		return;
+
+	err = xenbus_scanf(XBT_NIL, pdev->xdev->otherend,
+			   str, "%d", &res_count);
+
+	if (err != 1)
+		return; /* No resources, nothing to do */
+
+	sd->window = kzalloc(sizeof(*sd->window) * res_count, GFP_KERNEL);
+	if (!sd->window)
+		return;
+
+	/* magic is also the size of the byte stream in xenbus */
+	buf = kmalloc(magic, GFP_KERNEL);
+	if (!buf) {
+		kfree(sd->window);
+		sd->window = NULL;
+		return;
+	}
+
+	/* Read the resources out of xenbus */
+	for (j = 0; j < res_count; j++) {
+		memset(&res, 0, sizeof(res));
+		memset(buf, 0, magic);
+
+		len = snprintf(str, sizeof(str), "root-%d-resource-%d", i, j);
+		if (unlikely(len >= (sizeof(str) - 1)))
+			return;
+
+		err = xenbus_scanf(XBT_NIL, pdev->xdev->otherend, str,
+				   "%s", buf);
+		if (err != 1) {
+			printk(KERN_WARNING "pcifront: error reading "
+			       "resource %d on bus %04x:%02x\n",
+			       j, domain, bus);
+			continue;
+		}
+
+		bufp = buf;
+		ptr = (u8 *)&res;
+		memset(tmp, 0, sizeof(tmp));
+
+		/* Copy ASCII byte stream into structure */
+		for (k = 0; k < magic - 1; k += 2) {
+			memcpy(tmp, bufp, 2);
+			bufp += 2;
+
+			sscanf(tmp, "%02x", &byte);
+			*ptr = byte;
+			ptr++;
+		}
+
+		xen_add_resource(sd, domain, bus, &res);
+		sd->windows++;
+	}
+	kfree(buf);
+}
+
+static inline void pcifront_setup_root_resources(struct pci_bus *bus,
+						 struct pcifront_sd *sd)
+{
+	xen_pcibios_setup_root_windows(bus, sd);
 }
 
 #endif /* __ia64__ */
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-10  19:13 UTC
[Xen-ia64-devel] Re: [Xen-devel] [RFC][PATCH] "Controller" pcibackend and frontend extensions
On 10/6/07 19:24, "Alex Williamson" <alex.williamson@hp.com> wrote:> I would like to see this backend become the default backend for ia64, > but it probably needs testing on other systems before we can make the > switch. I would appreciate testing and/or review feedback. To make use > of extended I/O port spaces, you''ll need the patch I sent to > xen-ia64-devel last week to register the I/O port spaces with Xen.It would be nice to have just one backend type, with relevant bits of its code made arch-specific. I''m sure x86 doesn''t need vpci, passthru *and* slot. This is probably fine for now; it doesn''t preclude merging or dropping other backends down the line. Only thing I spotted is that you make an enormous inline function in pcifront.h. That function will have to be moved elsewhere into a .c file if it''s getting that big. -- Keir _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@lists.xensource.com http://lists.xensource.com/xen-ia64-devel
Alex Williamson
2007-Jun-10  20:18 UTC
Re: [Xen-ia64-devel] Re: [Xen-devel] [RFC][PATCH] "Controller" pcibackend and frontend extensions
On Sun, 2007-06-10 at 20:13 +0100, Keir Fraser wrote:> On 10/6/07 19:24, "Alex Williamson" <alex.williamson@hp.com> wrote: > > > I would like to see this backend become the default backend for ia64, > > but it probably needs testing on other systems before we can make the > > switch. I would appreciate testing and/or review feedback. To make use > > of extended I/O port spaces, you''ll need the patch I sent to > > xen-ia64-devel last week to register the I/O port spaces with Xen.Hi Keir, Thanks for looking at the patch.> It would be nice to have just one backend type, with relevant bits of its > code made arch-specific. I''m sure x86 doesn''t need vpci, passthru *and* > slot.I agree, but I wasn''t ready to tackle that problem.> This is probably fine for now; it doesn''t preclude merging or dropping other > backends down the line. Only thing I spotted is that you make an enormous > inline function in pcifront.h. That function will have to be moved elsewhere > into a .c file if it''s getting that big.Right, I forgot about that. It is rather unwieldy for a static inline. I guess pcifront/pci_op.c is the most appropriate place for it. I''ll tuck it in a #ifdef there unless you have a better suggestion. Thanks, Alex -- Alex Williamson HP Open Source & Linux Org. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel