Yuji Shimada
2008-Oct-08 23:43 UTC
[Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
This patch adds the function that reassign page-aligned memory resources, to dom0 linux. The function is useful when we assign I/O device to HVM domain using pci passthrough. When we assign a device to HVM domain using pci passthrough, the device needs to be assigned page-aligned memory resources. If the memory resource is not page-aligned, following error occurs. Error: pci: 0000:00:1d.7: non-page-aligned MMIO BAR found. On many system, BIOS assigns memory resources to the device and enables it. So my patch disables the device, and releases resources. Then it assigns page-aligned memory resource to the device. Note: we don''t need to re-enable device, because guest software will enable it. To reassign resources, please add boot parameters of dom0 linux as follows. reassign_resources reassigndev=00:1d.7,01:00.0 reassign_resources Enables reassigning resources. reassigndev= Specifies devices include I/O device and PCI-PCI bridge to reassign resources. PCI-PCI bridge can be specified, if resource windows need to be expanded. You can easily improve the way of specifying device to reassign, changing the code of reassigndev.c. There is a similar function enabled by pci-mem-align boot parameter. Currently it is kept. But if many people agree with me, I''d like to remove it from dom0 linux, because there are two problems. - pci-mem-align reassigns all devices'' memory resources if they are not page-aligned. This is not safe, because some devices are used by firmware. - pci-mem-align can''t expand resource window of PCI-PCI bridge. Thanks, -- Yuji Shimada Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp> diff -r b54652ee29ef drivers/pci/Makefile --- a/drivers/pci/Makefile Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/Makefile Wed Oct 08 12:12:27 2008 +0900 @@ -3,7 +3,8 @@ # obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ - pci-driver.o search.o pci-sysfs.o rom.o setup-res.o + pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \ + reassigndev.o obj-$(CONFIG_PROC_FS) += proc.o # Build PCI Express stuff if needed diff -r b54652ee29ef drivers/pci/pci.h --- a/drivers/pci/pci.h Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/pci.h Wed Oct 08 12:12:27 2008 +0900 @@ -99,3 +99,8 @@ return NULL; } +#define ROUND_UP_TO_PAGESIZE(size) ((size + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1)) + +extern int reassign_resources; +extern int is_reassigndev(struct pci_dev *dev); +extern void pci_update_bridge(struct pci_dev *dev, int resno); diff -r b54652ee29ef drivers/pci/quirks.c --- a/drivers/pci/quirks.c Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/quirks.c Wed Oct 08 12:12:27 2008 +0900 @@ -33,6 +33,19 @@ } __setup("pci-mem-align", set_pci_mem_align); + +int reassign_resources = 0; + +static int __init set_reassign_resources(char *str) +{ + /* resources reassign on */ + reassign_resources = 1; + printk(KERN_DEBUG "PCI: resource reassign ON.\n"); + + return 1; +} +__setup("reassign_resources", set_reassign_resources); + /* This quirk function enables us to force all memory resources which are * assigned to PCI devices, to be page-aligned. */ @@ -41,6 +54,42 @@ int i; struct resource *r; resource_size_t old_start; + + if (reassign_resources) { + if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL && + (dev->class >> 8) == PCI_CLASS_BRIDGE_HOST) { + /* PCI Host Bridge isn''t a target device */ + return; + } + if (is_reassigndev(dev)) { + printk(KERN_INFO + "PCI: Disable device and release resources" + " [%s].\n", pci_name(dev)); + pci_disable_device(dev); + + for (i=0; i < PCI_NUM_RESOURCES; i++) { + r = &dev->resource[i]; + if ((r == NULL) || + !(r->flags & IORESOURCE_MEM)) + continue; + + r->end = r->end - r->start; + r->start = 0; + + if (i < PCI_BRIDGE_RESOURCES) { + pci_update_resource(dev, r, i); + } else if (i == 8 || i == 9) { + /* need to update(clear) the Base/Limit + * register also, because PCI bridge is + * disabled and the resource is + * released. + */ + pci_update_bridge(dev, i); + } + } + } + return; + } if (!pci_mem_align) return; diff -r b54652ee29ef drivers/pci/reassigndev.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/drivers/pci/reassigndev.c Wed Oct 08 12:12:27 2008 +0900 @@ -0,0 +1,81 @@ +/* + * Copyright (c) 2008, NEC Corporation. + * Taro Nichiden <t-nichiden at ab jp nec com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/kernel.h> +#include <linux/pci.h> +#include <linux/string.h> +#include "pci.h" + + +#define REASSIGNDEV_PARAM_MAX (2048) +#define TOKEN_MAX (12) /* "SSSS:BB:DD.F" length is 12 */ + +static char param_reassigndev[REASSIGNDEV_PARAM_MAX] = {0}; + +static int __init reassigndev_setup(char *str) +{ + strncpy(param_reassigndev, str, REASSIGNDEV_PARAM_MAX); + param_reassigndev[REASSIGNDEV_PARAM_MAX - 1] = ''\0''; + return 1; +} +__setup("reassigndev=", reassigndev_setup); + +int is_reassigndev(struct pci_dev *dev) +{ + char dev_str[TOKEN_MAX+1]; + int seg, bus, slot, func; + int len; + char *p, *next_str; + + p = param_reassigndev; + for (; p; p = next_str + 1) { + next_str = strpbrk(p, ","); + if (next_str) { + len = next_str - p; + } else { + len = strlen(p); + } + if (len > 0 && len <= TOKEN_MAX) { + strncpy(dev_str, p, len); + *(dev_str + len) = ''\0''; + + if (sscanf(dev_str, "%x:%x:%x.%x", + &seg, &bus, &slot, &func) != 4) { + if (sscanf(dev_str, "%x:%x.%x", + &bus, &slot, &func) == 3) { + seg = 0; + } else { + /* failed to scan strings */ + seg = -1; + bus = -1; + } + } + if (seg == pci_domain_nr(dev->bus) && + bus == dev->bus->number && + slot == PCI_SLOT(dev->devfn) && + func == PCI_FUNC(dev->devfn)) { + /* It''s a target device */ + return 1; + } + } + if (!next_str) + break; + } + + return 0; +} diff -r b54652ee29ef drivers/pci/setup-bus.c --- a/drivers/pci/setup-bus.c Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/setup-bus.c Wed Oct 08 12:12:27 2008 +0900 @@ -26,6 +26,7 @@ #include <linux/cache.h> #include <linux/slab.h> +#include "pci.h" #define DEBUG_CONFIG 1 #if DEBUG_CONFIG @@ -344,7 +345,8 @@ list_for_each_entry(dev, &bus->devices, bus_list) { int i; - + int reassign = reassign_resources ? is_reassigndev(dev) : 0; + for (i = 0; i < PCI_NUM_RESOURCES; i++) { struct resource *r = &dev->resource[i]; unsigned long r_size; @@ -352,6 +354,11 @@ if (r->parent || (r->flags & mask) != type) continue; r_size = r->end - r->start + 1; + + if (reassign) { + r_size = ROUND_UP_TO_PAGESIZE(r_size); + } + /* For bridges size != alignment */ align = (i < PCI_BRIDGE_RESOURCES) ? r_size : r->start; order = __ffs(align) - 20; diff -r b54652ee29ef drivers/pci/setup-res.c --- a/drivers/pci/setup-res.c Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/setup-res.c Wed Oct 08 12:12:27 2008 +0900 @@ -117,19 +117,96 @@ } EXPORT_SYMBOL_GPL(pci_claim_resource); +void +pci_update_bridge(struct pci_dev *dev, int resno) +{ + struct resource *res = &dev->resource[resno]; + struct pci_bus_region region; + u32 l, dw, base_up32, limit_up32; + + if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE || + (dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) { + return; + } + + if (!res->flags) + return; + + switch (resno) { + case 8 : /* MMIO Base/Limit */ + pcibios_resource_to_bus(dev, ®ion, res); + if (res->flags & IORESOURCE_MEM && + !(res->flags & IORESOURCE_PREFETCH)) { + l = (region.start >> 16) & 0xfff0; + l |= region.end & 0xfff00000; + } else { + l = 0x0000fff0; + } + pci_write_config_dword(dev, PCI_MEMORY_BASE, l); + + break; + + case 9 : /* Prefetchable MMIO Base/Limit */ + /* Clear out the upper 32 bits of PREF limit. + * If PCI_PREF_BASE_UPPER32 was non-zero, this temporarily + * disables PREF range, which is ok. + */ + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, 0); + + /* Get PREF 32/64 bits Addressing mode */ + pci_read_config_dword(dev, PCI_PREF_MEMORY_BASE, &dw); + + pcibios_resource_to_bus(dev, ®ion, res); + if (res->flags & IORESOURCE_MEM && + res->flags & IORESOURCE_PREFETCH) { + l = (region.start >> 16) & 0xfff0; + l |= region.end & 0xfff00000; + + if (dw & PCI_PREF_RANGE_TYPE_64) { + base_up32 = (region.start >> 32) & 0xffffffff; + limit_up32 = (region.end >> 32) & 0xffffffff; + } else { + base_up32 = 0; + limit_up32 = 0; + } + } else { + l = 0x0000fff0; + base_up32 = 0xffffffff; + limit_up32 = 0; + } + pci_write_config_dword(dev, PCI_PREF_MEMORY_BASE, l); + /* Set up the upper 32 bits of PREF base/limit. */ + pci_write_config_dword(dev, PCI_PREF_BASE_UPPER32, base_up32); + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, limit_up32); + break; + default : + BUG(); + break; + } +} + int pci_assign_resource(struct pci_dev *dev, int resno) { struct pci_bus *bus = dev->bus; struct resource *res = dev->resource + resno; resource_size_t size, min, align; int ret; + int reassigndev = reassign_resources ? is_reassigndev(dev) : 0; size = res->end - res->start + 1; min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM; /* The bridge resources are special, as their size != alignment. Sizing routines return required alignment in the "start" field. */ - align = (resno < PCI_BRIDGE_RESOURCES) ? size : res->start; + if (resno < PCI_BRIDGE_RESOURCES) { + align = size; + if ((reassigndev) && + (res->flags & IORESOURCE_MEM)) { + align = ROUND_UP_TO_PAGESIZE(align); + } + } else { + align = res->start; + } /* First, try exact prefetching match.. */ ret = pci_bus_alloc_resource(bus, res, size, align, min, @@ -154,6 +231,9 @@ resno, (unsigned long long)size, (unsigned long long)res->start, pci_name(dev)); } else if (resno < PCI_BRIDGE_RESOURCES) { + printk(KERN_DEBUG "PCI: Assign resource(%d) on %s " + "%016llx - %016llx\n", resno, pci_name(dev), + (u64)res->start, (u64)res->end); pci_update_resource(dev, res, resno); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Neo Jia
2008-Oct-09 01:04 UTC
Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
So, for a I/O device, if we don''t see the error message you mentioned before, it probably don''t need this, right? Thanks, Neo On Wed, Oct 8, 2008 at 4:43 PM, Yuji Shimada <shimada-yxb@necst.nec.co.jp> wrote:> This patch adds the function that reassign page-aligned memory > resources, to dom0 linux. The function is useful when we assign I/O > device to HVM domain using pci passthrough. > > > When we assign a device to HVM domain using pci passthrough, > the device needs to be assigned page-aligned memory resources. If the > memory resource is not page-aligned, following error occurs. > > Error: pci: 0000:00:1d.7: non-page-aligned MMIO BAR found. > > On many system, BIOS assigns memory resources to the device and > enables it. So my patch disables the device, and releases resources. > Then it assigns page-aligned memory resource to the device. > > Note: we don''t need to re-enable device, because guest software will > enable it. > > > To reassign resources, please add boot parameters of dom0 linux as follows. > > reassign_resources reassigndev=00:1d.7,01:00.0 > > reassign_resources > Enables reassigning resources. > > reassigndev= Specifies devices include I/O device and PCI-PCI > bridge to reassign resources. PCI-PCI bridge > can be specified, if resource windows need to > be expanded. > > You can easily improve the way of specifying device to reassign, > changing the code of reassigndev.c. > > > There is a similar function enabled by pci-mem-align boot > parameter. Currently it is kept. But if many people agree with me, I''d > like to remove it from dom0 linux, because there are two problems. > > - pci-mem-align reassigns all devices'' memory resources if they are > not page-aligned. This is not safe, because some devices are > used by firmware. > - pci-mem-align can''t expand resource window of PCI-PCI bridge. > > Thanks, > -- > Yuji Shimada > > > Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp> > > diff -r b54652ee29ef drivers/pci/Makefile > --- a/drivers/pci/Makefile Thu Oct 02 11:29:02 2008 +0100 > +++ b/drivers/pci/Makefile Wed Oct 08 12:12:27 2008 +0900 > @@ -3,7 +3,8 @@ > # > > obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ > - pci-driver.o search.o pci-sysfs.o rom.o setup-res.o > + pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \ > + reassigndev.o > obj-$(CONFIG_PROC_FS) += proc.o > > # Build PCI Express stuff if needed > diff -r b54652ee29ef drivers/pci/pci.h > --- a/drivers/pci/pci.h Thu Oct 02 11:29:02 2008 +0100 > +++ b/drivers/pci/pci.h Wed Oct 08 12:12:27 2008 +0900 > @@ -99,3 +99,8 @@ > return NULL; > } > > +#define ROUND_UP_TO_PAGESIZE(size) ((size + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1)) > + > +extern int reassign_resources; > +extern int is_reassigndev(struct pci_dev *dev); > +extern void pci_update_bridge(struct pci_dev *dev, int resno); > diff -r b54652ee29ef drivers/pci/quirks.c > --- a/drivers/pci/quirks.c Thu Oct 02 11:29:02 2008 +0100 > +++ b/drivers/pci/quirks.c Wed Oct 08 12:12:27 2008 +0900 > @@ -33,6 +33,19 @@ > } > __setup("pci-mem-align", set_pci_mem_align); > > + > +int reassign_resources = 0; > + > +static int __init set_reassign_resources(char *str) > +{ > + /* resources reassign on */ > + reassign_resources = 1; > + printk(KERN_DEBUG "PCI: resource reassign ON.\n"); > + > + return 1; > +} > +__setup("reassign_resources", set_reassign_resources); > + > /* This quirk function enables us to force all memory resources which are > * assigned to PCI devices, to be page-aligned. > */ > @@ -41,6 +54,42 @@ > int i; > struct resource *r; > resource_size_t old_start; > + > + if (reassign_resources) { > + if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL && > + (dev->class >> 8) == PCI_CLASS_BRIDGE_HOST) { > + /* PCI Host Bridge isn''t a target device */ > + return; > + } > + if (is_reassigndev(dev)) { > + printk(KERN_INFO > + "PCI: Disable device and release resources" > + " [%s].\n", pci_name(dev)); > + pci_disable_device(dev); > + > + for (i=0; i < PCI_NUM_RESOURCES; i++) { > + r = &dev->resource[i]; > + if ((r == NULL) || > + !(r->flags & IORESOURCE_MEM)) > + continue; > + > + r->end = r->end - r->start; > + r->start = 0; > + > + if (i < PCI_BRIDGE_RESOURCES) { > + pci_update_resource(dev, r, i); > + } else if (i == 8 || i == 9) { > + /* need to update(clear) the Base/Limit > + * register also, because PCI bridge is > + * disabled and the resource is > + * released. > + */ > + pci_update_bridge(dev, i); > + } > + } > + } > + return; > + } > > if (!pci_mem_align) > return; > diff -r b54652ee29ef drivers/pci/reassigndev.c > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 > +++ b/drivers/pci/reassigndev.c Wed Oct 08 12:12:27 2008 +0900 > @@ -0,0 +1,81 @@ > +/* > + * Copyright (c) 2008, NEC Corporation. > + * Taro Nichiden <t-nichiden at ab jp nec com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > + * Place - Suite 330, Boston, MA 02111-1307 USA. > + */ > + > +#include <linux/kernel.h> > +#include <linux/pci.h> > +#include <linux/string.h> > +#include "pci.h" > + > + > +#define REASSIGNDEV_PARAM_MAX (2048) > +#define TOKEN_MAX (12) /* "SSSS:BB:DD.F" length is 12 */ > + > +static char param_reassigndev[REASSIGNDEV_PARAM_MAX] = {0}; > + > +static int __init reassigndev_setup(char *str) > +{ > + strncpy(param_reassigndev, str, REASSIGNDEV_PARAM_MAX); > + param_reassigndev[REASSIGNDEV_PARAM_MAX - 1] = ''\0''; > + return 1; > +} > +__setup("reassigndev=", reassigndev_setup); > + > +int is_reassigndev(struct pci_dev *dev) > +{ > + char dev_str[TOKEN_MAX+1]; > + int seg, bus, slot, func; > + int len; > + char *p, *next_str; > + > + p = param_reassigndev; > + for (; p; p = next_str + 1) { > + next_str = strpbrk(p, ","); > + if (next_str) { > + len = next_str - p; > + } else { > + len = strlen(p); > + } > + if (len > 0 && len <= TOKEN_MAX) { > + strncpy(dev_str, p, len); > + *(dev_str + len) = ''\0''; > + > + if (sscanf(dev_str, "%x:%x:%x.%x", > + &seg, &bus, &slot, &func) != 4) { > + if (sscanf(dev_str, "%x:%x.%x", > + &bus, &slot, &func) == 3) { > + seg = 0; > + } else { > + /* failed to scan strings */ > + seg = -1; > + bus = -1; > + } > + } > + if (seg == pci_domain_nr(dev->bus) && > + bus == dev->bus->number && > + slot == PCI_SLOT(dev->devfn) && > + func == PCI_FUNC(dev->devfn)) { > + /* It''s a target device */ > + return 1; > + } > + } > + if (!next_str) > + break; > + } > + > + return 0; > +} > diff -r b54652ee29ef drivers/pci/setup-bus.c > --- a/drivers/pci/setup-bus.c Thu Oct 02 11:29:02 2008 +0100 > +++ b/drivers/pci/setup-bus.c Wed Oct 08 12:12:27 2008 +0900 > @@ -26,6 +26,7 @@ > #include <linux/cache.h> > #include <linux/slab.h> > > +#include "pci.h" > > #define DEBUG_CONFIG 1 > #if DEBUG_CONFIG > @@ -344,7 +345,8 @@ > > list_for_each_entry(dev, &bus->devices, bus_list) { > int i; > - > + int reassign = reassign_resources ? is_reassigndev(dev) : 0; > + > for (i = 0; i < PCI_NUM_RESOURCES; i++) { > struct resource *r = &dev->resource[i]; > unsigned long r_size; > @@ -352,6 +354,11 @@ > if (r->parent || (r->flags & mask) != type) > continue; > r_size = r->end - r->start + 1; > + > + if (reassign) { > + r_size = ROUND_UP_TO_PAGESIZE(r_size); > + } > + > /* For bridges size != alignment */ > align = (i < PCI_BRIDGE_RESOURCES) ? r_size : r->start; > order = __ffs(align) - 20; > diff -r b54652ee29ef drivers/pci/setup-res.c > --- a/drivers/pci/setup-res.c Thu Oct 02 11:29:02 2008 +0100 > +++ b/drivers/pci/setup-res.c Wed Oct 08 12:12:27 2008 +0900 > @@ -117,19 +117,96 @@ > } > EXPORT_SYMBOL_GPL(pci_claim_resource); > > +void > +pci_update_bridge(struct pci_dev *dev, int resno) > +{ > + struct resource *res = &dev->resource[resno]; > + struct pci_bus_region region; > + u32 l, dw, base_up32, limit_up32; > + > + if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE || > + (dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) { > + return; > + } > + > + if (!res->flags) > + return; > + > + switch (resno) { > + case 8 : /* MMIO Base/Limit */ > + pcibios_resource_to_bus(dev, ®ion, res); > + if (res->flags & IORESOURCE_MEM && > + !(res->flags & IORESOURCE_PREFETCH)) { > + l = (region.start >> 16) & 0xfff0; > + l |= region.end & 0xfff00000; > + } else { > + l = 0x0000fff0; > + } > + pci_write_config_dword(dev, PCI_MEMORY_BASE, l); > + > + break; > + > + case 9 : /* Prefetchable MMIO Base/Limit */ > + /* Clear out the upper 32 bits of PREF limit. > + * If PCI_PREF_BASE_UPPER32 was non-zero, this temporarily > + * disables PREF range, which is ok. > + */ > + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, 0); > + > + /* Get PREF 32/64 bits Addressing mode */ > + pci_read_config_dword(dev, PCI_PREF_MEMORY_BASE, &dw); > + > + pcibios_resource_to_bus(dev, ®ion, res); > + if (res->flags & IORESOURCE_MEM && > + res->flags & IORESOURCE_PREFETCH) { > + l = (region.start >> 16) & 0xfff0; > + l |= region.end & 0xfff00000; > + > + if (dw & PCI_PREF_RANGE_TYPE_64) { > + base_up32 = (region.start >> 32) & 0xffffffff; > + limit_up32 = (region.end >> 32) & 0xffffffff; > + } else { > + base_up32 = 0; > + limit_up32 = 0; > + } > + } else { > + l = 0x0000fff0; > + base_up32 = 0xffffffff; > + limit_up32 = 0; > + } > + pci_write_config_dword(dev, PCI_PREF_MEMORY_BASE, l); > + /* Set up the upper 32 bits of PREF base/limit. */ > + pci_write_config_dword(dev, PCI_PREF_BASE_UPPER32, base_up32); > + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, limit_up32); > + break; > + default : > + BUG(); > + break; > + } > +} > + > int pci_assign_resource(struct pci_dev *dev, int resno) > { > struct pci_bus *bus = dev->bus; > struct resource *res = dev->resource + resno; > resource_size_t size, min, align; > int ret; > + int reassigndev = reassign_resources ? is_reassigndev(dev) : 0; > > size = res->end - res->start + 1; > min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM; > /* The bridge resources are special, as their > size != alignment. Sizing routines return > required alignment in the "start" field. */ > - align = (resno < PCI_BRIDGE_RESOURCES) ? size : res->start; > + if (resno < PCI_BRIDGE_RESOURCES) { > + align = size; > + if ((reassigndev) && > + (res->flags & IORESOURCE_MEM)) { > + align = ROUND_UP_TO_PAGESIZE(align); > + } > + } else { > + align = res->start; > + } > > /* First, try exact prefetching match.. */ > ret = pci_bus_alloc_resource(bus, res, size, align, min, > @@ -154,6 +231,9 @@ > resno, (unsigned long long)size, > (unsigned long long)res->start, pci_name(dev)); > } else if (resno < PCI_BRIDGE_RESOURCES) { > + printk(KERN_DEBUG "PCI: Assign resource(%d) on %s " > + "%016llx - %016llx\n", resno, pci_name(dev), > + (u64)res->start, (u64)res->end); > pci_update_resource(dev, res, resno); > } > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >-- I would remember that if researchers were not ambitious probably today we haven''t the technology we are using! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhao, Yu
2008-Oct-09 03:21 UTC
RE: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
On Thursday, October 09, 2008 7:44 AM, Yuji Shimada wrote:>There is a similar function enabled by pci-mem-align boot >parameter. Currently it is kept. But if many people agree with me, I''d >like to remove it from dom0 linux, because there are two problems. > > - pci-mem-align reassigns all devices'' memory resources if they are > not page-aligned. This is not safe, because some devices are > used by firmware. > - pci-mem-align can''t expand resource window of PCI-PCI bridge.Why the resource windows can''t be expended? I think the pci_bus_size_bridges takes care of this. So it shouldn''t be a problem and your pci_update_bridge appears unnecessary. Regarding the first problem you mentioned, it could be easily fixed by merging your "rassigndev=" parameter to quirk_align_mem_resources so the function can selective twist the devices. Thanks, Yu _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yuji Shimada
2008-Oct-09 07:20 UTC
Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
I am resending the patch, because I sent it with a wrong copyright. This patch adds the function that reassign page-aligned memory resources, to dom0 linux. The function is useful when we assign I/O device to HVM domain using pci passthrough. When we assign a device to HVM domain using pci passthrough, the device needs to be assigned page-aligned memory resources. If the memory resource is not page-aligned, following error occurs. Error: pci: 0000:00:1d.7: non-page-aligned MMIO BAR found. On many system, BIOS assigns memory resources to the device and enables it. So my patch disables the device, and releases resources, Then it assigns page-aligned memory resource to the device. Note: we don''t need to re-enable device, because guest software will enable it. To reassign resources, please add boot parameters of dom0 linux as follows. reassign_resources reassigndev=00:1d.7,01:00.0 reassign_resources Enables reassigning resources. reassigndev= Specifies devices include I/O device and PCI-PCI bridge to reassign resources. PCI-PCI bridge can be specified, if resource windows need to be expanded. You can easily improve the way of specifying device to reassign, changing the code of reassigndev.c. There is a similar function enabled by pci-mem-align boot parameter. Currently it is kept. But if many people agree with me, I''d like to remove it from dom0 linux, because there are two problems. - pci-mem-align reassigns all devices'' memory resources if they are not page-aligned. This is not safe, because some devices are used by firmware. - pci-mem-align can''t expand resource window of PCI-PCI bridge. Thanks, -- Yuji Shimada Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp> diff -r b54652ee29ef drivers/pci/Makefile --- a/drivers/pci/Makefile Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/Makefile Wed Oct 08 12:12:27 2008 +0900 @@ -3,7 +3,8 @@ # obj-y += access.o bus.o probe.o remove.o pci.o quirks.o \ - pci-driver.o search.o pci-sysfs.o rom.o setup-res.o + pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \ + reassigndev.o obj-$(CONFIG_PROC_FS) += proc.o # Build PCI Express stuff if needed diff -r b54652ee29ef drivers/pci/pci.h --- a/drivers/pci/pci.h Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/pci.h Wed Oct 08 12:12:27 2008 +0900 @@ -99,3 +99,8 @@ return NULL; } +#define ROUND_UP_TO_PAGESIZE(size) ((size + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1)) + +extern int reassign_resources; +extern int is_reassigndev(struct pci_dev *dev); +extern void pci_update_bridge(struct pci_dev *dev, int resno); diff -r b54652ee29ef drivers/pci/quirks.c --- a/drivers/pci/quirks.c Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/quirks.c Wed Oct 08 12:12:27 2008 +0900 @@ -33,6 +33,19 @@ } __setup("pci-mem-align", set_pci_mem_align); + +int reassign_resources = 0; + +static int __init set_reassign_resources(char *str) +{ + /* resources reassign on */ + reassign_resources = 1; + printk(KERN_DEBUG "PCI: resource reassign ON.\n"); + + return 1; +} +__setup("reassign_resources", set_reassign_resources); + /* This quirk function enables us to force all memory resources which are * assigned to PCI devices, to be page-aligned. */ @@ -41,6 +54,42 @@ int i; struct resource *r; resource_size_t old_start; + + if (reassign_resources) { + if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL && + (dev->class >> 8) == PCI_CLASS_BRIDGE_HOST) { + /* PCI Host Bridge isn''t a target device */ + return; + } + if (is_reassigndev(dev)) { + printk(KERN_INFO + "PCI: Disable device and release resources" + " [%s].\n", pci_name(dev)); + pci_disable_device(dev); + + for (i=0; i < PCI_NUM_RESOURCES; i++) { + r = &dev->resource[i]; + if ((r == NULL) || + !(r->flags & IORESOURCE_MEM)) + continue; + + r->end = r->end - r->start; + r->start = 0; + + if (i < PCI_BRIDGE_RESOURCES) { + pci_update_resource(dev, r, i); + } else if (i == 8 || i == 9) { + /* need to update(clear) the Base/Limit + * register also, because PCI bridge is + * disabled and the resource is + * released. + */ + pci_update_bridge(dev, i); + } + } + } + return; + } if (!pci_mem_align) return; diff -r b54652ee29ef drivers/pci/reassigndev.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/drivers/pci/reassigndev.c Wed Oct 08 12:12:27 2008 +0900 @@ -0,0 +1,80 @@ +/* + * Copyright (c) 2008, NEC Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/kernel.h> +#include <linux/pci.h> +#include <linux/string.h> +#include "pci.h" + + +#define REASSIGNDEV_PARAM_MAX (2048) +#define TOKEN_MAX (12) /* "SSSS:BB:DD.F" length is 12 */ + +static char param_reassigndev[REASSIGNDEV_PARAM_MAX] = {0}; + +static int __init reassigndev_setup(char *str) +{ + strncpy(param_reassigndev, str, REASSIGNDEV_PARAM_MAX); + param_reassigndev[REASSIGNDEV_PARAM_MAX - 1] = ''\0''; + return 1; +} +__setup("reassigndev=", reassigndev_setup); + +int is_reassigndev(struct pci_dev *dev) +{ + char dev_str[TOKEN_MAX+1]; + int seg, bus, slot, func; + int len; + char *p, *next_str; + + p = param_reassigndev; + for (; p; p = next_str + 1) { + next_str = strpbrk(p, ","); + if (next_str) { + len = next_str - p; + } else { + len = strlen(p); + } + if (len > 0 && len <= TOKEN_MAX) { + strncpy(dev_str, p, len); + *(dev_str + len) = ''\0''; + + if (sscanf(dev_str, "%x:%x:%x.%x", + &seg, &bus, &slot, &func) != 4) { + if (sscanf(dev_str, "%x:%x.%x", + &bus, &slot, &func) == 3) { + seg = 0; + } else { + /* failed to scan strings */ + seg = -1; + bus = -1; + } + } + if (seg == pci_domain_nr(dev->bus) && + bus == dev->bus->number && + slot == PCI_SLOT(dev->devfn) && + func == PCI_FUNC(dev->devfn)) { + /* It''s a target device */ + return 1; + } + } + if (!next_str) + break; + } + + return 0; +} diff -r b54652ee29ef drivers/pci/setup-bus.c --- a/drivers/pci/setup-bus.c Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/setup-bus.c Wed Oct 08 12:12:27 2008 +0900 @@ -26,6 +26,7 @@ #include <linux/cache.h> #include <linux/slab.h> +#include "pci.h" #define DEBUG_CONFIG 1 #if DEBUG_CONFIG @@ -344,7 +345,8 @@ list_for_each_entry(dev, &bus->devices, bus_list) { int i; - + int reassign = reassign_resources ? is_reassigndev(dev) : 0; + for (i = 0; i < PCI_NUM_RESOURCES; i++) { struct resource *r = &dev->resource[i]; unsigned long r_size; @@ -352,6 +354,11 @@ if (r->parent || (r->flags & mask) != type) continue; r_size = r->end - r->start + 1; + + if (reassign) { + r_size = ROUND_UP_TO_PAGESIZE(r_size); + } + /* For bridges size != alignment */ align = (i < PCI_BRIDGE_RESOURCES) ? r_size : r->start; order = __ffs(align) - 20; diff -r b54652ee29ef drivers/pci/setup-res.c --- a/drivers/pci/setup-res.c Thu Oct 02 11:29:02 2008 +0100 +++ b/drivers/pci/setup-res.c Wed Oct 08 12:12:27 2008 +0900 @@ -117,19 +117,96 @@ } EXPORT_SYMBOL_GPL(pci_claim_resource); +void +pci_update_bridge(struct pci_dev *dev, int resno) +{ + struct resource *res = &dev->resource[resno]; + struct pci_bus_region region; + u32 l, dw, base_up32, limit_up32; + + if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE || + (dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) { + return; + } + + if (!res->flags) + return; + + switch (resno) { + case 8 : /* MMIO Base/Limit */ + pcibios_resource_to_bus(dev, ®ion, res); + if (res->flags & IORESOURCE_MEM && + !(res->flags & IORESOURCE_PREFETCH)) { + l = (region.start >> 16) & 0xfff0; + l |= region.end & 0xfff00000; + } else { + l = 0x0000fff0; + } + pci_write_config_dword(dev, PCI_MEMORY_BASE, l); + + break; + + case 9 : /* Prefetchable MMIO Base/Limit */ + /* Clear out the upper 32 bits of PREF limit. + * If PCI_PREF_BASE_UPPER32 was non-zero, this temporarily + * disables PREF range, which is ok. + */ + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, 0); + + /* Get PREF 32/64 bits Addressing mode */ + pci_read_config_dword(dev, PCI_PREF_MEMORY_BASE, &dw); + + pcibios_resource_to_bus(dev, ®ion, res); + if (res->flags & IORESOURCE_MEM && + res->flags & IORESOURCE_PREFETCH) { + l = (region.start >> 16) & 0xfff0; + l |= region.end & 0xfff00000; + + if (dw & PCI_PREF_RANGE_TYPE_64) { + base_up32 = (region.start >> 32) & 0xffffffff; + limit_up32 = (region.end >> 32) & 0xffffffff; + } else { + base_up32 = 0; + limit_up32 = 0; + } + } else { + l = 0x0000fff0; + base_up32 = 0xffffffff; + limit_up32 = 0; + } + pci_write_config_dword(dev, PCI_PREF_MEMORY_BASE, l); + /* Set up the upper 32 bits of PREF base/limit. */ + pci_write_config_dword(dev, PCI_PREF_BASE_UPPER32, base_up32); + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, limit_up32); + break; + default : + BUG(); + break; + } +} + int pci_assign_resource(struct pci_dev *dev, int resno) { struct pci_bus *bus = dev->bus; struct resource *res = dev->resource + resno; resource_size_t size, min, align; int ret; + int reassigndev = reassign_resources ? is_reassigndev(dev) : 0; size = res->end - res->start + 1; min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM; /* The bridge resources are special, as their size != alignment. Sizing routines return required alignment in the "start" field. */ - align = (resno < PCI_BRIDGE_RESOURCES) ? size : res->start; + if (resno < PCI_BRIDGE_RESOURCES) { + align = size; + if ((reassigndev) && + (res->flags & IORESOURCE_MEM)) { + align = ROUND_UP_TO_PAGESIZE(align); + } + } else { + align = res->start; + } /* First, try exact prefetching match.. */ ret = pci_bus_alloc_resource(bus, res, size, align, min, @@ -154,6 +231,9 @@ resno, (unsigned long long)size, (unsigned long long)res->start, pci_name(dev)); } else if (resno < PCI_BRIDGE_RESOURCES) { + printk(KERN_DEBUG "PCI: Assign resource(%d) on %s " + "%016llx - %016llx\n", resno, pci_name(dev), + (u64)res->start, (u64)res->end); pci_update_resource(dev, res, resno); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yuji Shimada
2008-Oct-09 08:04 UTC
Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
Hi, Neo, On Wed, 8 Oct 2008 18:04:56 -0700 "Neo Jia" <neojia@gmail.com> wrote:> So, for a I/O device, if we don''t see the error message you mentioned > before, it probably don''t need this, right?You are right. If we don''t see the following error message, we don''t need to add dom0 boot parameters of my patch. Error: pci: 0000:00:1d.7: non-page-aligned MMIO BAR found. Thanks, -- Yuji Shimada _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yuji Shimada
2008-Oct-09 08:54 UTC
Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
On Thu, 9 Oct 2008 11:21:22 +0800 "Zhao, Yu" <yu.zhao@intel.com> wrote:> On Thursday, October 09, 2008 7:44 AM, Yuji Shimada wrote: > >There is a similar function enabled by pci-mem-align boot > >parameter. Currently it is kept. But if many people agree with me, I''d > >like to remove it from dom0 linux, because there are two problems. > > > > - pci-mem-align reassigns all devices'' memory resources if they are > > not page-aligned. This is not safe, because some devices are > > used by firmware. > > - pci-mem-align can''t expand resource window of PCI-PCI bridge. > > Why the resource windows can''t be expended? I think the > pci_bus_size_bridges takes care of this. So it shouldn''t be a > problem and your pci_update_bridge appears unnecessary.Current pci_bus_size_bridges calculates the size and minimal alignment of resource window based on actual resource size. The values of them will be not changed if pci-mem-align is enabled. But if we make memory resource page-aligned, we need more space. My patch rounds up resource size to page size, on calculating the size of resource windows.> Regarding the first problem you mentioned, it could be easily fixed > by merging your "rassigndev=" parameter to quirk_align_mem_resources > so the function can selective twist the devices.If my "rassigndev=" parameter is merged to quirk_align_mem_resources, problem will occurs. Current quirk_align_mem_resources shifts resources simply. Resources will conflict easily. If confliction occurs, kernel reassigns conflicting resources. As a result, the resources of device which are not specified are reassigned. Additionally, the reassigned resource is not page-aligned. Thanks, -- Yuji Shimada _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhao, Yu
2008-Oct-09 11:34 UTC
Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
Yuji Shimada wrote:> On Thu, 9 Oct 2008 11:21:22 +0800 > "Zhao, Yu" <yu.zhao@intel.com> wrote: > >> On Thursday, October 09, 2008 7:44 AM, Yuji Shimada wrote: >>> There is a similar function enabled by pci-mem-align boot >>> parameter. Currently it is kept. But if many people agree with me, I''d >>> like to remove it from dom0 linux, because there are two problems. >>> >>> - pci-mem-align reassigns all devices'' memory resources if they are >>> not page-aligned. This is not safe, because some devices are >>> used by firmware. >>> - pci-mem-align can''t expand resource window of PCI-PCI bridge. >> Why the resource windows can''t be expended? I think the >> pci_bus_size_bridges takes care of this. So it shouldn''t be a >> problem and your pci_update_bridge appears unnecessary. > > Current pci_bus_size_bridges calculates the size and minimal alignment > of resource window based on actual resource size. The values of them > will be not changed if pci-mem-align is enabled.Yes, that''s a problem in pbus_size_mem, and should be fixed there -- say, make it consistent with pci_bus_alloc_resource by fixing up the alignment.> > But if we make memory resource page-aligned, we need more space. My > patch rounds up resource size to page size, on calculating the size > of resource windows.The calculation of the window size is incorrect because it''s still based on wrong values from pbus_size_mem. And the base addresses may be overwritten by pci_setup_bridge ultimately.> > >> Regarding the first problem you mentioned, it could be easily fixed >> by merging your "rassigndev=" parameter to quirk_align_mem_resources >> so the function can selective twist the devices. > > If my "rassigndev=" parameter is merged to quirk_align_mem_resources, > problem will occurs. Current quirk_align_mem_resources shifts > resources simply. Resources will conflict easily. If confliction > occurs, kernel reassigns conflicting resources. As a result, the > resources of device which are not specified are reassigned.The quirk function should deassign the resource (set resource start to 0) instead of shifting it. Then it could fall into same scenario as your patch.> Additionally, the reassigned resource is not page-aligned. > > Thanks, > -- > Yuji Shimada_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Zhao, Yu
2008-Oct-10 07:05 UTC
Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
Yuji & Keir, I guess the comments I gave yesterday is obscure, so I wrote up more details this time. Please take a look. Thanks, Yu > To reassign resources, please add boot parameters of dom0 linux as follows. > > reassign_resources reassigndev=00:1d.7,01:00.0 > > reassign_resources > Enables reassigning resources. > > reassigndev= Specifies devices include I/O device and PCI-PCI > bridge to reassign resources. PCI-PCI bridge > can be specified, if resource windows need to > be expanded. Generally, it doesn''t work as claimed from my observation.> /* This quirk function enables us to force all memory resources which are > * assigned to PCI devices, to be page-aligned. > */ > @@ -41,6 +54,42 @@ > int i; > struct resource *r; > resource_size_t old_start; > + > + if (reassign_resources) { > + if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL && > + (dev->class >> 8) == PCI_CLASS_BRIDGE_HOST) { > + /* PCI Host Bridge isn''t a target device */The comments is wrong -- we get invalid device type here, not the host bridge. Host bridge doesn''t have a ''dev'' at all.> + return; > + } > + if (is_reassigndev(dev)) { > + printk(KERN_INFO > + "PCI: Disable device and release resources" > + " [%s].\n", pci_name(dev)); > + pci_disable_device(dev); > + > + for (i=0; i < PCI_NUM_RESOURCES; i++) { > + r = &dev->resource[i]; > + if ((r == NULL) ||How can ''r'' be NULL?> + !(r->flags & IORESOURCE_MEM)) > + continue; > + > + r->end = r->end - r->start; > + r->start = 0; > + > + if (i < PCI_BRIDGE_RESOURCES) { > + pci_update_resource(dev, r, i); > + } else if (i == 8 || i == 9) {The bridge even hasn''t been scanned yet so this will *never* execute as expected.> + /* need to update(clear) the Base/Limit > + * register also, because PCI bridge is > + * disabled and the resource is > + * released. > + */ > + pci_update_bridge(dev, i); > + } > + } > + } > + return; > + } > > if (!pci_mem_align) > return;...> diff -r b54652ee29ef drivers/pci/setup-bus.c > --- a/drivers/pci/setup-bus.c Thu Oct 02 11:29:02 2008 +0100 > +++ b/drivers/pci/setup-bus.c Wed Oct 08 12:12:27 2008 +0900 > @@ -26,6 +26,7 @@ > #include <linux/cache.h> > #include <linux/slab.h> > > +#include "pci.h" > > #define DEBUG_CONFIG 1 > #if DEBUG_CONFIG > @@ -344,7 +345,8 @@ > > list_for_each_entry(dev, &bus->devices, bus_list) { > int i;It most likely couldn''t get here because the x86 PCI low level code has allocated resources for the bridge according to the BIOS and the function returns early.> - > + int reassign = reassign_resources ? is_reassigndev(dev) : 0; > + > for (i = 0; i < PCI_NUM_RESOURCES; i++) { > struct resource *r = &dev->resource[i]; > unsigned long r_size; > @@ -352,6 +354,11 @@ > if (r->parent || (r->flags & mask) != type) > continue; > r_size = r->end - r->start + 1; > + > + if (reassign) { > + r_size = ROUND_UP_TO_PAGESIZE(r_size); > + } > + > /* For bridges size != alignment */ > align = (i < PCI_BRIDGE_RESOURCES) ? r_size : r->start; > order = __ffs(align) - 20; > diff -r b54652ee29ef drivers/pci/setup-res.c > --- a/drivers/pci/setup-res.c Thu Oct 02 11:29:02 2008 +0100 > +++ b/drivers/pci/setup-res.c Wed Oct 08 12:12:27 2008 +0900 > @@ -117,19 +117,96 @@ > } > EXPORT_SYMBOL_GPL(pci_claim_resource); > > +void > +pci_update_bridge(struct pci_dev *dev, int resno) > +{ > + struct resource *res = &dev->resource[resno]; > + struct pci_bus_region region; > + u32 l, dw, base_up32, limit_up32; > + > + if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE || > + (dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) { > + return; > + } > + > + if (!res->flags) > + return; > + > + switch (resno) { > + case 8 : /* MMIO Base/Limit */ > + pcibios_resource_to_bus(dev, ®ion, res); > + if (res->flags & IORESOURCE_MEM && > + !(res->flags & IORESOURCE_PREFETCH)) { > + l = (region.start >> 16) & 0xfff0; > + l |= region.end & 0xfff00000; > + } else { > + l = 0x0000fff0; > + } > + pci_write_config_dword(dev, PCI_MEMORY_BASE, l); > + > + break; > + > + case 9 : /* Prefetchable MMIO Base/Limit */ > + /* Clear out the upper 32 bits of PREF limit. > + * If PCI_PREF_BASE_UPPER32 was non-zero, this temporarily > + * disables PREF range, which is ok. > + */ > + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, 0); > + > + /* Get PREF 32/64 bits Addressing mode */ > + pci_read_config_dword(dev, PCI_PREF_MEMORY_BASE, &dw); > + > + pcibios_resource_to_bus(dev, ®ion, res); > + if (res->flags & IORESOURCE_MEM && > + res->flags & IORESOURCE_PREFETCH) { > + l = (region.start >> 16) & 0xfff0; > + l |= region.end & 0xfff00000; > + > + if (dw & PCI_PREF_RANGE_TYPE_64) { > + base_up32 = (region.start >> 32) & 0xffffffff; > + limit_up32 = (region.end >> 32) & 0xffffffff; > + } else { > + base_up32 = 0; > + limit_up32 = 0; > + } > + } else { > + l = 0x0000fff0; > + base_up32 = 0xffffffff; > + limit_up32 = 0; > + } > + pci_write_config_dword(dev, PCI_PREF_MEMORY_BASE, l); > + /* Set up the upper 32 bits of PREF base/limit. */ > + pci_write_config_dword(dev, PCI_PREF_BASE_UPPER32, base_up32); > + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, limit_up32); > + break; > + default : > + BUG(); > + break; > + } > +} > +I don''t see how this function ''expend'' the resource window (even it was invoked). Actually it has many problems: 1, the dev->resource is not updated hence software doesn''t know the new resource window size that can be granted to subordinate devices. 2, the values might be overwritten later by pci_setup_bridge. 3, the registers are changed without checking if resource range are available, which means it may conflict with other bridge.> int pci_assign_resource(struct pci_dev *dev, int resno) > { > struct pci_bus *bus = dev->bus; > struct resource *res = dev->resource + resno; > resource_size_t size, min, align; > int ret; > + int reassigndev = reassign_resources ? is_reassigndev(dev) : 0; > > size = res->end - res->start + 1; > min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM; > /* The bridge resources are special, as their > size != alignment. Sizing routines return > required alignment in the "start" field. */ > - align = (resno < PCI_BRIDGE_RESOURCES) ? size : res->start; > + if (resno < PCI_BRIDGE_RESOURCES) { > + align = size; > + if ((reassigndev) && > + (res->flags & IORESOURCE_MEM)) { > + align = ROUND_UP_TO_PAGESIZE(align);Different MMIO resrouces of a device may require different alignments. Using page size for all may cause problem. The alignment should be a bigger value between page size and resource size. Or the best way is keep aligned resources of a device and just reassign the unaligned resources in the quirk_align_mem_resources.> + } > + } else { > + align = res->start; > + } > > /* First, try exact prefetching match.. */ > ret = pci_bus_alloc_resource(bus, res, size, align, min, > @@ -154,6 +231,9 @@ > resno, (unsigned long long)size, > (unsigned long long)res->start, pci_name(dev)); > } else if (resno < PCI_BRIDGE_RESOURCES) { > + printk(KERN_DEBUG "PCI: Assign resource(%d) on %s " > + "%016llx - %016llx\n", resno, pci_name(dev), > + (u64)res->start, (u64)res->end); > pci_update_resource(dev, res, resno); > } > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yuji Shimada
2008-Oct-10 10:13 UTC
Re: [Xen-devel] [PATCH] dom0 linux: Reassign memory resources to device for pci passthrough.
Thank you so much for your reviewing. Please look my comments in below. On Fri, 10 Oct 2008 15:05:29 +0800 "Zhao, Yu" <yu.zhao@intel.com> wrote:> Yuji & Keir, > > I guess the comments I gave yesterday is obscure, so I wrote up more > details this time. Please take a look. > > Thanks, > Yu > > > To reassign resources, please add boot parameters of dom0 linux as > follows. > > > > reassign_resources reassigndev=00:1d.7,01:00.0 > > > > reassign_resources > > Enables reassigning resources. > > > > reassigndev= Specifies devices include I/O device and PCI-PCI > > bridge to reassign resources. PCI-PCI bridge > > can be specified, if resource windows need to > > be expanded. > > Generally, it doesn''t work as claimed from my observation.When I/O device is specified, my patch work fine. But when PCI-PCI bridge is specified, my patch does not work fine. I can fix it. I will submit the patch.> > /* This quirk function enables us to force all memory resources which are > > * assigned to PCI devices, to be page-aligned. > > */ > > @@ -41,6 +54,42 @@ > > int i; > > struct resource *r; > > resource_size_t old_start; > > + > > + if (reassign_resources) { > > + if (dev->hdr_type == PCI_HEADER_TYPE_NORMAL && > > + (dev->class >> 8) == PCI_CLASS_BRIDGE_HOST) { > > + /* PCI Host Bridge isn''t a target device */ > > The comments is wrong -- we get invalid device type here, not the host > bridge. Host bridge doesn''t have a ''dev'' at all.I think Host bridge has ''dev''. We can see host bridge using lspci. 00:00.0 Host bridge: Intel Corporation 3200/3210 Chipset DRAM Controller (rev 01) Subsystem: NEC Corporation Unknown device 835e Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort+ <MAbort+ >SERR- <PERR- Latency: 0 Capabilities: [e0] Vendor Specific Information 00: 86 80 f0 29 06 00 90 30 01 00 00 06 00 00 00 00 ^^^^^^^^ ^^ class hdr_type> > + return; > > + } > > + if (is_reassigndev(dev)) { > > + printk(KERN_INFO > > + "PCI: Disable device and release resources" > > + " [%s].\n", pci_name(dev)); > > + pci_disable_device(dev); > > + > > + for (i=0; i < PCI_NUM_RESOURCES; i++) { > > + r = &dev->resource[i]; > > + if ((r == NULL) || > > How can ''r'' be NULL?''r'' is always non-NULL. So "(r == NULL) ||" should be removed.> > + !(r->flags & IORESOURCE_MEM)) > > + continue; > > + > > + r->end = r->end - r->start; > > + r->start = 0; > > + > > + if (i < PCI_BRIDGE_RESOURCES) { > > + pci_update_resource(dev, r, i); > > + } else if (i == 8 || i == 9) { > > The bridge even hasn''t been scanned yet so this will *never* execute as > expected.You are right. pci_read_bridge_bases is called after quirk_align_mem_resources. We need to clear the Base/Limit register regardless of IORESOURCE_MEM flag. I will investigate further and submit the patch.> > + /* need to update(clear) the Base/Limit > > + * register also, because PCI bridge is > > + * disabled and the resource is > > + * released. > > + */ > > + pci_update_bridge(dev, i); > > + } > > + } > > + } > > + return; > > + } > > > > if (!pci_mem_align) > > return; > > ... > > > diff -r b54652ee29ef drivers/pci/setup-bus.c > > --- a/drivers/pci/setup-bus.c Thu Oct 02 11:29:02 2008 +0100 > > +++ b/drivers/pci/setup-bus.c Wed Oct 08 12:12:27 2008 +0900 > > @@ -26,6 +26,7 @@ > > #include <linux/cache.h> > > #include <linux/slab.h> > > > > +#include "pci.h" > > > > #define DEBUG_CONFIG 1 > > #if DEBUG_CONFIG > > @@ -344,7 +345,8 @@ > > > > list_for_each_entry(dev, &bus->devices, bus_list) { > > int i; > > It most likely couldn''t get here because the x86 PCI low level code has > allocated resources for the bridge according to the BIOS and the > function returns early.So, we have to clear Base/Limit register in quirk_align_mem_resources.> > - > > + int reassign = reassign_resources ? is_reassigndev(dev) : 0; > > + > > for (i = 0; i < PCI_NUM_RESOURCES; i++) { > > struct resource *r = &dev->resource[i]; > > unsigned long r_size; > > @@ -352,6 +354,11 @@ > > if (r->parent || (r->flags & mask) != type) > > continue; > > r_size = r->end - r->start + 1; > > + > > + if (reassign) { > > + r_size = ROUND_UP_TO_PAGESIZE(r_size); > > + } > > + > > /* For bridges size != alignment */ > > align = (i < PCI_BRIDGE_RESOURCES) ? r_size : r->start; > > order = __ffs(align) - 20; > > diff -r b54652ee29ef drivers/pci/setup-res.c > > --- a/drivers/pci/setup-res.c Thu Oct 02 11:29:02 2008 +0100 > > +++ b/drivers/pci/setup-res.c Wed Oct 08 12:12:27 2008 +0900 > > @@ -117,19 +117,96 @@ > > } > > EXPORT_SYMBOL_GPL(pci_claim_resource); > > > > +void > > +pci_update_bridge(struct pci_dev *dev, int resno) > > +{ > > + struct resource *res = &dev->resource[resno]; > > + struct pci_bus_region region; > > + u32 l, dw, base_up32, limit_up32; > > + > > + if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE || > > + (dev->class >> 8) != PCI_CLASS_BRIDGE_PCI) { > > + return; > > + } > > + > > + if (!res->flags) > > + return; > > + > > + switch (resno) { > > + case 8 : /* MMIO Base/Limit */ > > + pcibios_resource_to_bus(dev, ®ion, res); > > + if (res->flags & IORESOURCE_MEM && > > + !(res->flags & IORESOURCE_PREFETCH)) { > > + l = (region.start >> 16) & 0xfff0; > > + l |= region.end & 0xfff00000; > > + } else { > > + l = 0x0000fff0; > > + } > > + pci_write_config_dword(dev, PCI_MEMORY_BASE, l); > > + > > + break; > > + > > + case 9 : /* Prefetchable MMIO Base/Limit */ > > + /* Clear out the upper 32 bits of PREF limit. > > + * If PCI_PREF_BASE_UPPER32 was non-zero, this temporarily > > + * disables PREF range, which is ok. > > + */ > > + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, 0); > > + > > + /* Get PREF 32/64 bits Addressing mode */ > > + pci_read_config_dword(dev, PCI_PREF_MEMORY_BASE, &dw); > > + > > + pcibios_resource_to_bus(dev, ®ion, res); > > + if (res->flags & IORESOURCE_MEM && > > + res->flags & IORESOURCE_PREFETCH) { > > + l = (region.start >> 16) & 0xfff0; > > + l |= region.end & 0xfff00000; > > + > > + if (dw & PCI_PREF_RANGE_TYPE_64) { > > + base_up32 = (region.start >> 32) & 0xffffffff; > > + limit_up32 = (region.end >> 32) & 0xffffffff; > > + } else { > > + base_up32 = 0; > > + limit_up32 = 0; > > + } > > + } else { > > + l = 0x0000fff0; > > + base_up32 = 0xffffffff; > > + limit_up32 = 0; > > + } > > + pci_write_config_dword(dev, PCI_PREF_MEMORY_BASE, l); > > + /* Set up the upper 32 bits of PREF base/limit. */ > > + pci_write_config_dword(dev, PCI_PREF_BASE_UPPER32, base_up32); > > + pci_write_config_dword(dev, PCI_PREF_LIMIT_UPPER32, limit_up32); > > + break; > > + default : > > + BUG(); > > + break; > > + } > > +} > > + > > I don''t see how this function ''expend'' the resource window (even it was > invoked). Actually it has many problems: 1, the dev->resource is not > updated hence software doesn''t know the new resource window size that > can be granted to subordinate devices. 2, the values might be > overwritten later by pci_setup_bridge. 3, the registers are changed > without checking if resource range are available, which means it may > conflict with other bridge.pci_update_bridge does not expand the resource window. I use it to clear Base/Limit register.> > > int pci_assign_resource(struct pci_dev *dev, int resno) > > { > > struct pci_bus *bus = dev->bus; > > struct resource *res = dev->resource + resno; > > resource_size_t size, min, align; > > int ret; > > + int reassigndev = reassign_resources ? is_reassigndev(dev) : 0; > > > > size = res->end - res->start + 1; > > min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM; > > /* The bridge resources are special, as their > > size != alignment. Sizing routines return > > required alignment in the "start" field. */ > > - align = (resno < PCI_BRIDGE_RESOURCES) ? size : res->start; > > + if (resno < PCI_BRIDGE_RESOURCES) { > > + align = size; > > + if ((reassigndev) && > > + (res->flags & IORESOURCE_MEM)) { > > + align = ROUND_UP_TO_PAGESIZE(align); > > Different MMIO resrouces of a device may require different alignments. > Using page size for all may cause problem. The alignment should be a > bigger value between page size and resource size.My code do as you mentioned. If the value of "align" is bigger than page size, it keep the value.> Or the best way is > keep aligned resources of a device and just reassign the unaligned > resources in the quirk_align_mem_resources.I don''t think so. My code can use existing code for calculating the size of resource window and assigning resource. Thanks, -- Yuji Shimada _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel