Stefan Berger
2005-Aug-19 16:31 UTC
[Xen-devel] Experimental PCI Emulation Layer for XEN x86(32)
Hello! I have explored the idea of running QEMU''s PCI emulation layer in XEN''s (x86-32) exception handler a bit further and the results are very promising. The benefits of doing this are the following: - Compiling PCI code into a user domain works just fine and the PCI layer finds an empty PCI bus; the bootup messages (dmesg) can be seen under (*) as well as the devices that the sysfs shows - you need to set CONFIG_XEN_PRIVILEGED_GUEST=y and CONFIG_XEN_PHYSDEV_ACCESS=y to be queried about PCI when compiling a user domain kernel - The best: Running a domain 0 kernel as a user domain kernel also works ! The bootup messages (dmesg) can be seen under (**) below - With some further exploration this could be leading to a unified build process and kernel for domain 0 and user domains which I think would be ''good''. In the future I would look at this a bit further and try to do the following: - hide PCI devices in domain 0 by passing a command line argument in the grub.conf when booting a domain - learn as much about the hidden PCI device as possible by probing the PCI bus or reading it from exising device structures - making the information about the PCI device available to XEN through a hypercall and programm the PCI emulation layer of a driver domain to mimic this device''s existence - punch holes into the privilege bitmap of the driver domain to access the IO ports of that device and somehow map memory so the user domain can talk to this device (need help on this) About the emulation layer: - I recycled QEMU''s implementation and shrunk it to fit into XEN; it''s actually quite small also when compiled -rw-r--r-- 1 root root 8756 Aug 19 12:05 qemu_pci.o -rw-r--r-- 1 root root 2876 Aug 19 12:05 qemu_vl.o - The plugability of QEMU''s emulation layer has been preserved to register ''mimic''ed devices'' or maybe even the ones that QEMU has :-) - There''s a nob to turn: MAX_IOPORTS; If that is set to 65536 (for all inb''s and outb''s) then XEN will need to allocate a huge amount of memory; turning it down to 4096 is good enough to emulate all ports <=4095; this part needs some work -> a hash table which hashes by ioport number instead of a plain array The patch is attached and inlined. It works fine against Wendesday''s tree. Please let me know about comments or ideas! ------------------------- (*) dmesg of user domain kernel with PCI bus Linux version 2.6.12-xenU (root@jlfb-2.watson.ibm.com) (gcc version 4.0.1 20050727 (Red Hat 4.0.1-5)) #6 SMP Thu Aug 18 17:55:21 EDT 2005 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000002000000 (usable) 0MB HIGHMEM available. 32MB LOWMEM available. DMI not present. IRQ lockup detection disabled Allocating PCI resources starting at 02000000 (gap: 02000000:fe000000) Built 1 zonelists Kernel command line: root=/dev/ram0 console=tty ro ima=1 Initializing CPU#0 PID hash table entries: 256 (order: 8, 4096 bytes) Xen reported: 2799.900 MHz processor. Dentry cache hash table entries: 8192 (order: 3, 32768 bytes) Inode-cache hash table entries: 4096 (order: 2, 16384 bytes) vmalloc area: c2800000-fb7fe000, maxmem 34000000 Memory: 17644k/32768k available (1974k kernel code, 15060k reserved, 558k data, 168k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Mount-cache hash table entries: 512 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking ''hlt'' instruction... disabled CPU0: Intel(R) Xeon(TM) CPU 2.80GHz stepping 07 SMP motherboard not detected. Brought up 1 CPUs checking if image is initramfs... it is Freeing initrd memory: 11440k freed NET: Registered protocol family 16 store_evtchn = 2 XENBUS xs_read_watch: 0 PCI: Using configuration type 1 XENBUS xs_read_watch: 0 XENBUS xs_read_watch: 0 xen_mem: Initialising balloon driver. XENBUS xs_read_watch: 0 PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) Grant table initialized Initializing Cryptographic API PCI: PIIX3: Enabling Passive Release on 0000:00:01.0 Limiting direct PCI/PCI transfers. Activating ISA DMA hang workarounds. i8042.c: No controller found. io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize Xen virtual console successfully installed as tty1 Event-channel device installed. Blkif frontend is using grant tables. xen_blk: Initialising virtual block device driver xen_blk: Timeout connecting to device! Netdev frontend (TX) is using grant tables. Netdev frontend (RX) is using grant tables. xen_net: Initialising virtual ethernet driver. ??~?mice: PS/2 mouse device common for all mice NET: Registered protocol family 2 IP: routing cache hash table of 256 buckets, 4Kbytes TCP established hash table entries: 2048 (order: 3, 32768 bytes) TCP bind hash table entries: 2048 (order: 2, 24576 bytes) TCP: Hash tables configured (established 2048 bind 2048) NET: Registered protocol family 1 NET: Registered protocol family 17 Freeing unused kernel memory: 168k freed init started: BusyBox v1.00 (2005.02.09-15:13+0000) multi-call binary [...] # cd /sys/devices/ # find . ./pci0000:00 ./pci0000:00/0000:00:01.0 ./pci0000:00/0000:00:01.0/config ./pci0000:00/0000:00:01.0/bus ./pci0000:00/0000:00:01.0/modalias ./pci0000:00/0000:00:01.0/local_cpus ./pci0000:00/0000:00:01.0/irq ./pci0000:00/0000:00:01.0/class ./pci0000:00/0000:00:01.0/subsystem_device ./pci0000:00/0000:00:01.0/subsystem_vendor ./pci0000:00/0000:00:01.0/device ./pci0000:00/0000:00:01.0/vendor ./pci0000:00/0000:00:01.0/resource ./pci0000:00/0000:00:00.0 ./pci0000:00/0000:00:00.0/config ./pci0000:00/0000:00:00.0/bus ./pci0000:00/0000:00:00.0/modalias ./pci0000:00/0000:00:00.0/local_cpus ./pci0000:00/0000:00:00.0/irq ./pci0000:00/0000:00:00.0/class ./pci0000:00/0000:00:00.0/subsystem_device ./pci0000:00/0000:00:00.0/subsystem_vendor ./pci0000:00/0000:00:00.0/device ./pci0000:00/0000:00:00.0/vendor ./pci0000:00/0000:00:00.0/resource ./vif-0 ./vif-0/bus ./system ./system/ioapic ./system/timer ./system/timer/timer0 ./system/cpu ./system/cpu/cpu0 ./system/cpu/cpu0/online ./platform ------------------------- (**) dmesg of domain 0 kernel booting up as a user domain kernel Linux version 2.6.12-xen0 (root@jlfb-2.watson.ibm.com) (gcc version 4.0.1 20050727 ( Red Hat 4.0.1-5)) #6 Thu Aug 18 17:54:40 EDT 2005 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000002000000 (usable) 0MB HIGHMEM available. 32MB LOWMEM available. On node 0 totalpages: 8192 DMA zone: 8192 pages, LIFO batch:3 Normal zone: 0 pages, LIFO batch:1 HighMem zone: 0 pages, LIFO batch:1 DMI not present. ACPI in unprivileged domain disabled IRQ lockup detection disabled Allocating PCI resources starting at 02000000 (gap: 02000000:fe000000) Built 1 zonelists Kernel command line: root=/dev/ram0 console=tty ro ima=1 Initializing CPU#0 PID hash table entries: 256 (order: 8, 4096 bytes) Xen reported: 2799.892 MHz processor. Dentry cache hash table entries: 8192 (order: 3, 32768 bytes) Inode-cache hash table entries: 4096 (order: 2, 16384 bytes) vmalloc area: c2800000-fb7fe000, maxmem 34000000 Memory: 15748k/32768k available (3438k kernel code, 16956k reserved, 1016k data, 352 k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 5596.77 BogoMIPS (lpj=27983872) Mount-cache hash table entries: 512 CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 0000 0000 00000000 CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000 000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: After all inits, caps: bfebc3f1 00000000 00000000 00000080 00004400 00000000 00 000000 CPU: Intel(R) Xeon(TM) CPU 2.80GHz stepping 07 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking ''hlt'' instruction... disabled checking if image is initramfs... it is Freeing initrd memory: 11440k freed NET: Registered protocol family 16 store_evtchn = 2 XENBUS xs_read_watch: 0 PCI: Using configuration type 1 XENBUS xs_read_watch: 0 ACPI: Subsystem revision 20050309 ACPI: Interpreter disabled. xen_mem: Initialising balloon driver. XENBUS xs_read_watch: 0 SCSI subsystem initialized usbcore: registered new driver hub PCI: Using ACPI for IRQ routing PCI: If a device doesn''t work, try "pci=routeirq". If it helps, post a report Grant table initialized IA-32 Microcode Update Driver: v1.14-xen <tigran@veritas.com> Initializing Cryptographic API i8042.c: No controller found. io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered Floppy drive(s): fd0 is unknown type 15 (usb?), fd1 is unknown type 15 (usb?) Failed to obtain physical IRQ 6 floppy0: no floppy controllers found RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize loop: loaded (max 8 devices) HP CISS Driver (v 2.6.6) Intel(R) PRO/1000 Network Driver - version 6.0.54-k2 Copyright (c) 1999-2004 Intel Corporation. pcnet32.c:v1.30j 29.04.2005 tsbogend@alpha.franken.de e100: Intel(R) PRO/100 Network Driver, 3.4.8-k2-NAPI e100: Copyright(c) 1999-2005 Intel Corporation tun: Universal TUN/TAP device driver, 1.6 tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com> Xen virtual console successfully installed as tty1 Event-channel device installed. Blkif frontend is using grant tables. xen_blk: Initialising virtual block device driver xen_blk: Could not probe disks (0) xen_blk: failed to probe Netdev frontend (TX) is using grant tables. Netdev frontend (RX) is using grant tables. xen_net: Initialising virtual ethernet driver. Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx Probing IDE interface ide0... Probing IDE interface ide1... Probing IDE interface ide2... Probing IDE interface ide3... Probing IDE interface ide4... Probing IDE interface ide5... Red Hat/Adaptec aacraid driver (1.1.2-lk2 Aug 17 2005) 3ware Storage Controller device driver for Linux v1.26.02.001. libata version 1.11 loaded. Fusion MPT base driver 3.01.20 Copyright (c) 1999-2004 LSI Logic Corporation Fusion MPT SCSI Host driver 3.01.20 usbmon: debugs is not available ohci_hcd: 2004 Nov 08 USB 1.1 ''Open'' Host Controller (OHCI) Driver (PCI) USB Universal Host Controller Interface driver v2.2 usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.01:USB HID core driver mice: PS/2 mouse device common for all mice md: raid0 personality registered as nr 2 md: raid1 personality registered as nr 3 md: raid5 personality registered as nr 4 raid5: automatically using best checksumming function: pIII_sse pIII_sse : 1306.800 MB/sec raid5: using function: pIII_sse (1306.800 MB/sec) md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27 device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel@redhat.com NET: Registered protocol family 2 IP: routing cache hash table of 512 buckets, 4Kbytes TCP established hash table entries: 2048 (order: 2, 16384 bytes) TCP bind hash table entries: 2048 (order: 1, 8192 bytes) TCP: Hash tables configured (established 2048 bind 2048) NET: Registered protocol family 1 NET: Registered protocol family 17 Bridge firewalling registered Freeing unused kernel memory: 352k freed ---------------------------------- The patch: Signed-off-by: Stefan Berger <stefanb@us.ibm.com> Index: root/xen-unstable.hg/xen/arch/x86/qemu_pci.c ==================================================================--- /dev/null +++ root/xen-unstable.hg/xen/arch/x86/qemu_pci.c @@ -0,0 +1,807 @@ +/* + * QEMU PCI bus manager + * + * Copyright (c) 2004 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ +/* + * Adapted to fit XEN by Stefan Berger, stefanb@us.ibm.com + */ + +//#define DEBUG_PCI + +#define PCI_VENDOR_ID 0x00 /* 16 bits */ +#define PCI_DEVICE_ID 0x02 /* 16 bits */ +#define PCI_COMMAND 0x04 /* 16 bits */ +#define PCI_COMMAND_IO 0x1 /* Enable response in I/O space */ +#define PCI_COMMAND_MEMORY 0x2 /* Enable response in Memory space */ +#define PCI_CLASS_DEVICE 0x0a /* Device class */ +#define PCI_INTERRUPT_LINE 0x3c /* 8 bits */ +#define PCI_INTERRUPT_PIN 0x3d /* 8 bits */ +#define PCI_MIN_GNT 0x3e /* 8 bits */ +#define PCI_MAX_LAT 0x3f /* 8 bits */ + + +#include <xen/qemu_vl.h> +#include <xen/lib.h> +#include "qemu_makeup.h" + + +static PCIBase *create_pci_base(void) +{ + PCIBase *base = qemu_mallocz(sizeof(PCIBase)); + + if (base) { + init_ioports(base); + } + return base; +} + +static PCIBus *pci_register_bus(PCIBase *base) +{ + PCIBus *bus; + bus = qemu_mallocz(sizeof(PCIBus)); + if (bus) { + bus->base = base; + } + return bus; +} + +static void pci_bus_free(PCIBus *bus) +{ + xfree(bus); +} + +static void pci_device_free(PCIDevice *dev) +{ + xfree(dev); +} + +void pci_base_free(PCIBase *base) +{ + PCIBus *bus = base->bus; + if (bus) { + int devfn; + for(devfn = 0; devfn < 256; devfn++) { + PCIDevice *d; + d = bus->devices[devfn]; + if (d) + pci_device_free(d); + } + pci_bus_free(bus); + } + + xfree(base); +} + +/* -1 for devfn means auto assign */ +PCIDevice *pci_register_device(PCIBus *bus, const char *name, + int instance_size, int devfn, + PCIConfigReadFunc *config_read, + PCIConfigWriteFunc *config_write) +{ + PCIDevice *pci_dev; + + + if (devfn < 0) { + for(devfn = bus->devfn_min ; devfn < 256; devfn += 8) { + if (!bus->devices[devfn]) + goto found; + } + return NULL; + found: ; + } + pci_dev = qemu_mallocz(instance_size); + if (!pci_dev) + return NULL; + pci_dev->bus = bus; + pci_dev->devfn = devfn; + pstrcpy(pci_dev->name, sizeof(pci_dev->name), name); + + if (!config_read) + config_read = pci_default_read_config; + if (!config_write) + config_write = pci_default_write_config; + pci_dev->config_read = config_read; + pci_dev->config_write = config_write; + bus->devices[devfn] = pci_dev; + return pci_dev; +} + +void pci_register_io_region(PCIDevice *pci_dev, int region_num, + uint32_t size, int type, + PCIMapIORegionFunc *map_func) +{ + PCIIORegion *r; + + if ((unsigned int)region_num >= PCI_NUM_REGIONS) + return; + r = &pci_dev->io_regions[region_num]; + r->addr = -1; + r->size = size; + r->type = type; + r->map_func = map_func; +} + +static void pci_addr_writel(PCIBase *base, uint32_t addr, uint32_t val) +{ + PCIBus *s = base->bus; + s->config_reg = val; +} + +static uint32_t pci_addr_readl(PCIBase *base, uint32_t addr) +{ + PCIBus *s = base->bus; + return s->config_reg; +} + +static void pci_update_mappings(PCIDevice *d) +{ + PCIIORegion *r; + int cmd, i; + uint32_t last_addr, new_addr, config_ofs; + + cmd = le16_to_cpu(*(uint16_t *)(d->config + PCI_COMMAND)); + for(i = 0; i < PCI_NUM_REGIONS; i++) { + r = &d->io_regions[i]; + if (i == PCI_ROM_SLOT) { + config_ofs = 0x30; + } else { + config_ofs = 0x10 + i * 4; + } + if (r->size != 0) { + if (r->type & PCI_ADDRESS_SPACE_IO) { + if (cmd & PCI_COMMAND_IO) { + new_addr = le32_to_cpu(*(uint32_t *)(d->config + + config_ofs)); + new_addr = new_addr & ~(r->size - 1); + last_addr = new_addr + r->size - 1; + /* NOTE: we have only 64K ioports on PC */ + if (last_addr <= new_addr || new_addr == 0 || + last_addr >= 0x10000) { + new_addr = -1; + } + } else { + new_addr = -1; + } + } else { + if (cmd & PCI_COMMAND_MEMORY) { + new_addr = le32_to_cpu(*(uint32_t *)(d->config + + config_ofs)); + /* the ROM slot has a specific enable bit */ + if (i == PCI_ROM_SLOT && !(new_addr & 1)) + goto no_mem_map; + new_addr = new_addr & ~(r->size - 1); + last_addr = new_addr + r->size - 1; + /* NOTE: we do not support wrapping */ + /* XXX: as we cannot support really dynamic + mappings, we handle specific values as invalid + mappings. */ + if (last_addr <= new_addr || new_addr == 0 || + last_addr == -1) { + new_addr = -1; + } + } else { + no_mem_map: + new_addr = -1; + } + } + /* now do the real mapping */ + if (new_addr != r->addr) { + if (r->addr != -1) { + if (r->type & PCI_ADDRESS_SPACE_IO) { + int class; + /* NOTE: specific hack for IDE in PC case: + only one byte must be mapped. */ + class = d->config[0x0a] | (d->config[0x0b] << 8); + if (class == 0x0101 && r->size == 4) { + isa_unassign_ioport(d->bus->base, r->addr + 2, 1); + } else { + isa_unassign_ioport(d->bus->base, r->addr, r->size); + } + } else { +// !!! This should not have been disabled - maybe +#if 0 + cpu_register_physical_memory(r->addr + pci_mem_base, + r->size, + IO_MEM_UNASSIGNED); +#endif + } + } + r->addr = new_addr; + if (r->addr != -1) { + r->map_func(d, i, r->addr, r->size, r->type); + } + } + } + } +} + +uint32_t pci_default_read_config(PCIDevice *d, + uint32_t address, int len) +{ + uint32_t val; + switch(len) { + case 1: + val = d->config[address]; + break; + case 2: + val = le16_to_cpu(*(uint16_t *)(d->config + address)); + break; + default: + case 4: + val = le32_to_cpu(*(uint32_t *)(d->config + address)); + break; + } + return val; +} + +void pci_default_write_config(PCIDevice *d, + uint32_t address, uint32_t val, int len) +{ + int can_write, i; + uint32_t end, addr; + + if (len == 4 && ((address >= 0x10 && address < 0x10 + 4 * 6) || + (address >= 0x30 && address < 0x34))) { + PCIIORegion *r; + int reg; + + if ( address >= 0x30 ) { + reg = PCI_ROM_SLOT; + }else{ + reg = (address - 0x10) >> 2; + } + r = &d->io_regions[reg]; + if (r->size == 0) + goto default_config; + /* compute the stored value */ + if (reg == PCI_ROM_SLOT) { + /* keep ROM enable bit */ + val &= (~(r->size - 1)) | 1; + } else { + val &= ~(r->size - 1); + val |= r->type; + } + *(uint32_t *)(d->config + address) = cpu_to_le32(val); + pci_update_mappings(d); + return; + } + default_config: + /* not efficient, but simple */ + addr = address; + for(i = 0; i < len; i++) { + /* default read/write accesses */ + switch(d->config[0x0e]) { + case 0x00: + case 0x80: + switch(addr) { + case 0x00: + case 0x01: + case 0x02: + case 0x03: + case 0x08: + case 0x09: + case 0x0a: + case 0x0b: + case 0x0e: + case 0x10 ... 0x27: /* base */ + case 0x30 ... 0x33: /* rom */ + case 0x3d: + can_write = 0; + break; + default: + can_write = 1; + break; + } + break; + default: + case 0x01: + switch(addr) { + case 0x00: + case 0x01: + case 0x02: + case 0x03: + case 0x08: + case 0x09: + case 0x0a: + case 0x0b: + case 0x0e: + case 0x38 ... 0x3b: /* rom */ + case 0x3d: + can_write = 0; + break; + default: + can_write = 1; + break; + } + break; + } + if (can_write) { + d->config[addr] = val; + } + addr++; + val >>= 8; + } + + end = address + len; + if (end > PCI_COMMAND && address < (PCI_COMMAND + 2)) { + /* if the command register is modified, we must modify the mappings */ + pci_update_mappings(d); + } +} + +static void pci_data_write(PCIBus *opaque, uint32_t addr, + uint32_t val, int len) +{ + PCIBus *s = opaque; + PCIDevice *pci_dev; + int config_addr, bus_num; + +#if defined(DEBUG_PCI) && 0 + printf("pci_data_write: addr=%08x val=%08x len=%d\n", + s->config_reg, val, len); +#endif + if (!(s->config_reg & (1 << 31))) { + return; + } + if ((s->config_reg & 0x3) != 0) { + return; + } + bus_num = (s->config_reg >> 16) & 0xff; + if (bus_num != 0) + return; + pci_dev = s->devices[(s->config_reg >> 8) & 0xff]; + if (!pci_dev) + return; + config_addr = (s->config_reg & 0xfc) | (addr & 3); +#if defined(DEBUG_PCI) + printf("pci_config_write: %s: addr=%02x val=%08x len=%d\n", + pci_dev->name, config_addr, val, len); +#endif + pci_dev->config_write(pci_dev, config_addr, val, len); +} + +static uint32_t pci_data_read(PCIBus *opaque, uint32_t addr, + int len) +{ + PCIBus *s = opaque; + PCIDevice *pci_dev; + int config_addr, bus_num; + uint32_t val; + + if (!(s->config_reg & (1 << 31))) + goto fail; + if ((s->config_reg & 0x3) != 0) + goto fail; + bus_num = (s->config_reg >> 16) & 0xff; + if (bus_num != 0) + goto fail; + pci_dev = s->devices[(s->config_reg >> 8) & 0xff]; + if (!pci_dev) { + fail: + switch(len) { + case 1: + val = 0xff; + break; + case 2: + val = 0xffff; + break; + default: + case 4: + val = 0xffffffff; + break; + } + goto the_end; + } + config_addr = (s->config_reg & 0xfc) | (addr & 3); + val = pci_dev->config_read(pci_dev, config_addr, len); +#if defined(DEBUG_PCI) + printf("pci_config_read: %s: addr=%02x val=%08x len=%d\n", + pci_dev->name, config_addr, val, len); +#endif + the_end: +#if defined(DEBUG_PCI) && 0 + printf("pci_data_read: addr=%08x val=%08x len=%d\n", + s->config_reg, val, len); +#endif + return val; +} + +static void pci_data_writeb(PCIBase *base, uint32_t addr, uint32_t val) +{ + pci_data_write(base->bus, addr, val, 1); +} + +static void pci_data_writew(PCIBase *base, uint32_t addr, uint32_t val) +{ + pci_data_write(base->bus, addr, val, 2); +} + +static void pci_data_writel(PCIBase *base, uint32_t addr, uint32_t val) +{ + pci_data_write(base->bus, addr, val, 4); +} + +static uint32_t pci_data_readb(PCIBase *base, uint32_t addr) +{ + return pci_data_read(base->bus, addr, 1); +} + +static uint32_t pci_data_readw(PCIBase *base, uint32_t addr) +{ + return pci_data_read(base->bus, addr, 2); +} + +static uint32_t pci_data_readl(PCIBase *base, uint32_t addr) +{ + return pci_data_read(base->bus, addr, 4); +} + +/* i440FX PCI bridge */ + +PCIBase *i440fx_init() +{ + PCIBus *s; + PCIDevice *d; + PCIBase *base; + + base = create_pci_base(); + if (!base) return NULL; + + base->bus = s = pci_register_bus(base); + if (!s) { + xfree(base); + return NULL; + } + + register_ioport_write(0xcf8, 4, 4, pci_addr_writel, s, base); + register_ioport_read(0xcf8, 4, 4, pci_addr_readl, s, base); + + register_ioport_write(0xcfc, 4, 1, pci_data_writeb, s, base); + register_ioport_write(0xcfc, 4, 2, pci_data_writew, s, base); + register_ioport_write(0xcfc, 4, 4, pci_data_writel, s, base); + register_ioport_read(0xcfc, 4, 1, pci_data_readb, s, base); + register_ioport_read(0xcfc, 4, 2, pci_data_readw, s, base); + register_ioport_read(0xcfc, 4, 4, pci_data_readl, s, base); + + d = pci_register_device(s, "i440FX", sizeof(PCIDevice), 0, + NULL, NULL); + + d->config[0x00] = 0x86; // vendor_id + d->config[0x01] = 0x80; + d->config[0x02] = 0x37; // device_id + d->config[0x03] = 0x12; + d->config[0x08] = 0x02; // revision + d->config[0x0a] = 0x00; // class_sub = host2pci + d->config[0x0b] = 0x06; // class_base = PCI_bridge + d->config[0x0e] = 0x00; // header_type + return base; +} + +/* PIIX3 PCI to ISA bridge */ + +typedef struct PIIX3State { + PCIDevice dev; +} PIIX3State; + +PIIX3State *piix3_state; + +/* return the global irq number corresponding to a given device irq + pin. We could also use the bus number to have a more precise + mapping. */ +static inline int pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) +{ + int slot_addend; + slot_addend = (pci_dev->devfn >> 3) - 1; + return (irq_num + slot_addend) & 3; +} + +static void piix3_reset(PIIX3State *d) +{ + uint8_t *pci_conf = d->dev.config; + + pci_conf[0x04] = 0x07; // master, memory and I/O + pci_conf[0x05] = 0x00; + pci_conf[0x06] = 0x00; + pci_conf[0x07] = 0x02; // PCI_status_devsel_medium + pci_conf[0x4c] = 0x4d; + pci_conf[0x4e] = 0x03; + pci_conf[0x4f] = 0x00; + pci_conf[0x60] = 0x80; + pci_conf[0x69] = 0x02; + pci_conf[0x70] = 0x80; + pci_conf[0x76] = 0x0c; + pci_conf[0x77] = 0x0c; + pci_conf[0x78] = 0x02; + pci_conf[0x79] = 0x00; + pci_conf[0x80] = 0x00; + pci_conf[0x82] = 0x00; + pci_conf[0xa0] = 0x08; + pci_conf[0xa0] = 0x08; + pci_conf[0xa2] = 0x00; + pci_conf[0xa3] = 0x00; + pci_conf[0xa4] = 0x00; + pci_conf[0xa5] = 0x00; + pci_conf[0xa6] = 0x00; + pci_conf[0xa7] = 0x00; + pci_conf[0xa8] = 0x0f; + pci_conf[0xaa] = 0x00; + pci_conf[0xab] = 0x00; + pci_conf[0xac] = 0x00; + pci_conf[0xae] = 0x00; +} + +void piix3_init(PCIBus *bus) +{ + PIIX3State *d; + uint8_t *pci_conf; + + d = (PIIX3State *)pci_register_device(bus, "PIIX3", sizeof(PIIX3State), + -1, NULL, NULL); +#if 0 + register_savevm("PIIX3", 0, 1, generic_pci_save, generic_pci_load, d); +#endif + + piix3_state = d; + pci_conf = d->dev.config; + + pci_conf[0x00] = 0x86; // Intel + pci_conf[0x01] = 0x80; + pci_conf[0x02] = 0x00; // 82371SB PIIX3 PCI-to-ISA bridge (Step A1) + pci_conf[0x03] = 0x70; + pci_conf[0x0a] = 0x01; // class_sub = PCI_ISA + pci_conf[0x0b] = 0x06; // class_base = PCI_bridge + pci_conf[0x0e] = 0x80; // header_type = PCI_multifunction, generic + + piix3_reset(d); +} + + + + +/***********************************************************/ +/* XXX: the following should be moved to the PC BIOS */ + +static __attribute__((unused)) uint32_t isa_inb(PCIBus *bus,uint32_t addr) +{ + return cpu_inb(bus->base, addr); +} + +static void isa_outb(PCIBus *bus, uint32_t val, uint32_t addr) +{ + cpu_outb(bus->base, addr, val); +} + +static __attribute__((unused)) uint32_t isa_inw(PCIBus *bus,uint32_t addr) +{ + return cpu_inw(bus->base, addr); +} + +static __attribute__((unused)) void isa_outw(PCIBus *bus,uint32_t val, uint32_t addr) +{ + cpu_outw(bus->base, addr, val); +} + +static __attribute__((unused)) uint32_t isa_inl(PCIBus *bus,uint32_t addr) +{ + return cpu_inl(bus->base, addr); +} + +static __attribute__((unused)) void isa_outl(PCIBus *bus,uint32_t val, uint32_t addr) +{ + cpu_outl(bus->base, addr, val); +} + +static void pci_config_writel(PCIDevice *d, uint32_t addr, uint32_t val) +{ + PCIBus *s = d->bus; + s->config_reg = 0x80000000 | (s->bus_num << 16) | + (d->devfn << 8) | addr; + pci_data_write(s, 0, val, 4); +} + +static void pci_config_writew(PCIDevice *d, uint32_t addr, uint32_t val) +{ + PCIBus *s = d->bus; + s->config_reg = 0x80000000 | (s->bus_num << 16) | + (d->devfn << 8) | (addr & ~3); + pci_data_write(s, addr & 3, val, 2); +} + + +static void pci_config_writeb(PCIDevice *d, uint32_t addr, uint32_t val) +{ + PCIBus *s = d->bus; + s->config_reg = 0x80000000 | (s->bus_num << 16) | + (d->devfn << 8) | (addr & ~3); + pci_data_write(s, addr & 3, val, 1); +} + +static __attribute__((unused)) uint32_t pci_config_readl(PCIDevice *d, uint32_t addr) +{ + PCIBus *s = d->bus; + s->config_reg = 0x80000000 | (s->bus_num << 16) | + (d->devfn << 8) | addr; + return pci_data_read(s, 0, 4); +} + +static uint32_t pci_config_readw(PCIDevice *d, uint32_t addr) +{ + PCIBus *s = d->bus; + s->config_reg = 0x80000000 | (s->bus_num << 16) | + (d->devfn << 8) | (addr & ~3); + return pci_data_read(s, addr & 3, 2); +} + +static uint32_t pci_config_readb(PCIDevice *d, uint32_t addr) +{ + PCIBus *s = d->bus; + s->config_reg = 0x80000000 | (s->bus_num << 16) | + (d->devfn << 8) | (addr & ~3); + return pci_data_read(s, addr & 3, 1); +} + +static uint32_t pci_bios_io_addr; +static uint32_t pci_bios_mem_addr; + +/* host irqs corresponding to PCI irqs A-D */ +static uint8_t pci_irqs[4] = { 11, 9, 11, 9 }; + +static void pci_set_io_region_addr(PCIDevice *d, int region_num, uint32_t addr) +{ + PCIIORegion *r; + uint16_t cmd; + uint32_t ofs; + + if ( region_num == PCI_ROM_SLOT ) { + ofs = 0x30; + }else{ + ofs = 0x10 + region_num * 4; + } + + pci_config_writel(d, ofs, addr); + r = &d->io_regions[region_num]; + + /* enable memory mappings */ + cmd = pci_config_readw(d, PCI_COMMAND); + if ( region_num == PCI_ROM_SLOT ) + cmd |= 2; + else if (r->type & PCI_ADDRESS_SPACE_IO) + cmd |= 1; + else + cmd |= 2; + pci_config_writew(d, PCI_COMMAND, cmd); +} + +static void pci_bios_init_device(PCIDevice *d) +{ + int class; + PCIIORegion *r; + uint32_t *paddr; + int i, pin, pic_irq, vendor_id, device_id; + + class = pci_config_readw(d, PCI_CLASS_DEVICE); + vendor_id = pci_config_readw(d, PCI_VENDOR_ID); + device_id = pci_config_readw(d, PCI_DEVICE_ID); + switch(class) { + case 0x0101: + if (vendor_id == 0x8086 && device_id == 0x7010) { + /* PIIX3 IDE */ + pci_config_writew(d, 0x40, 0x8000); // enable IDE0 + pci_config_writew(d, 0x42, 0x8000); // enable IDE1 + goto default_map; + } else { + /* IDE: we map it as in ISA mode */ + pci_set_io_region_addr(d, 0, 0x1f0); + pci_set_io_region_addr(d, 1, 0x3f4); + pci_set_io_region_addr(d, 2, 0x170); + pci_set_io_region_addr(d, 3, 0x374); + } + break; + case 0x0300: + if (vendor_id != 0x1234) + goto default_map; + /* VGA: map frame buffer to default Bochs VBE address */ + pci_set_io_region_addr(d, 0, 0xE0000000); + break; + case 0x0800: + /* PIC */ + vendor_id = pci_config_readw(d, PCI_VENDOR_ID); + device_id = pci_config_readw(d, PCI_DEVICE_ID); + if (vendor_id == 0x1014) { + /* IBM */ + if (device_id == 0x0046 || device_id == 0xFFFF) { + /* MPIC & MPIC2 */ + pci_set_io_region_addr(d, 0, 0x80800000 + 0x00040000); + } + } + break; + case 0xff00: + if (vendor_id == 0x0106b && + (device_id == 0x0017 || device_id == 0x0022)) { + /* macio bridge */ + pci_set_io_region_addr(d, 0, 0x80800000); + } + break; + default: + default_map: + /* default memory mappings */ + for(i = 0; i < PCI_NUM_REGIONS; i++) { + r = &d->io_regions[i]; + if (r->size) { + if (r->type & PCI_ADDRESS_SPACE_IO) + paddr = &pci_bios_io_addr; + else + paddr = &pci_bios_mem_addr; + *paddr = (*paddr + r->size - 1) & ~(r->size - 1); + pci_set_io_region_addr(d, i, *paddr); + *paddr += r->size; + } + } + break; + } + + /* map the interrupt */ + pin = pci_config_readb(d, PCI_INTERRUPT_PIN); + if (pin != 0) { + pin = pci_slot_get_pirq(d, pin - 1); + pic_irq = pci_irqs[pin]; + pci_config_writeb(d, PCI_INTERRUPT_LINE, pic_irq); + } + +} + +/* + * This function initializes the PCI devices as a normal PCI BIOS + * would do. It is provided just in case the BIOS has no support for + * PCI. + */ +void pci_bios_init(PCIBus *bus) +{ + int i, irq; + uint8_t elcr[2]; + + pci_bios_io_addr = 0xc000; + pci_bios_mem_addr = 0xf0000000; + + /* activate IRQ mappings */ + elcr[0] = 0x00; + elcr[1] = 0x00; + for(i = 0; i < 4; i++) { + irq = pci_irqs[i]; + /* set to trigger level */ + elcr[irq >> 3] |= (1 << (irq & 7)); + /* activate irq remapping in PIIX */ + pci_config_writeb((PCIDevice *)piix3_state, 0x60 + i, irq); + } + isa_outb(bus, elcr[0], 0x4d0); + isa_outb(bus, elcr[1], 0x4d1); + + if (bus) { + int devfn; + for(devfn = 0; devfn < 256; devfn++) { + PCIDevice *d; + d = bus->devices[devfn]; + if (d) + pci_bios_init_device(d); + } + } +} Index: root/xen-unstable.hg/xen/arch/x86/qemu_makeup.h ==================================================================--- /dev/null +++ root/xen-unstable.hg/xen/arch/x86/qemu_makeup.h @@ -0,0 +1,25 @@ +#ifndef QEMU_MAKEUP_H +#define QEMU_MAKEUP_H + +#include <xen/types.h> +#include <xen/cache.h> +#include <xen/xmalloc.h> + +static inline void *qemu_mallocz(unsigned int size) { + void *ptr = xmalloc_bytes(size); + if (ptr) { + memset(ptr, 0x0, size); + } + return ptr; +} +static inline uint16_t le16_to_cpu(uint16_t data) { + return data; +} +static inline uint32_t le32_to_cpu(uint32_t data) { + return data; +} +static inline uint32_t cpu_to_le32(uint32_t data) { + return data; +} + +#endif Index: root/xen-unstable.hg/xen/arch/x86/qemu_vl.c ==================================================================--- /dev/null +++ root/xen-unstable.hg/xen/arch/x86/qemu_vl.c @@ -0,0 +1,262 @@ +/* + * QEMU System Emulator + * + * Copyright (c) 2003-2005 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ +/* + * Adapted to fit XEN by Stefan Berger, stefanb@us.ibm.com + */ +#include <xen/qemu_vl.h> +#include <xen/lib.h> +#include "qemu_makeup.h" + + +/* size is the word size in byte */ +int register_ioport_read(int start, int length, int size, + IOPortReadFunc *func, PCIBus *bus, + PCIBase *base) +{ + int i, bsize; + + if (size == 1) { + bsize = 0; + } else if (size == 2) { + bsize = 1; + } else if (size == 4) { + bsize = 2; + } else { + printk("register_ioport_read: invalid size"); + return -1; + } + for(i = start; i < start + length && i < MAX_IOPORTS; i += size) { + base->ioport_read_table[bsize][i] = func; +#if 0 + if (base->ioport_opaque[i] != NULL && + base->ioport_opaque[i] != base) + printk("register_ioport_read: invalid opaque"); +#endif + base->ioport_opaque[i] = base; + } + return 0; +} + +/* size is the word size in byte */ +int register_ioport_write(int start, int length, int size, + IOPortWriteFunc *func, PCIBus *bus, + PCIBase *base) +{ + int i, bsize; + + if (size == 1) { + bsize = 0; + } else if (size == 2) { + bsize = 1; + } else if (size == 4) { + bsize = 2; + } else { + printk("register_ioport_write: invalid size"); + return -1; + } + for(i = start; i < start + length && i < MAX_IOPORTS ; i += size) { + base->ioport_write_table[bsize][i] = func; +#if 0 + if (base->ioport_opaque[i] != NULL && + base->ioport_opaque[i] != base) + printk("register_ioport_read: invalid opaque"); +#endif + base->ioport_opaque[i] = base; + } + return 0; +} + + +void cpu_outb(PCIBase *env, unsigned int addr, int val) +{ +#ifdef DEBUG_IOPORT + if (loglevel & CPU_LOG_IOPORT) + fprintf(logfile, "outb: %04x %02x\n", addr, val); +#endif + if (addr <= MAX_IOPORTS) + env->ioport_write_table[0][addr](env->ioport_opaque[addr], addr, val); +} + +void cpu_outw(PCIBase *env, unsigned int addr, int val) +{ +#ifdef DEBUG_IOPORT + if (loglevel & CPU_LOG_IOPORT) + fprintf(logfile, "outw: %04x %04x\n", addr, val); +#endif + if (addr <= MAX_IOPORTS) + env->ioport_write_table[1][addr](env->ioport_opaque[addr], addr, val); +} + +void cpu_outl(PCIBase *env, unsigned int addr, int val) +{ +#ifdef DEBUG_IOPORT + if (loglevel & CPU_LOG_IOPORT) + fprintf(logfile, "outl: %04x %08x\n", addr, val); +#endif + if (addr <= MAX_IOPORTS) + env->ioport_write_table[2][addr](env->ioport_opaque[addr], addr, val); +} + +int cpu_inb(PCIBase *env, unsigned int addr) +{ + int val; + if (addr > MAX_IOPORTS) { + return ~0; + } + val = env->ioport_read_table[0][addr](env->ioport_opaque[addr], addr); +#ifdef DEBUG_IOPORT + if (loglevel & CPU_LOG_IOPORT) + fprintf(logfile, "inb : %04x %02x\n", addr, val); +#endif + return val; +} + +int cpu_inw(PCIBase *env, unsigned int addr) +{ + int val; + if (addr > MAX_IOPORTS) { + return ~0; + } + val = env->ioport_read_table[1][addr](env->ioport_opaque[addr], addr); +#ifdef DEBUG_IOPORT + if (loglevel & CPU_LOG_IOPORT) + fprintf(logfile, "inw : %04x %04x\n", addr, val); +#endif + return val; +} + +int cpu_inl(PCIBase *env, unsigned int addr) +{ + int val; + if (addr > MAX_IOPORTS) { + return ~0; + } + val = env->ioport_read_table[2][addr](env->ioport_opaque[addr], addr); +#ifdef DEBUG_IOPORT + if (loglevel & CPU_LOG_IOPORT) + fprintf(logfile, "inl : %04x %08x\n", addr, val); +#endif + return val; +} + +uint32_t default_ioport_readb(PCIBase *base, uint32_t address) +{ +#ifdef DEBUG_UNUSED_IOPORT + fprintf(stderr, "inb: port=0x%04x\n", address); +#endif + return 0xff; +} + +void default_ioport_writeb(PCIBase *base, uint32_t address, uint32_t data) +{ +#ifdef DEBUG_UNUSED_IOPORT + fprintf(stderr, "outb: port=0x%04x data=0x%02x\n", address, data); +#endif +} + +/* default is to make two byte accesses */ +uint32_t default_ioport_readw(PCIBase *base, uint32_t address) +{ + uint32_t data = ~0; + if (address + 1 < MAX_IOPORTS) { + data = base->ioport_read_table[0][address](base->ioport_opaque[address], address); + address += 1; + data |= base->ioport_read_table[0][address](base->ioport_opaque[address], address) << 8; + } + return data; +} + +void default_ioport_writew(PCIBase *base, uint32_t address, uint32_t data) +{ + if (address + 1 < MAX_IOPORTS) { + base->ioport_write_table[0][address](base->ioport_opaque[address], address, data & 0xff); + address += 1; + base->ioport_write_table[0][address](base->ioport_opaque[address], address, (data >> 8) & 0xff); + } +} + +uint32_t default_ioport_readl(PCIBase *base, uint32_t address) +{ +#ifdef DEBUG_UNUSED_IOPORT + fprintf(stderr, "inl: port=0x%04x\n", address); +#endif + return 0xfffffffe; +} + +void default_ioport_writel(PCIBase *base, uint32_t address, uint32_t data) +{ +#ifdef DEBUG_UNUSED_IOPORT + fprintf(stderr, "outl: port=0x%04x data=0x%02x\n", address, data); +#endif +} + +void init_ioports(PCIBase *base) +{ + int i; + + for(i = 0; i < MAX_IOPORTS; i++) { + base->ioport_read_table[0][i] = default_ioport_readb; + base->ioport_write_table[0][i] = default_ioport_writeb; + base->ioport_read_table[1][i] = default_ioport_readw; + base->ioport_write_table[1][i] = default_ioport_writew; + base->ioport_read_table[2][i] = default_ioport_readl; + base->ioport_write_table[2][i] = default_ioport_writel; + } +} + + +void isa_unassign_ioport(PCIBase *base, int start, int length) +{ + int i; + + for(i = start; i < start + length && i < MAX_IOPORTS; i++) { + base->ioport_read_table[0][i] = default_ioport_readb; + base->ioport_read_table[1][i] = default_ioport_readw; + base->ioport_read_table[2][i] = default_ioport_readl; + + base->ioport_write_table[0][i] = default_ioport_writeb; + base->ioport_write_table[1][i] = default_ioport_writew; + base->ioport_write_table[2][i] = default_ioport_writel; + } +} + +/***********************************************************/ + + +void pstrcpy(char *buf, int buf_size, const char *str) +{ + int c; + char *q = buf; + + if (buf_size <= 0) + return; + + for(;;) { + c = *str++; + if (c == 0 || q >= buf + buf_size - 1) + break; + *q++ = c; + } + *q = ''\0''; +} Index: root/xen-unstable.hg/xen/include/xen/qemu_vl.h ==================================================================--- /dev/null +++ root/xen-unstable.hg/xen/include/xen/qemu_vl.h @@ -0,0 +1,149 @@ +/* + * QEMU System Emulator header + * + * Copyright (c) 2003 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ +#ifndef VL_H +#define VL_H + +#include <xen/types.h> + + +typedef struct PCIBus PCIBus; +typedef struct PCIDevice PCIDevice; +typedef struct PCIBase PCIBase; + +/* vl.c */ + +void pstrcpy(char *buf, int buf_size, const char *str); + + +typedef void SetIRQFunc(void *opaque, int irq_num, int level); + +/* ISA bus */ + +typedef void (IOPortWriteFunc)(PCIBase *base, uint32_t address, uint32_t data); +typedef uint32_t (IOPortReadFunc)(PCIBase *base, uint32_t address); + +int register_ioport_read(int start, int length, int size, + IOPortReadFunc *func, PCIBus *bus, + PCIBase *base); +int register_ioport_write(int start, int length, int size, + IOPortWriteFunc *func, PCIBus *bus, + PCIBase *base); + +/* PCI bus */ + + +typedef void PCIConfigWriteFunc(PCIDevice *pci_dev, + uint32_t address, uint32_t data, int len); +typedef uint32_t PCIConfigReadFunc(PCIDevice *pci_dev, + uint32_t address, int len); +typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, + uint32_t addr, uint32_t size, int type); + + +#define PCI_ADDRESS_SPACE_MEM 0x00 +#define PCI_ADDRESS_SPACE_IO 0x01 +#define PCI_ADDRESS_SPACE_MEM_PREFETCH 0x08 + + +typedef struct PCIIORegion { + uint32_t addr; /* current PCI mapping address. -1 means not mapped */ + uint32_t size; + uint8_t type; + PCIMapIORegionFunc *map_func; +} PCIIORegion; + +#define PCI_ROM_SLOT 6 +#define PCI_NUM_REGIONS 7 +struct PCIDevice { + /* PCI config space */ + uint8_t config[256]; + + /* the following fields are read only */ + PCIBus *bus; + int devfn; + char name[64]; + PCIIORegion io_regions[PCI_NUM_REGIONS]; + + /* do not access the following fields */ + PCIConfigReadFunc *config_read; + PCIConfigWriteFunc *config_write; +}; + +struct PCIBus { + int bus_num; + int devfn_min; + uint32_t config_reg; /* XXX: suppress */ + /* low level pic */ + SetIRQFunc *low_set_irq; + void *irq_opaque; + PCIDevice *devices[256]; + PCIBase *base; +}; + +#if 0 +#define MAX_IOPORTS 65536 +#else +#define MAX_IOPORTS 4096 +#endif + +struct PCIBase { + PCIBus *bus; + PCIBase *ioport_opaque[MAX_IOPORTS]; + IOPortReadFunc *ioport_read_table[3][MAX_IOPORTS]; + IOPortWriteFunc *ioport_write_table[3][MAX_IOPORTS]; +}; + +PCIDevice *pci_register_device(PCIBus *bus, const char *name, + int instance_size, int devfn, + PCIConfigReadFunc *config_read, + PCIConfigWriteFunc *config_write); + +void pci_register_io_region(PCIDevice *pci_dev, int region_num, + uint32_t size, int type, + PCIMapIORegionFunc *map_func); + +uint32_t pci_default_read_config(PCIDevice *d, + uint32_t address, int len); +void pci_default_write_config(PCIDevice *d, + uint32_t address, uint32_t val, int len); + +void cpu_outb(PCIBase *env, unsigned int addr, int val); +void cpu_outw(PCIBase *env, unsigned int addr, int val); +void cpu_outl(PCIBase *env, unsigned int addr, int val); +int cpu_inb(PCIBase *env, unsigned int addr); +int cpu_inw(PCIBase *env, unsigned int addr); +int cpu_inl(PCIBase *env, unsigned int addr); +void init_ioports(PCIBase *base); +void isa_unassign_ioport(PCIBase *base, int start, int length); + +extern struct PIIX3State *piix3_state; + +PCIBase *i440fx_init(void); +void piix3_init(PCIBus *bus); +void pci_bios_init(PCIBus *bus); + +void pci_base_free(PCIBase *base); + + +#endif /* VL_H */ Index: root/xen-unstable.hg/xen/arch/x86/traps.c ==================================================================--- root.orig/xen-unstable.hg/xen/arch/x86/traps.c +++ root/xen-unstable.hg/xen/arch/x86/traps.c @@ -553,16 +553,74 @@ static inline int admin_io_okay( return 0; } + +u32 static inline _cpu_inb(u32 dummy, int addr) +{ + struct vcpu *v = current; + struct domain *d = v->domain; + if (d->PCIBase) { + return cpu_inb(d->PCIBase, addr); + } + return ~0; +} + +u32 static inline _cpu_inw(u32 dummy, int addr) +{ + struct vcpu *v = current; + struct domain *d = v->domain; + if (d->PCIBase) { + return cpu_inw(d->PCIBase, addr); + } + return ~0; +} + +u32 static inline _cpu_inl(u32 dummy, int addr) +{ + struct vcpu *v = current; + struct domain *d = v->domain; + if (d->PCIBase) { + return cpu_inl(d->PCIBase, addr); + } + return ~0; +} + +void static inline _cpu_outb(u32 dummy, int addr, u8 val) +{ + struct vcpu *v = current; + struct domain *d = v->domain; + if (d->PCIBase) { + cpu_outb(d->PCIBase, addr, val); + } +} + +void static inline _cpu_outw(u32 dummy, int addr, u16 val) +{ + struct vcpu *v = current; + struct domain *d = v->domain; + if (d->PCIBase) { + cpu_outw(d->PCIBase, addr, val); + } +} + +void static inline _cpu_outl(u32 dummy, int addr, u32 val) +{ + struct vcpu *v = current; + struct domain *d = v->domain; + if (d->PCIBase) { + cpu_outl(d->PCIBase, addr, val); + } +} + /* Check admin limits. Silently fail the access if it is disallowed. */ -#define inb_user(_p, _d, _r) (admin_io_okay(_p, 1, _d, _r) ? inb(_p) : ~0) -#define inw_user(_p, _d, _r) (admin_io_okay(_p, 2, _d, _r) ? inw(_p) : ~0) -#define inl_user(_p, _d, _r) (admin_io_okay(_p, 4, _d, _r) ? inl(_p) : ~0) +#define inb_user(_p, _d, _r) (admin_io_okay(_p, 1, _d, _r) ? inb(_p) : _cpu_inb(0,_p)) +#define inw_user(_p, _d, _r) (admin_io_okay(_p, 2, _d, _r) ? inw(_p) : _cpu_inw(0,_p)) +#define inl_user(_p, _d, _r) (admin_io_okay(_p, 4, _d, _r) ? inl(_p) : _cpu_inl(0,_p)) #define outb_user(_v, _p, _d, _r) \ - (admin_io_okay(_p, 1, _d, _r) ? outb(_v, _p) : ((void)0)) + (admin_io_okay(_p, 1, _d, _r) ? outb(_v, _p) : _cpu_outb(0,_p,_v)/*((void)0)*/) #define outw_user(_v, _p, _d, _r) \ - (admin_io_okay(_p, 2, _d, _r) ? outw(_v, _p) : ((void)0)) + (admin_io_okay(_p, 2, _d, _r) ? outw(_v, _p) : _cpu_outw(0,_p,_v)/*((void)0)*/) #define outl_user(_v, _p, _d, _r) \ - (admin_io_okay(_p, 4, _d, _r) ? outl(_v, _p) : ((void)0)) + (admin_io_okay(_p, 4, _d, _r) ? outl(_v, _p) : _cpu_outl(0,_p,_v)/*((void)0)*/) /* Propagate a fault back to the guest kernel. */ #define USER_READ_FAULT 4 /* user mode, read fault */ Index: root/xen-unstable.hg/xen/common/domain.c ==================================================================--- root.orig/xen-unstable.hg/xen/common/domain.c +++ root/xen-unstable.hg/xen/common/domain.c @@ -58,6 +58,21 @@ struct domain *do_createdomain(domid_t d free_domain_struct(d); return NULL; } + + if (dom_id != 0 && dom_id != IDLE_DOMAIN_ID) { + d->PCIBase = i440fx_init(); + if (d->PCIBase) { + PCIBus *bus = d->PCIBase->bus; + piix3_init(bus); + pci_bios_init(bus); + printk("Fake PCI BUS initialized.\n"); + } else { + printk("Not enough memory for fake PCI bus?\n"); + evtchn_destroy(d); + free_domain_struct(d); + return NULL; + } + } arch_do_createdomain(v); @@ -77,6 +92,7 @@ struct domain *do_createdomain(domid_t d write_unlock(&domlist_lock); } + return d; } @@ -260,6 +276,10 @@ void domain_destruct(struct domain *d) *pd = d->next_in_hashbucket; write_unlock(&domlist_lock); + if (d->PCIBase) { + pci_base_free(d->PCIBase); + } + evtchn_destroy(d); grant_table_destroy(d); Index: root/xen-unstable.hg/xen/include/xen/sched.h ==================================================================--- root.orig/xen-unstable.hg/xen/include/xen/sched.h +++ root/xen-unstable.hg/xen/include/xen/sched.h @@ -12,6 +12,7 @@ #include <xen/ac_timer.h> #include <xen/grant_table.h> #include <asm/domain.h> +#include <xen/qemu_vl.h> extern unsigned long volatile jiffies; extern rwlock_t domlist_lock; @@ -138,6 +139,7 @@ struct domain struct arch_domain arch; void *ssid; /* sHype security subject identifier */ + PCIBase *PCIBase; }; struct domain_setup_info _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2005-Aug-19 16:47 UTC
RE: [Xen-devel] Experimental PCI Emulation Layer for XEN x86(32)
> - The best: Running a domain 0 kernel as a user domain > kernel also works ! The bootup messages (dmesg) can be seen > under (**) below > - With some further exploration this could be leading to a > unified build process and kernel for domain 0 and user > domains which I think would be ''good''.? You can already use a xen0 kernel as a xenU kernel just fine. That''s what al distros do. Having two kernels is just for developer''s convenience. See the -xen default config which compiles everything as modules. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Berger
2005-Aug-19 18:30 UTC
RE: [Xen-devel] Experimental PCI Emulation Layer for XEN x86(32)
"Ian Pratt" <m+Ian.Pratt@cl.cam.ac.uk> wrote on 08/19/2005 11:47:07 AM:> > - The best: Running a domain 0 kernel as a user domain > > kernel also works ! The bootup messages (dmesg) can be seen > > under (**) below > > - With some further exploration this could be leading to a > > unified build process and kernel for domain 0 and user > > domains which I think would be ''good''. > > ? You can already use a xen0 kernel as a xenU kernel just fine. That''s > what al distros do. > > Having two kernels is just for developer''s convenience.I thought that the arch/xen/Kconfig.drivers file was really necessary to build domUs. I guess this is not true then.> > See the -xen default config which compiles everything as modules.I see. Stefan> > Ian_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel