anthony.perard@citrix.com
2010-Sep-17  11:14 UTC
[Xen-devel] [PATCH RFC V3 00/12] xen device model support
From: Anthony PERARD <anthony.perard@citrix.com>
Hi all,
this is the third version of the patch series that adds xen device
model support in qemu.
This is the list of changes we made on top of the last version:
- we finally removed the special target for Xen and use the i386 target.
- we removed xenstore management, we have only one call to xenstore to
  tell the device model state ("running").
- we integrated MapCache to RAMBlock infrastructure. This come with a new
fonction
  qemu_ram_ptr_unlock because MapCache need to know if he can unmap a block.
- we removed dynamic check of Xen in get_irq_slot and set_irq, in piix_pci,
  and used Xen function with pci_bus_irqs.
- we converted the GPE of the Xen ACPI to VMSTATE.
Anthony PERARD (12):
  xen: Support new libxc calls from xen unstable.
  xen: Add xen_machine_fv
  xen: Introduce --enable-xen command options.
  xen: Add the Xen platform pci device
  piix_pci: Introduces Xen specific call for irq.
  xen: add a 8259 Interrupt Controller
  xen: Introduce the Xen mapcache
  Intruduce qemu_ram_ptr_unlock.
  vl.c: Introduce getter for shutdown_requested and reset_requested.
  xen: Initialize event channels and io rings
  xen: Set running state in xenstore.
  xen: Add a Xen specific ACPI Implementation to target-xen
 Makefile.target      |    8 +
 configure            |    5 +
 cpu-common.h         |    1 +
 exec.c               |   65 ++++++-
 hw/hw.h              |    3 +
 hw/pci_ids.h         |    2 +
 hw/piix_pci.c        |   10 +-
 hw/xen.h             |   20 ++
 hw/xen_acpi_piix4.c  |  405 ++++++++++++++++++++++++++++++++++++++
 hw/xen_backend.c     |   10 +-
 hw/xen_backend.h     |    2 +-
 hw/xen_common.h      |   29 +++-
 hw/xen_disk.c        |   12 +-
 hw/xen_domainbuild.c |    2 +-
 hw/xen_machine_fv.c  |  155 +++++++++++++++
 hw/xen_nic.c         |   16 +-
 hw/xen_platform.c    |  455 +++++++++++++++++++++++++++++++++++++++++++
 hw/xen_platform.h    |    8 +
 qemu-options.hx      |    9 +
 sysemu.h             |    2 +
 vl.c                 |   26 +++
 xen-all.c            |  529 ++++++++++++++++++++++++++++++++++++++++++++++++++
 xen-stub.c           |   34 ++++
 xen_mapcache.c       |  336 ++++++++++++++++++++++++++++++++
 xen_mapcache.h       |   27 +++
 25 files changed, 2141 insertions(+), 30 deletions(-)
 create mode 100644 hw/xen_acpi_piix4.c
 create mode 100644 hw/xen_machine_fv.c
 create mode 100644 hw/xen_platform.c
 create mode 100644 hw/xen_platform.h
 create mode 100644 xen-all.c
 create mode 100644 xen-stub.c
 create mode 100644 xen_mapcache.c
 create mode 100644 xen_mapcache.h
Regards,
-- 
Anthony PERARD
P.S.
Stefano is currently on vacation.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:14 UTC
[Xen-devel] [PATCH RFC V3 01/12] xen: Support new libxc calls from xen unstable.
From: Anthony PERARD <anthony.perard@citrix.com>
Update the libxenctrl calls in Qemu to use the new interface, otherwise
Qemu wouldn''t be able to build against new versions of the library.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 configure            |    5 +++++
 hw/xen_backend.c     |   10 +++++-----
 hw/xen_backend.h     |    2 +-
 hw/xen_common.h      |   23 ++++++++++++++++++++++-
 hw/xen_disk.c        |   12 ++++++------
 hw/xen_domainbuild.c |    2 +-
 hw/xen_nic.c         |   16 ++++++++--------
 7 files changed, 48 insertions(+), 22 deletions(-)
diff --git a/configure b/configure
index 4061cb7..0d745e0 100755
--- a/configure
+++ b/configure
@@ -1113,7 +1113,12 @@ if test "$xen" != "no" ; then
   cat > $TMPC <<EOF
 #include <xenctrl.h>
 #include <xs.h>
+#include <xen/xen-compat.h>
+#if __XEN_LATEST_INTERFACE_VERSION__ < 0x0003020a
 int main(void) { xs_daemon_open(); xc_interface_open(); return 0; }
+#else
+int main(void) { xs_daemon_open(); xc_interface_open(0, 0, 0); return 0; }
+#endif
 EOF
   if compile_prog "" "$xen_libs" ; then
     xen=yes
diff --git a/hw/xen_backend.c b/hw/xen_backend.c
index a2e408f..b23620f 100644
--- a/hw/xen_backend.c
+++ b/hw/xen_backend.c
@@ -43,7 +43,7 @@
 /* ------------------------------------------------------------- */
 
 /* public */
-int xen_xc;
+qemu_xc_interface xen_xc = XC_HANDLER_INITIAL_VALUE;
 struct xs_handle *xenstore = NULL;
 const char *xen_protocol;
 
@@ -216,7 +216,7 @@ static struct XenDevice *xen_be_get_xendev(const char *type,
int dom, int dev,
     fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
 
     if (ops->flags & DEVOPS_FLAG_NEED_GNTDEV) {
-	xendev->gnttabdev = xc_gnttab_open();
+	xendev->gnttabdev = xc_gnttab_open(xen_xc);
 	if (xendev->gnttabdev < 0) {
 	    xen_be_printf(NULL, 0, "can''t open gnttab device\n");
 	    xc_evtchn_close(xendev->evtchndev);
@@ -269,7 +269,7 @@ static struct XenDevice *xen_be_del_xendev(int dom, int dev)
 	if (xendev->evtchndev >= 0)
 	    xc_evtchn_close(xendev->evtchndev);
 	if (xendev->gnttabdev >= 0)
-	    xc_gnttab_close(xendev->gnttabdev);
+	    xc_gnttab_close(xen_xc, xendev->gnttabdev);
 
 	QTAILQ_REMOVE(&xendevs, xendev, next);
 	qemu_free(xendev);
@@ -627,8 +627,8 @@ int xen_be_init(void)
     if (qemu_set_fd_handler(xs_fileno(xenstore), xenstore_update, NULL, NULL)
< 0)
 	goto err;
 
-    xen_xc = xc_interface_open();
-    if (xen_xc == -1) {
+    xen_xc = xc_interface_open(NULL, NULL, 0);
+    if (xen_xc == XC_HANDLER_INITIAL_VALUE) {
 	xen_be_printf(NULL, 0, "can''t open xen interface\n");
 	goto err;
     }
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index 292126d..1f23cde 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -55,7 +55,7 @@ struct XenDevice {
 /* ------------------------------------------------------------- */
 
 /* variables */
-extern int xen_xc;
+extern qemu_xc_interface xen_xc;
 extern struct xs_handle *xenstore;
 extern const char *xen_protocol;
 
diff --git a/hw/xen_common.h b/hw/xen_common.h
index 8a55b44..2cbc376 100644
--- a/hw/xen_common.h
+++ b/hw/xen_common.h
@@ -16,7 +16,8 @@
  * tweaks needed to build with different xen versions
  *  0x00030205 -> 3.1.0
  *  0x00030207 -> 3.2.0
- *  0x00030208 -> unstable
+ *  0x00030209 -> 3.3.0
+ *  0x0003020a -> unstable
  */
 #include <xen/xen-compat.h>
 #if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030205
@@ -31,4 +32,24 @@
 # define xen_wmb() wmb()
 #endif
 
+#if __XEN_LATEST_INTERFACE_VERSION__ < 0x0003020a
+typedef int qemu_xc_interface;
+# define XC_HANDLER_INITIAL_VALUE               -1
+# define xc_fd(xen_xc)                          xen_xc
+# define xc_interface_open(l, dl, f)            xc_interface_open()
+# define xc_gnttab_open(xc)                     xc_gnttab_open()
+# define xc_gnttab_map_grant_ref(xc, gnt, domid, ref, flags) \
+    xc_gnttab_map_grant_ref(gnt, domid, ref, flags)
+# define xc_gnttab_map_grant_refs(xc, gnt, count, domids, refs, flags) \
+    xc_gnttab_map_grant_refs(gnt, count, domids, refs, flags)
+# define xc_gnttab_munmap(xc, gnt, pages, niov) xc_gnttab_munmap(gnt, pages,
niov)
+# define xc_gnttab_close(xc, dev)               xc_gnttab_close(dev)
+#else
+typedef xc_interface *qemu_xc_interface;
+# define XC_HANDLER_INITIAL_VALUE NULL
+/* FIXME The fd of xen_xc is now xen_xc->fd */
+/* fd is the first field, so this works */
+# define xc_fd(xen_xc)                          (*(int*)xen_xc)
+#endif
+
 #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 134ac33..e38e155 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -243,7 +243,7 @@ static void ioreq_unmap(struct ioreq *ioreq)
     if (batch_maps) {
 	if (!ioreq->pages)
 	    return;
-	if (xc_gnttab_munmap(gnt, ioreq->pages, ioreq->v.niov) != 0)
+	if (xc_gnttab_munmap(xen_xc, gnt, ioreq->pages, ioreq->v.niov) != 0)
 	    xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap
failed: %s\n",
 			  strerror(errno));
 	ioreq->blkdev->cnt_map -= ioreq->v.niov;
@@ -252,7 +252,7 @@ static void ioreq_unmap(struct ioreq *ioreq)
 	for (i = 0; i < ioreq->v.niov; i++) {
 	    if (!ioreq->page[i])
 		continue;
-	    if (xc_gnttab_munmap(gnt, ioreq->page[i], 1) != 0)
+	    if (xc_gnttab_munmap(xen_xc, gnt, ioreq->page[i], 1) != 0)
 		xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap
failed: %s\n",
 			      strerror(errno));
 	    ioreq->blkdev->cnt_map--;
@@ -270,7 +270,7 @@ static int ioreq_map(struct ioreq *ioreq)
         return 0;
     if (batch_maps) {
 	ioreq->pages = xc_gnttab_map_grant_refs
-	    (gnt, ioreq->v.niov, ioreq->domids, ioreq->refs, ioreq->prot);
+	    (xen_xc, gnt, ioreq->v.niov, ioreq->domids, ioreq->refs,
ioreq->prot);
 	if (ioreq->pages == NULL) {
 	    xen_be_printf(&ioreq->blkdev->xendev, 0,
 			  "can''t map %d grant refs (%s, %d maps)\n",
@@ -284,7 +284,7 @@ static int ioreq_map(struct ioreq *ioreq)
     } else  {
 	for (i = 0; i < ioreq->v.niov; i++) {
 	    ioreq->page[i] = xc_gnttab_map_grant_ref
-		(gnt, ioreq->domids[i], ioreq->refs[i], ioreq->prot);
+		(xen_xc, gnt, ioreq->domids[i], ioreq->refs[i], ioreq->prot);
 	    if (ioreq->page[i] == NULL) {
 		xen_be_printf(&ioreq->blkdev->xendev, 0,
 			      "can''t map grant ref %d (%s, %d maps)\n",
@@ -684,7 +684,7 @@ static int blk_connect(struct XenDevice *xendev)
             blkdev->protocol = BLKIF_PROTOCOL_X86_64;
     }
 
-    blkdev->sring = xc_gnttab_map_grant_ref(blkdev->xendev.gnttabdev,
+    blkdev->sring = xc_gnttab_map_grant_ref(xen_xc,
blkdev->xendev.gnttabdev,
 					    blkdev->xendev.dom,
 					    blkdev->ring_ref,
 					    PROT_READ | PROT_WRITE);
@@ -739,7 +739,7 @@ static void blk_disconnect(struct XenDevice *xendev)
     xen_be_unbind_evtchn(&blkdev->xendev);
 
     if (blkdev->sring) {
-	xc_gnttab_munmap(blkdev->xendev.gnttabdev, blkdev->sring, 1);
+	xc_gnttab_munmap(xen_xc, blkdev->xendev.gnttabdev, blkdev->sring, 1);
 	blkdev->cnt_map--;
 	blkdev->sring = NULL;
     }
diff --git a/hw/xen_domainbuild.c b/hw/xen_domainbuild.c
index 7f1fd66..232a456 100644
--- a/hw/xen_domainbuild.c
+++ b/hw/xen_domainbuild.c
@@ -176,7 +176,7 @@ static int xen_domain_watcher(void)
     for (i = 3; i < n; i++) {
         if (i == fd[0])
             continue;
-        if (i == xen_xc)
+        if (i == xc_fd(xen_xc))
             continue;
         close(i);
     }
diff --git a/hw/xen_nic.c b/hw/xen_nic.c
index 08055b8..4f68850 100644
--- a/hw/xen_nic.c
+++ b/hw/xen_nic.c
@@ -166,7 +166,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
 			  (txreq.flags & NETTXF_more_data)      ? " more_data"      :
"",
 			  (txreq.flags & NETTXF_extra_info)     ? " extra_info"     :
"");
 
-	    page = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+	    page = xc_gnttab_map_grant_ref(xen_xc, netdev->xendev.gnttabdev,
 					   netdev->xendev.dom,
 					   txreq.gref, PROT_READ);
 	    if (page == NULL) {
@@ -185,7 +185,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
             } else {
                 qemu_send_packet(&netdev->nic->nc, page +
txreq.offset, txreq.size);
             }
-	    xc_gnttab_munmap(netdev->xendev.gnttabdev, page, 1);
+	    xc_gnttab_munmap(xen_xc, netdev->xendev.gnttabdev, page, 1);
 	    net_tx_response(netdev, &txreq, NETIF_RSP_OKAY);
 	}
 	if (!netdev->tx_work)
@@ -272,7 +272,7 @@ static ssize_t net_rx_packet(VLANClientState *nc, const
uint8_t *buf, size_t siz
     memcpy(&rxreq, RING_GET_REQUEST(&netdev->rx_ring, rc),
sizeof(rxreq));
     netdev->rx_ring.req_cons = ++rc;
 
-    page = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+    page = xc_gnttab_map_grant_ref(xen_xc, netdev->xendev.gnttabdev,
 				   netdev->xendev.dom,
 				   rxreq.gref, PROT_WRITE);
     if (page == NULL) {
@@ -282,7 +282,7 @@ static ssize_t net_rx_packet(VLANClientState *nc, const
uint8_t *buf, size_t siz
 	return -1;
     }
     memcpy(page + NET_IP_ALIGN, buf, size);
-    xc_gnttab_munmap(netdev->xendev.gnttabdev, page, 1);
+    xc_gnttab_munmap(xen_xc, netdev->xendev.gnttabdev, page, 1);
     net_rx_response(netdev, &rxreq, NETIF_RSP_OKAY, NET_IP_ALIGN, size, 0);
 
     return size;
@@ -350,11 +350,11 @@ static int net_connect(struct XenDevice *xendev)
 	return -1;
     }
 
-    netdev->txs = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+    netdev->txs = xc_gnttab_map_grant_ref(xen_xc,
netdev->xendev.gnttabdev,
 					  netdev->xendev.dom,
 					  netdev->tx_ring_ref,
 					  PROT_READ | PROT_WRITE);
-    netdev->rxs = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+    netdev->rxs = xc_gnttab_map_grant_ref(xen_xc,
netdev->xendev.gnttabdev,
 					  netdev->xendev.dom,
 					  netdev->rx_ring_ref,
 					  PROT_READ | PROT_WRITE);
@@ -381,11 +381,11 @@ static void net_disconnect(struct XenDevice *xendev)
     xen_be_unbind_evtchn(&netdev->xendev);
 
     if (netdev->txs) {
-	xc_gnttab_munmap(netdev->xendev.gnttabdev, netdev->txs, 1);
+	xc_gnttab_munmap(xen_xc, netdev->xendev.gnttabdev, netdev->txs, 1);
 	netdev->txs = NULL;
     }
     if (netdev->rxs) {
-	xc_gnttab_munmap(netdev->xendev.gnttabdev, netdev->rxs, 1);
+	xc_gnttab_munmap(xen_xc, netdev->xendev.gnttabdev, netdev->rxs, 1);
 	netdev->rxs = NULL;
     }
     if (netdev->nic) {
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:14 UTC
[Xen-devel] [PATCH RFC V3 02/12] xen: Add xen_machine_fv
From: Anthony PERARD <anthony.perard@citrix.com>
Add the Xen FV (Fully Virtualized) machine to Qemu;
this is groundwork to add Xen device model support in Qemu.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 Makefile.target     |    3 +
 hw/xen_machine_fv.c |  157 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 160 insertions(+), 0 deletions(-)
 create mode 100644 hw/xen_machine_fv.c
diff --git a/Makefile.target b/Makefile.target
index a4e80b1..f112e66 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -183,6 +183,9 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
 # xen backend driver support
 obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
 
+# xen full virtualized machine
+obj-$(CONFIG_XEN) += xen_machine_fv.o
+
 # USB layer
 obj-$(CONFIG_USB_OHCI) += usb-ohci.o
 
diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
new file mode 100644
index 0000000..03683c7
--- /dev/null
+++ b/hw/xen_machine_fv.c
@@ -0,0 +1,157 @@
+/*
+ * QEMU Xen FV Machine
+ *
+ * Copyright (c) 2003-2007 Fabrice Bellard
+ * Copyright (c) 2007 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the
"Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "hw.h"
+#include "pc.h"
+#include "pci.h"
+#include "usb-uhci.h"
+#include "net.h"
+#include "boards.h"
+#include "ide.h"
+#include "sysemu.h"
+#include "blockdev.h"
+
+#include "xen/hvm/hvm_info_table.h"
+
+#define MAX_IDE_BUS 2
+
+static void xen_init_fv(ram_addr_t ram_size,
+                        const char *boot_device,
+                        const char *kernel_filename,
+                        const char *kernel_cmdline,
+                        const char *initrd_filename,
+                        const char *cpu_model)
+{
+    int i;
+    ram_addr_t below_4g_mem_size, above_4g_mem_size = 0;
+    PCIBus *pci_bus;
+    PCII440FXState *i440fx_state;
+    int piix3_devfn = -1;
+    qemu_irq *cpu_irq;
+    qemu_irq *isa_irq;
+    qemu_irq *i8259;
+    qemu_irq *cmos_s3;
+    qemu_irq *smi_irq;
+    IsaIrqState *isa_irq_state;
+    DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
+    FDCtrl *floppy_controller;
+    BusState *idebus[MAX_IDE_BUS];
+    ISADevice *rtc_state;
+
+    CPUState *env;
+
+    /* Initialize a dummy CPU */
+    if (cpu_model == NULL) {
+#ifdef TARGET_X86_64
+        cpu_model = "qemu64";
+#else
+        cpu_model = "qemu32";
+#endif
+    }
+    env = cpu_init(cpu_model);
+    env->halted = 1;
+
+    cpu_irq = pc_allocate_cpu_irq();
+    i8259 = i8259_init(cpu_irq[0]);
+    isa_irq_state = qemu_mallocz(sizeof(*isa_irq_state));
+    isa_irq_state->i8259 = i8259;
+
+    isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
+
+    pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq,
ram_size);
+    isa_bus_irqs(isa_irq);
+
+    pc_register_ferr_irq(isa_reserve_irq(13));
+
+    pc_vga_init(pci_bus);
+
+    /* init basic PC hardware */
+    pc_basic_device_init(isa_irq, &floppy_controller, &rtc_state);
+
+    for(i = 0; i < nb_nics; i++) {
+        NICInfo *nd = &nd_table[i];
+
+        if (nd->model && strcmp(nd->model, "ne2k_isa")
== 0)
+            pc_init_ne2k_isa(nd);
+        else
+            pci_nic_init_nofail(nd, "e1000", NULL);
+    }
+
+    if (drive_get_max_bus(IF_IDE) >= MAX_IDE_BUS) {
+        fprintf(stderr, "qemu: too many IDE bus\n");
+        exit(1);
+    }
+
+    for(i = 0; i < MAX_IDE_BUS * MAX_IDE_DEVS; i++) {
+        hd[i] = drive_get(IF_IDE, i / MAX_IDE_DEVS, i % MAX_IDE_DEVS);
+    }
+
+    PCIDevice *dev = pci_piix3_ide_init(pci_bus, hd, piix3_devfn + 1);
+    idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
+    idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
+
+    pc_audio_init(pci_bus, isa_irq);
+
+    if (ram_size >= 0xe0000000 ) {
+        above_4g_mem_size = ram_size - 0xe0000000;
+        below_4g_mem_size = 0xe0000000;
+    } else {
+        below_4g_mem_size = ram_size;
+    }
+    pc_cmos_init(below_4g_mem_size, above_4g_mem_size, boot_device,
+            idebus[0], idebus[1], floppy_controller, rtc_state);
+
+    if (usb_enabled) {
+        usb_uhci_piix3_init(pci_bus, piix3_devfn + 2);
+    }
+
+    if (acpi_enabled) {
+        cmos_s3 = qemu_allocate_irqs(pc_cmos_set_s3_resume, rtc_state, 1);
+        smi_irq = qemu_allocate_irqs(pc_acpi_smi_interrupt, first_cpu, 1);
+        piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100,
+                isa_reserve_irq(9), *cmos_s3, *smi_irq,
+                0);
+    }
+
+    if (i440fx_state) {
+        i440fx_init_memory_mappings(i440fx_state);
+    }
+
+    pc_pci_device_init(pci_bus);
+}
+
+static QEMUMachine xenfv_machine = {
+    .name = "xenfv",
+    .desc = "Xen Fully-virtualized PC",
+    .init = xen_init_fv,
+    .max_cpus = HVM_MAX_VCPUS,
+};
+
+static void xenfv_machine_init(void)
+{
+    qemu_register_machine(&xenfv_machine);
+}
+
+machine_init(xenfv_machine_init);
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:14 UTC
[Xen-devel] [PATCH RFC V3 03/12] xen: Introduce --enable-xen command options.
From: Anthony PERARD <anthony.perard@citrix.com>
This options will check if the target is build with Xen support.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 Makefile.target |    3 +++
 hw/xen.h        |   10 ++++++++++
 qemu-options.hx |    9 +++++++++
 vl.c            |   16 ++++++++++++++++
 xen-all.c       |   25 +++++++++++++++++++++++++
 xen-stub.c      |   17 +++++++++++++++++
 6 files changed, 80 insertions(+), 0 deletions(-)
 create mode 100644 xen-all.c
 create mode 100644 xen-stub.c
diff --git a/Makefile.target b/Makefile.target
index f112e66..1984f58 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -2,6 +2,7 @@
 
 GENERATED_HEADERS = config-target.h
 CONFIG_NO_KVM = $(if $(subst n,,$(CONFIG_KVM)),n,y)
+CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y)
 
 include ../config-host.mak
 include config-devices.mak
@@ -182,6 +183,8 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
 
 # xen backend driver support
 obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
+obj-$(CONFIG_XEN) += xen-all.o
+obj-$(CONFIG_NO_XEN) += xen-stub.o
 
 # xen full virtualized machine
 obj-$(CONFIG_XEN) += xen_machine_fv.o
diff --git a/hw/xen.h b/hw/xen.h
index 780dcf7..14bbb6e 100644
--- a/hw/xen.h
+++ b/hw/xen.h
@@ -18,4 +18,14 @@ enum xen_mode {
 extern uint32_t xen_domid;
 extern enum xen_mode xen_mode;
 
+extern int xen_allowed;
+
+#if defined CONFIG_XEN
+#define xen_enabled() (xen_allowed)
+#else
+#define xen_enabled() (0)
+#endif
+
+int xen_init(int smp_cpus);
+
 #endif /* QEMU_HW_XEN_H */
diff --git a/qemu-options.hx b/qemu-options.hx
index a0b5ae9..457ca32 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1904,6 +1904,15 @@ Enable KVM full virtualization support. This option is
only available
 if KVM support is enabled when compiling.
 ETEXI
 
+DEF("enable-xen", 0, QEMU_OPTION_enable_xen, \
+    "-enable-xen     enable Xen full virtualization support\n",
QEMU_ARCH_ALL)
+STEXI
+@item -enable-xen
+@findex -enable-xen
+Enable Xen full virtualization support. This option is only available
+if Xen support is enabled when compiling.
+ETEXI
+
 DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid,
     "-xen-domid id   specify xen guest domain id\n", QEMU_ARCH_ALL)
 DEF("xen-create", 0, QEMU_OPTION_xen_create,
diff --git a/vl.c b/vl.c
index 3f45aa9..6948703 100644
--- a/vl.c
+++ b/vl.c
@@ -243,6 +243,7 @@ static NotifierList exit_notifiers     
NOTIFIER_LIST_INITIALIZER(exit_notifiers);
 
 int kvm_allowed = 0;
+int xen_allowed = 0;
 uint32_t xen_domid;
 enum xen_mode xen_mode = XEN_EMULATE;
 
@@ -2448,6 +2449,9 @@ int main(int argc, char **argv, char **envp)
             case QEMU_OPTION_enable_kvm:
                 kvm_allowed = 1;
                 break;
+            case QEMU_OPTION_enable_xen:
+                xen_allowed = 1;
+                break;
             case QEMU_OPTION_usb:
                 usb_enabled = 1;
                 break;
@@ -2756,6 +2760,18 @@ int main(int argc, char **argv, char **envp)
         }
     }
 
+    if (xen_allowed) {
+        int ret = xen_init(smp_cpus);
+        if (ret < 0) {
+            if (!xen_available()) {
+                printf("Xen not supported for this target\n");
+            } else {
+                fprintf(stderr, "failed to initialize Xen: %s\n",
strerror(-ret));
+            }
+            exit(1);
+        }
+    }
+
     if (qemu_init_main_loop()) {
         fprintf(stderr, "qemu_init_main_loop failed\n");
         exit(1);
diff --git a/xen-all.c b/xen-all.c
new file mode 100644
index 0000000..f505563
--- /dev/null
+++ b/xen-all.c
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2010       Citrix Ltd.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "config.h"
+
+#include "hw/xen_common.h"
+#include "hw/xen_backend.h"
+
+/* Initialise Xen */
+
+int xen_init(int smp_cpus)
+{
+    xen_xc = xc_interface_open(NULL, NULL, 0);
+    if (xen_xc == NULL) {
+        xen_be_printf(NULL, 0, "can''t open xen
interface\n");
+        return -1;
+    }
+
+    return 0;
+}
diff --git a/xen-stub.c b/xen-stub.c
new file mode 100644
index 0000000..0fa9c51
--- /dev/null
+++ b/xen-stub.c
@@ -0,0 +1,17 @@
+/*
+ * Copyright (C) 2010       Citrix Ltd.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "config.h"
+
+#include "qemu-common.h"
+#include "hw/xen.h"
+
+int xen_init(int smp_cpus)
+{
+    return -ENOSYS;
+}
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:14 UTC
[Xen-devel] [PATCH RFC V3 04/12] xen: Add the Xen platform pci device
From: Anthony PERARD <anthony.perard@citrix.com>
Introduce a new emulated PCI device, specific to fully virtualized Xen
guests.  The device is necessary for PV on HVM drivers to work.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 Makefile.target     |    1 +
 hw/hw.h             |    3 +
 hw/pci_ids.h        |    2 +
 hw/xen_machine_fv.c |    3 +
 hw/xen_platform.c   |  455 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/xen_platform.h   |    8 +
 6 files changed, 472 insertions(+), 0 deletions(-)
 create mode 100644 hw/xen_platform.c
 create mode 100644 hw/xen_platform.h
diff --git a/Makefile.target b/Makefile.target
index 1984f58..6b390e6 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -188,6 +188,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o
 
 # xen full virtualized machine
 obj-$(CONFIG_XEN) += xen_machine_fv.o
+obj-$(CONFIG_XEN) += xen_platform.o
 
 # USB layer
 obj-$(CONFIG_USB_OHCI) += usb-ohci.o
diff --git a/hw/hw.h b/hw/hw.h
index 4405092..67f3369 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -653,6 +653,9 @@ extern const VMStateDescription vmstate_i2c_slave;
 #define VMSTATE_INT32_LE(_f, _s)                                   \
     VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t)
 
+#define VMSTATE_UINT8_TEST(_f, _s, _t)                               \
+    VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint8, uint8_t)
+
 #define VMSTATE_UINT16_TEST(_f, _s, _t)                               \
     VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint16, uint16_t)
 
diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index 39e9f1d..1f2e0dd 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -105,3 +105,5 @@
 #define PCI_DEVICE_ID_INTEL_82371AB      0x7111
 #define PCI_DEVICE_ID_INTEL_82371AB_2    0x7112
 #define PCI_DEVICE_ID_INTEL_82371AB_3    0x7113
+
+#define PCI_VENDOR_ID_XENSOURCE          0x5853
diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
index 03683c7..65fd44a 100644
--- a/hw/xen_machine_fv.c
+++ b/hw/xen_machine_fv.c
@@ -34,6 +34,7 @@
 #include "blockdev.h"
 
 #include "xen/hvm/hvm_info_table.h"
+#include "xen_platform.h"
 
 #define MAX_IDE_BUS 2
 
@@ -87,6 +88,8 @@ static void xen_init_fv(ram_addr_t ram_size,
 
     pc_vga_init(pci_bus);
 
+    pci_xen_platform_init(pci_bus);
+
     /* init basic PC hardware */
     pc_basic_device_init(isa_irq, &floppy_controller, &rtc_state);
 
diff --git a/hw/xen_platform.c b/hw/xen_platform.c
new file mode 100644
index 0000000..15b490a
--- /dev/null
+++ b/hw/xen_platform.c
@@ -0,0 +1,455 @@
+/*
+ * XEN platform pci device, formerly known as the event channel device
+ *
+ * Copyright (c) 2003-2004 Intel Corp.
+ * Copyright (c) 2006 XenSource
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the
"Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "hw.h"
+#include "pc.h"
+#include "pci.h"
+#include "irq.h"
+#include "xen_common.h"
+#include "net.h"
+#include "xen_platform.h"
+#include "xen_backend.h"
+#include "qemu-log.h"
+
+#include <assert.h>
+#include <xenguest.h>
+
+//#define PLATFORM_DEBUG
+
+#ifdef PLATFORM_DEBUG
+#define DPRINTF(fmt, ...) do { \
+    fprintf(stderr, "xen_platform: " fmt, ## __VA_ARGS__); \
+} while (0)
+#else
+#define DPRINTF(fmt, ...) do { } while (0)
+#endif
+
+#define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */
+
+typedef struct PCIXenPlatformState {
+    PCIDevice  pci_dev;
+    uint8_t flags; /* used only for version_id == 2 */
+    int drivers_blacklisted;
+    uint16_t driver_product_version;
+
+    /* Log from guest drivers */
+    int throttling_disabled;
+    char log_buffer[4096];
+    int log_buffer_off;
+} PCIXenPlatformState;
+
+#define XEN_PLATFORM_IOPORT 0x10
+
+/* We throttle access to dom0 syslog, to avoid DOS attacks.  This is
+   modelled as a token bucket, with one token for every byte of log.
+   The bucket size is 128KB (->1024 lines of 128 bytes each) and
+   refills at 256B/s.  It starts full.  The guest is blocked if no
+   tokens are available when it tries to generate a log message. */
+#define BUCKET_MAX_SIZE (128*1024)
+#define BUCKET_FILL_RATE 256
+
+static void throttle(PCIXenPlatformState *s, unsigned count)
+{
+    static unsigned available;
+    static struct timespec last_refil;
+    static int started;
+    static int warned;
+
+    struct timespec waiting_for, now;
+    double delay;
+    struct timespec ts;
+
+    if (s->throttling_disabled)
+        return;
+
+    if (!started) {
+        clock_gettime(CLOCK_MONOTONIC, &last_refil);
+        available = BUCKET_MAX_SIZE;
+        started = 1;
+    }
+
+    if (count > BUCKET_MAX_SIZE) {
+        DPRINTF("tried to get %d tokens, but bucket size is %d\n",
+                BUCKET_MAX_SIZE, count);
+        exit(1);
+    }
+
+    if (available < count) {
+        /* The bucket is empty.  Refil it */
+
+        /* When will it be full enough to handle this request? */
+        delay = (double)(count - available) / BUCKET_FILL_RATE;
+        waiting_for = last_refil;
+        waiting_for.tv_sec += delay;
+        waiting_for.tv_nsec += (delay - (int)delay) * 1e9;
+        if (waiting_for.tv_nsec >= 1000000000) {
+            waiting_for.tv_nsec -= 1000000000;
+            waiting_for.tv_sec++;
+        }
+
+        /* How long do we have to wait? (might be negative) */
+        clock_gettime(CLOCK_MONOTONIC, &now);
+        ts.tv_sec = waiting_for.tv_sec - now.tv_sec;
+        ts.tv_nsec = waiting_for.tv_nsec - now.tv_nsec;
+        if (ts.tv_nsec < 0) {
+            ts.tv_sec--;
+            ts.tv_nsec += 1000000000;
+        }
+
+        /* Wait for it. */
+        if (ts.tv_sec > 0 ||
+            (ts.tv_sec == 0 && ts.tv_nsec > 0)) {
+            if (!warned) {
+                DPRINTF("throttling guest access to syslog");
+                warned = 1;
+            }
+            while (nanosleep(&ts, &ts) < 0 && errno ==
EINTR)
+                ;
+        }
+
+        /* Refil */
+        clock_gettime(CLOCK_MONOTONIC, &now);
+        delay = (now.tv_sec - last_refil.tv_sec) +
+            (now.tv_nsec - last_refil.tv_nsec) * 1.0e-9;
+        available += BUCKET_FILL_RATE * delay;
+        if (available > BUCKET_MAX_SIZE)
+            available = BUCKET_MAX_SIZE;
+        last_refil = now;
+    }
+
+    assert(available >= count);
+
+    available -= count;
+}
+
+/* Xen Platform, Fixed IOPort */
+
+static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t
val)
+{
+    PCIXenPlatformState *s = opaque;
+
+    switch (addr - XEN_PLATFORM_IOPORT) {
+    case 0:
+        /* TODO: */
+        /* Unplug devices.  Value is a bitmask of which devices to
+           unplug, with bit 0 the IDE devices, bit 1 the network
+           devices, and bit 2 the non-primary-master IDE devices. */
+        break;
+    case 2:
+        switch (val) {
+        case 1:
+            DPRINTF("Citrix Windows PV drivers loaded in guest\n");
+            break;
+        case 0:
+            DPRINTF("Guest claimed to be running PV product 0?\n");
+            break;
+        default:
+            DPRINTF("Unknown PV product %d loaded in guest\n", val);
+            break;
+        }
+        s->driver_product_version = val;
+        break;
+    }
+}
+
+static void platform_fixed_ioport_writel(void *opaque, uint32_t addr,
+                                         uint32_t val)
+{
+    switch (addr - XEN_PLATFORM_IOPORT) {
+    case 0:
+        /* PV driver version */
+        break;
+    }
+}
+
+static void platform_fixed_ioport_writeb(void *opaque, uint32_t addr, uint32_t
val)
+{
+    PCIXenPlatformState *s = opaque;
+
+    switch (addr - XEN_PLATFORM_IOPORT) {
+    case 0: /* Platform flags */ {
+        hvmmem_type_t mem_type = (val & PFFLAG_ROM_LOCK) ?
+            HVMMEM_ram_ro : HVMMEM_ram_rw;
+        if (xc_hvm_set_mem_type(xen_xc, xen_domid, mem_type, 0xc0, 0x40))
+            DPRINTF("unable to change ro/rw state of ROM memory
area!\n");
+        else {
+            s->flags = val & PFFLAG_ROM_LOCK;
+            DPRINTF("changed ro/rw state of ROM memory area. now is %s
state.\n",
+                    (mem_type == HVMMEM_ram_ro ?
"ro":"rw"));
+        }
+        break;
+    }
+    case 2:
+        /* Send bytes to syslog */
+        if (val == ''\n'' || s->log_buffer_off ==
sizeof(s->log_buffer) - 1) {
+            /* Flush buffer */
+            s->log_buffer[s->log_buffer_off] = 0;
+            throttle(s, s->log_buffer_off);
+            DPRINTF("%s\n", s->log_buffer);
+            s->log_buffer_off = 0;
+            break;
+        }
+        s->log_buffer[s->log_buffer_off++] = val;
+        break;
+    }
+}
+
+static uint32_t platform_fixed_ioport_readw(void *opaque, uint32_t addr)
+{
+    PCIXenPlatformState *s = opaque;
+
+    switch (addr - XEN_PLATFORM_IOPORT) {
+    case 0:
+        if (s->drivers_blacklisted) {
+            /* The drivers will recognise this magic number and refuse
+             * to do anything. */
+            return 0xd249;
+        } else {
+            /* Magic value so that you can identify the interface. */
+            return 0x49d2;
+        }
+    default:
+        return 0xffff;
+    }
+}
+
+static uint32_t platform_fixed_ioport_readb(void *opaque, uint32_t addr)
+{
+    PCIXenPlatformState *s = opaque;
+
+    switch (addr - XEN_PLATFORM_IOPORT) {
+    case 0:
+        /* Platform flags */
+        return s->flags;
+    case 2:
+        /* Version number */
+        return 1;
+    default:
+        return 0xff;
+    }
+}
+
+static void platform_fixed_ioport_reset(void *opaque)
+{
+    PCIXenPlatformState *s = opaque;
+
+    platform_fixed_ioport_writeb(s, XEN_PLATFORM_IOPORT, 0);
+}
+
+static void platform_fixed_ioport_init(PCIXenPlatformState* s)
+{
+    register_ioport_write(XEN_PLATFORM_IOPORT, 16, 4,
platform_fixed_ioport_writel, s);
+    register_ioport_write(XEN_PLATFORM_IOPORT, 16, 2,
platform_fixed_ioport_writew, s);
+    register_ioport_write(XEN_PLATFORM_IOPORT, 16, 1,
platform_fixed_ioport_writeb, s);
+    register_ioport_read(XEN_PLATFORM_IOPORT, 16, 2,
platform_fixed_ioport_readw, s);
+    register_ioport_read(XEN_PLATFORM_IOPORT, 16, 1,
platform_fixed_ioport_readb, s);
+}
+
+/* Xen Platform PCI Device */
+
+static uint32_t xen_platform_ioport_readb(void *opaque, uint32_t addr)
+{
+    addr &= 0xff;
+
+    if (addr == 0)
+        return platform_fixed_ioport_readb(opaque, XEN_PLATFORM_IOPORT);
+    else
+        return ~0u;
+}
+
+static void xen_platform_ioport_writeb(void *opaque, uint32_t addr, uint32_t
val)
+{
+    PCIXenPlatformState *s = opaque;
+
+    addr &= 0xff;
+    val  &= 0xff;
+
+    switch (addr) {
+    case 0: /* Platform flags */
+        platform_fixed_ioport_writeb(opaque, XEN_PLATFORM_IOPORT, val);
+        break;
+    case 8:
+        {
+            if (val == ''\n'' || s->log_buffer_off ==
sizeof(s->log_buffer) - 1) {
+                /* Flush buffer */
+                s->log_buffer[s->log_buffer_off] = 0;
+                throttle(s, s->log_buffer_off);
+                DPRINTF("%s\n", s->log_buffer);
+                s->log_buffer_off = 0;
+                break;
+            }
+            s->log_buffer[s->log_buffer_off++] = val;
+        }
+        break;
+    default:
+        break;
+    }
+}
+
+static void platform_ioport_map(PCIDevice *pci_dev, int region_num, pcibus_t
addr, pcibus_t size, int type)
+{
+    PCIXenPlatformState *d = DO_UPCAST(PCIXenPlatformState, pci_dev, pci_dev);
+
+    register_ioport_write(addr, size, 1, xen_platform_ioport_writeb, d);
+    register_ioport_read(addr, size, 1, xen_platform_ioport_readb, d);
+}
+
+static uint32_t platform_mmio_read(void *opaque, target_phys_addr_t addr)
+{
+    static int warnings = 0;
+
+    if (warnings < 5) {
+        DPRINTF("Warning: attempted read from physical address "
+                "0x" TARGET_FMT_plx " in xen platform mmio
space\n", addr);
+        warnings++;
+    }
+    return 0;
+}
+
+static void platform_mmio_write(void *opaque, target_phys_addr_t addr,
+                                uint32_t val)
+{
+    static int warnings = 0;
+
+    if (warnings < 5) {
+        DPRINTF("Warning: attempted write of 0x%x to physical "
+                "address 0x" TARGET_FMT_plx " in xen platform
mmio space\n",
+                val, addr);
+        warnings++;
+    }
+}
+
+static CPUReadMemoryFunc * const platform_mmio_read_funcs[3] = {
+    platform_mmio_read,
+    platform_mmio_read,
+    platform_mmio_read,
+};
+
+static CPUWriteMemoryFunc * const platform_mmio_write_funcs[3] = {
+    platform_mmio_write,
+    platform_mmio_write,
+    platform_mmio_write,
+};
+
+static void platform_mmio_map(PCIDevice *d, int region_num,
+                              pcibus_t addr, pcibus_t size, int type)
+{
+    int mmio_io_addr;
+
+    mmio_io_addr = cpu_register_io_memory(platform_mmio_read_funcs,
+                                          platform_mmio_write_funcs, NULL);
+
+    cpu_register_physical_memory(addr, size, mmio_io_addr);
+}
+
+static int xen_platform_post_load(void *opaque, int version_id)
+{
+    PCIXenPlatformState *s = opaque;
+
+    platform_fixed_ioport_writeb(s, XEN_PLATFORM_IOPORT, s->flags);
+
+    return 0;
+}
+
+static const VMStateDescription vmstate_xen_platform = {
+    .name = "platform",
+    .version_id = 4,
+    .minimum_version_id = 4,
+    .minimum_version_id_old = 4,
+    .post_load = xen_platform_post_load,
+    .fields = (VMStateField []) {
+        VMSTATE_PCI_DEVICE(pci_dev, PCIXenPlatformState),
+        VMSTATE_UINT8(flags, PCIXenPlatformState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static int xen_platform_initfn(PCIDevice *dev)
+{
+    PCIXenPlatformState *d = DO_UPCAST(PCIXenPlatformState, pci_dev, dev);
+    uint8_t *pci_conf;
+
+    pci_conf = d->pci_dev.config;
+
+    pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_XENSOURCE);
+    pci_config_set_device_id(pci_conf, 0x0001);
+    pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_IO | PCI_COMMAND_MEMORY);
+
+    pci_config_set_revision(pci_conf, 1);
+    pci_config_set_prog_interface(pci_conf, 0);
+
+    pci_config_set_class(pci_conf, PCI_CLASS_OTHERS << 8 | 0x80);
+
+    pci_conf[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL;
+    pci_conf[PCI_INTERRUPT_PIN] = 1;
+
+    /* Microsoft WHQL requires non-zero subsystem IDs. */
+    /* http://www.pcisig.com/reflector/msg02205.html.  */
+    pci_set_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID, pci_conf[PCI_VENDOR_ID]);
+    pci_set_word(pci_conf + PCI_SUBSYSTEM_ID, 0x0001);
+
+    pci_register_bar(&d->pci_dev, 0, 0x100,
+            PCI_BASE_ADDRESS_SPACE_IO, platform_ioport_map);
+
+    /* reserve 16MB mmio address for share memory*/
+    pci_register_bar(&d->pci_dev, 1, 0x1000000,
+            PCI_BASE_ADDRESS_MEM_PREFETCH, platform_mmio_map);
+
+    platform_fixed_ioport_init(d);
+
+    return 0;
+}
+
+static void platform_reset(DeviceState *dev)
+{
+    PCIXenPlatformState *s = DO_UPCAST(PCIXenPlatformState, pci_dev.qdev, dev);
+
+    platform_fixed_ioport_reset(s);
+}
+
+void pci_xen_platform_init(PCIBus *bus)
+{
+    PCIDevice *dev;
+
+    dev = pci_create(bus, -1, "xen-platform");
+
+    qdev_init_nofail(&dev->qdev);
+}
+
+static PCIDeviceInfo xen_platform_info = {
+    .init = xen_platform_initfn,
+    .qdev.name = "xen-platform",
+    .qdev.desc = "XEN platform pci device",
+    .qdev.size = sizeof(PCIXenPlatformState),
+    .qdev.vmsd = &vmstate_xen_platform,
+    .qdev.reset = platform_reset,
+};
+
+static void xen_platform_register(void)
+{
+    pci_qdev_register(&xen_platform_info);
+}
+
+device_init(xen_platform_register);
diff --git a/hw/xen_platform.h b/hw/xen_platform.h
new file mode 100644
index 0000000..574eecd
--- /dev/null
+++ b/hw/xen_platform.h
@@ -0,0 +1,8 @@
+#ifndef XEN_PLATFORM_H
+#define XEN_PLATFORM_H
+
+#include "hw/pci.h"
+
+void pci_xen_platform_init(PCIBus *bus);
+
+#endif
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 05/12] piix_pci: Introduces Xen specific call for irq.
From: Anthony PERARD <anthony.perard@citrix.com>
This patch introduces Xen specific call in piix_pci.
The specific part for Xen is in write_config, set_irq and get_pirq.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 hw/piix_pci.c |   10 +++++++++-
 hw/xen.h      |    6 ++++++
 xen-all.c     |   29 +++++++++++++++++++++++++++++
 xen-stub.c    |   13 +++++++++++++
 4 files changed, 57 insertions(+), 1 deletions(-)
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index f152a0f..41a342f 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -28,6 +28,7 @@
 #include "pci_host.h"
 #include "isa.h"
 #include "sysbus.h"
+#include "xen.h"
 
 /*
  * I440FX chipset data sheet.
@@ -142,6 +143,9 @@ static void i440fx_write_config(PCIDevice *dev,
 {
     PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev);
 
+    if (xen_enabled())
+        xen_piix_pci_write_config_client(address, val, len);
+
     /* XXX: implement SMRAM.D_LOCK */
     pci_default_write_config(dev, address, val, len);
     if (ranges_overlap(address, len, I440FX_PAM, I440FX_PAM_SIZE) ||
@@ -235,7 +239,11 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int
*piix3_devfn, qemu_irq *
     piix3 = DO_UPCAST(PIIX3State, dev,
                       pci_create_simple_multifunction(b, -1, true,
"PIIX3"));
     piix3->pic = pic;
-    pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4);
+    if (xen_enabled()) {
+        pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, piix3, 4);
+    } else {
+        pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4);
+    }
     (*pi440fx_state)->piix3 = piix3;
 
     *piix3_devfn = piix3->dev.devfn;
diff --git a/hw/xen.h b/hw/xen.h
index 14bbb6e..c5189b1 100644
--- a/hw/xen.h
+++ b/hw/xen.h
@@ -8,6 +8,8 @@
  */
 #include <inttypes.h>
 
+#include "qemu-common.h"
+
 /* xen-machine.c */
 enum xen_mode {
     XEN_EMULATE = 0,  // xen emulation, using xenner (default)
@@ -26,6 +28,10 @@ extern int xen_allowed;
 #define xen_enabled() (0)
 #endif
 
+int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num);
+void xen_piix3_set_irq(void *opaque, int irq_num, int level);
+void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len);
+
 int xen_init(int smp_cpus);
 
 #endif /* QEMU_HW_XEN_H */
diff --git a/xen-all.c b/xen-all.c
index f505563..948e439 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -8,9 +8,38 @@
 
 #include "config.h"
 
+#include "hw/pci.h"
 #include "hw/xen_common.h"
 #include "hw/xen_backend.h"
 
+/* Xen specific function for piix pci */
+
+int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
+{
+    return irq_num + ((pci_dev->devfn >> 3) << 2);
+}
+
+void xen_piix3_set_irq(void *opaque, int irq_num, int level)
+{
+    xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2,
+                              irq_num & 3, level);
+}
+
+void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
+{
+    int i;
+
+    /* Scan for updates to PCI link routes (0x60-0x63). */
+    for (i = 0; i < len; i++) {
+        uint8_t v = (val >> (8*i)) & 0xff;
+        if (v & 0x80)
+            v = 0;
+        v &= 0xf;
+        if (((address+i) >= 0x60) && ((address+i) <= 0x63))
+            xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60,
v);
+    }
+}
+
 /* Initialise Xen */
 
 int xen_init(int smp_cpus)
diff --git a/xen-stub.c b/xen-stub.c
index 0fa9c51..07e64bc 100644
--- a/xen-stub.c
+++ b/xen-stub.c
@@ -11,6 +11,19 @@
 #include "qemu-common.h"
 #include "hw/xen.h"
 
+int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
+{
+    return -1;
+}
+
+void xen_piix3_set_irq(void *opaque, int irq_num, int level)
+{
+}
+
+void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
+{
+}
+
 int xen_init(int smp_cpus)
 {
     return -ENOSYS;
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 06/12] xen: add a 8259 Interrupt Controller
From: Anthony PERARD <anthony.perard@citrix.com>
Introduce a 8259 Interrupt Controller for target-xen; every set_irq
call makes a Xen hypercall.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 hw/xen_common.h     |    2 ++
 hw/xen_machine_fv.c |    5 ++---
 xen-all.c           |   12 ++++++++++++
 3 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/hw/xen_common.h b/hw/xen_common.h
index 2cbc376..dd54063 100644
--- a/hw/xen_common.h
+++ b/hw/xen_common.h
@@ -52,4 +52,6 @@ typedef xc_interface *qemu_xc_interface;
 # define xc_fd(xen_xc)                          (*(int*)xen_xc)
 #endif
 
+qemu_irq *i8259_xen_init(void);
+
 #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
index 65fd44a..dfdff55 100644
--- a/hw/xen_machine_fv.c
+++ b/hw/xen_machine_fv.c
@@ -35,6 +35,7 @@
 
 #include "xen/hvm/hvm_info_table.h"
 #include "xen_platform.h"
+#include "xen_common.h"
 
 #define MAX_IDE_BUS 2
 
@@ -50,7 +51,6 @@ static void xen_init_fv(ram_addr_t ram_size,
     PCIBus *pci_bus;
     PCII440FXState *i440fx_state;
     int piix3_devfn = -1;
-    qemu_irq *cpu_irq;
     qemu_irq *isa_irq;
     qemu_irq *i8259;
     qemu_irq *cmos_s3;
@@ -74,8 +74,7 @@ static void xen_init_fv(ram_addr_t ram_size,
     env = cpu_init(cpu_model);
     env->halted = 1;
 
-    cpu_irq = pc_allocate_cpu_irq();
-    i8259 = i8259_init(cpu_irq[0]);
+    i8259 = i8259_xen_init();
     isa_irq_state = qemu_mallocz(sizeof(*isa_irq_state));
     isa_irq_state->i8259 = i8259;
 
diff --git a/xen-all.c b/xen-all.c
index 948e439..765f87a 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -40,6 +40,18 @@ void xen_piix_pci_write_config_client(uint32_t address,
uint32_t val, int len)
     }
 }
 
+/* i8259 */
+
+static void i8259_set_irq(void *opaque, int irq, int level)
+{
+    xc_hvm_set_isa_irq_level(xen_xc, xen_domid, irq, level);
+}
+
+qemu_irq *i8259_xen_init(void)
+{
+    return qemu_allocate_irqs(i8259_set_irq, NULL, 16);
+}
+
 /* Initialise Xen */
 
 int xen_init(int smp_cpus)
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 07/12] xen: Introduce the Xen mapcache
From: Anthony PERARD <anthony.perard@citrix.com>
The mapcache maps chucks of guest memory on demand, unmaps them when
they are not needed anymore.
Each call to qemu_get_ram_ptr makes a call to qemu_map_cache with the
lock option, so mapcache will not unmap these ram_ptr.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---
 Makefile.target |    2 +-
 exec.c          |   36 ++++++-
 hw/xen.h        |    4 +
 xen-all.c       |   63 ++++++++++++
 xen-stub.c      |    4 +
 xen_mapcache.c  |  302 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 xen_mapcache.h  |   26 +++++
 7 files changed, 432 insertions(+), 5 deletions(-)
 create mode 100644 xen_mapcache.c
 create mode 100644 xen_mapcache.h
diff --git a/Makefile.target b/Makefile.target
index 6b390e6..ea14393 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -183,7 +183,7 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
 
 # xen backend driver support
 obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
-obj-$(CONFIG_XEN) += xen-all.o
+obj-$(CONFIG_XEN) += xen-all.o xen_mapcache.o
 obj-$(CONFIG_NO_XEN) += xen-stub.o
 
 # xen full virtualized machine
diff --git a/exec.c b/exec.c
index 380dab5..f5888eb 100644
--- a/exec.c
+++ b/exec.c
@@ -60,6 +60,9 @@
 #endif
 #endif
 
+#include "hw/xen.h"
+#include "xen_mapcache.h"
+
 //#define DEBUG_TB_INVALIDATE
 //#define DEBUG_FLUSH
 //#define DEBUG_TLB
@@ -2833,6 +2836,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const
char *name,
         }
     }
 
+    new_block->offset = find_ram_offset(size);
     if (host) {
         new_block->host = host;
     } else {
@@ -2856,15 +2860,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev,
const char *name,
                                    PROT_EXEC|PROT_READ|PROT_WRITE,
                                    MAP_SHARED | MAP_ANONYMOUS, -1, 0);
 #else
-            new_block->host = qemu_vmalloc(size);
+            if (xen_enabled()) {
+                xen_ram_alloc(new_block->offset, size);
+            } else {
+                new_block->host = qemu_vmalloc(size);
+            }
 #endif
 #ifdef MADV_MERGEABLE
             madvise(new_block->host, size, MADV_MERGEABLE);
 #endif
         }
     }
-
-    new_block->offset = find_ram_offset(size);
     new_block->length = size;
 
     QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
@@ -2905,7 +2911,11 @@ void qemu_ram_free(ram_addr_t addr)
 #if defined(TARGET_S390X) && defined(CONFIG_KVM)
                 munmap(block->host, block->length);
 #else
-                qemu_vfree(block->host);
+                if (xen_enabled()) {
+                    qemu_invalidate_entry(block->host);
+                } else {
+                    qemu_vfree(block->host);
+                }
 #endif
             }
             qemu_free(block);
@@ -2931,6 +2941,14 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
         if (addr - block->offset < block->length) {
             QLIST_REMOVE(block, next);
             QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
+            if (xen_enabled()) {
+                /* We need to check if the requested address is in the RAM
+                 * because we don''t want to map the entire memory in
QEMU.
+                 */
+                if (block->offset == 0)
+                    return qemu_map_cache(addr, 0, 1);
+                block->host = qemu_map_cache(block->offset,
block->length, 1);
+            }
             return block->host + (addr - block->offset);
         }
     }
@@ -2949,11 +2967,18 @@ ram_addr_t qemu_ram_addr_from_host(void *ptr)
     uint8_t *host = ptr;
 
     QLIST_FOREACH(block, &ram_list.blocks, next) {
+        /* This case append when the block is not mapped. */
+        if (block->host == NULL)
+            continue;
         if (host - block->host < block->length) {
             return block->offset + (host - block->host);
         }
     }
 
+    if (xen_enabled()) {
+        return qemu_ram_addr_from_mapcache(ptr);
+    }
+
     fprintf(stderr, "Bad ram pointer %p\n", ptr);
     abort();
 
@@ -3728,6 +3753,9 @@ void cpu_physical_memory_unmap(void *buffer,
target_phys_addr_t len,
     if (is_write) {
         cpu_physical_memory_write(bounce.addr, bounce.buffer, access_len);
     }
+    if (xen_enabled()) {
+        qemu_invalidate_entry(buffer);
+    }
     qemu_vfree(bounce.buffer);
     bounce.buffer = NULL;
     cpu_notify_map_clients();
diff --git a/hw/xen.h b/hw/xen.h
index c5189b1..2b62ff5 100644
--- a/hw/xen.h
+++ b/hw/xen.h
@@ -34,4 +34,8 @@ void xen_piix_pci_write_config_client(uint32_t address,
uint32_t val, int len);
 
 int xen_init(int smp_cpus);
 
+#ifdef NEED_CPU_H
+void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size);
+#endif
+
 #endif /* QEMU_HW_XEN_H */
diff --git a/xen-all.c b/xen-all.c
index 765f87a..4e0b061 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -12,6 +12,8 @@
 #include "hw/xen_common.h"
 #include "hw/xen_backend.h"
 
+#include "xen_mapcache.h"
+
 /* Xen specific function for piix pci */
 
 int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
@@ -52,6 +54,63 @@ qemu_irq *i8259_xen_init(void)
     return qemu_allocate_irqs(i8259_set_irq, NULL, 16);
 }
 
+
+/* Memory Ops */
+
+static void xen_ram_init(ram_addr_t ram_size)
+{
+    RAMBlock *new_block;
+    ram_addr_t below_4g_mem_size, above_4g_mem_size = 0;
+
+    new_block = qemu_mallocz(sizeof (*new_block));
+    pstrcpy(new_block->idstr, sizeof (new_block->idstr),
"xen.ram");
+    new_block->host = NULL;
+    new_block->offset = 0;
+    new_block->length = ram_size;
+
+    QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
+
+    ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty,
+                                       new_block->length >>
TARGET_PAGE_BITS);
+    memset(ram_list.phys_dirty + (new_block->offset >>
TARGET_PAGE_BITS),
+           0xff, new_block->length >> TARGET_PAGE_BITS);
+
+    if (ram_size >= 0xe0000000 ) {
+        above_4g_mem_size = ram_size - 0xe0000000;
+        below_4g_mem_size = 0xe0000000;
+    } else {
+        below_4g_mem_size = ram_size;
+    }
+
+    cpu_register_physical_memory(0, below_4g_mem_size, new_block->offset);
+#if TARGET_PHYS_ADDR_BITS > 32
+    if (above_4g_mem_size > 0) {
+        cpu_register_physical_memory(0x100000000ULL, above_4g_mem_size,
+                                     new_block->offset + below_4g_mem_size);
+    }
+#endif
+}
+
+void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size)
+{
+    unsigned long nr_pfn;
+    xen_pfn_t *pfn_list;
+    int i;
+
+    nr_pfn = size >> TARGET_PAGE_BITS;
+    pfn_list = qemu_malloc(sizeof (*pfn_list) * nr_pfn);
+
+    for (i = 0; i < nr_pfn; i++)
+        pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
+
+    if (xc_domain_memory_populate_physmap(xen_xc, xen_domid, nr_pfn, 0, 0,
pfn_list)) {
+        hw_error("xen: failed to populate ram at %lx", ram_addr);
+    }
+
+    qemu_free(pfn_list);
+}
+
+
 /* Initialise Xen */
 
 int xen_init(int smp_cpus)
@@ -62,5 +121,9 @@ int xen_init(int smp_cpus)
         return -1;
     }
 
+    /* Init RAM management */
+    qemu_map_cache_init();
+    xen_ram_init(ram_size);
+
     return 0;
 }
diff --git a/xen-stub.c b/xen-stub.c
index 07e64bc..c9f477d 100644
--- a/xen-stub.c
+++ b/xen-stub.c
@@ -24,6 +24,10 @@ void xen_piix_pci_write_config_client(uint32_t address,
uint32_t val, int len)
 {
 }
 
+void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size)
+{
+}
+
 int xen_init(int smp_cpus)
 {
     return -ENOSYS;
diff --git a/xen_mapcache.c b/xen_mapcache.c
new file mode 100644
index 0000000..8e3bf6c
--- /dev/null
+++ b/xen_mapcache.c
@@ -0,0 +1,302 @@
+#include "config.h"
+
+#include "hw/xen_backend.h"
+#include "blockdev.h"
+
+#include <xen/hvm/params.h>
+#include <sys/mman.h>
+
+#include "xen_mapcache.h"
+
+
+//#define MAPCACHE_DEBUG
+
+#ifdef MAPCACHE_DEBUG
+#define DPRINTF(fmt, ...) do { \
+    fprintf(stderr, "xen_mapcache: " fmt, ## __VA_ARGS__); \
+} while (0)
+#else
+#define DPRINTF(fmt, ...) do { } while (0)
+#endif
+
+#if defined(MAPCACHE)
+
+#define BITS_PER_LONG (sizeof(long)*8)
+#define BITS_TO_LONGS(bits) \
+    (((bits)+BITS_PER_LONG-1)/BITS_PER_LONG)
+#define DECLARE_BITMAP(name,bits) \
+    unsigned long name[BITS_TO_LONGS(bits)]
+#define test_bit(bit,map) \
+    (!!((map)[(bit)/BITS_PER_LONG] & (1UL << ((bit)%BITS_PER_LONG))))
+
+typedef struct MapCacheEntry {
+    unsigned long paddr_index;
+    uint8_t *vaddr_base;
+    DECLARE_BITMAP(valid_mapping, MCACHE_BUCKET_SIZE>>XC_PAGE_SHIFT);
+    uint8_t lock;
+    struct MapCacheEntry *next;
+} MapCacheEntry;
+
+typedef struct MapCacheRev {
+    uint8_t *vaddr_req;
+    unsigned long paddr_index;
+    QTAILQ_ENTRY(MapCacheRev) next;
+} MapCacheRev;
+
+typedef struct MapCache {
+    MapCacheEntry *entry;
+    unsigned long nr_buckets;
+    QTAILQ_HEAD(map_cache_head, MapCacheRev) locked_entries;
+
+    /* For most cases (>99.9%), the page address is the same. */
+    unsigned long last_address_index;
+    uint8_t      *last_address_vaddr;
+} MapCache;
+
+static MapCache *mapcache;
+
+
+int qemu_map_cache_init(void)
+{
+    unsigned long size;
+
+    mapcache = qemu_mallocz(sizeof (MapCache));
+
+    QTAILQ_INIT(&mapcache->locked_entries);
+    mapcache->last_address_index = ~0UL;
+
+    mapcache->nr_buckets = (((MAX_MCACHE_SIZE >> XC_PAGE_SHIFT) +
+                   (1UL << (MCACHE_BUCKET_SHIFT - XC_PAGE_SHIFT)) - 1)
>>
+                  (MCACHE_BUCKET_SHIFT - XC_PAGE_SHIFT));
+
+    /*
+     * Use mmap() directly: lets us allocate a big hash table with no up-front
+     * cost in storage space. The OS will allocate memory only for the buckets
+     * that we actually use. All others will contain all zeroes.
+     */
+    size = mapcache->nr_buckets * sizeof(MapCacheEntry);
+    size = (size + XC_PAGE_SIZE - 1) & ~(XC_PAGE_SIZE - 1);
+    DPRINTF("qemu_map_cache_init, nr_buckets = %lx size %lu\n",
mapcache->nr_buckets, size);
+    mapcache->entry = mmap(NULL, size, PROT_READ|PROT_WRITE,
+                          MAP_SHARED|MAP_ANON, -1, 0);
+    if (mapcache->entry == MAP_FAILED) {
+        errno = ENOMEM;
+        return -1;
+    }
+
+    return 0;
+}
+
+static void qemu_remap_bucket(MapCacheEntry *entry,
+                              target_phys_addr_t size,
+                              unsigned long address_index)
+{
+    uint8_t *vaddr_base;
+    xen_pfn_t *pfns;
+    int *err;
+    unsigned int i, j;
+
+    pfns = qemu_mallocz((size >> XC_PAGE_SHIFT) * sizeof (xen_pfn_t));
+    err = qemu_mallocz((size >> XC_PAGE_SHIFT) * sizeof (int));
+
+    if (entry->vaddr_base != NULL) {
+        errno = munmap(entry->vaddr_base, size);
+        if (errno) {
+            fprintf(stderr, "unmap fails %d\n", errno);
+            exit(-1);
+        }
+    }
+
+    for (i = 0; i < size >> XC_PAGE_SHIFT; i++) {
+        pfns[i] = (address_index << (MCACHE_BUCKET_SHIFT-XC_PAGE_SHIFT))
+ i;
+    }
+
+    vaddr_base = xc_map_foreign_bulk(xen_xc, xen_domid, PROT_READ|PROT_WRITE,
+                                     pfns, err,
+                                     size >> XC_PAGE_SHIFT);
+    if (vaddr_base == NULL) {
+        fprintf(stderr, "xc_map_foreign_bulk error %d\n", errno);
+        exit(-1);
+    }
+
+    entry->vaddr_base  = vaddr_base;
+    entry->paddr_index = address_index;
+
+    for (i = 0; i < size >> XC_PAGE_SHIFT; i += BITS_PER_LONG) {
+        unsigned long word = 0;
+        j = ((i + BITS_PER_LONG) > (size >> XC_PAGE_SHIFT)) ?
+            (size >> XC_PAGE_SHIFT) % BITS_PER_LONG : BITS_PER_LONG;
+        while (j > 0) {
+            word = (word << 1) | !err[i + --j];
+        }
+        entry->valid_mapping[i / BITS_PER_LONG] = word;
+    }
+
+    qemu_free(pfns);
+    qemu_free(err);
+}
+
+uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size,
uint8_t lock)
+{
+    MapCacheEntry *entry, *pentry = NULL;
+    unsigned long address_index  = phys_addr >> MCACHE_BUCKET_SHIFT;
+    unsigned long address_offset = phys_addr & (MCACHE_BUCKET_SIZE-1);
+
+    if (address_index == mapcache->last_address_index && !lock)
+        return mapcache->last_address_vaddr + address_offset;
+
+    entry = &mapcache->entry[address_index % mapcache->nr_buckets];
+
+    while (entry && entry->lock && entry->paddr_index !=
address_index && entry->vaddr_base) {
+        pentry = entry;
+        entry = entry->next;
+    }
+    if (!entry) {
+        entry = qemu_mallocz(sizeof(MapCacheEntry));
+        pentry->next = entry;
+        qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE, address_index);
+    } else if (!entry->lock) {
+        if (!entry->vaddr_base || entry->paddr_index != address_index ||
!test_bit(address_offset>>XC_PAGE_SHIFT, entry->valid_mapping))
+            qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE,
address_index);
+    }
+
+    if (!test_bit(address_offset>>XC_PAGE_SHIFT,
entry->valid_mapping)) {
+        mapcache->last_address_index = ~0UL;
+        return NULL;
+    }
+
+    mapcache->last_address_index = address_index;
+    mapcache->last_address_vaddr = entry->vaddr_base;
+    if (lock) {
+        MapCacheRev *reventry = qemu_mallocz(sizeof(MapCacheRev));
+        entry->lock++;
+        reventry->vaddr_req = mapcache->last_address_vaddr +
address_offset;
+        reventry->paddr_index = mapcache->last_address_index;
+        QTAILQ_INSERT_TAIL(&mapcache->locked_entries, reventry, next);
+    }
+
+    return mapcache->last_address_vaddr + address_offset;
+}
+
+ram_addr_t qemu_ram_addr_from_mapcache(void *ptr)
+{
+    MapCacheRev *reventry;
+    unsigned long paddr_index;
+    int found = 0;
+
+    QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
+        if (reventry->vaddr_req == ptr) {
+            paddr_index = reventry->paddr_index;
+            found = 1;
+            break;
+        }
+    }
+    if (!found) {
+        fprintf(stderr, "qemu_ram_addr_from_mapcache, could not find
%p\n", ptr);
+        QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
+            DPRINTF("   %lx -> %p is present\n",
reventry->paddr_index, reventry->vaddr_req);
+        }
+        abort();
+        return 0;
+    }
+
+    return paddr_index << MCACHE_BUCKET_SHIFT;
+}
+
+void qemu_invalidate_entry(uint8_t *buffer)
+{
+    MapCacheEntry *entry = NULL, *pentry = NULL;
+    MapCacheRev *reventry;
+    unsigned long paddr_index;
+    int found = 0;
+
+    if (mapcache->last_address_vaddr == buffer)
+        mapcache->last_address_index =  ~0UL;
+
+    QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
+        if (reventry->vaddr_req == buffer) {
+            paddr_index = reventry->paddr_index;
+            found = 1;
+            break;
+        }
+    }
+    if (!found) {
+        DPRINTF("qemu_invalidate_entry, could not find %p\n",
buffer);
+        QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
+            DPRINTF("   %lx -> %p is present\n",
reventry->paddr_index, reventry->vaddr_req);
+        }
+        return;
+    }
+    QTAILQ_REMOVE(&mapcache->locked_entries, reventry, next);
+    qemu_free(reventry);
+
+    entry = &mapcache->entry[paddr_index % mapcache->nr_buckets];
+    while (entry && entry->paddr_index != paddr_index) {
+        pentry = entry;
+        entry = entry->next;
+    }
+    if (!entry) {
+        DPRINTF("Trying to unmap address %p that is not in the
mapcache!\n", buffer);
+        return;
+    }
+    entry->lock--;
+    if (entry->lock > 0 || pentry == NULL)
+        return;
+
+    pentry->next = entry->next;
+    errno = munmap(entry->vaddr_base, MCACHE_BUCKET_SIZE);
+    if (errno) {
+        fprintf(stderr, "unmap fails %d\n", errno);
+        exit(-1);
+    }
+    qemu_free(entry);
+}
+
+void qemu_invalidate_map_cache(void)
+{
+    unsigned long i;
+    MapCacheRev *reventry;
+
+    qemu_aio_flush();
+
+    QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
+        DPRINTF("There should be no locked mappings at this time, but %lx
-> %p is present\n", reventry->paddr_index, reventry->vaddr_req);
+    }
+
+    mapcache_lock();
+
+    for (i = 0; i < mapcache->nr_buckets; i++) {
+        MapCacheEntry *entry = &mapcache->entry[i];
+
+        if (entry->vaddr_base == NULL)
+            continue;
+
+        errno = munmap(entry->vaddr_base, MCACHE_BUCKET_SIZE);
+        if (errno) {
+            fprintf(stderr, "unmap fails %d\n", errno);
+            exit(-1);
+        }
+
+        entry->paddr_index = 0;
+        entry->vaddr_base  = NULL;
+    }
+
+    mapcache->last_address_index =  ~0UL;
+    mapcache->last_address_vaddr = NULL;
+
+    mapcache_unlock();
+}
+#else
+uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, uint8_t lock)
+{
+    return qemu_get_ram_ptr(phys_addr);
+}
+
+void qemu_invalidate_map_cache(void)
+{
+}
+
+void qemu_invalidate_entry(uint8_t *buffer)
+{
+}
+#endif /* !MAPCACHE */
diff --git a/xen_mapcache.h b/xen_mapcache.h
new file mode 100644
index 0000000..5a6730f
--- /dev/null
+++ b/xen_mapcache.h
@@ -0,0 +1,26 @@
+#ifndef XEN_MAPCACHE_H
+#define XEN_MAPCACHE_H
+
+#if (defined(__i386__) || defined(__x86_64__))
+#  define MAPCACHE
+#  if defined(__i386__)
+#    define MAX_MCACHE_SIZE    0x40000000 /* 1GB max for x86 */
+#    define MCACHE_BUCKET_SHIFT 16
+#  elif defined(__x86_64__)
+#    define MAX_MCACHE_SIZE    0x1000000000 /* 64GB max for x86_64 */
+#    define MCACHE_BUCKET_SHIFT 20
+#  endif
+#  define MCACHE_BUCKET_SIZE (1UL << MCACHE_BUCKET_SHIFT)
+#endif
+
+int      qemu_map_cache_init(void);
+uint8_t  *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size,
uint8_t lock);
+ram_addr_t qemu_ram_addr_from_mapcache(void *ptr);
+void     qemu_invalidate_entry(uint8_t *buffer);
+void     qemu_invalidate_map_cache(void);
+
+#define mapcache_lock()   ((void)0)
+#define mapcache_unlock() ((void)0)
+
+
+#endif /* !XEN_MAPCACHE_H */
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 08/12] Intruduce qemu_ram_ptr_unlock.
From: Anthony PERARD <anthony.perard@citrix.com>
This function allows to unlock a ram_ptr give by qemu_get_ram_ptr. After
a call to qemu_ram_ptr_unlock, the pointer may be unmap from QEMU when
used with Xen.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---
 cpu-common.h   |    1 +
 exec.c         |   29 ++++++++++++++++++++++++++---
 xen_mapcache.c |   34 ++++++++++++++++++++++++++++++++++
 xen_mapcache.h |    1 +
 4 files changed, 62 insertions(+), 3 deletions(-)
diff --git a/cpu-common.h b/cpu-common.h
index 0426bc8..378eea8 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -46,6 +46,7 @@ ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name,
ram_addr_t size);
 void qemu_ram_free(ram_addr_t addr);
 /* This should only be used for ram local to a device.  */
 void *qemu_get_ram_ptr(ram_addr_t addr);
+void qemu_ram_ptr_unlock(void *addr);
 /* This should not be used by devices.  */
 ram_addr_t qemu_ram_addr_from_host(void *ptr);
 
diff --git a/exec.c b/exec.c
index f5888eb..659db50 100644
--- a/exec.c
+++ b/exec.c
@@ -2959,6 +2959,13 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
     return NULL;
 }
 
+void qemu_ram_ptr_unlock(void *addr)
+{
+    if (xen_enabled()) {
+        qemu_map_cache_unlock(addr);
+    }
+}
+
 /* Some of the softmmu routines need to translate from a host pointer
    (typically a TLB entry) back to a ram offset.  */
 ram_addr_t qemu_ram_addr_from_host(void *ptr)
@@ -3064,6 +3071,7 @@ static void notdirty_mem_writeb(void *opaque,
target_phys_addr_t ram_addr,
                                 uint32_t val)
 {
     int dirty_flags;
+    void *vaddr;
     dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
     if (!(dirty_flags & CODE_DIRTY_FLAG)) {
 #if !defined(CONFIG_USER_ONLY)
@@ -3071,19 +3079,21 @@ static void notdirty_mem_writeb(void *opaque,
target_phys_addr_t ram_addr,
         dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
 #endif
     }
-    stb_p(qemu_get_ram_ptr(ram_addr), val);
+    stb_p(vaddr = qemu_get_ram_ptr(ram_addr), val);
     dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
     cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
     /* we remove the notdirty callback only if the code has been
        flushed */
     if (dirty_flags == 0xff)
         tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
+    qemu_ram_ptr_unlock(vaddr);
 }
 
 static void notdirty_mem_writew(void *opaque, target_phys_addr_t ram_addr,
                                 uint32_t val)
 {
     int dirty_flags;
+    void *vaddr;
     dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
     if (!(dirty_flags & CODE_DIRTY_FLAG)) {
 #if !defined(CONFIG_USER_ONLY)
@@ -3091,19 +3101,21 @@ static void notdirty_mem_writew(void *opaque,
target_phys_addr_t ram_addr,
         dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
 #endif
     }
-    stw_p(qemu_get_ram_ptr(ram_addr), val);
+    stw_p(vaddr = qemu_get_ram_ptr(ram_addr), val);
     dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
     cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
     /* we remove the notdirty callback only if the code has been
        flushed */
     if (dirty_flags == 0xff)
         tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
+    qemu_ram_ptr_unlock(vaddr);
 }
 
 static void notdirty_mem_writel(void *opaque, target_phys_addr_t ram_addr,
                                 uint32_t val)
 {
     int dirty_flags;
+    void *vaddr;
     dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
     if (!(dirty_flags & CODE_DIRTY_FLAG)) {
 #if !defined(CONFIG_USER_ONLY)
@@ -3111,13 +3123,14 @@ static void notdirty_mem_writel(void *opaque,
target_phys_addr_t ram_addr,
         dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
 #endif
     }
-    stl_p(qemu_get_ram_ptr(ram_addr), val);
+    stl_p(vaddr = qemu_get_ram_ptr(ram_addr), val);
     dirty_flags |= (0xff & ~CODE_DIRTY_FLAG);
     cpu_physical_memory_set_dirty_flags(ram_addr, dirty_flags);
     /* we remove the notdirty callback only if the code has been
        flushed */
     if (dirty_flags == 0xff)
         tlb_set_dirty(cpu_single_env, cpu_single_env->mem_io_vaddr);
+    qemu_ram_ptr_unlock(vaddr);
 }
 
 static CPUReadMemoryFunc * const error_mem_read[3] = {
@@ -3537,6 +3550,7 @@ void cpu_physical_memory_rw(target_phys_addr_t addr,
uint8_t *buf,
                     cpu_physical_memory_set_dirty_flags(
                         addr1, (0xff & ~CODE_DIRTY_FLAG));
                 }
+                qemu_ram_ptr_unlock(ptr);
             }
         } else {
             if ((pd & ~TARGET_PAGE_MASK) > IO_MEM_ROM &&
@@ -3567,6 +3581,7 @@ void cpu_physical_memory_rw(target_phys_addr_t addr,
uint8_t *buf,
                 ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) +
                     (addr & ~TARGET_PAGE_MASK);
                 memcpy(buf, ptr, l);
+                qemu_ram_ptr_unlock(ptr);
             }
         }
         len -= l;
@@ -3607,6 +3622,7 @@ void cpu_physical_memory_write_rom(target_phys_addr_t
addr,
             /* ROM/RAM case */
             ptr = qemu_get_ram_ptr(addr1);
             memcpy(ptr, buf, l);
+            qemu_ram_ptr_unlock(ptr);
         }
         len -= l;
         buf += l;
@@ -3789,6 +3805,7 @@ uint32_t ldl_phys(target_phys_addr_t addr)
         ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) +
             (addr & ~TARGET_PAGE_MASK);
         val = ldl_p(ptr);
+        qemu_ram_ptr_unlock(ptr);
     }
     return val;
 }
@@ -3827,6 +3844,7 @@ uint64_t ldq_phys(target_phys_addr_t addr)
         ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) +
             (addr & ~TARGET_PAGE_MASK);
         val = ldq_p(ptr);
+        qemu_ram_ptr_unlock(ptr);
     }
     return val;
 }
@@ -3867,6 +3885,7 @@ uint32_t lduw_phys(target_phys_addr_t addr)
         ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) +
             (addr & ~TARGET_PAGE_MASK);
         val = lduw_p(ptr);
+        qemu_ram_ptr_unlock(ptr);
     }
     return val;
 }
@@ -3897,6 +3916,7 @@ void stl_phys_notdirty(target_phys_addr_t addr, uint32_t
val)
         unsigned long addr1 = (pd & TARGET_PAGE_MASK) + (addr &
~TARGET_PAGE_MASK);
         ptr = qemu_get_ram_ptr(addr1);
         stl_p(ptr, val);
+        qemu_ram_ptr_unlock(ptr);
 
         if (unlikely(in_migration)) {
             if (!cpu_physical_memory_is_dirty(addr1)) {
@@ -3939,6 +3959,7 @@ void stq_phys_notdirty(target_phys_addr_t addr, uint64_t
val)
         ptr = qemu_get_ram_ptr(pd & TARGET_PAGE_MASK) +
             (addr & ~TARGET_PAGE_MASK);
         stq_p(ptr, val);
+        qemu_ram_ptr_unlock(ptr);
     }
 }
 
@@ -3968,6 +3989,7 @@ void stl_phys(target_phys_addr_t addr, uint32_t val)
         /* RAM case */
         ptr = qemu_get_ram_ptr(addr1);
         stl_p(ptr, val);
+        qemu_ram_ptr_unlock(ptr);
         if (!cpu_physical_memory_is_dirty(addr1)) {
             /* invalidate code */
             tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
@@ -4011,6 +4033,7 @@ void stw_phys(target_phys_addr_t addr, uint32_t val)
         /* RAM case */
         ptr = qemu_get_ram_ptr(addr1);
         stw_p(ptr, val);
+        qemu_ram_ptr_unlock(ptr);
         if (!cpu_physical_memory_is_dirty(addr1)) {
             /* invalidate code */
             tb_invalidate_phys_page_range(addr1, addr1 + 2, 0);
diff --git a/xen_mapcache.c b/xen_mapcache.c
index 8e3bf6c..afa8728 100644
--- a/xen_mapcache.c
+++ b/xen_mapcache.c
@@ -178,6 +178,40 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr,
target_phys_addr_t size, u
     return mapcache->last_address_vaddr + address_offset;
 }
 
+void qemu_map_cache_unlock(void *buffer)
+{
+    MapCacheEntry *entry = NULL, *pentry = NULL;
+    MapCacheRev *reventry;
+    unsigned long paddr_index;
+    int found = 0;
+
+    QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
+        if (reventry->vaddr_req == buffer) {
+            paddr_index = reventry->paddr_index;
+            found = 1;
+            break;
+        }
+    }
+    if (!found) {
+        return;
+    }
+    QTAILQ_REMOVE(&mapcache->locked_entries, reventry, next);
+    qemu_free(reventry);
+
+    entry = &mapcache->entry[paddr_index % mapcache->nr_buckets];
+    while (entry && entry->paddr_index != paddr_index) {
+        pentry = entry;
+        entry = entry->next;
+    }
+    if (!entry) {
+        return;
+    }
+    entry->lock--;
+    if (entry->lock > 0) {
+        entry->lock--;
+    }
+}
+
 ram_addr_t qemu_ram_addr_from_mapcache(void *ptr)
 {
     MapCacheRev *reventry;
diff --git a/xen_mapcache.h b/xen_mapcache.h
index 5a6730f..3b358b1 100644
--- a/xen_mapcache.h
+++ b/xen_mapcache.h
@@ -15,6 +15,7 @@
 
 int      qemu_map_cache_init(void);
 uint8_t  *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size,
uint8_t lock);
+void     qemu_map_cache_unlock(void *phys_addr);
 ram_addr_t qemu_ram_addr_from_mapcache(void *ptr);
 void     qemu_invalidate_entry(uint8_t *buffer);
 void     qemu_invalidate_map_cache(void);
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 09/12] vl.c: Introduce getter for shutdown_requested and reset_requested.
From: Anthony PERARD <anthony.perard@citrix.com>
Introduce two functions qemu_shutdown_requested_get and
qemu_reset_requested_get to get the value of shutdown/reset_requested
without reset it.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 sysemu.h |    2 ++
 vl.c     |   10 ++++++++++
 2 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/sysemu.h b/sysemu.h
index a1f6466..7facfae 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -51,6 +51,8 @@ void cpu_disable_ticks(void);
 void qemu_system_reset_request(void);
 void qemu_system_shutdown_request(void);
 void qemu_system_powerdown_request(void);
+int qemu_shutdown_requested_get(void);
+int qemu_reset_requested_get(void);
 int qemu_shutdown_requested(void);
 int qemu_reset_requested(void);
 int qemu_powerdown_requested(void);
diff --git a/vl.c b/vl.c
index 6948703..2abb7d0 100644
--- a/vl.c
+++ b/vl.c
@@ -1134,6 +1134,16 @@ static int powerdown_requested;
 int debug_requested;
 int vmstop_requested;
 
+int qemu_shutdown_requested_get(void)
+{
+    return shutdown_requested;
+}
+
+int qemu_reset_requested_get(void)
+{
+    return reset_requested;
+}
+
 int qemu_shutdown_requested(void)
 {
     int r = shutdown_requested;
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 10/12] xen: Initialize event channels and io rings
From: Anthony PERARD <anthony.perard@citrix.com>
Open and bind event channels; map ioreq and buffered ioreq rings.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 hw/xen_common.h |    1 +
 xen-all.c       |  381 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 382 insertions(+), 0 deletions(-)
diff --git a/hw/xen_common.h b/hw/xen_common.h
index dd54063..96cfad7 100644
--- a/hw/xen_common.h
+++ b/hw/xen_common.h
@@ -53,5 +53,6 @@ typedef xc_interface *qemu_xc_interface;
 #endif
 
 qemu_irq *i8259_xen_init(void);
+void destroy_hvm_domain(void);
 
 #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/xen-all.c b/xen-all.c
index 4e0b061..13672f0 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -8,12 +8,38 @@
 
 #include "config.h"
 
+#include <sys/mman.h>
+
 #include "hw/pci.h"
 #include "hw/xen_common.h"
 #include "hw/xen_backend.h"
 
 #include "xen_mapcache.h"
 
+#include <xen/hvm/ioreq.h>
+
+//#define DEBUG_XEN
+
+#ifdef DEBUG_XEN
+#define DPRINTF(fmt, ...) \
+    do { fprintf(stderr, "xen: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+    do { } while (0)
+#endif
+
+shared_iopage_t *shared_page = NULL;
+#define BUFFER_IO_MAX_DELAY  100
+buffered_iopage_t *buffered_io_page = NULL;
+QEMUTimer *buffered_io_timer;
+/* the evtchn port for polling the notification, */
+evtchn_port_t *ioreq_local_port;
+/* the evtchn fd for polling */
+int xce_handle = -1;
+/* which vcpu we are serving */
+int send_vcpu = 0;
+long time_offset = 0;
+
 /* Xen specific function for piix pci */
 
 int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
@@ -111,19 +137,374 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size)
 }
 
 
+/* VCPU Operations, MMIO, IO ring ... */
+
+/* get the ioreq packets from share mem */
+static ioreq_t *cpu_get_ioreq_from_shared_memory(int vcpu)
+{
+    ioreq_t *req = &shared_page->vcpu_ioreq[vcpu];
+
+    if (req->state != STATE_IOREQ_READY) {
+        DPRINTF("I/O request not ready: "
+                "%x, ptr: %x, port: %"PRIx64", "
+                "data: %"PRIx64", count: %u, size: %u\n",
+                req->state, req->data_is_ptr, req->addr,
+                req->data, req->count, req->size);
+        return NULL;
+    }
+
+    xen_rmb(); /* see IOREQ_READY /then/ read contents of ioreq */
+
+    req->state = STATE_IOREQ_INPROCESS;
+    return req;
+}
+
+/* use poll to get the port notification */
+/* ioreq_vec--out,the */
+/* retval--the number of ioreq packet */
+static ioreq_t *cpu_get_ioreq(void)
+{
+    int i;
+    evtchn_port_t port;
+
+    port = xc_evtchn_pending(xce_handle);
+    if (port != -1) {
+        for ( i = 0; i < smp_cpus; i++ )
+            if ( ioreq_local_port[i] == port )
+                break;
+
+        if ( i == smp_cpus ) {
+            hw_error("Fatal error while trying to get io event!\n");
+        }
+
+        /* unmask the wanted port again */
+        xc_evtchn_unmask(xce_handle, port);
+
+        /* get the io packet from shared memory */
+        send_vcpu = i;
+        return cpu_get_ioreq_from_shared_memory(i);
+    }
+
+    /* read error or read nothing */
+    return NULL;
+}
+
+static uint32_t do_inp(CPUState *env, pio_addr_t addr, unsigned long size)
+{
+    switch(size) {
+        case 1:
+            return cpu_inb(addr);
+        case 2:
+            return cpu_inw(addr);
+        case 4:
+            return cpu_inl(addr);
+        default:
+            hw_error("inp: bad size: %04"FMT_pioaddr" %lx",
addr, size);
+    }
+}
+
+static void do_outp(CPUState *env, pio_addr_t addr,
+        unsigned long size, uint32_t val)
+{
+    switch(size) {
+        case 1:
+            return cpu_outb(addr, val);
+        case 2:
+            return cpu_outw(addr, val);
+        case 4:
+            return cpu_outl(addr, val);
+        default:
+            hw_error("outp: bad size: %04"FMT_pioaddr"
%lx", addr, size);
+    }
+}
+
+static void cpu_ioreq_pio(CPUState *env, ioreq_t *req)
+{
+    int i, sign;
+
+    sign = req->df ? -1 : 1;
+
+    if (req->dir == IOREQ_READ) {
+        if (!req->data_is_ptr) {
+            req->data = do_inp(env, req->addr, req->size);
+        } else {
+            uint32_t tmp;
+
+            for (i = 0; i < req->count; i++) {
+                tmp = do_inp(env, req->addr, req->size);
+                cpu_physical_memory_write(req->data + (sign * i *
req->size),
+                        (uint8_t*) &tmp, req->size);
+            }
+        }
+    } else if (req->dir == IOREQ_WRITE) {
+        if (!req->data_is_ptr) {
+            do_outp(env, req->addr, req->size, req->data);
+        } else {
+            for (i = 0; i < req->count; i++) {
+                uint32_t tmp = 0;
+
+                cpu_physical_memory_read(req->data + (sign * i *
req->size),
+                        (uint8_t*) &tmp, req->size);
+                do_outp(env, req->addr, req->size, tmp);
+            }
+        }
+    }
+}
+
+static void cpu_ioreq_move(CPUState *env, ioreq_t *req)
+{
+    int i, sign;
+
+    sign = req->df ? -1 : 1;
+
+    if (!req->data_is_ptr) {
+        if (req->dir == IOREQ_READ) {
+            for (i = 0; i < req->count; i++) {
+                cpu_physical_memory_read(req->addr + (sign * i *
req->size),
+                        (uint8_t*) &req->data, req->size);
+            }
+        } else if (req->dir == IOREQ_WRITE) {
+            for (i = 0; i < req->count; i++) {
+                cpu_physical_memory_write(req->addr + (sign * i *
req->size),
+                        (uint8_t*) &req->data, req->size);
+            }
+        }
+    } else {
+        target_ulong tmp;
+
+        if (req->dir == IOREQ_READ) {
+            for (i = 0; i < req->count; i++) {
+                cpu_physical_memory_read(req->addr + (sign * i *
req->size),
+                        (uint8_t*) &tmp, req->size);
+                cpu_physical_memory_write(req->data + (sign * i *
req->size),
+                        (uint8_t*) &tmp, req->size);
+            }
+        } else if (req->dir == IOREQ_WRITE) {
+            for (i = 0; i < req->count; i++) {
+                cpu_physical_memory_read(req->data + (sign * i *
req->size),
+                        (uint8_t*) &tmp, req->size);
+                cpu_physical_memory_write(req->addr + (sign * i *
req->size),
+                        (uint8_t*) &tmp, req->size);
+            }
+        }
+    }
+}
+
+static void cpu_ioreq_timeoffset(CPUState *env, ioreq_t *req)
+{
+    /* char b[64]; */
+
+    time_offset += (unsigned long)req->data;
+
+    //DPRINTF("Time offset set %ld, added offset
%"PRId64"\n",
+            //time_offset, req->data);
+    /* snprintf(b, 64, "%ld", time_offset); */
+    /* xenstore_vm_write(xen_domid, "rtc/timeoffset", b); */
+}
+
+static void handle_ioreq(CPUState *env, ioreq_t *req)
+{
+    if (!req->data_is_ptr && (req->dir == IOREQ_WRITE) &&
+            (req->size < sizeof(target_ulong)))
+        req->data &= ((target_ulong)1 << (8 * req->size)) - 1;
+
+    switch (req->type) {
+        case IOREQ_TYPE_PIO:
+            cpu_ioreq_pio(env, req);
+            break;
+        case IOREQ_TYPE_COPY:
+            cpu_ioreq_move(env, req);
+            break;
+        case IOREQ_TYPE_TIMEOFFSET:
+            cpu_ioreq_timeoffset(env, req);
+            break;
+        case IOREQ_TYPE_INVALIDATE:
+            qemu_invalidate_map_cache();
+            break;
+        default:
+            hw_error("Invalid ioreq type 0x%x\n", req->type);
+    }
+}
+
+static void handle_buffered_iopage(CPUState *env)
+{
+    buf_ioreq_t *buf_req = NULL;
+    ioreq_t req;
+    int qw;
+
+    if (!buffered_io_page)
+        return;
+
+    while (buffered_io_page->read_pointer !+           
buffered_io_page->write_pointer) {
+        buf_req = &buffered_io_page->buf_ioreq[
+            buffered_io_page->read_pointer % IOREQ_BUFFER_SLOT_NUM];
+        req.size = 1UL << buf_req->size;
+        req.count = 1;
+        req.addr = buf_req->addr;
+        req.data = buf_req->data;
+        req.state = STATE_IOREQ_READY;
+        req.dir = buf_req->dir;
+        req.df = 1;
+        req.type = buf_req->type;
+        req.data_is_ptr = 0;
+        qw = (req.size == 8);
+        if (qw) {
+            buf_req = &buffered_io_page->buf_ioreq[
+                (buffered_io_page->read_pointer+1) % IOREQ_BUFFER_SLOT_NUM];
+            req.data |= ((uint64_t)buf_req->data) << 32;
+        }
+
+        handle_ioreq(env, &req);
+
+        xen_mb();
+        buffered_io_page->read_pointer += qw ? 2 : 1;
+    }
+}
+
+static void handle_buffered_io(void *opaque)
+{
+    CPUState *env = opaque;
+
+    handle_buffered_iopage(env);
+    qemu_mod_timer(buffered_io_timer, BUFFER_IO_MAX_DELAY +
+                   qemu_get_clock(rt_clock));
+}
+
+static void cpu_handle_ioreq(void *opaque)
+{
+    CPUState *env = opaque;
+    ioreq_t *req = cpu_get_ioreq();
+
+    handle_buffered_iopage(env);
+    if (req) {
+        handle_ioreq(env, req);
+
+        if (req->state != STATE_IOREQ_INPROCESS) {
+            fprintf(stderr, "Badness in I/O request ... not in service?!:
"
+                    "%x, ptr: %x, port: %"PRIx64", "
+                    "data: %"PRIx64", count: %u, size:
%u\n",
+                    req->state, req->data_is_ptr, req->addr,
+                    req->data, req->count, req->size);
+            destroy_hvm_domain();
+            return;
+        }
+
+        xen_wmb(); /* Update ioreq contents /then/ update state. */
+
+        /*
+         * We do this before we send the response so that the tools
+         * have the opportunity to pick up on the reset before the
+         * guest resumes and does a hlt with interrupts disabled which
+         * causes Xen to powerdown the domain.
+         */
+        if (vm_running) {
+            if (qemu_shutdown_requested_get()) {
+                destroy_hvm_domain();
+            }
+            if (qemu_reset_requested_get()) {
+                qemu_system_reset();
+            }
+        }
+
+        req->state = STATE_IORESP_READY;
+        xc_evtchn_notify(xce_handle, ioreq_local_port[send_vcpu]);
+    }
+}
+
+static void xen_main_loop_prepare(void)
+{
+    CPUState *env = cpu_single_env;
+
+    int evtchn_fd = xce_handle == -1 ? -1 : xc_evtchn_fd(xce_handle);
+
+    buffered_io_timer = qemu_new_timer(rt_clock, handle_buffered_io,
+                                       cpu_single_env);
+    qemu_mod_timer(buffered_io_timer, qemu_get_clock(rt_clock));
+
+    if (evtchn_fd != -1)
+        qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, env);
+}
+
+
 /* Initialise Xen */
 
+static void xen_vm_change_state_handler(void *opaque, int running, int reason)
+{
+    if (running)
+        xen_main_loop_prepare();
+}
+
 int xen_init(int smp_cpus)
 {
+    int i, rc;
+    unsigned long ioreq_pfn;
+
     xen_xc = xc_interface_open(NULL, NULL, 0);
     if (xen_xc == NULL) {
         xen_be_printf(NULL, 0, "can''t open xen
interface\n");
         return -1;
     }
 
+    xce_handle = xc_evtchn_open();
+    if (xce_handle == -1) {
+        perror("open");
+        return -errno;
+    }
+
+    xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_IOREQ_PFN, &ioreq_pfn);
+    DPRINTF("shared page at pfn %lx\n", ioreq_pfn);
+    shared_page = xc_map_foreign_range(xen_xc, xen_domid, XC_PAGE_SIZE,
+                                       PROT_READ|PROT_WRITE, ioreq_pfn);
+    if (shared_page == NULL) {
+        hw_error("map shared IO page returned error %d handle=%p",
errno, xen_xc);
+    }
+
+    xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_BUFIOREQ_PFN,
&ioreq_pfn);
+    DPRINTF("buffered io page at pfn %lx\n", ioreq_pfn);
+    buffered_io_page = xc_map_foreign_range(xen_xc, xen_domid, XC_PAGE_SIZE,
+                                            PROT_READ|PROT_WRITE, ioreq_pfn);
+    if (buffered_io_page == NULL) {
+        hw_error("map buffered IO page returned error %d", errno);
+    }
+
+    ioreq_local_port = qemu_mallocz(smp_cpus * sizeof(evtchn_port_t));
+
+    /* FIXME: how about if we overflow the page here? */
+    for (i = 0; i < smp_cpus; i++) {
+        rc = xc_evtchn_bind_interdomain(xce_handle, xen_domid,
+                                       
shared_page->vcpu_ioreq[i].vp_eport);
+        if (rc == -1) {
+            fprintf(stderr, "bind interdomain ioctl error %d\n",
errno);
+            return -1;
+        }
+        ioreq_local_port[i] = rc;
+    }
+
     /* Init RAM management */
     qemu_map_cache_init();
     xen_ram_init(ram_size);
 
+    qemu_add_vm_change_state_handler(xen_vm_change_state_handler, NULL);
+
     return 0;
 }
+
+void destroy_hvm_domain(void)
+{
+    xc_interface *xc_handle;
+    int sts;
+
+    xc_handle = xc_interface_open(NULL, NULL, 0);
+    if (!xc_handle)
+        fprintf(stderr, "Cannot acquire xenctrl handle\n");
+    else {
+        sts = xc_domain_shutdown(xc_handle, xen_domid, SHUTDOWN_poweroff);
+        if (sts != 0)
+            fprintf(stderr, "? xc_domain_shutdown failed to issue
poweroff, "
+                    "sts %d, errno %d\n", sts, errno);
+        else
+            fprintf(stderr, "Issued domain %d poweroff\n",
xen_domid);
+        xc_interface_close(xc_handle);
+    }
+}
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 11/12] xen: Set running state in xenstore.
From: Anthony PERARD <anthony.perard@citrix.com>
This tells to the xen management tool that the machine can begin run.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---
 xen-all.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/xen-all.c b/xen-all.c
index 13672f0..6a62ecd 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -412,6 +412,22 @@ static void cpu_handle_ioreq(void *opaque)
     }
 }
 
+static void xenstore_record_dm_state(const char *state)
+{
+    char *path = NULL;
+    struct xs_handle *xenstore = xs_daemon_open();
+
+    if (asprintf(&path, "/local/domain/0/device-model/%u/state",
xen_domid) == -1) {
+        fprintf(stderr, "out of memory recording dm state\n");
+        exit(1);
+    }
+    if (!xs_write(xenstore, XBT_NULL, path, state, strlen(state))) {
+        fprintf(stderr, "error recording dm state\n");
+        exit(1);
+    }
+    free(path);
+}
+
 static void xen_main_loop_prepare(void)
 {
     CPUState *env = cpu_single_env;
@@ -424,6 +440,9 @@ static void xen_main_loop_prepare(void)
 
     if (evtchn_fd != -1)
         qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, env);
+
+    /* record state running */
+    xenstore_record_dm_state("running");
 }
 
 
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
anthony.perard@citrix.com
2010-Sep-17  11:15 UTC
[Xen-devel] [PATCH RFC V3 12/12] xen: Add a Xen specific ACPI Implementation to target-xen
From: Anthony PERARD <anthony.perard@citrix.com>
Xen currently uses a different BIOS (hvmloader + rombios) therefore the
Qemu acpi_piix4 implementation wouldn''t work correctly with Xen.
We plan on fixing this properly but at the moment we are just adding a
new Xen specific acpi_piix4 implementation.
This patch is optional; without it the VM boots but it cannot shutdown
properly or go to S3.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 Makefile.target     |    1 +
 hw/xen_acpi_piix4.c |  405 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/xen_common.h     |    3 +
 hw/xen_machine_fv.c |    6 +-
 4 files changed, 410 insertions(+), 5 deletions(-)
 create mode 100644 hw/xen_acpi_piix4.c
diff --git a/Makefile.target b/Makefile.target
index ea14393..db7f96b 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -189,6 +189,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o
 # xen full virtualized machine
 obj-$(CONFIG_XEN) += xen_machine_fv.o
 obj-$(CONFIG_XEN) += xen_platform.o
+obj-$(CONFIG_XEN) += xen_acpi_piix4.o
 
 # USB layer
 obj-$(CONFIG_USB_OHCI) += usb-ohci.o
diff --git a/hw/xen_acpi_piix4.c b/hw/xen_acpi_piix4.c
new file mode 100644
index 0000000..f4792f2
--- /dev/null
+++ b/hw/xen_acpi_piix4.c
@@ -0,0 +1,405 @@
+ /*
+ * PIIX4 ACPI controller emulation
+ *
+ * Winston liwen Wang, winston.l.wang@intel.com
+ * Copyright (c) 2006 , Intel Corporation.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the
"Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "hw.h"
+#include "pc.h"
+#include "pci.h"
+#include "sysemu.h"
+#include "acpi.h"
+
+#include "xen_backend.h"
+#include "xen_common.h"
+#include "qemu-log.h"
+
+#include <xen/hvm/ioreq.h>
+#include <xen/hvm/params.h>
+
+#define PIIX4ACPI_LOG_ERROR 0
+#define PIIX4ACPI_LOG_INFO 1
+#define PIIX4ACPI_LOG_DEBUG 2
+#define PIIX4ACPI_LOGLEVEL PIIX4ACPI_LOG_INFO
+#define PIIX4ACPI_LOG(level, fmt, ...) do { if (level <= PIIX4ACPI_LOGLEVEL)
qemu_log(fmt, ## __VA_ARGS__); } while (0)
+
+/* Sleep state type codes as defined by the \_Sx objects in the DSDT. */
+/* These must be kept in sync with the DSDT (hvmloader/acpi/dsdt.asl) */
+#define SLP_TYP_S4        (6 << 10)
+#define SLP_TYP_S3        (5 << 10)
+#define SLP_TYP_S5        (7 << 10)
+
+#define ACPI_DBG_IO_ADDR  0xb044
+#define ACPI_PHP_IO_ADDR  0x10c0
+
+#define PHP_EVT_ADD     0x0
+#define PHP_EVT_REMOVE  0x3
+
+/* The bit in GPE0_STS/EN to notify the pci hotplug event */
+#define ACPI_PHP_GPE_BIT 3
+
+#define DEVFN_TO_PHP_SLOT_REG(devfn) (devfn >> 1)
+#define PHP_SLOT_REG_TO_DEVFN(reg, hilo) ((reg << 1) | hilo)
+
+/* ioport to monitor cpu add/remove status */
+#define PROC_BASE 0xaf00
+
+typedef struct PCIAcpiState {
+    PCIDevice dev;
+    uint16_t pm1_control; /* pm1a_ECNT_BLK */
+    qemu_irq irq;
+    qemu_irq cmos_s3;
+} PCIAcpiState;
+
+typedef struct GPEState {
+    /* GPE0 block */
+    uint8_t gpe0_sts[ACPI_GPE0_BLK_LEN / 2];
+    uint8_t gpe0_en[ACPI_GPE0_BLK_LEN / 2];
+
+    /* CPU bitmap */
+    uint8_t cpus_sts[32];
+
+    /* SCI IRQ level */
+    uint8_t sci_asserted;
+
+} GPEState;
+
+static GPEState gpe_state;
+
+static qemu_irq sci_irq;
+
+typedef struct AcpiDeviceState AcpiDeviceState;
+AcpiDeviceState *acpi_device_table;
+
+static const VMStateDescription vmstate_acpi = {
+    .name = "PIIX4 ACPI",
+    .version_id = 1,
+    .fields      = (VMStateField []) {
+        VMSTATE_PCI_DEVICE(dev, PCIAcpiState),
+        VMSTATE_UINT16(pm1_control, PCIAcpiState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static void acpiPm1Control_writeb(void *opaque, uint32_t addr, uint32_t val)
+{
+    PCIAcpiState *s = opaque;
+    s->pm1_control = (s->pm1_control & 0xff00) | (val & 0xff);
+}
+
+static uint32_t acpiPm1Control_readb(void *opaque, uint32_t addr)
+{
+    PCIAcpiState *s = opaque;
+    /* Mask out the write-only bits */
+    return (uint8_t)(s->pm1_control &
~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE));
+}
+
+static void acpi_shutdown(PCIAcpiState *s, uint32_t val)
+{
+    if (!(val & ACPI_BITMASK_SLEEP_ENABLE))
+        return;
+
+    switch (val & ACPI_BITMASK_SLEEP_TYPE) {
+    case SLP_TYP_S3:
+        qemu_system_reset();
+        qemu_irq_raise(s->cmos_s3);
+        xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 3);
+        break;
+    case SLP_TYP_S4:
+    case SLP_TYP_S5:
+        qemu_system_shutdown_request();
+        break;
+    default:
+        break;
+    }
+}
+
+static void acpiPm1ControlP1_writeb(void *opaque, uint32_t addr, uint32_t val)
+{
+    PCIAcpiState *s = opaque;
+
+    val <<= 8;
+    s->pm1_control = ((s->pm1_control & 0xff) | val) &
~ACPI_BITMASK_SLEEP_ENABLE;
+
+    acpi_shutdown(s, val);
+}
+
+static uint32_t acpiPm1ControlP1_readb(void *opaque, uint32_t addr)
+{
+    PCIAcpiState *s = opaque;
+    /* Mask out the write-only bits */
+    return (uint8_t)((s->pm1_control &
~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE)) >> 8);
+}
+
+static void acpiPm1Control_writew(void *opaque, uint32_t addr, uint32_t val)
+{
+    PCIAcpiState *s = opaque;
+
+    s->pm1_control = val & ~ACPI_BITMASK_SLEEP_ENABLE;
+
+    acpi_shutdown(s, val);
+}
+
+static uint32_t acpiPm1Control_readw(void *opaque, uint32_t addr)
+{
+    PCIAcpiState *s = opaque;
+    /* Mask out the write-only bits */
+    return (s->pm1_control &
~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE));
+}
+
+static void acpi_map(PCIDevice *pci_dev, int region_num,
+                     uint32_t addr, uint32_t size, int type)
+{
+    PCIAcpiState *d = (PCIAcpiState *)pci_dev;
+
+    /* Byte access */
+    register_ioport_write(addr + 4, 1, 1, acpiPm1Control_writeb, d);
+    register_ioport_read(addr + 4, 1, 1, acpiPm1Control_readb, d);
+    register_ioport_write(addr + 4 + 1, 1, 1, acpiPm1ControlP1_writeb, d);
+    register_ioport_read(addr + 4 +1, 1, 1, acpiPm1ControlP1_readb, d);
+
+    /* Word access */
+    register_ioport_write(addr + 4, 2, 2, acpiPm1Control_writew, d);
+    register_ioport_read(addr + 4, 2, 2, acpiPm1Control_readw, d);
+}
+
+static inline int test_bit(uint8_t *map, int bit)
+{
+    return ( map[bit / 8] & (1 << (bit % 8)) );
+}
+
+static inline void set_bit(uint8_t *map, int bit)
+{
+    map[bit / 8] |= (1 << (bit % 8));
+}
+
+static inline void clear_bit(uint8_t *map, int bit)
+{
+    map[bit / 8] &= ~(1 << (bit % 8));
+}
+
+static void acpi_dbg_writel(void *opaque, uint32_t addr, uint32_t val)
+{
+    PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "ACPI: DBG: 0x%08x\n", val);
+    PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "ACPI:debug: write addr=0x%x,
val=0x%x.\n", addr, val);
+}
+
+/* GPEx_STS occupy 1st half of the block, while GPEx_EN 2nd half */
+static uint32_t gpe_sts_read(void *opaque, uint32_t addr)
+{
+    GPEState *s = opaque;
+
+    return s->gpe0_sts[addr - ACPI_GPE0_BLK_ADDRESS];
+}
+
+/* write 1 to clear specific GPE bits */
+static void gpe_sts_write(void *opaque, uint32_t addr, uint32_t val)
+{
+    GPEState *s = opaque;
+    int hotplugged = 0;
+
+    PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "gpe_sts_write: addr=0x%x,
val=0x%x.\n", addr, val);
+
+    hotplugged = test_bit(&s->gpe0_sts[0], ACPI_PHP_GPE_BIT);
+    s->gpe0_sts[addr - ACPI_GPE0_BLK_ADDRESS] &= ~val;
+    if ( s->sci_asserted &&
+         hotplugged &&
+         !test_bit(&s->gpe0_sts[0], ACPI_PHP_GPE_BIT)) {
+        PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "Clear the GPE0_STS bit for ACPI
hotplug & deassert the IRQ.\n");
+        qemu_irq_lower(sci_irq);
+    }
+
+}
+
+static uint32_t gpe_en_read(void *opaque, uint32_t addr)
+{
+    GPEState *s = opaque;
+
+    return s->gpe0_en[addr - (ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN /
2)];
+}
+
+/* write 0 to clear en bit */
+static void gpe_en_write(void *opaque, uint32_t addr, uint32_t val)
+{
+    GPEState *s = opaque;
+    int reg_count;
+
+    PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "gpe_en_write: addr=0x%x,
val=0x%x.\n", addr, val);
+    reg_count = addr - (ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2);
+    s->gpe0_en[reg_count] = val;
+    /* If disable GPE bit right after generating SCI on it,
+     * need deassert the intr to avoid redundant intrs
+     */
+    if ( s->sci_asserted &&
+         reg_count == (ACPI_PHP_GPE_BIT / 8) &&
+         !(val & (1 << (ACPI_PHP_GPE_BIT % 8))) ) {
+        PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "deassert due to disable GPE
bit.\n");
+        s->sci_asserted = 0;
+        qemu_irq_lower(sci_irq);
+    }
+
+}
+
+static const VMStateDescription vmstate_gpe = {
+    .name = "gpe",
+    .version_id = 2,
+    .minimum_version_id = 2,
+    .minimum_version_id_old = 2,
+    .fields = (VMStateField []) {
+        VMSTATE_BUFFER(gpe0_sts, GPEState),
+        VMSTATE_BUFFER(gpe0_en, GPEState),
+        VMSTATE_UINT8(sci_asserted, GPEState),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static uint32_t gpe_cpus_readb(void *opaque, uint32_t addr)
+{
+    uint32_t val = 0;
+    GPEState *g = opaque;
+
+    switch (addr) {
+        case PROC_BASE ... PROC_BASE+31:
+            val = g->cpus_sts[addr - PROC_BASE];
+        default:
+            break;
+    }
+
+    return val;
+}
+
+static void gpe_cpus_writeb(void *opaque, uint32_t addr, uint32_t val)
+{
+    /* GPEState *g = opaque; */
+
+    switch (addr) {
+        case PROC_BASE ... PROC_BASE + 31:
+            /* don''t allow to change cpus_sts from inside a guest */
+            break;
+        default:
+            break;
+    }
+}
+
+static void gpe_acpi_init(void)
+{
+    GPEState *s = &gpe_state;
+    memset(s, 0, sizeof(GPEState));
+
+    s->cpus_sts[0] = 1;
+
+    register_ioport_read(PROC_BASE, 32, 1,  gpe_cpus_readb, s);
+    register_ioport_write(PROC_BASE, 32, 1, gpe_cpus_writeb, s);
+
+    register_ioport_read(ACPI_GPE0_BLK_ADDRESS,
+                         ACPI_GPE0_BLK_LEN / 2,
+                         1,
+                         gpe_sts_read,
+                         s);
+    register_ioport_read(ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2,
+                         ACPI_GPE0_BLK_LEN / 2,
+                         1,
+                         gpe_en_read,
+                         s);
+
+    register_ioport_write(ACPI_GPE0_BLK_ADDRESS,
+                          ACPI_GPE0_BLK_LEN / 2,
+                          1,
+                          gpe_sts_write,
+                          s);
+    register_ioport_write(ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2,
+                          ACPI_GPE0_BLK_LEN / 2,
+                          1,
+                          gpe_en_write,
+                          s);
+
+    vmstate_register(NULL, 0, &vmstate_gpe, s);
+}
+
+static int piix4_pm_xen_initfn(PCIDevice *dev)
+{
+    PCIAcpiState *s = DO_UPCAST(PCIAcpiState, dev, dev);
+    uint8_t *pci_conf;
+
+    pci_conf = s->dev.config;
+    pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL);
+    pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_INTEL_82371AB_3);
+    pci_conf[0x08] = 0x01;  /* B0 stepping */
+    pci_conf[0x09] = 0x00;  /* base class */
+    pci_config_set_class(pci_conf, PCI_CLASS_BRIDGE_OTHER);
+    pci_conf[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL; /* header_type */
+    pci_conf[0x3d] = 0x01;  /* Hardwired to PIRQA is used */
+
+    /* PMBA POWER MANAGEMENT BASE ADDRESS, hardcoded to 0x1f40
+     * to make shutdown work for IPF, due to IPF Guest Firmware
+     * will enumerate pci devices.
+     *
+     * TODO:  if Guest Firmware or Guest OS will change this PMBA,
+     * More logic will be added.
+     */
+    pci_conf[0x40] = 0x41; /* Special device-specific BAR at 0x40 */
+    pci_conf[0x41] = 0x1f;
+    pci_conf[0x42] = 0x00;
+    pci_conf[0x43] = 0x00;
+
+    s->pm1_control = ACPI_BITMASK_SCI_ENABLE;
+
+    acpi_map((PCIDevice *)s, 0, 0x1f40, 0x10, PCI_BASE_ADDRESS_SPACE_IO);
+
+    gpe_acpi_init();
+
+    register_ioport_write(ACPI_DBG_IO_ADDR, 4, 4, acpi_dbg_writel, s);
+
+    return 0;
+}
+
+void piix4_pm_xen_init(PCIBus *bus, int devfn, qemu_irq sci_irq_spec, qemu_irq
cmos_s3)
+{
+    PCIDevice *dev;
+    PCIAcpiState *s;
+
+    sci_irq = sci_irq_spec;
+
+    dev = pci_create(bus, devfn, "PIIX4 ACPI");
+
+    s = DO_UPCAST(PCIAcpiState, dev, dev);
+
+    s->irq = sci_irq_spec;
+    s->cmos_s3 = cmos_s3;
+
+    qdev_init_nofail(&dev->qdev);
+}
+
+static PCIDeviceInfo piix4_pm_xen_info = {
+    .qdev.name    = "PIIX4 ACPI",
+    .qdev.desc    = "dm",
+    .qdev.size    = sizeof(PCIAcpiState),
+    .qdev.vmsd    = &vmstate_acpi,
+    .init         = piix4_pm_xen_initfn,
+};
+
+static void piix4_pm_xen_register(void)
+{
+    pci_qdev_register(&piix4_pm_xen_info);
+}
+
+device_init(piix4_pm_xen_register);
diff --git a/hw/xen_common.h b/hw/xen_common.h
index 96cfad7..9dfca8f 100644
--- a/hw/xen_common.h
+++ b/hw/xen_common.h
@@ -55,4 +55,7 @@ typedef xc_interface *qemu_xc_interface;
 qemu_irq *i8259_xen_init(void);
 void destroy_hvm_domain(void);
 
+/* hw/xen_acpi_piix4.c */
+void piix4_pm_xen_init(PCIBus *bus, int devfn, qemu_irq sci_irq_spec, qemu_irq
cmos_s3);
+
 #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
index dfdff55..30a356f 100644
--- a/hw/xen_machine_fv.c
+++ b/hw/xen_machine_fv.c
@@ -54,7 +54,6 @@ static void xen_init_fv(ram_addr_t ram_size,
     qemu_irq *isa_irq;
     qemu_irq *i8259;
     qemu_irq *cmos_s3;
-    qemu_irq *smi_irq;
     IsaIrqState *isa_irq_state;
     DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
     FDCtrl *floppy_controller;
@@ -131,10 +130,7 @@ static void xen_init_fv(ram_addr_t ram_size,
 
     if (acpi_enabled) {
         cmos_s3 = qemu_allocate_irqs(pc_cmos_set_s3_resume, rtc_state, 1);
-        smi_irq = qemu_allocate_irqs(pc_acpi_smi_interrupt, first_cpu, 1);
-        piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100,
-                isa_reserve_irq(9), *cmos_s3, *smi_irq,
-                0);
+        piix4_pm_xen_init(pci_bus, piix3_devfn + 3, isa_reserve_irq(9),
*cmos_s3);
     }
 
     if (i440fx_state) {
-- 
1.6.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Alexander Graf
2010-Sep-17  11:41 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 03/12] xen: Introduce --enable-xen command options.
On 17.09.2010, at 13:14, Anthony.Perard@citrix.com wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > This options will check if the target is build with Xen support. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > Makefile.target | 3 +++ > hw/xen.h | 10 ++++++++++ > qemu-options.hx | 9 +++++++++ > vl.c | 16 ++++++++++++++++ > xen-all.c | 25 +++++++++++++++++++++++++ > xen-stub.c | 17 +++++++++++++++++ > 6 files changed, 80 insertions(+), 0 deletions(-) > create mode 100644 xen-all.c > create mode 100644 xen-stub.c > > diff --git a/Makefile.target b/Makefile.target > index f112e66..1984f58 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -2,6 +2,7 @@ > > GENERATED_HEADERS = config-target.h > CONFIG_NO_KVM = $(if $(subst n,,$(CONFIG_KVM)),n,y) > +CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y) > > include ../config-host.mak > include config-devices.mak > @@ -182,6 +183,8 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) > > # xen backend driver support > obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o > +obj-$(CONFIG_XEN) += xen-all.o > +obj-$(CONFIG_NO_XEN) += xen-stub.o > > # xen full virtualized machine > obj-$(CONFIG_XEN) += xen_machine_fv.o > diff --git a/hw/xen.h b/hw/xen.h > index 780dcf7..14bbb6e 100644 > --- a/hw/xen.h > +++ b/hw/xen.h > @@ -18,4 +18,14 @@ enum xen_mode { > extern uint32_t xen_domid; > extern enum xen_mode xen_mode; > > +extern int xen_allowed; > + > +#if defined CONFIG_XEN > +#define xen_enabled() (xen_allowed) > +#else > +#define xen_enabled() (0) > +#endif > + > +int xen_init(int smp_cpus); > + > #endif /* QEMU_HW_XEN_H */ > diff --git a/qemu-options.hx b/qemu-options.hx > index a0b5ae9..457ca32 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -1904,6 +1904,15 @@ Enable KVM full virtualization support. This option is only available > if KVM support is enabled when compiling. > ETEXI > > +DEF("enable-xen", 0, QEMU_OPTION_enable_xen, \ > + "-enable-xen enable Xen full virtualization support\n", QEMU_ARCH_ALL)This is probably a good point in time to switch to something a bit more sophisticated. I was thinking of qemu -accel xen,kvm,tcg which would first try to enable xen support, then kvm support and fall back to tcg if none is available. The default would be pretty much the line above. That way we could finally get rid of all those -enable-kvm and -enable-whatever switches. We would still need to keep backwards compat for -enable-kvm by mapping it to "-accel kvm" internally. But in the long run an -accel parameter just makes so much more sense. Alex _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Blue Swirl
2010-Sep-17  18:06 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 04/12] xen: Add the Xen platform pci device
On Fri, Sep 17, 2010 at 11:14 AM, <anthony.perard@citrix.com> wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > Introduce a new emulated PCI device, specific to fully virtualized Xen > guests. The device is necessary for PV on HVM drivers to work. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > Makefile.target | 1 + > hw/hw.h | 3 + > hw/pci_ids.h | 2 + > hw/xen_machine_fv.c | 3 + > hw/xen_platform.c | 455 +++++++++++++++++++++++++++++++++++++++++++++++++++ > hw/xen_platform.h | 8 + > 6 files changed, 472 insertions(+), 0 deletions(-) > create mode 100644 hw/xen_platform.c > create mode 100644 hw/xen_platform.h > > diff --git a/Makefile.target b/Makefile.target > index 1984f58..6b390e6 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -188,6 +188,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o > > # xen full virtualized machine > obj-$(CONFIG_XEN) += xen_machine_fv.o > +obj-$(CONFIG_XEN) += xen_platform.o > > # USB layer > obj-$(CONFIG_USB_OHCI) += usb-ohci.o > diff --git a/hw/hw.h b/hw/hw.h > index 4405092..67f3369 100644 > --- a/hw/hw.h > +++ b/hw/hw.h > @@ -653,6 +653,9 @@ extern const VMStateDescription vmstate_i2c_slave; > #define VMSTATE_INT32_LE(_f, _s) \ > VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t) > > +#define VMSTATE_UINT8_TEST(_f, _s, _t) \ > + VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint8, uint8_t) > + > #define VMSTATE_UINT16_TEST(_f, _s, _t) \ > VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint16, uint16_t) > > diff --git a/hw/pci_ids.h b/hw/pci_ids.h > index 39e9f1d..1f2e0dd 100644 > --- a/hw/pci_ids.h > +++ b/hw/pci_ids.h > @@ -105,3 +105,5 @@ > #define PCI_DEVICE_ID_INTEL_82371AB 0x7111 > #define PCI_DEVICE_ID_INTEL_82371AB_2 0x7112 > #define PCI_DEVICE_ID_INTEL_82371AB_3 0x7113 > + > +#define PCI_VENDOR_ID_XENSOURCE 0x5853 > diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c > index 03683c7..65fd44a 100644 > --- a/hw/xen_machine_fv.c > +++ b/hw/xen_machine_fv.c > @@ -34,6 +34,7 @@ > #include "blockdev.h" > > #include "xen/hvm/hvm_info_table.h" > +#include "xen_platform.h" > > #define MAX_IDE_BUS 2 > > @@ -87,6 +88,8 @@ static void xen_init_fv(ram_addr_t ram_size, > > pc_vga_init(pci_bus); > > + pci_xen_platform_init(pci_bus); > + > /* init basic PC hardware */ > pc_basic_device_init(isa_irq, &floppy_controller, &rtc_state); > > diff --git a/hw/xen_platform.c b/hw/xen_platform.c > new file mode 100644 > index 0000000..15b490a > --- /dev/null > +++ b/hw/xen_platform.c > @@ -0,0 +1,455 @@ > +/* > + * XEN platform pci device, formerly known as the event channel device > + * > + * Copyright (c) 2003-2004 Intel Corp. > + * Copyright (c) 2006 XenSource > + * > + * Permission is hereby granted, free of charge, to any person obtaining a copy > + * of this software and associated documentation files (the "Software"), to deal > + * in the Software without restriction, including without limitation the rights > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > + * copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > + * THE SOFTWARE. > + */ > + > +#include "hw.h" > +#include "pc.h" > +#include "pci.h" > +#include "irq.h" > +#include "xen_common.h" > +#include "net.h" > +#include "xen_platform.h" > +#include "xen_backend.h" > +#include "qemu-log.h" > + > +#include <assert.h> > +#include <xenguest.h> > + > +//#define PLATFORM_DEBUG > + > +#ifdef PLATFORM_DEBUG > +#define DPRINTF(fmt, ...) do { \ > + fprintf(stderr, "xen_platform: " fmt, ## __VA_ARGS__); \ > +} while (0) > +#else > +#define DPRINTF(fmt, ...) do { } while (0) > +#endif > + > +#define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */ > + > +typedef struct PCIXenPlatformState { > + PCIDevice pci_dev; > + uint8_t flags; /* used only for version_id == 2 */ > + int drivers_blacklisted; > + uint16_t driver_product_version; > + > + /* Log from guest drivers */ > + int throttling_disabled; > + char log_buffer[4096]; > + int log_buffer_off; > +} PCIXenPlatformState; > + > +#define XEN_PLATFORM_IOPORT 0x10 > + > +/* We throttle access to dom0 syslog, to avoid DOS attacks. This is > + modelled as a token bucket, with one token for every byte of log. > + The bucket size is 128KB (->1024 lines of 128 bytes each) and > + refills at 256B/s. It starts full. The guest is blocked if no > + tokens are available when it tries to generate a log message. */ > +#define BUCKET_MAX_SIZE (128*1024) > +#define BUCKET_FILL_RATE 256 > + > +static void throttle(PCIXenPlatformState *s, unsigned count) > +{ > + static unsigned available; > + static struct timespec last_refil;last_refill> + static int started; > + static int warned; > + > + struct timespec waiting_for, now; > + double delay; > + struct timespec ts; > + > + if (s->throttling_disabled) > + return;Braces should be added here and other places.> + > + if (!started) { > + clock_gettime(CLOCK_MONOTONIC, &last_refil); > + available = BUCKET_MAX_SIZE; > + started = 1; > + } > + > + if (count > BUCKET_MAX_SIZE) { > + DPRINTF("tried to get %d tokens, but bucket size is %d\n",count is unsigned, so %u.> + BUCKET_MAX_SIZE, count); > + exit(1); > + } > + > + if (available < count) { > + /* The bucket is empty. Refil it */ > + > + /* When will it be full enough to handle this request? */ > + delay = (double)(count - available) / BUCKET_FILL_RATE; > + waiting_for = last_refil; > + waiting_for.tv_sec += delay; > + waiting_for.tv_nsec += (delay - (int)delay) * 1e9; > + if (waiting_for.tv_nsec >= 1000000000) { > + waiting_for.tv_nsec -= 1000000000; > + waiting_for.tv_sec++; > + } > + > + /* How long do we have to wait? (might be negative) */ > + clock_gettime(CLOCK_MONOTONIC, &now); > + ts.tv_sec = waiting_for.tv_sec - now.tv_sec; > + ts.tv_nsec = waiting_for.tv_nsec - now.tv_nsec; > + if (ts.tv_nsec < 0) { > + ts.tv_sec--; > + ts.tv_nsec += 1000000000; > + } > + > + /* Wait for it. */ > + if (ts.tv_sec > 0 || > + (ts.tv_sec == 0 && ts.tv_nsec > 0)) { > + if (!warned) { > + DPRINTF("throttling guest access to syslog"); > + warned = 1; > + } > + while (nanosleep(&ts, &ts) < 0 && errno == EINTR) > + ;braces> + } > + > + /* Refil */Refill> + clock_gettime(CLOCK_MONOTONIC, &now); > + delay = (now.tv_sec - last_refil.tv_sec) + > + (now.tv_nsec - last_refil.tv_nsec) * 1.0e-9; > + available += BUCKET_FILL_RATE * delay;We have muldiv64(), perhaps it could be used here?> + if (available > BUCKET_MAX_SIZE) > + available = BUCKET_MAX_SIZE; > + last_refil = now; > + } > + > + assert(available >= count);Is it possible to trigger this from the guest?> + > + available -= count; > +} > + > +/* Xen Platform, Fixed IOPort */ > + > +static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t val) > +{ > + PCIXenPlatformState *s = opaque; > + > + switch (addr - XEN_PLATFORM_IOPORT) { > + case 0: > + /* TODO: */ > + /* Unplug devices. Value is a bitmask of which devices to > + unplug, with bit 0 the IDE devices, bit 1 the network > + devices, and bit 2 the non-primary-master IDE devices. */ > + break; > + case 2: > + switch (val) { > + case 1: > + DPRINTF("Citrix Windows PV drivers loaded in guest\n"); > + break; > + case 0: > + DPRINTF("Guest claimed to be running PV product 0?\n"); > + break; > + default: > + DPRINTF("Unknown PV product %d loaded in guest\n", val); > + break; > + } > + s->driver_product_version = val; > + break; > + } > +} > + > +static void platform_fixed_ioport_writel(void *opaque, uint32_t addr, > + uint32_t val) > +{ > + switch (addr - XEN_PLATFORM_IOPORT) { > + case 0: > + /* PV driver version */ > + break; > + } > +} > + > +static void platform_fixed_ioport_writeb(void *opaque, uint32_t addr, uint32_t val) > +{ > + PCIXenPlatformState *s = opaque; > + > + switch (addr - XEN_PLATFORM_IOPORT) { > + case 0: /* Platform flags */ { > + hvmmem_type_t mem_type = (val & PFFLAG_ROM_LOCK) ? > + HVMMEM_ram_ro : HVMMEM_ram_rw; > + if (xc_hvm_set_mem_type(xen_xc, xen_domid, mem_type, 0xc0, 0x40)) > + DPRINTF("unable to change ro/rw state of ROM memory area!\n");braces> + else { > + s->flags = val & PFFLAG_ROM_LOCK; > + DPRINTF("changed ro/rw state of ROM memory area. now is %s state.\n", > + (mem_type == HVMMEM_ram_ro ? "ro":"rw")); > + } > + break; > + } > + case 2: > + /* Send bytes to syslog */ > + if (val == ''\n'' || s->log_buffer_off == sizeof(s->log_buffer) - 1) { > + /* Flush buffer */ > + s->log_buffer[s->log_buffer_off] = 0; > + throttle(s, s->log_buffer_off); > + DPRINTF("%s\n", s->log_buffer); > + s->log_buffer_off = 0; > + break; > + } > + s->log_buffer[s->log_buffer_off++] = val; > + break; > + } > +} > + > +static uint32_t platform_fixed_ioport_readw(void *opaque, uint32_t addr) > +{ > + PCIXenPlatformState *s = opaque; > + > + switch (addr - XEN_PLATFORM_IOPORT) { > + case 0: > + if (s->drivers_blacklisted) { > + /* The drivers will recognise this magic number and refuse > + * to do anything. */ > + return 0xd249; > + } else { > + /* Magic value so that you can identify the interface. */ > + return 0x49d2; > + } > + default: > + return 0xffff; > + } > +} > + > +static uint32_t platform_fixed_ioport_readb(void *opaque, uint32_t addr) > +{ > + PCIXenPlatformState *s = opaque; > + > + switch (addr - XEN_PLATFORM_IOPORT) { > + case 0: > + /* Platform flags */ > + return s->flags; > + case 2: > + /* Version number */ > + return 1; > + default: > + return 0xff; > + } > +} > + > +static void platform_fixed_ioport_reset(void *opaque) > +{ > + PCIXenPlatformState *s = opaque; > + > + platform_fixed_ioport_writeb(s, XEN_PLATFORM_IOPORT, 0); > +} > + > +static void platform_fixed_ioport_init(PCIXenPlatformState* s) > +{ > + register_ioport_write(XEN_PLATFORM_IOPORT, 16, 4, platform_fixed_ioport_writel, s); > + register_ioport_write(XEN_PLATFORM_IOPORT, 16, 2, platform_fixed_ioport_writew, s); > + register_ioport_write(XEN_PLATFORM_IOPORT, 16, 1, platform_fixed_ioport_writeb, s); > + register_ioport_read(XEN_PLATFORM_IOPORT, 16, 2, platform_fixed_ioport_readw, s); > + register_ioport_read(XEN_PLATFORM_IOPORT, 16, 1, platform_fixed_ioport_readb, s); > +} > + > +/* Xen Platform PCI Device */ > + > +static uint32_t xen_platform_ioport_readb(void *opaque, uint32_t addr) > +{ > + addr &= 0xff; > + > + if (addr == 0) > + return platform_fixed_ioport_readb(opaque, XEN_PLATFORM_IOPORT);braces> + else > + return ~0u; > +} > + > +static void xen_platform_ioport_writeb(void *opaque, uint32_t addr, uint32_t val) > +{ > + PCIXenPlatformState *s = opaque; > + > + addr &= 0xff; > + val &= 0xff; > + > + switch (addr) { > + case 0: /* Platform flags */ > + platform_fixed_ioport_writeb(opaque, XEN_PLATFORM_IOPORT, val); > + break; > + case 8: > + { > + if (val == ''\n'' || s->log_buffer_off == sizeof(s->log_buffer) - 1) { > + /* Flush buffer */ > + s->log_buffer[s->log_buffer_off] = 0; > + throttle(s, s->log_buffer_off); > + DPRINTF("%s\n", s->log_buffer); > + s->log_buffer_off = 0; > + break; > + } > + s->log_buffer[s->log_buffer_off++] = val; > + } > + break; > + default: > + break; > + } > +} > + > +static void platform_ioport_map(PCIDevice *pci_dev, int region_num, pcibus_t addr, pcibus_t size, int type) > +{ > + PCIXenPlatformState *d = DO_UPCAST(PCIXenPlatformState, pci_dev, pci_dev); > + > + register_ioport_write(addr, size, 1, xen_platform_ioport_writeb, d); > + register_ioport_read(addr, size, 1, xen_platform_ioport_readb, d); > +} > + > +static uint32_t platform_mmio_read(void *opaque, target_phys_addr_t addr) > +{ > + static int warnings = 0; > + > + if (warnings < 5) { > + DPRINTF("Warning: attempted read from physical address " > + "0x" TARGET_FMT_plx " in xen platform mmio space\n", addr); > + warnings++;Since DPRINTF only works in a specially compiled version, I''d remove these checks. There could also be additional debug flags besides PLATFORM_DEBUG to enable these warnings if these are too noisy, like DEBUG_MMIO. I''d rename PLATFORM_DEBUG to DEBUG_PLATFORM. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Blue Swirl
2010-Sep-17  18:10 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 05/12] piix_pci: Introduces Xen specific call for irq.
On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > This patch introduces Xen specific call in piix_pci. > > The specific part for Xen is in write_config, set_irq and get_pirq. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > hw/piix_pci.c | 10 +++++++++- > hw/xen.h | 6 ++++++ > xen-all.c | 29 +++++++++++++++++++++++++++++ > xen-stub.c | 13 +++++++++++++ > 4 files changed, 57 insertions(+), 1 deletions(-) > > diff --git a/hw/piix_pci.c b/hw/piix_pci.c > index f152a0f..41a342f 100644 > --- a/hw/piix_pci.c > +++ b/hw/piix_pci.c > @@ -28,6 +28,7 @@ > #include "pci_host.h" > #include "isa.h" > #include "sysbus.h" > +#include "xen.h" > > /* > * I440FX chipset data sheet. > @@ -142,6 +143,9 @@ static void i440fx_write_config(PCIDevice *dev, > { > PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev); > > + if (xen_enabled())braces> + xen_piix_pci_write_config_client(address, val, len); > + > /* XXX: implement SMRAM.D_LOCK */ > pci_default_write_config(dev, address, val, len); > if (ranges_overlap(address, len, I440FX_PAM, I440FX_PAM_SIZE) || > @@ -235,7 +239,11 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq * > piix3 = DO_UPCAST(PIIX3State, dev, > pci_create_simple_multifunction(b, -1, true, "PIIX3")); > piix3->pic = pic; > - pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4); > + if (xen_enabled()) { > + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, piix3, 4); > + } else { > + pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4); > + } > (*pi440fx_state)->piix3 = piix3; > > *piix3_devfn = piix3->dev.devfn; > diff --git a/hw/xen.h b/hw/xen.h > index 14bbb6e..c5189b1 100644 > --- a/hw/xen.h > +++ b/hw/xen.h > @@ -8,6 +8,8 @@ > */ > #include <inttypes.h> > > +#include "qemu-common.h" > + > /* xen-machine.c */ > enum xen_mode { > XEN_EMULATE = 0, // xen emulation, using xenner (default) > @@ -26,6 +28,10 @@ extern int xen_allowed; > #define xen_enabled() (0) > #endif > > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num); > +void xen_piix3_set_irq(void *opaque, int irq_num, int level); > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len); > + > int xen_init(int smp_cpus); > > #endif /* QEMU_HW_XEN_H */ > diff --git a/xen-all.c b/xen-all.c > index f505563..948e439 100644 > --- a/xen-all.c > +++ b/xen-all.c > @@ -8,9 +8,38 @@ > > #include "config.h" > > +#include "hw/pci.h" > #include "hw/xen_common.h" > #include "hw/xen_backend.h" > > +/* Xen specific function for piix pci */ > + > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > +{ > + return irq_num + ((pci_dev->devfn >> 3) << 2); > +} > + > +void xen_piix3_set_irq(void *opaque, int irq_num, int level) > +{ > + xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2, > + irq_num & 3, level); > +} > + > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)address should be target_phys_addr_t.> +{ > + int i; > + > + /* Scan for updates to PCI link routes (0x60-0x63). */ > + for (i = 0; i < len; i++) { > + uint8_t v = (val >> (8*i)) & 0xff;Please add spaces around ''*''.> + if (v & 0x80)braces> + v = 0; > + v &= 0xf; > + if (((address+i) >= 0x60) && ((address+i) <= 0x63))Braces and spaces around ''+''.> + xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60, v); > + } > +} > + > /* Initialise Xen */ > > int xen_init(int smp_cpus) > diff --git a/xen-stub.c b/xen-stub.c > index 0fa9c51..07e64bc 100644 > --- a/xen-stub.c > +++ b/xen-stub.c > @@ -11,6 +11,19 @@ > #include "qemu-common.h" > #include "hw/xen.h" > > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > +{ > + return -1; > +} > + > +void xen_piix3_set_irq(void *opaque, int irq_num, int level) > +{ > +} > + > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)Also here the address should be target_phys_addr_t. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Blue Swirl
2010-Sep-17  19:07 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 07/12] xen: Introduce the Xen mapcache
On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > The mapcache maps chucks of guest memory on demand, unmaps them when > they are not needed anymore. > > Each call to qemu_get_ram_ptr makes a call to qemu_map_cache with the > lock option, so mapcache will not unmap these ram_ptr. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > --- > Makefile.target | 2 +- > exec.c | 36 ++++++- > hw/xen.h | 4 + > xen-all.c | 63 ++++++++++++ > xen-stub.c | 4 + > xen_mapcache.c | 302 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > xen_mapcache.h | 26 +++++ > 7 files changed, 432 insertions(+), 5 deletions(-) > create mode 100644 xen_mapcache.c > create mode 100644 xen_mapcache.h > > diff --git a/Makefile.target b/Makefile.target > index 6b390e6..ea14393 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -183,7 +183,7 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) > > # xen backend driver support > obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o > -obj-$(CONFIG_XEN) += xen-all.o > +obj-$(CONFIG_XEN) += xen-all.o xen_mapcache.o > obj-$(CONFIG_NO_XEN) += xen-stub.o > > # xen full virtualized machine > diff --git a/exec.c b/exec.c > index 380dab5..f5888eb 100644 > --- a/exec.c > +++ b/exec.c > @@ -60,6 +60,9 @@ > #endif > #endif > > +#include "hw/xen.h" > +#include "xen_mapcache.h" > + > //#define DEBUG_TB_INVALIDATE > //#define DEBUG_FLUSH > //#define DEBUG_TLB > @@ -2833,6 +2836,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, > } > } > > + new_block->offset = find_ram_offset(size); > if (host) { > new_block->host = host; > } else { > @@ -2856,15 +2860,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, > PROT_EXEC|PROT_READ|PROT_WRITE, > MAP_SHARED | MAP_ANONYMOUS, -1, 0); > #else > - new_block->host = qemu_vmalloc(size); > + if (xen_enabled()) { > + xen_ram_alloc(new_block->offset, size); > + } else { > + new_block->host = qemu_vmalloc(size); > + } > #endif > #ifdef MADV_MERGEABLE > madvise(new_block->host, size, MADV_MERGEABLE); > #endif > } > } > - > - new_block->offset = find_ram_offset(size); > new_block->length = size; > > QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next); > @@ -2905,7 +2911,11 @@ void qemu_ram_free(ram_addr_t addr) > #if defined(TARGET_S390X) && defined(CONFIG_KVM) > munmap(block->host, block->length); > #else > - qemu_vfree(block->host); > + if (xen_enabled()) { > + qemu_invalidate_entry(block->host); > + } else { > + qemu_vfree(block->host); > + } > #endif > } > qemu_free(block); > @@ -2931,6 +2941,14 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > if (addr - block->offset < block->length) { > QLIST_REMOVE(block, next); > QLIST_INSERT_HEAD(&ram_list.blocks, block, next); > + if (xen_enabled()) { > + /* We need to check if the requested address is in the RAM > + * because we don''t want to map the entire memory in QEMU. > + */ > + if (block->offset == 0)braces> + return qemu_map_cache(addr, 0, 1); > + block->host = qemu_map_cache(block->offset, block->length, 1); > + } > return block->host + (addr - block->offset); > } > } > @@ -2949,11 +2967,18 @@ ram_addr_t qemu_ram_addr_from_host(void *ptr) > uint8_t *host = ptr; > > QLIST_FOREACH(block, &ram_list.blocks, next) { > + /* This case append when the block is not mapped. */ > + if (block->host == NULL)braces> + continue; > if (host - block->host < block->length) { > return block->offset + (host - block->host); > } > } > > + if (xen_enabled()) { > + return qemu_ram_addr_from_mapcache(ptr); > + } > + > fprintf(stderr, "Bad ram pointer %p\n", ptr); > abort(); > > @@ -3728,6 +3753,9 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len, > if (is_write) { > cpu_physical_memory_write(bounce.addr, bounce.buffer, access_len); > } > + if (xen_enabled()) { > + qemu_invalidate_entry(buffer); > + } > qemu_vfree(bounce.buffer); > bounce.buffer = NULL; > cpu_notify_map_clients(); > diff --git a/hw/xen.h b/hw/xen.h > index c5189b1..2b62ff5 100644 > --- a/hw/xen.h > +++ b/hw/xen.h > @@ -34,4 +34,8 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len); > > int xen_init(int smp_cpus); > > +#ifdef NEED_CPU_H > +void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size); > +#endif > + > #endif /* QEMU_HW_XEN_H */ > diff --git a/xen-all.c b/xen-all.c > index 765f87a..4e0b061 100644 > --- a/xen-all.c > +++ b/xen-all.c > @@ -12,6 +12,8 @@ > #include "hw/xen_common.h" > #include "hw/xen_backend.h" > > +#include "xen_mapcache.h" > + > /* Xen specific function for piix pci */ > > int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > @@ -52,6 +54,63 @@ qemu_irq *i8259_xen_init(void) > return qemu_allocate_irqs(i8259_set_irq, NULL, 16); > } > > + > +/* Memory Ops */ > + > +static void xen_ram_init(ram_addr_t ram_size) > +{ > + RAMBlock *new_block; > + ram_addr_t below_4g_mem_size, above_4g_mem_size = 0; > + > + new_block = qemu_mallocz(sizeof (*new_block)); > + pstrcpy(new_block->idstr, sizeof (new_block->idstr), "xen.ram"); > + new_block->host = NULL; > + new_block->offset = 0; > + new_block->length = ram_size; > + > + QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next); > + > + ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty, > + new_block->length >> TARGET_PAGE_BITS); > + memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS), > + 0xff, new_block->length >> TARGET_PAGE_BITS); > + > + if (ram_size >= 0xe0000000 ) { > + above_4g_mem_size = ram_size - 0xe0000000; > + below_4g_mem_size = 0xe0000000; > + } else { > + below_4g_mem_size = ram_size; > + } > + > + cpu_register_physical_memory(0, below_4g_mem_size, new_block->offset); > +#if TARGET_PHYS_ADDR_BITS > 32 > + if (above_4g_mem_size > 0) { > + cpu_register_physical_memory(0x100000000ULL, above_4g_mem_size, > + new_block->offset + below_4g_mem_size); > + } > +#endif > +} > + > +void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size) > +{ > + unsigned long nr_pfn; > + xen_pfn_t *pfn_list; > + int i; > + > + nr_pfn = size >> TARGET_PAGE_BITS; > + pfn_list = qemu_malloc(sizeof (*pfn_list) * nr_pfn); > + > + for (i = 0; i < nr_pfn; i++)braces> + pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i; > + > + if (xc_domain_memory_populate_physmap(xen_xc, xen_domid, nr_pfn, 0, 0, pfn_list)) { > + hw_error("xen: failed to populate ram at %lx", ram_addr); > + } > + > + qemu_free(pfn_list); > +} > + > + > /* Initialise Xen */ > > int xen_init(int smp_cpus) > @@ -62,5 +121,9 @@ int xen_init(int smp_cpus) > return -1; > } > > + /* Init RAM management */ > + qemu_map_cache_init(); > + xen_ram_init(ram_size); > + > return 0; > } > diff --git a/xen-stub.c b/xen-stub.c > index 07e64bc..c9f477d 100644 > --- a/xen-stub.c > +++ b/xen-stub.c > @@ -24,6 +24,10 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) > { > } > > +void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size) > +{ > +} > + > int xen_init(int smp_cpus) > { > return -ENOSYS; > diff --git a/xen_mapcache.c b/xen_mapcache.c > new file mode 100644 > index 0000000..8e3bf6c > --- /dev/null > +++ b/xen_mapcache.c > @@ -0,0 +1,302 @@ > +#include "config.h" > + > +#include "hw/xen_backend.h" > +#include "blockdev.h" > + > +#include <xen/hvm/params.h> > +#include <sys/mman.h> > + > +#include "xen_mapcache.h" > + > + > +//#define MAPCACHE_DEBUG > + > +#ifdef MAPCACHE_DEBUG > +#define DPRINTF(fmt, ...) do { \ > + fprintf(stderr, "xen_mapcache: " fmt, ## __VA_ARGS__); \ > +} while (0) > +#else > +#define DPRINTF(fmt, ...) do { } while (0) > +#endif > + > +#if defined(MAPCACHE) > + > +#define BITS_PER_LONG (sizeof(long)*8)Please add spaces around ''*'', also the below #defines need more spaces.> +#define BITS_TO_LONGS(bits) \ > + (((bits)+BITS_PER_LONG-1)/BITS_PER_LONG) > +#define DECLARE_BITMAP(name,bits) \ > + unsigned long name[BITS_TO_LONGS(bits)] > +#define test_bit(bit,map) \ > + (!!((map)[(bit)/BITS_PER_LONG] & (1UL << ((bit)%BITS_PER_LONG)))) > + > +typedef struct MapCacheEntry { > + unsigned long paddr_index; > + uint8_t *vaddr_base; > + DECLARE_BITMAP(valid_mapping, MCACHE_BUCKET_SIZE>>XC_PAGE_SHIFT); > + uint8_t lock; > + struct MapCacheEntry *next; > +} MapCacheEntry; > + > +typedef struct MapCacheRev { > + uint8_t *vaddr_req; > + unsigned long paddr_index; > + QTAILQ_ENTRY(MapCacheRev) next; > +} MapCacheRev; > + > +typedef struct MapCache { > + MapCacheEntry *entry; > + unsigned long nr_buckets; > + QTAILQ_HEAD(map_cache_head, MapCacheRev) locked_entries; > + > + /* For most cases (>99.9%), the page address is the same. */ > + unsigned long last_address_index; > + uint8_t *last_address_vaddr; > +} MapCache; > + > +static MapCache *mapcache; > + > + > +int qemu_map_cache_init(void) > +{ > + unsigned long size; > + > + mapcache = qemu_mallocz(sizeof (MapCache)); > + > + QTAILQ_INIT(&mapcache->locked_entries); > + mapcache->last_address_index = ~0UL; > + > + mapcache->nr_buckets = (((MAX_MCACHE_SIZE >> XC_PAGE_SHIFT) + > + (1UL << (MCACHE_BUCKET_SHIFT - XC_PAGE_SHIFT)) - 1) >> > + (MCACHE_BUCKET_SHIFT - XC_PAGE_SHIFT)); > + > + /* > + * Use mmap() directly: lets us allocate a big hash table with no up-front > + * cost in storage space. The OS will allocate memory only for the buckets > + * that we actually use. All others will contain all zeroes. > + */ > + size = mapcache->nr_buckets * sizeof(MapCacheEntry); > + size = (size + XC_PAGE_SIZE - 1) & ~(XC_PAGE_SIZE - 1); > + DPRINTF("qemu_map_cache_init, nr_buckets = %lx size %lu\n", mapcache->nr_buckets, size); > + mapcache->entry = mmap(NULL, size, PROT_READ|PROT_WRITE, > + MAP_SHARED|MAP_ANON, -1, 0); > + if (mapcache->entry == MAP_FAILED) { > + errno = ENOMEM;Is this needed, can''t we just use whatever was in errno?> + return -1; > + } > + > + return 0; > +} > + > +static void qemu_remap_bucket(MapCacheEntry *entry, > + target_phys_addr_t size, > + unsigned long address_index) > +{ > + uint8_t *vaddr_base; > + xen_pfn_t *pfns; > + int *err; > + unsigned int i, j;There are a lot of size >> XC_PAGE_SHIFT uses here. I think it would be clearer to add size >>= XC_PAGE_SHIFT or a new variable.> + > + pfns = qemu_mallocz((size >> XC_PAGE_SHIFT) * sizeof (xen_pfn_t)); > + err = qemu_mallocz((size >> XC_PAGE_SHIFT) * sizeof (int)); > + > + if (entry->vaddr_base != NULL) { > + errno = munmap(entry->vaddr_base, size); > + if (errno) { > + fprintf(stderr, "unmap fails %d\n", errno);munmap() returns -1 on error, so please don''t clobber errno and use perror().> + exit(-1); > + } > + } > + > + for (i = 0; i < size >> XC_PAGE_SHIFT; i++) { > + pfns[i] = (address_index << (MCACHE_BUCKET_SHIFT-XC_PAGE_SHIFT)) + i; > + } > + > + vaddr_base = xc_map_foreign_bulk(xen_xc, xen_domid, PROT_READ|PROT_WRITE, > + pfns, err, > + size >> XC_PAGE_SHIFT); > + if (vaddr_base == NULL) { > + fprintf(stderr, "xc_map_foreign_bulk error %d\n", errno);perror()?> + exit(-1); > + } > + > + entry->vaddr_base = vaddr_base; > + entry->paddr_index = address_index; > + > + for (i = 0; i < size >> XC_PAGE_SHIFT; i += BITS_PER_LONG) { > + unsigned long word = 0; > + j = ((i + BITS_PER_LONG) > (size >> XC_PAGE_SHIFT)) ? > + (size >> XC_PAGE_SHIFT) % BITS_PER_LONG : BITS_PER_LONG;Maybe this would be clearer with ''if''.> + while (j > 0) { > + word = (word << 1) | !err[i + --j];You are mixing bitwise OR with logical NOT, is this correct?> + } > + entry->valid_mapping[i / BITS_PER_LONG] = word; > + } > + > + qemu_free(pfns); > + qemu_free(err); > +} > + > +uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock) > +{ > + MapCacheEntry *entry, *pentry = NULL; > + unsigned long address_index = phys_addr >> MCACHE_BUCKET_SHIFT; > + unsigned long address_offset = phys_addr & (MCACHE_BUCKET_SIZE-1);unsigned long will not be long enough on 32 bit host (or 32 bit user space) for a 64 bit target. I can''t remember if this was a supported case for Xen anyway. How about address_offset >>= XC_PAGE_SHIFT?> + > + if (address_index == mapcache->last_address_index && !lock)braces> + return mapcache->last_address_vaddr + address_offset; > + > + entry = &mapcache->entry[address_index % mapcache->nr_buckets]; > + > + while (entry && entry->lock && entry->paddr_index != address_index && entry->vaddr_base) { > + pentry = entry; > + entry = entry->next; > + } > + if (!entry) { > + entry = qemu_mallocz(sizeof(MapCacheEntry)); > + pentry->next = entry; > + qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE, address_index); > + } else if (!entry->lock) { > + if (!entry->vaddr_base || entry->paddr_index != address_index || !test_bit(address_offset>>XC_PAGE_SHIFT, entry->valid_mapping))I suspect this line is too long. Please also add braces.> + qemu_remap_bucket(entry, size ? : MCACHE_BUCKET_SIZE, address_index); > + } > + > + if (!test_bit(address_offset>>XC_PAGE_SHIFT, entry->valid_mapping)) { > + mapcache->last_address_index = ~0UL; > + return NULL; > + } > + > + mapcache->last_address_index = address_index; > + mapcache->last_address_vaddr = entry->vaddr_base; > + if (lock) { > + MapCacheRev *reventry = qemu_mallocz(sizeof(MapCacheRev)); > + entry->lock++; > + reventry->vaddr_req = mapcache->last_address_vaddr + address_offset; > + reventry->paddr_index = mapcache->last_address_index; > + QTAILQ_INSERT_TAIL(&mapcache->locked_entries, reventry, next); > + } > + > + return mapcache->last_address_vaddr + address_offset; > +} > + > +ram_addr_t qemu_ram_addr_from_mapcache(void *ptr) > +{ > + MapCacheRev *reventry; > + unsigned long paddr_index; > + int found = 0; > + > + QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) { > + if (reventry->vaddr_req == ptr) { > + paddr_index = reventry->paddr_index; > + found = 1; > + break; > + } > + } > + if (!found) { > + fprintf(stderr, "qemu_ram_addr_from_mapcache, could not find %p\n", ptr); > + QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) { > + DPRINTF(" %lx -> %p is present\n", reventry->paddr_index, reventry->vaddr_req); > + } > + abort(); > + return 0; > + } > + > + return paddr_index << MCACHE_BUCKET_SHIFT; > +} > + > +void qemu_invalidate_entry(uint8_t *buffer) > +{ > + MapCacheEntry *entry = NULL, *pentry = NULL; > + MapCacheRev *reventry; > + unsigned long paddr_index; > + int found = 0; > + > + if (mapcache->last_address_vaddr == buffer) > + mapcache->last_address_index = ~0UL; > + > + QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) { > + if (reventry->vaddr_req == buffer) { > + paddr_index = reventry->paddr_index; > + found = 1; > + break; > + } > + } > + if (!found) { > + DPRINTF("qemu_invalidate_entry, could not find %p\n", buffer); > + QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) { > + DPRINTF(" %lx -> %p is present\n", reventry->paddr_index, reventry->vaddr_req); > + } > + return; > + } > + QTAILQ_REMOVE(&mapcache->locked_entries, reventry, next); > + qemu_free(reventry); > + > + entry = &mapcache->entry[paddr_index % mapcache->nr_buckets]; > + while (entry && entry->paddr_index != paddr_index) { > + pentry = entry; > + entry = entry->next; > + } > + if (!entry) { > + DPRINTF("Trying to unmap address %p that is not in the mapcache!\n", buffer); > + return; > + } > + entry->lock--; > + if (entry->lock > 0 || pentry == NULL) > + return; > + > + pentry->next = entry->next; > + errno = munmap(entry->vaddr_base, MCACHE_BUCKET_SIZE); > + if (errno) { > + fprintf(stderr, "unmap fails %d\n", errno);Please see my previous munmap comments.> + exit(-1); > + } > + qemu_free(entry); > +} > + > +void qemu_invalidate_map_cache(void) > +{ > + unsigned long i; > + MapCacheRev *reventry; > + > + qemu_aio_flush(); > + > + QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) { > + DPRINTF("There should be no locked mappings at this time, but %lx -> %p is present\n", reventry->paddr_index, reventry->vaddr_req);Probably too long line.> + } > + > + mapcache_lock(); > + > + for (i = 0; i < mapcache->nr_buckets; i++) { > + MapCacheEntry *entry = &mapcache->entry[i]; > + > + if (entry->vaddr_base == NULL) > + continue; > + > + errno = munmap(entry->vaddr_base, MCACHE_BUCKET_SIZE); > + if (errno) { > + fprintf(stderr, "unmap fails %d\n", errno); > + exit(-1); > + } > + > + entry->paddr_index = 0; > + entry->vaddr_base = NULL; > + } > + > + mapcache->last_address_index = ~0UL; > + mapcache->last_address_vaddr = NULL; > + > + mapcache_unlock(); > +} > +#else > +uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, uint8_t lock) > +{ > + return qemu_get_ram_ptr(phys_addr); > +} > + > +void qemu_invalidate_map_cache(void) > +{ > +} > + > +void qemu_invalidate_entry(uint8_t *buffer) > +{ > +} > +#endif /* !MAPCACHE */ > diff --git a/xen_mapcache.h b/xen_mapcache.h > new file mode 100644 > index 0000000..5a6730f > --- /dev/null > +++ b/xen_mapcache.h > @@ -0,0 +1,26 @@ > +#ifndef XEN_MAPCACHE_H > +#define XEN_MAPCACHE_H > + > +#if (defined(__i386__) || defined(__x86_64__)) > +# define MAPCACHExen_mapcache.c could be split into two files, xen-mapcache-stub.c and xen-mapcache.c. configure could perform the check for i386 or x86_64 host and define CONFIG_XEN_MAPCACHE=y appropriately. Then Makefile.target would compile the correct file based on that. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Blue Swirl
2010-Sep-17  19:27 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 10/12] xen: Initialize event channels and io rings
On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > Open and bind event channels; map ioreq and buffered ioreq rings.In general, because of CPUState accesses and cpu_in/out use, this looks like CPU code, specifically x86. Could this belong to target-i386/xen.c instead, much like target-i386/kvm.c vs ./kvm-all.c? Do other CPU types use this stuff?> > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > hw/xen_common.h | 1 + > xen-all.c | 381 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 382 insertions(+), 0 deletions(-) > > diff --git a/hw/xen_common.h b/hw/xen_common.h > index dd54063..96cfad7 100644 > --- a/hw/xen_common.h > +++ b/hw/xen_common.h > @@ -53,5 +53,6 @@ typedef xc_interface *qemu_xc_interface; > #endif > > qemu_irq *i8259_xen_init(void); > +void destroy_hvm_domain(void); > > #endif /* QEMU_HW_XEN_COMMON_H */ > diff --git a/xen-all.c b/xen-all.c > index 4e0b061..13672f0 100644 > --- a/xen-all.c > +++ b/xen-all.c > @@ -8,12 +8,38 @@ > > #include "config.h" > > +#include <sys/mman.h> > + > #include "hw/pci.h" > #include "hw/xen_common.h" > #include "hw/xen_backend.h" > > #include "xen_mapcache.h" > > +#include <xen/hvm/ioreq.h> > + > +//#define DEBUG_XEN > + > +#ifdef DEBUG_XEN > +#define DPRINTF(fmt, ...) \ > + do { fprintf(stderr, "xen: " fmt, ## __VA_ARGS__); } while (0) > +#else > +#define DPRINTF(fmt, ...) \ > + do { } while (0) > +#endif > + > +shared_iopage_t *shared_page = NULL; > +#define BUFFER_IO_MAX_DELAY 100 > +buffered_iopage_t *buffered_io_page = NULL; > +QEMUTimer *buffered_io_timer; > +/* the evtchn port for polling the notification, */ > +evtchn_port_t *ioreq_local_port; > +/* the evtchn fd for polling */ > +int xce_handle = -1; > +/* which vcpu we are serving */ > +int send_vcpu = 0; > +long time_offset = 0;Are all these global needed? Can some of them actually be ''static''? Could you wrap these into a struct and pass that around?> + > /* Xen specific function for piix pci */ > > int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > @@ -111,19 +137,374 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size) > } > > > +/* VCPU Operations, MMIO, IO ring ... */ > + > +/* get the ioreq packets from share mem */ > +static ioreq_t *cpu_get_ioreq_from_shared_memory(int vcpu) > +{ > + ioreq_t *req = &shared_page->vcpu_ioreq[vcpu]; > + > + if (req->state != STATE_IOREQ_READY) { > + DPRINTF("I/O request not ready: " > + "%x, ptr: %x, port: %"PRIx64", " > + "data: %"PRIx64", count: %u, size: %u\n", > + req->state, req->data_is_ptr, req->addr, > + req->data, req->count, req->size); > + return NULL; > + } > + > + xen_rmb(); /* see IOREQ_READY /then/ read contents of ioreq */ > + > + req->state = STATE_IOREQ_INPROCESS; > + return req; > +} > + > +/* use poll to get the port notification */ > +/* ioreq_vec--out,the */ > +/* retval--the number of ioreq packet */ > +static ioreq_t *cpu_get_ioreq(void) > +{ > + int i; > + evtchn_port_t port; > + > + port = xc_evtchn_pending(xce_handle); > + if (port != -1) { > + for ( i = 0; i < smp_cpus; i++ )Please add braces and remove extra spaces after ''('' and before '')'', also in other places.> + if ( ioreq_local_port[i] == port ) > + break; > + > + if ( i == smp_cpus ) { > + hw_error("Fatal error while trying to get io event!\n"); > + } > + > + /* unmask the wanted port again */ > + xc_evtchn_unmask(xce_handle, port); > + > + /* get the io packet from shared memory */ > + send_vcpu = i; > + return cpu_get_ioreq_from_shared_memory(i); > + } > + > + /* read error or read nothing */ > + return NULL; > +} > + > +static uint32_t do_inp(CPUState *env, pio_addr_t addr, unsigned long size) > +{ > + switch(size) { > + case 1: > + return cpu_inb(addr); > + case 2: > + return cpu_inw(addr); > + case 4: > + return cpu_inl(addr); > + default: > + hw_error("inp: bad size: %04"FMT_pioaddr" %lx", addr, size); > + } > +} > + > +static void do_outp(CPUState *env, pio_addr_t addr, > + unsigned long size, uint32_t val) > +{ > + switch(size) { > + case 1: > + return cpu_outb(addr, val); > + case 2: > + return cpu_outw(addr, val); > + case 4: > + return cpu_outl(addr, val); > + default: > + hw_error("outp: bad size: %04"FMT_pioaddr" %lx", addr, size); > + } > +} > + > +static void cpu_ioreq_pio(CPUState *env, ioreq_t *req) > +{ > + int i, sign; > + > + sign = req->df ? -1 : 1; > + > + if (req->dir == IOREQ_READ) { > + if (!req->data_is_ptr) { > + req->data = do_inp(env, req->addr, req->size); > + } else { > + uint32_t tmp; > + > + for (i = 0; i < req->count; i++) { > + tmp = do_inp(env, req->addr, req->size); > + cpu_physical_memory_write(req->data + (sign * i * req->size), > + (uint8_t*) &tmp, req->size); > + } > + } > + } else if (req->dir == IOREQ_WRITE) { > + if (!req->data_is_ptr) { > + do_outp(env, req->addr, req->size, req->data); > + } else { > + for (i = 0; i < req->count; i++) { > + uint32_t tmp = 0; > + > + cpu_physical_memory_read(req->data + (sign * i * req->size), > + (uint8_t*) &tmp, req->size); > + do_outp(env, req->addr, req->size, tmp); > + } > + } > + } > +} > + > +static void cpu_ioreq_move(CPUState *env, ioreq_t *req) > +{ > + int i, sign; > + > + sign = req->df ? -1 : 1; > + > + if (!req->data_is_ptr) { > + if (req->dir == IOREQ_READ) { > + for (i = 0; i < req->count; i++) { > + cpu_physical_memory_read(req->addr + (sign * i * req->size), > + (uint8_t*) &req->data, req->size); > + } > + } else if (req->dir == IOREQ_WRITE) { > + for (i = 0; i < req->count; i++) { > + cpu_physical_memory_write(req->addr + (sign * i * req->size), > + (uint8_t*) &req->data, req->size); > + } > + } > + } else { > + target_ulong tmp; > + > + if (req->dir == IOREQ_READ) { > + for (i = 0; i < req->count; i++) { > + cpu_physical_memory_read(req->addr + (sign * i * req->size), > + (uint8_t*) &tmp, req->size); > + cpu_physical_memory_write(req->data + (sign * i * req->size), > + (uint8_t*) &tmp, req->size); > + } > + } else if (req->dir == IOREQ_WRITE) { > + for (i = 0; i < req->count; i++) { > + cpu_physical_memory_read(req->data + (sign * i * req->size), > + (uint8_t*) &tmp, req->size); > + cpu_physical_memory_write(req->addr + (sign * i * req->size), > + (uint8_t*) &tmp, req->size); > + } > + } > + } > +} > + > +static void cpu_ioreq_timeoffset(CPUState *env, ioreq_t *req) > +{ > + /* char b[64]; */ > + > + time_offset += (unsigned long)req->data; > + > + //DPRINTF("Time offset set %ld, added offset %"PRId64"\n", > + //time_offset, req->data); > + /* snprintf(b, 64, "%ld", time_offset); */ > + /* xenstore_vm_write(xen_domid, "rtc/timeoffset", b); */The commented out stuff should probably go.> +} > + > +static void handle_ioreq(CPUState *env, ioreq_t *req) > +{ > + if (!req->data_is_ptr && (req->dir == IOREQ_WRITE) && > + (req->size < sizeof(target_ulong))) > + req->data &= ((target_ulong)1 << (8 * req->size)) - 1; > + > + switch (req->type) { > + case IOREQ_TYPE_PIO: > + cpu_ioreq_pio(env, req); > + break; > + case IOREQ_TYPE_COPY: > + cpu_ioreq_move(env, req); > + break; > + case IOREQ_TYPE_TIMEOFFSET: > + cpu_ioreq_timeoffset(env, req); > + break; > + case IOREQ_TYPE_INVALIDATE: > + qemu_invalidate_map_cache(); > + break; > + default: > + hw_error("Invalid ioreq type 0x%x\n", req->type); > + } > +} > + > +static void handle_buffered_iopage(CPUState *env) > +{ > + buf_ioreq_t *buf_req = NULL; > + ioreq_t req; > + int qw; > + > + if (!buffered_io_page) > + return; > + > + while (buffered_io_page->read_pointer !> + buffered_io_page->write_pointer) { > + buf_req = &buffered_io_page->buf_ioreq[ > + buffered_io_page->read_pointer % IOREQ_BUFFER_SLOT_NUM]; > + req.size = 1UL << buf_req->size; > + req.count = 1; > + req.addr = buf_req->addr; > + req.data = buf_req->data; > + req.state = STATE_IOREQ_READY; > + req.dir = buf_req->dir; > + req.df = 1; > + req.type = buf_req->type; > + req.data_is_ptr = 0; > + qw = (req.size == 8); > + if (qw) { > + buf_req = &buffered_io_page->buf_ioreq[ > + (buffered_io_page->read_pointer+1) % IOREQ_BUFFER_SLOT_NUM]; > + req.data |= ((uint64_t)buf_req->data) << 32; > + } > + > + handle_ioreq(env, &req); > + > + xen_mb(); > + buffered_io_page->read_pointer += qw ? 2 : 1; > + } > +} > + > +static void handle_buffered_io(void *opaque) > +{ > + CPUState *env = opaque; > + > + handle_buffered_iopage(env); > + qemu_mod_timer(buffered_io_timer, BUFFER_IO_MAX_DELAY + > + qemu_get_clock(rt_clock)); > +} > + > +static void cpu_handle_ioreq(void *opaque) > +{ > + CPUState *env = opaque; > + ioreq_t *req = cpu_get_ioreq(); > + > + handle_buffered_iopage(env); > + if (req) { > + handle_ioreq(env, req); > + > + if (req->state != STATE_IOREQ_INPROCESS) { > + fprintf(stderr, "Badness in I/O request ... not in service?!: " > + "%x, ptr: %x, port: %"PRIx64", " > + "data: %"PRIx64", count: %u, size: %u\n", > + req->state, req->data_is_ptr, req->addr, > + req->data, req->count, req->size); > + destroy_hvm_domain(); > + return; > + } > + > + xen_wmb(); /* Update ioreq contents /then/ update state. */ > + > + /* > + * We do this before we send the response so that the tools > + * have the opportunity to pick up on the reset before the > + * guest resumes and does a hlt with interrupts disabled which > + * causes Xen to powerdown the domain. > + */ > + if (vm_running) { > + if (qemu_shutdown_requested_get()) { > + destroy_hvm_domain(); > + } > + if (qemu_reset_requested_get()) { > + qemu_system_reset(); > + } > + } > + > + req->state = STATE_IORESP_READY; > + xc_evtchn_notify(xce_handle, ioreq_local_port[send_vcpu]); > + } > +} > + > +static void xen_main_loop_prepare(void) > +{ > + CPUState *env = cpu_single_env; > + > + int evtchn_fd = xce_handle == -1 ? -1 : xc_evtchn_fd(xce_handle); > + > + buffered_io_timer = qemu_new_timer(rt_clock, handle_buffered_io, > + cpu_single_env); > + qemu_mod_timer(buffered_io_timer, qemu_get_clock(rt_clock)); > + > + if (evtchn_fd != -1) > + qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, env);braces> +} > + > + > /* Initialise Xen */ > > +static void xen_vm_change_state_handler(void *opaque, int running, int reason) > +{ > + if (running)braces> + xen_main_loop_prepare(); > +} > + > int xen_init(int smp_cpus) > { > + int i, rc; > + unsigned long ioreq_pfn; > + > xen_xc = xc_interface_open(NULL, NULL, 0); > if (xen_xc == NULL) { > xen_be_printf(NULL, 0, "can''t open xen interface\n"); > return -1; > } > > + xce_handle = xc_evtchn_open(); > + if (xce_handle == -1) { > + perror("open"); > + return -errno; > + } > + > + xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_IOREQ_PFN, &ioreq_pfn); > + DPRINTF("shared page at pfn %lx\n", ioreq_pfn); > + shared_page = xc_map_foreign_range(xen_xc, xen_domid, XC_PAGE_SIZE, > + PROT_READ|PROT_WRITE, ioreq_pfn); > + if (shared_page == NULL) { > + hw_error("map shared IO page returned error %d handle=%p", errno, xen_xc); > + } > + > + xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_BUFIOREQ_PFN, &ioreq_pfn); > + DPRINTF("buffered io page at pfn %lx\n", ioreq_pfn); > + buffered_io_page = xc_map_foreign_range(xen_xc, xen_domid, XC_PAGE_SIZE, > + PROT_READ|PROT_WRITE, ioreq_pfn); > + if (buffered_io_page == NULL) { > + hw_error("map buffered IO page returned error %d", errno); > + } > + > + ioreq_local_port = qemu_mallocz(smp_cpus * sizeof(evtchn_port_t)); > + > + /* FIXME: how about if we overflow the page here? */ > + for (i = 0; i < smp_cpus; i++) { > + rc = xc_evtchn_bind_interdomain(xce_handle, xen_domid, > + shared_page->vcpu_ioreq[i].vp_eport); > + if (rc == -1) { > + fprintf(stderr, "bind interdomain ioctl error %d\n", errno); > + return -1; > + } > + ioreq_local_port[i] = rc; > + } > + > /* Init RAM management */ > qemu_map_cache_init(); > xen_ram_init(ram_size); > > + qemu_add_vm_change_state_handler(xen_vm_change_state_handler, NULL); > + > return 0; > } > + > +void destroy_hvm_domain(void) > +{ > + xc_interface *xc_handle; > + int sts; > + > + xc_handle = xc_interface_open(NULL, NULL, 0); > + if (!xc_handle) > + fprintf(stderr, "Cannot acquire xenctrl handle\n"); > + else { > + sts = xc_domain_shutdown(xc_handle, xen_domid, SHUTDOWN_poweroff); > + if (sts != 0) > + fprintf(stderr, "? xc_domain_shutdown failed to issue poweroff, " > + "sts %d, errno %d\n", sts, errno);braces, perror() _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Blue Swirl
2010-Sep-17  19:42 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 08/12] Intruduce qemu_ram_ptr_unlock.
On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > This function allows to unlock a ram_ptr give by qemu_get_ram_ptr. After > a call to qemu_ram_ptr_unlock, the pointer may be unmap from QEMU when > used with Xen. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > --- > cpu-common.h | 1 + > exec.c | 29 ++++++++++++++++++++++++++--- > xen_mapcache.c | 34 ++++++++++++++++++++++++++++++++++ > xen_mapcache.h | 1 + > 4 files changed, 62 insertions(+), 3 deletions(-) > > diff --git a/cpu-common.h b/cpu-common.h > index 0426bc8..378eea8 100644 > --- a/cpu-common.h > +++ b/cpu-common.h > @@ -46,6 +46,7 @@ ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size); > void qemu_ram_free(ram_addr_t addr); > /* This should only be used for ram local to a device. */ > void *qemu_get_ram_ptr(ram_addr_t addr); > +void qemu_ram_ptr_unlock(void *addr); > /* This should not be used by devices. */ > ram_addr_t qemu_ram_addr_from_host(void *ptr); > > diff --git a/exec.c b/exec.c > index f5888eb..659db50 100644 > --- a/exec.c > +++ b/exec.c > @@ -2959,6 +2959,13 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > return NULL; > } > > +void qemu_ram_ptr_unlock(void *addr) > +{ > + if (xen_enabled()) { > + qemu_map_cache_unlock(addr);I think there may be linkage problems without CONFIG_XEN, so there should be a stub for qemu_map_cache_unlock().> + } > +} > + > /* Some of the softmmu routines need to translate from a host pointer > (typically a TLB entry) back to a ram offset. */ > ram_addr_t qemu_ram_addr_from_host(void *ptr) > @@ -3064,6 +3071,7 @@ static void notdirty_mem_writeb(void *opaque, target_phys_addr_t ram_addr, > uint32_t val) > { > int dirty_flags; > + void *vaddr; > dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr); > if (!(dirty_flags & CODE_DIRTY_FLAG)) { > #if !defined(CONFIG_USER_ONLY) > @@ -3071,19 +3079,21 @@ static void notdirty_mem_writeb(void *opaque, target_phys_addr_t ram_addr, > dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr); > #endif > } > - stb_p(qemu_get_ram_ptr(ram_addr), val); > + stb_p(vaddr = qemu_get_ram_ptr(ram_addr), val);Perhaps ''vaddr = ...'' should be put on a separate line. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Blue Swirl
2010-Sep-17  20:03 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 12/12] xen: Add a Xen specific ACPI Implementation to target-xen
On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > Xen currently uses a different BIOS (hvmloader + rombios) therefore the > Qemu acpi_piix4 implementation wouldn''t work correctly with Xen. > We plan on fixing this properly but at the moment we are just adding a > new Xen specific acpi_piix4 implementation. > This patch is optional; without it the VM boots but it cannot shutdown > properly or go to S3. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > Makefile.target | 1 + > hw/xen_acpi_piix4.c | 405 +++++++++++++++++++++++++++++++++++++++++++++++++++ > hw/xen_common.h | 3 + > hw/xen_machine_fv.c | 6 +- > 4 files changed, 410 insertions(+), 5 deletions(-) > create mode 100644 hw/xen_acpi_piix4.c > > diff --git a/Makefile.target b/Makefile.target > index ea14393..db7f96b 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -189,6 +189,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o > # xen full virtualized machine > obj-$(CONFIG_XEN) += xen_machine_fv.o > obj-$(CONFIG_XEN) += xen_platform.o > +obj-$(CONFIG_XEN) += xen_acpi_piix4.o > > # USB layer > obj-$(CONFIG_USB_OHCI) += usb-ohci.o > diff --git a/hw/xen_acpi_piix4.c b/hw/xen_acpi_piix4.c > new file mode 100644 > index 0000000..f4792f2 > --- /dev/null > +++ b/hw/xen_acpi_piix4.c > @@ -0,0 +1,405 @@ > + /* > + * PIIX4 ACPI controller emulation > + * > + * Winston liwen Wang, winston.l.wang@intel.com > + * Copyright (c) 2006 , Intel Corporation. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a copy > + * of this software and associated documentation files (the "Software"), to deal > + * in the Software without restriction, including without limitation the rights > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > + * copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > + * THE SOFTWARE. > + */ > + > +#include "hw.h" > +#include "pc.h" > +#include "pci.h" > +#include "sysemu.h" > +#include "acpi.h" > + > +#include "xen_backend.h" > +#include "xen_common.h" > +#include "qemu-log.h" > + > +#include <xen/hvm/ioreq.h> > +#include <xen/hvm/params.h> > + > +#define PIIX4ACPI_LOG_ERROR 0 > +#define PIIX4ACPI_LOG_INFO 1 > +#define PIIX4ACPI_LOG_DEBUG 2 > +#define PIIX4ACPI_LOGLEVEL PIIX4ACPI_LOG_INFO > +#define PIIX4ACPI_LOG(level, fmt, ...) do { if (level <= PIIX4ACPI_LOGLEVEL) qemu_log(fmt, ## __VA_ARGS__); } while (0) > + > +/* Sleep state type codes as defined by the \_Sx objects in the DSDT. */ > +/* These must be kept in sync with the DSDT (hvmloader/acpi/dsdt.asl) */ > +#define SLP_TYP_S4 (6 << 10) > +#define SLP_TYP_S3 (5 << 10) > +#define SLP_TYP_S5 (7 << 10) > + > +#define ACPI_DBG_IO_ADDR 0xb044 > +#define ACPI_PHP_IO_ADDR 0x10c0 > + > +#define PHP_EVT_ADD 0x0 > +#define PHP_EVT_REMOVE 0x3 > + > +/* The bit in GPE0_STS/EN to notify the pci hotplug event */ > +#define ACPI_PHP_GPE_BIT 3 > + > +#define DEVFN_TO_PHP_SLOT_REG(devfn) (devfn >> 1) > +#define PHP_SLOT_REG_TO_DEVFN(reg, hilo) ((reg << 1) | hilo) > + > +/* ioport to monitor cpu add/remove status */ > +#define PROC_BASE 0xaf00 > + > +typedef struct PCIAcpiState {PCIACPIState> + PCIDevice dev; > + uint16_t pm1_control; /* pm1a_ECNT_BLK */ > + qemu_irq irq; > + qemu_irq cmos_s3; > +} PCIAcpiState; > + > +typedef struct GPEState { > + /* GPE0 block */ > + uint8_t gpe0_sts[ACPI_GPE0_BLK_LEN / 2]; > + uint8_t gpe0_en[ACPI_GPE0_BLK_LEN / 2]; > + > + /* CPU bitmap */ > + uint8_t cpus_sts[32]; > + > + /* SCI IRQ level */ > + uint8_t sci_asserted; > + > +} GPEState; > + > +static GPEState gpe_state; > + > +static qemu_irq sci_irq; > + > +typedef struct AcpiDeviceState AcpiDeviceState; > +AcpiDeviceState *acpi_device_table;The above globals and static variable should be eliminated, see ac4040955b1669f0aac5937f623d6587d5210679.> + > +static const VMStateDescription vmstate_acpi = { > + .name = "PIIX4 ACPI", > + .version_id = 1, > + .fields = (VMStateField []) { > + VMSTATE_PCI_DEVICE(dev, PCIAcpiState), > + VMSTATE_UINT16(pm1_control, PCIAcpiState), > + VMSTATE_END_OF_LIST() > + } > +}; > + > +static void acpiPm1Control_writeb(void *opaque, uint32_t addr, uint32_t val)acpi_pm1_control_writeb().> +{ > + PCIAcpiState *s = opaque; > + s->pm1_control = (s->pm1_control & 0xff00) | (val & 0xff); > +} > + > +static uint32_t acpiPm1Control_readb(void *opaque, uint32_t addr)Likewise.> +{ > + PCIAcpiState *s = opaque; > + /* Mask out the write-only bits */ > + return (uint8_t)(s->pm1_control & ~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE));Please add spaces around ''|'' and break the line. Adding ACPI_BITMASK_WRITEONLY may help.> +} > + > +static void acpi_shutdown(PCIAcpiState *s, uint32_t val) > +{ > + if (!(val & ACPI_BITMASK_SLEEP_ENABLE)) > + return;braces> + > + switch (val & ACPI_BITMASK_SLEEP_TYPE) { > + case SLP_TYP_S3: > + qemu_system_reset(); > + qemu_irq_raise(s->cmos_s3); > + xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 3); > + break; > + case SLP_TYP_S4: > + case SLP_TYP_S5: > + qemu_system_shutdown_request(); > + break; > + default: > + break; > + } > +} > + > +static void acpiPm1ControlP1_writeb(void *opaque, uint32_t addr, uint32_t val) > +{ > + PCIAcpiState *s = opaque; > + > + val <<= 8; > + s->pm1_control = ((s->pm1_control & 0xff) | val) & ~ACPI_BITMASK_SLEEP_ENABLE; > + > + acpi_shutdown(s, val); > +} > + > +static uint32_t acpiPm1ControlP1_readb(void *opaque, uint32_t addr) > +{ > + PCIAcpiState *s = opaque; > + /* Mask out the write-only bits */ > + return (uint8_t)((s->pm1_control & ~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE)) >> 8); > +} > + > +static void acpiPm1Control_writew(void *opaque, uint32_t addr, uint32_t val) > +{ > + PCIAcpiState *s = opaque; > + > + s->pm1_control = val & ~ACPI_BITMASK_SLEEP_ENABLE; > + > + acpi_shutdown(s, val); > +} > + > +static uint32_t acpiPm1Control_readw(void *opaque, uint32_t addr) > +{ > + PCIAcpiState *s = opaque; > + /* Mask out the write-only bits */ > + return (s->pm1_control & ~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE)); > +} > + > +static void acpi_map(PCIDevice *pci_dev, int region_num, > + uint32_t addr, uint32_t size, int type) > +{ > + PCIAcpiState *d = (PCIAcpiState *)pci_dev; > + > + /* Byte access */ > + register_ioport_write(addr + 4, 1, 1, acpiPm1Control_writeb, d); > + register_ioport_read(addr + 4, 1, 1, acpiPm1Control_readb, d); > + register_ioport_write(addr + 4 + 1, 1, 1, acpiPm1ControlP1_writeb, d); > + register_ioport_read(addr + 4 +1, 1, 1, acpiPm1ControlP1_readb, d); > + > + /* Word access */ > + register_ioport_write(addr + 4, 2, 2, acpiPm1Control_writew, d); > + register_ioport_read(addr + 4, 2, 2, acpiPm1Control_readw, d); > +} > + > +static inline int test_bit(uint8_t *map, int bit) > +{ > + return ( map[bit / 8] & (1 << (bit % 8)) ); > +} > + > +static inline void set_bit(uint8_t *map, int bit) > +{ > + map[bit / 8] |= (1 << (bit % 8)); > +} > + > +static inline void clear_bit(uint8_t *map, int bit) > +{ > + map[bit / 8] &= ~(1 << (bit % 8)); > +} > + > +static void acpi_dbg_writel(void *opaque, uint32_t addr, uint32_t val) > +{ > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "ACPI: DBG: 0x%08x\n", val); > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "ACPI:debug: write addr=0x%x, val=0x%x.\n", addr, val); > +} > + > +/* GPEx_STS occupy 1st half of the block, while GPEx_EN 2nd half */ > +static uint32_t gpe_sts_read(void *opaque, uint32_t addr) > +{ > + GPEState *s = opaque; > + > + return s->gpe0_sts[addr - ACPI_GPE0_BLK_ADDRESS]; > +} > + > +/* write 1 to clear specific GPE bits */ > +static void gpe_sts_write(void *opaque, uint32_t addr, uint32_t val) > +{ > + GPEState *s = opaque; > + int hotplugged = 0; > + > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "gpe_sts_write: addr=0x%x, val=0x%x.\n", addr, val); > + > + hotplugged = test_bit(&s->gpe0_sts[0], ACPI_PHP_GPE_BIT); > + s->gpe0_sts[addr - ACPI_GPE0_BLK_ADDRESS] &= ~val; > + if ( s->sci_asserted && > + hotplugged && > + !test_bit(&s->gpe0_sts[0], ACPI_PHP_GPE_BIT)) { > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "Clear the GPE0_STS bit for ACPI hotplug & deassert the IRQ.\n"); > + qemu_irq_lower(sci_irq); > + } > + > +} > + > +static uint32_t gpe_en_read(void *opaque, uint32_t addr) > +{ > + GPEState *s = opaque; > + > + return s->gpe0_en[addr - (ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2)]; > +} > + > +/* write 0 to clear en bit */ > +static void gpe_en_write(void *opaque, uint32_t addr, uint32_t val) > +{ > + GPEState *s = opaque; > + int reg_count; > + > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "gpe_en_write: addr=0x%x, val=0x%x.\n", addr, val); > + reg_count = addr - (ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2); > + s->gpe0_en[reg_count] = val; > + /* If disable GPE bit right after generating SCI on it, > + * need deassert the intr to avoid redundant intrs > + */ > + if ( s->sci_asserted && > + reg_count == (ACPI_PHP_GPE_BIT / 8) && > + !(val & (1 << (ACPI_PHP_GPE_BIT % 8))) ) { > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "deassert due to disable GPE bit.\n"); > + s->sci_asserted = 0; > + qemu_irq_lower(sci_irq); > + } > + > +} > + > +static const VMStateDescription vmstate_gpe = { > + .name = "gpe", > + .version_id = 2, > + .minimum_version_id = 2, > + .minimum_version_id_old = 2, > + .fields = (VMStateField []) { > + VMSTATE_BUFFER(gpe0_sts, GPEState), > + VMSTATE_BUFFER(gpe0_en, GPEState), > + VMSTATE_UINT8(sci_asserted, GPEState), > + VMSTATE_END_OF_LIST() > + } > +}; > + > +static uint32_t gpe_cpus_readb(void *opaque, uint32_t addr) > +{ > + uint32_t val = 0; > + GPEState *g = opaque; > + > + switch (addr) { > + case PROC_BASE ... PROC_BASE+31: > + val = g->cpus_sts[addr - PROC_BASE];break;> + default: > + break; > + } > + > + return val; > +} > + > +static void gpe_cpus_writeb(void *opaque, uint32_t addr, uint32_t val) > +{ > + /* GPEState *g = opaque; */ > + > + switch (addr) { > + case PROC_BASE ... PROC_BASE + 31: > + /* don''t allow to change cpus_sts from inside a guest */ > + break; > + default: > + break; > + } > +} > + > +static void gpe_acpi_init(void) > +{ > + GPEState *s = &gpe_state; > + memset(s, 0, sizeof(GPEState)); > + > + s->cpus_sts[0] = 1; > + > + register_ioport_read(PROC_BASE, 32, 1, gpe_cpus_readb, s); > + register_ioport_write(PROC_BASE, 32, 1, gpe_cpus_writeb, s); > + > + register_ioport_read(ACPI_GPE0_BLK_ADDRESS, > + ACPI_GPE0_BLK_LEN / 2, > + 1, > + gpe_sts_read, > + s); > + register_ioport_read(ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2, > + ACPI_GPE0_BLK_LEN / 2, > + 1, > + gpe_en_read, > + s); > + > + register_ioport_write(ACPI_GPE0_BLK_ADDRESS, > + ACPI_GPE0_BLK_LEN / 2, > + 1, > + gpe_sts_write, > + s); > + register_ioport_write(ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2, > + ACPI_GPE0_BLK_LEN / 2, > + 1, > + gpe_en_write, > + s); > + > + vmstate_register(NULL, 0, &vmstate_gpe, s); > +} > + > +static int piix4_pm_xen_initfn(PCIDevice *dev) > +{ > + PCIAcpiState *s = DO_UPCAST(PCIAcpiState, dev, dev); > + uint8_t *pci_conf; > + > + pci_conf = s->dev.config; > + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL); > + pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_INTEL_82371AB_3); > + pci_conf[0x08] = 0x01; /* B0 stepping */ > + pci_conf[0x09] = 0x00; /* base class */ > + pci_config_set_class(pci_conf, PCI_CLASS_BRIDGE_OTHER); > + pci_conf[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL; /* header_type */ > + pci_conf[0x3d] = 0x01; /* Hardwired to PIRQA is used */ > + > + /* PMBA POWER MANAGEMENT BASE ADDRESS, hardcoded to 0x1f40 > + * to make shutdown work for IPF, due to IPF Guest Firmware > + * will enumerate pci devices. > + * > + * TODO: if Guest Firmware or Guest OS will change this PMBA, > + * More logic will be added. > + */ > + pci_conf[0x40] = 0x41; /* Special device-specific BAR at 0x40 */ > + pci_conf[0x41] = 0x1f; > + pci_conf[0x42] = 0x00; > + pci_conf[0x43] = 0x00; > + > + s->pm1_control = ACPI_BITMASK_SCI_ENABLE;Please extract this line to a reset function. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2010-Sep-20  16:10 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 04/12] xen: Add the Xen platform pci device
On Fri, 17 Sep 2010, Blue Swirl wrote:> On Fri, Sep 17, 2010 at 11:14 AM, <anthony.perard@citrix.com> wrote: > > From: Anthony PERARD <anthony.perard@citrix.com> > > > > Introduce a new emulated PCI device, specific to fully virtualized Xen > > guests. The device is necessary for PV on HVM drivers to work. > > > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > Makefile.target | 1 + > > hw/hw.h | 3 + > > hw/pci_ids.h | 2 + > > hw/xen_machine_fv.c | 3 + > > hw/xen_platform.c | 455 +++++++++++++++++++++++++++++++++++++++++++++++++++ > > hw/xen_platform.h | 8 + > > 6 files changed, 472 insertions(+), 0 deletions(-) > > create mode 100644 hw/xen_platform.c > > create mode 100644 hw/xen_platform.h[...]> > +/* We throttle access to dom0 syslog, to avoid DOS attacks. This is > > + modelled as a token bucket, with one token for every byte of log. > > + The bucket size is 128KB (->1024 lines of 128 bytes each) and > > + refills at 256B/s. It starts full. The guest is blocked if no > > + tokens are available when it tries to generate a log message. */ > > +#define BUCKET_MAX_SIZE (128*1024) > > +#define BUCKET_FILL_RATE 256 > > + > > +static void throttle(PCIXenPlatformState *s, unsigned count) > > +{ > > + static unsigned available; > > + static struct timespec last_refil; > > last_refill > > > + static int started; > > + static int warned; > > + > > + struct timespec waiting_for, now; > > + double delay; > > + struct timespec ts; > > + > > + if (s->throttling_disabled) > > + return; > > Braces should be added here and other places. > > > + > > + if (!started) { > > + clock_gettime(CLOCK_MONOTONIC, &last_refil); > > + available = BUCKET_MAX_SIZE; > > + started = 1; > > + } > > + > > + if (count > BUCKET_MAX_SIZE) { > > + DPRINTF("tried to get %d tokens, but bucket size is %d\n", > > count is unsigned, so %u. > > > + BUCKET_MAX_SIZE, count); > > + exit(1); > > + } > > + > > + if (available < count) { > > + /* The bucket is empty. Refil it */ > > + > > + /* When will it be full enough to handle this request? */ > > + delay = (double)(count - available) / BUCKET_FILL_RATE; > > + waiting_for = last_refil; > > + waiting_for.tv_sec += delay; > > + waiting_for.tv_nsec += (delay - (int)delay) * 1e9; > > + if (waiting_for.tv_nsec >= 1000000000) { > > + waiting_for.tv_nsec -= 1000000000; > > + waiting_for.tv_sec++; > > + } > > + > > + /* How long do we have to wait? (might be negative) */ > > + clock_gettime(CLOCK_MONOTONIC, &now); > > + ts.tv_sec = waiting_for.tv_sec - now.tv_sec; > > + ts.tv_nsec = waiting_for.tv_nsec - now.tv_nsec; > > + if (ts.tv_nsec < 0) { > > + ts.tv_sec--; > > + ts.tv_nsec += 1000000000; > > + } > > + > > + /* Wait for it. */ > > + if (ts.tv_sec > 0 || > > + (ts.tv_sec == 0 && ts.tv_nsec > 0)) { > > + if (!warned) { > > + DPRINTF("throttling guest access to syslog"); > > + warned = 1; > > + } > > + while (nanosleep(&ts, &ts) < 0 && errno == EINTR) > > + ; > > braces > > > + } > > + > > + /* Refil */ > > Refill > > > + clock_gettime(CLOCK_MONOTONIC, &now); > > + delay = (now.tv_sec - last_refil.tv_sec) + > > + (now.tv_nsec - last_refil.tv_nsec) * 1.0e-9; > > + available += BUCKET_FILL_RATE * delay; > > We have muldiv64(), perhaps it could be used here?Ok, I use it, and I also use qemu_get_clock_ns(rt_clock).> > + if (available > BUCKET_MAX_SIZE) > > + available = BUCKET_MAX_SIZE; > > + last_refil = now; > > + } > > + > > + assert(available >= count); > > Is it possible to trigger this from the guest?I don''t think we can do that for every guest.> > + > > + available -= count; > > +} > > +[...]> > +static uint32_t platform_mmio_read(void *opaque, target_phys_addr_t addr) > > +{ > > + static int warnings = 0; > > + > > + if (warnings < 5) { > > + DPRINTF("Warning: attempted read from physical address " > > + "0x" TARGET_FMT_plx " in xen platform mmio space\n", addr); > > + warnings++; > > Since DPRINTF only works in a specially compiled version, I''d remove > these checks. There could also be additional debug flags besides > PLATFORM_DEBUG to enable these warnings if these are too noisy, like > DEBUG_MMIO. I''d rename PLATFORM_DEBUG to DEBUG_PLATFORM.I will fix the coding style issue. Thanks, -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2010-Sep-20  16:43 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 05/12] piix_pci: Introduces Xen specific call for irq.
On Fri, 17 Sep 2010, Blue Swirl wrote:> On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote: > > From: Anthony PERARD <anthony.perard@citrix.com> > > > > This patch introduces Xen specific call in piix_pci. > > > > The specific part for Xen is in write_config, set_irq and get_pirq. > > > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > hw/piix_pci.c | 10 +++++++++- > > hw/xen.h | 6 ++++++ > > xen-all.c | 29 +++++++++++++++++++++++++++++ > > xen-stub.c | 13 +++++++++++++ > > 4 files changed, 57 insertions(+), 1 deletions(-)[...]> > diff --git a/xen-all.c b/xen-all.c > > index f505563..948e439 100644 > > --- a/xen-all.c > > +++ b/xen-all.c > > @@ -8,9 +8,38 @@ > > > > #include "config.h" > > > > +#include "hw/pci.h" > > #include "hw/xen_common.h" > > #include "hw/xen_backend.h" > > > > +/* Xen specific function for piix pci */ > > + > > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > > +{ > > + return irq_num + ((pci_dev->devfn >> 3) << 2); > > +} > > + > > +void xen_piix3_set_irq(void *opaque, int irq_num, int level) > > +{ > > + xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2, > > + irq_num & 3, level); > > +} > > + > > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) > > address should be target_phys_addr_t.I use the same type as for PCIConfigWriteFunc, and address is uint32_t. But I can change if it''s necessary.> > +{ > > + int i; > > + > > + /* Scan for updates to PCI link routes (0x60-0x63). */ > > + for (i = 0; i < len; i++) { > > + uint8_t v = (val >> (8*i)) & 0xff; > > Please add spaces around ''*''. > > > + if (v & 0x80) > > braces > > > + v = 0; > > + v &= 0xf; > > + if (((address+i) >= 0x60) && ((address+i) <= 0x63)) > > Braces and spaces around ''+''. > > > + xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60, v); > > + } > > +} > > + > > /* Initialise Xen */ > > > > int xen_init(int smp_cpus) > > diff --git a/xen-stub.c b/xen-stub.c > > index 0fa9c51..07e64bc 100644 > > --- a/xen-stub.c > > +++ b/xen-stub.c > > @@ -11,6 +11,19 @@ > > #include "qemu-common.h" > > #include "hw/xen.h" > > > > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > > +{ > > + return -1; > > +} > > + > > +void xen_piix3_set_irq(void *opaque, int irq_num, int level) > > +{ > > +} > > + > > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) > > Also here the address should be target_phys_addr_t.-- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2010-Sep-21  10:42 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 07/12] xen: Introduce the Xen mapcache
On Fri, 17 Sep 2010, Blue Swirl wrote:> On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote: > > From: Anthony PERARD <anthony.perard@citrix.com> > > > > The mapcache maps chucks of guest memory on demand, unmaps them when > > they are not needed anymore. > > > > Each call to qemu_get_ram_ptr makes a call to qemu_map_cache with the > > lock option, so mapcache will not unmap these ram_ptr. > > > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > > --- > > Makefile.target | 2 +- > > exec.c | 36 ++++++- > > hw/xen.h | 4 + > > xen-all.c | 63 ++++++++++++ > > xen-stub.c | 4 + > > xen_mapcache.c | 302 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > xen_mapcache.h | 26 +++++ > > 7 files changed, 432 insertions(+), 5 deletions(-) > > create mode 100644 xen_mapcache.c > > create mode 100644 xen_mapcache.h > >[...]> > + while (j > 0) { > > + word = (word << 1) | !err[i + --j]; > > You are mixing bitwise OR with logical NOT, is this correct?Yes, this is correct.> > + } > > + entry->valid_mapping[i / BITS_PER_LONG] = word; > > + } > > + > > + qemu_free(pfns); > > + qemu_free(err); > > +} > > + > > +uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size, uint8_t lock) > > +{ > > + MapCacheEntry *entry, *pentry = NULL; > > + unsigned long address_index = phys_addr >> MCACHE_BUCKET_SHIFT; > > + unsigned long address_offset = phys_addr & (MCACHE_BUCKET_SIZE-1); > > unsigned long will not be long enough on 32 bit host (or 32 bit user > space) for a 64 bit target. I can''t remember if this was a supported > case for Xen anyway.Xen can do that, so I change unsigned long to target_phys_addr_t. [...]> > diff --git a/xen_mapcache.h b/xen_mapcache.h > > new file mode 100644 > > index 0000000..5a6730f > > --- /dev/null > > +++ b/xen_mapcache.h > > @@ -0,0 +1,26 @@ > > +#ifndef XEN_MAPCACHE_H > > +#define XEN_MAPCACHE_H > > + > > +#if (defined(__i386__) || defined(__x86_64__)) > > +# define MAPCACHE > > xen_mapcache.c could be split into two files, xen-mapcache-stub.c and > xen-mapcache.c. configure could perform the check for i386 or x86_64 > host and define CONFIG_XEN_MAPCACHE=y appropriately. Then > Makefile.target would compile the correct file based on that.Ok, I will do that. -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2010-Sep-21  11:18 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 03/12] xen: Introduce --enable-xen command options.
On Fri, 17 Sep 2010, Alexander Graf wrote:> On 17.09.2010, at 13:14, Anthony.Perard@citrix.com wrote: > > > diff --git a/qemu-options.hx b/qemu-options.hx > > index a0b5ae9..457ca32 100644 > > --- a/qemu-options.hx > > +++ b/qemu-options.hx > > @@ -1904,6 +1904,15 @@ Enable KVM full virtualization support. This option is only available > > if KVM support is enabled when compiling. > > ETEXI > > > > +DEF("enable-xen", 0, QEMU_OPTION_enable_xen, \ > > + "-enable-xen enable Xen full virtualization support\n", QEMU_ARCH_ALL) > > This is probably a good point in time to switch to something a bit more sophisticated. I was thinking of > > qemu -accel xen,kvm,tcg > > which would first try to enable xen support, then kvm support and fall back to tcg if none is available. The default would be pretty much the line above. > > That way we could finally get rid of all those -enable-kvm and -enable-whatever switches. We would still need to keep backwards compat for -enable-kvm by mapping it to "-accel kvm" internally. But in the long run an -accel parameter just makes so much more sense.Ok, I will add this option, but the default one will be -accel tcg because qemu with xen can''t run without a tool stack at this moment and the default behavior of qemu with kvm is to run with tcg. Thanks, -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2010-Sep-21  11:41 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 08/12] Intruduce qemu_ram_ptr_unlock.
On Fri, 17 Sep 2010, Blue Swirl wrote:> On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote: > > From: Anthony PERARD <anthony.perard@citrix.com> > > > > This function allows to unlock a ram_ptr give by qemu_get_ram_ptr. After > > a call to qemu_ram_ptr_unlock, the pointer may be unmap from QEMU when > > used with Xen. > > > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > > --- > > cpu-common.h | 1 + > > exec.c | 29 ++++++++++++++++++++++++++--- > > xen_mapcache.c | 34 ++++++++++++++++++++++++++++++++++ > > xen_mapcache.h | 1 + > > 4 files changed, 62 insertions(+), 3 deletions(-) > > > > diff --git a/cpu-common.h b/cpu-common.h > > index 0426bc8..378eea8 100644 > > --- a/cpu-common.h > > +++ b/cpu-common.h > > @@ -46,6 +46,7 @@ ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size); > > void qemu_ram_free(ram_addr_t addr); > > /* This should only be used for ram local to a device. */ > > void *qemu_get_ram_ptr(ram_addr_t addr); > > +void qemu_ram_ptr_unlock(void *addr); > > /* This should not be used by devices. */ > > ram_addr_t qemu_ram_addr_from_host(void *ptr); > > > > diff --git a/exec.c b/exec.c > > index f5888eb..659db50 100644 > > --- a/exec.c > > +++ b/exec.c > > @@ -2959,6 +2959,13 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > > return NULL; > > } > > > > +void qemu_ram_ptr_unlock(void *addr) > > +{ > > + if (xen_enabled()) { > > + qemu_map_cache_unlock(addr); > > I think there may be linkage problems without CONFIG_XEN, so there > should be a stub for qemu_map_cache_unlock(). > > > + } > > +} > > + > > /* Some of the softmmu routines need to translate from a host pointer > > (typically a TLB entry) back to a ram offset. */ > > ram_addr_t qemu_ram_addr_from_host(void *ptr) > > @@ -3064,6 +3071,7 @@ static void notdirty_mem_writeb(void *opaque, target_phys_addr_t ram_addr, > > uint32_t val) > > { > > int dirty_flags; > > + void *vaddr; > > dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr); > > if (!(dirty_flags & CODE_DIRTY_FLAG)) { > > #if !defined(CONFIG_USER_ONLY) > > @@ -3071,19 +3079,21 @@ static void notdirty_mem_writeb(void *opaque, target_phys_addr_t ram_addr, > > dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr); > > #endif > > } > > - stb_p(qemu_get_ram_ptr(ram_addr), val); > > + stb_p(vaddr = qemu_get_ram_ptr(ram_addr), val); > > Perhaps ''vaddr = ...'' should be put on a separate line.Ok, will do. Thanks, -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2010-Sep-21  12:19 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 12/12] xen: Add a Xen specific ACPI Implementation to target-xen
On Fri, 17 Sep 2010, Blue Swirl wrote:> On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote: > > From: Anthony PERARD <anthony.perard@citrix.com> > > > > Xen currently uses a different BIOS (hvmloader + rombios) therefore the > > Qemu acpi_piix4 implementation wouldn''t work correctly with Xen. > > We plan on fixing this properly but at the moment we are just adding a > > new Xen specific acpi_piix4 implementation. > > This patch is optional; without it the VM boots but it cannot shutdown > > properly or go to S3. > > > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > Makefile.target | 1 + > > hw/xen_acpi_piix4.c | 405 +++++++++++++++++++++++++++++++++++++++++++++++++++ > > hw/xen_common.h | 3 + > > hw/xen_machine_fv.c | 6 +- > > 4 files changed, 410 insertions(+), 5 deletions(-) > > create mode 100644 hw/xen_acpi_piix4.c > > > > diff --git a/Makefile.target b/Makefile.target > > index ea14393..db7f96b 100644 > > --- a/Makefile.target > > +++ b/Makefile.target > > @@ -189,6 +189,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o > > # xen full virtualized machine > > obj-$(CONFIG_XEN) += xen_machine_fv.o > > obj-$(CONFIG_XEN) += xen_platform.o > > +obj-$(CONFIG_XEN) += xen_acpi_piix4.o > > > > # USB layer > > obj-$(CONFIG_USB_OHCI) += usb-ohci.o > > diff --git a/hw/xen_acpi_piix4.c b/hw/xen_acpi_piix4.c > > new file mode 100644 > > index 0000000..f4792f2 > > --- /dev/null > > +++ b/hw/xen_acpi_piix4.c > > @@ -0,0 +1,405 @@ > > + /* > > + * PIIX4 ACPI controller emulation > > + * > > + * Winston liwen Wang, winston.l.wang@intel.com > > + * Copyright (c) 2006 , Intel Corporation. > > + * > > + * Permission is hereby granted, free of charge, to any person obtaining a copy > > + * of this software and associated documentation files (the "Software"), to deal > > + * in the Software without restriction, including without limitation the rights > > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > > + * copies of the Software, and to permit persons to whom the Software is > > + * furnished to do so, subject to the following conditions: > > + * > > + * The above copyright notice and this permission notice shall be included in > > + * all copies or substantial portions of the Software. > > + * > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > > + * THE SOFTWARE. > > + */ > > + > > +#include "hw.h" > > +#include "pc.h" > > +#include "pci.h" > > +#include "sysemu.h" > > +#include "acpi.h" > > + > > +#include "xen_backend.h" > > +#include "xen_common.h" > > +#include "qemu-log.h" > > + > > +#include <xen/hvm/ioreq.h> > > +#include <xen/hvm/params.h> > > + > > +#define PIIX4ACPI_LOG_ERROR 0 > > +#define PIIX4ACPI_LOG_INFO 1 > > +#define PIIX4ACPI_LOG_DEBUG 2 > > +#define PIIX4ACPI_LOGLEVEL PIIX4ACPI_LOG_INFO > > +#define PIIX4ACPI_LOG(level, fmt, ...) do { if (level <= PIIX4ACPI_LOGLEVEL) qemu_log(fmt, ## __VA_ARGS__); } while (0) > > + > > +/* Sleep state type codes as defined by the \_Sx objects in the DSDT. */ > > +/* These must be kept in sync with the DSDT (hvmloader/acpi/dsdt.asl) */ > > +#define SLP_TYP_S4 (6 << 10) > > +#define SLP_TYP_S3 (5 << 10) > > +#define SLP_TYP_S5 (7 << 10) > > + > > +#define ACPI_DBG_IO_ADDR 0xb044 > > +#define ACPI_PHP_IO_ADDR 0x10c0 > > + > > +#define PHP_EVT_ADD 0x0 > > +#define PHP_EVT_REMOVE 0x3 > > + > > +/* The bit in GPE0_STS/EN to notify the pci hotplug event */ > > +#define ACPI_PHP_GPE_BIT 3 > > + > > +#define DEVFN_TO_PHP_SLOT_REG(devfn) (devfn >> 1) > > +#define PHP_SLOT_REG_TO_DEVFN(reg, hilo) ((reg << 1) | hilo) > > + > > +/* ioport to monitor cpu add/remove status */ > > +#define PROC_BASE 0xaf00 > > + > > +typedef struct PCIAcpiState { > > PCIACPIState > > > + PCIDevice dev; > > + uint16_t pm1_control; /* pm1a_ECNT_BLK */ > > + qemu_irq irq; > > + qemu_irq cmos_s3; > > +} PCIAcpiState; > > + > > +typedef struct GPEState { > > + /* GPE0 block */ > > + uint8_t gpe0_sts[ACPI_GPE0_BLK_LEN / 2]; > > + uint8_t gpe0_en[ACPI_GPE0_BLK_LEN / 2]; > > + > > + /* CPU bitmap */ > > + uint8_t cpus_sts[32]; > > + > > + /* SCI IRQ level */ > > + uint8_t sci_asserted; > > + > > +} GPEState; > > + > > +static GPEState gpe_state; > > + > > +static qemu_irq sci_irq; > > + > > +typedef struct AcpiDeviceState AcpiDeviceState; > > +AcpiDeviceState *acpi_device_table; > > The above globals and static variable should be eliminated, see > ac4040955b1669f0aac5937f623d6587d5210679. > > > + > > +static const VMStateDescription vmstate_acpi = { > > + .name = "PIIX4 ACPI", > > + .version_id = 1, > > + .fields = (VMStateField []) { > > + VMSTATE_PCI_DEVICE(dev, PCIAcpiState), > > + VMSTATE_UINT16(pm1_control, PCIAcpiState), > > + VMSTATE_END_OF_LIST() > > + } > > +}; > > + > > +static void acpiPm1Control_writeb(void *opaque, uint32_t addr, uint32_t val) > > acpi_pm1_control_writeb(). > > > +{ > > + PCIAcpiState *s = opaque; > > + s->pm1_control = (s->pm1_control & 0xff00) | (val & 0xff); > > +} > > + > > +static uint32_t acpiPm1Control_readb(void *opaque, uint32_t addr) > > Likewise. > > > +{ > > + PCIAcpiState *s = opaque; > > + /* Mask out the write-only bits */ > > + return (uint8_t)(s->pm1_control & ~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE)); > > Please add spaces around ''|'' and break the line. Adding > ACPI_BITMASK_WRITEONLY may help. > > > +} > > + > > +static void acpi_shutdown(PCIAcpiState *s, uint32_t val) > > +{ > > + if (!(val & ACPI_BITMASK_SLEEP_ENABLE)) > > + return; > > braces > > > + > > + switch (val & ACPI_BITMASK_SLEEP_TYPE) { > > + case SLP_TYP_S3: > > + qemu_system_reset(); > > + qemu_irq_raise(s->cmos_s3); > > + xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 3); > > + break; > > + case SLP_TYP_S4: > > + case SLP_TYP_S5: > > + qemu_system_shutdown_request(); > > + break; > > + default: > > + break; > > + } > > +} > > + > > +static void acpiPm1ControlP1_writeb(void *opaque, uint32_t addr, uint32_t val) > > +{ > > + PCIAcpiState *s = opaque; > > + > > + val <<= 8; > > + s->pm1_control = ((s->pm1_control & 0xff) | val) & ~ACPI_BITMASK_SLEEP_ENABLE; > > + > > + acpi_shutdown(s, val); > > +} > > + > > +static uint32_t acpiPm1ControlP1_readb(void *opaque, uint32_t addr) > > +{ > > + PCIAcpiState *s = opaque; > > + /* Mask out the write-only bits */ > > + return (uint8_t)((s->pm1_control & ~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE)) >> 8); > > +} > > + > > +static void acpiPm1Control_writew(void *opaque, uint32_t addr, uint32_t val) > > +{ > > + PCIAcpiState *s = opaque; > > + > > + s->pm1_control = val & ~ACPI_BITMASK_SLEEP_ENABLE; > > + > > + acpi_shutdown(s, val); > > +} > > + > > +static uint32_t acpiPm1Control_readw(void *opaque, uint32_t addr) > > +{ > > + PCIAcpiState *s = opaque; > > + /* Mask out the write-only bits */ > > + return (s->pm1_control & ~(ACPI_BITMASK_GLOBAL_LOCK_RELEASE|ACPI_BITMASK_SLEEP_ENABLE)); > > +} > > + > > +static void acpi_map(PCIDevice *pci_dev, int region_num, > > + uint32_t addr, uint32_t size, int type) > > +{ > > + PCIAcpiState *d = (PCIAcpiState *)pci_dev; > > + > > + /* Byte access */ > > + register_ioport_write(addr + 4, 1, 1, acpiPm1Control_writeb, d); > > + register_ioport_read(addr + 4, 1, 1, acpiPm1Control_readb, d); > > + register_ioport_write(addr + 4 + 1, 1, 1, acpiPm1ControlP1_writeb, d); > > + register_ioport_read(addr + 4 +1, 1, 1, acpiPm1ControlP1_readb, d); > > + > > + /* Word access */ > > + register_ioport_write(addr + 4, 2, 2, acpiPm1Control_writew, d); > > + register_ioport_read(addr + 4, 2, 2, acpiPm1Control_readw, d); > > +} > > + > > +static inline int test_bit(uint8_t *map, int bit) > > +{ > > + return ( map[bit / 8] & (1 << (bit % 8)) ); > > +} > > + > > +static inline void set_bit(uint8_t *map, int bit) > > +{ > > + map[bit / 8] |= (1 << (bit % 8)); > > +} > > + > > +static inline void clear_bit(uint8_t *map, int bit) > > +{ > > + map[bit / 8] &= ~(1 << (bit % 8)); > > +} > > + > > +static void acpi_dbg_writel(void *opaque, uint32_t addr, uint32_t val) > > +{ > > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "ACPI: DBG: 0x%08x\n", val); > > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "ACPI:debug: write addr=0x%x, val=0x%x.\n", addr, val); > > +} > > + > > +/* GPEx_STS occupy 1st half of the block, while GPEx_EN 2nd half */ > > +static uint32_t gpe_sts_read(void *opaque, uint32_t addr) > > +{ > > + GPEState *s = opaque; > > + > > + return s->gpe0_sts[addr - ACPI_GPE0_BLK_ADDRESS]; > > +} > > + > > +/* write 1 to clear specific GPE bits */ > > +static void gpe_sts_write(void *opaque, uint32_t addr, uint32_t val) > > +{ > > + GPEState *s = opaque; > > + int hotplugged = 0; > > + > > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "gpe_sts_write: addr=0x%x, val=0x%x.\n", addr, val); > > + > > + hotplugged = test_bit(&s->gpe0_sts[0], ACPI_PHP_GPE_BIT); > > + s->gpe0_sts[addr - ACPI_GPE0_BLK_ADDRESS] &= ~val; > > + if ( s->sci_asserted && > > + hotplugged && > > + !test_bit(&s->gpe0_sts[0], ACPI_PHP_GPE_BIT)) { > > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "Clear the GPE0_STS bit for ACPI hotplug & deassert the IRQ.\n"); > > + qemu_irq_lower(sci_irq); > > + } > > + > > +} > > + > > +static uint32_t gpe_en_read(void *opaque, uint32_t addr) > > +{ > > + GPEState *s = opaque; > > + > > + return s->gpe0_en[addr - (ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2)]; > > +} > > + > > +/* write 0 to clear en bit */ > > +static void gpe_en_write(void *opaque, uint32_t addr, uint32_t val) > > +{ > > + GPEState *s = opaque; > > + int reg_count; > > + > > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_DEBUG, "gpe_en_write: addr=0x%x, val=0x%x.\n", addr, val); > > + reg_count = addr - (ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2); > > + s->gpe0_en[reg_count] = val; > > + /* If disable GPE bit right after generating SCI on it, > > + * need deassert the intr to avoid redundant intrs > > + */ > > + if ( s->sci_asserted && > > + reg_count == (ACPI_PHP_GPE_BIT / 8) && > > + !(val & (1 << (ACPI_PHP_GPE_BIT % 8))) ) { > > + PIIX4ACPI_LOG(PIIX4ACPI_LOG_INFO, "deassert due to disable GPE bit.\n"); > > + s->sci_asserted = 0; > > + qemu_irq_lower(sci_irq); > > + } > > + > > +} > > + > > +static const VMStateDescription vmstate_gpe = { > > + .name = "gpe", > > + .version_id = 2, > > + .minimum_version_id = 2, > > + .minimum_version_id_old = 2, > > + .fields = (VMStateField []) { > > + VMSTATE_BUFFER(gpe0_sts, GPEState), > > + VMSTATE_BUFFER(gpe0_en, GPEState), > > + VMSTATE_UINT8(sci_asserted, GPEState), > > + VMSTATE_END_OF_LIST() > > + } > > +}; > > + > > +static uint32_t gpe_cpus_readb(void *opaque, uint32_t addr) > > +{ > > + uint32_t val = 0; > > + GPEState *g = opaque; > > + > > + switch (addr) { > > + case PROC_BASE ... PROC_BASE+31: > > + val = g->cpus_sts[addr - PROC_BASE]; > > break; > > > + default: > > + break; > > + } > > + > > + return val; > > +} > > + > > +static void gpe_cpus_writeb(void *opaque, uint32_t addr, uint32_t val) > > +{ > > + /* GPEState *g = opaque; */ > > + > > + switch (addr) { > > + case PROC_BASE ... PROC_BASE + 31: > > + /* don''t allow to change cpus_sts from inside a guest */ > > + break; > > + default: > > + break; > > + } > > +} > > + > > +static void gpe_acpi_init(void) > > +{ > > + GPEState *s = &gpe_state; > > + memset(s, 0, sizeof(GPEState)); > > + > > + s->cpus_sts[0] = 1; > > + > > + register_ioport_read(PROC_BASE, 32, 1, gpe_cpus_readb, s); > > + register_ioport_write(PROC_BASE, 32, 1, gpe_cpus_writeb, s); > > + > > + register_ioport_read(ACPI_GPE0_BLK_ADDRESS, > > + ACPI_GPE0_BLK_LEN / 2, > > + 1, > > + gpe_sts_read, > > + s); > > + register_ioport_read(ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2, > > + ACPI_GPE0_BLK_LEN / 2, > > + 1, > > + gpe_en_read, > > + s); > > + > > + register_ioport_write(ACPI_GPE0_BLK_ADDRESS, > > + ACPI_GPE0_BLK_LEN / 2, > > + 1, > > + gpe_sts_write, > > + s); > > + register_ioport_write(ACPI_GPE0_BLK_ADDRESS + ACPI_GPE0_BLK_LEN / 2, > > + ACPI_GPE0_BLK_LEN / 2, > > + 1, > > + gpe_en_write, > > + s); > > + > > + vmstate_register(NULL, 0, &vmstate_gpe, s); > > +} > > + > > +static int piix4_pm_xen_initfn(PCIDevice *dev) > > +{ > > + PCIAcpiState *s = DO_UPCAST(PCIAcpiState, dev, dev); > > + uint8_t *pci_conf; > > + > > + pci_conf = s->dev.config; > > + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL); > > + pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_INTEL_82371AB_3); > > + pci_conf[0x08] = 0x01; /* B0 stepping */ > > + pci_conf[0x09] = 0x00; /* base class */ > > + pci_config_set_class(pci_conf, PCI_CLASS_BRIDGE_OTHER); > > + pci_conf[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL; /* header_type */ > > + pci_conf[0x3d] = 0x01; /* Hardwired to PIRQA is used */ > > + > > + /* PMBA POWER MANAGEMENT BASE ADDRESS, hardcoded to 0x1f40 > > + * to make shutdown work for IPF, due to IPF Guest Firmware > > + * will enumerate pci devices. > > + * > > + * TODO: if Guest Firmware or Guest OS will change this PMBA, > > + * More logic will be added. > > + */ > > + pci_conf[0x40] = 0x41; /* Special device-specific BAR at 0x40 */ > > + pci_conf[0x41] = 0x1f; > > + pci_conf[0x42] = 0x00; > > + pci_conf[0x43] = 0x00; > > + > > + s->pm1_control = ACPI_BITMASK_SCI_ENABLE; > > Please extract this line to a reset function. >Ok, I will fix all of it. -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Sep-22  10:28 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 10/12] xen: Initialize event channels and io rings
On Fri, 17 Sep 2010, Blue Swirl wrote:> On Fri, Sep 17, 2010 at 11:15 AM, <anthony.perard@citrix.com> wrote: > > From: Anthony PERARD <anthony.perard@citrix.com> > > > > Open and bind event channels; map ioreq and buffered ioreq rings. > > In general, because of CPUState accesses and cpu_in/out use, this > looks like CPU code, specifically x86. Could this belong to > target-i386/xen.c instead, much like target-i386/kvm.c vs ./kvm-all.c? > Do other CPU types use this stuff? >Even though it might look like CPU code, this code only deals with IO events from xen on behalf of the guest. In fact it runs as it is on ia64 AFAIK. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2010-Sep-24  05:10 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 04/12] xen: Add the Xen platform pci device
On Fri, Sep 17, 2010 at 12:14:59PM +0100, anthony.perard@citrix.com wrote:> +static int xen_platform_initfn(PCIDevice *dev) > +{ > + PCIXenPlatformState *d = DO_UPCAST(PCIXenPlatformState, pci_dev, dev); > + uint8_t *pci_conf; > + > + pci_conf = d->pci_dev.config; > + > + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_XENSOURCE); > + pci_config_set_device_id(pci_conf, 0x0001); > + pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_IO | PCI_COMMAND_MEMORY); > + > + pci_config_set_revision(pci_conf, 1); > + pci_config_set_prog_interface(pci_conf, 0); > + > + pci_config_set_class(pci_conf, PCI_CLASS_OTHERS << 8 | 0x80); > + > + pci_conf[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL;Eliminate this line. Don''t overwrite multifunction bit. Please refer to 498238687fd3a2bf3efb32694732f88ceac72e99 6eab3de16d36c48a983366b09d0a0029a5260bc3> + pci_conf[PCI_INTERRUPT_PIN] = 1; > + > + /* Microsoft WHQL requires non-zero subsystem IDs. */ > + /* http://www.pcisig.com/reflector/msg02205.html. */ > + pci_set_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID, pci_conf[PCI_VENDOR_ID]); > + pci_set_word(pci_conf + PCI_SUBSYSTEM_ID, 0x0001); > + > + pci_register_bar(&d->pci_dev, 0, 0x100, > + PCI_BASE_ADDRESS_SPACE_IO, platform_ioport_map); > + > + /* reserve 16MB mmio address for share memory*/ > + pci_register_bar(&d->pci_dev, 1, 0x1000000, > + PCI_BASE_ADDRESS_MEM_PREFETCH, platform_mmio_map); > + > + platform_fixed_ioport_init(d); > + > + return 0; > +}-- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2010-Sep-24  05:17 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 05/12] piix_pci: Introduces Xen specific call for irq.
On Fri, Sep 17, 2010 at 12:15:00PM +0100, anthony.perard@citrix.com wrote:> From: Anthony PERARD <anthony.perard@citrix.com> > > This patch introduces Xen specific call in piix_pci. > > The specific part for Xen is in write_config, set_irq and get_pirq. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > hw/piix_pci.c | 10 +++++++++- > hw/xen.h | 6 ++++++ > xen-all.c | 29 +++++++++++++++++++++++++++++ > xen-stub.c | 13 +++++++++++++ > 4 files changed, 57 insertions(+), 1 deletions(-) > > diff --git a/hw/piix_pci.c b/hw/piix_pci.c > index f152a0f..41a342f 100644 > --- a/hw/piix_pci.c > +++ b/hw/piix_pci.c > @@ -28,6 +28,7 @@ > #include "pci_host.h" > #include "isa.h" > #include "sysbus.h" > +#include "xen.h" > > /* > * I440FX chipset data sheet. > @@ -142,6 +143,9 @@ static void i440fx_write_config(PCIDevice *dev, > { > PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev); > > + if (xen_enabled()) > + xen_piix_pci_write_config_client(address, val, len); > + > /* XXX: implement SMRAM.D_LOCK */ > pci_default_write_config(dev, address, val, len); > if (ranges_overlap(address, len, I440FX_PAM, I440FX_PAM_SIZE) ||Maybe I wasn''t clear enough. This dynamic check can also be eliminated. Something like the following pseudo code. i440fx_init() ... if (xen_enabled) { d = pci_create_simple(b, 0, "i440FX-xen"); } else } d = pci_create_simple(b, 0, "i440FX"); } static PCIDeviceInfo i440fx_info[] = { { .qdev.name = "i440FX", ... .config_write = i440fx_write_config, },{ .qdev.name = "i440FX-xen", ... .config_write = i440fx_write_config_xen, },{ i440fx_write_config_xen() { xen_piix_pci_write_config_client(); i440fx_write_config() }> @@ -235,7 +239,11 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq * > piix3 = DO_UPCAST(PIIX3State, dev, > pci_create_simple_multifunction(b, -1, true, "PIIX3")); > piix3->pic = pic; > - pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4); > + if (xen_enabled()) { > + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq, piix3, 4); > + } else { > + pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3, 4); > + } > (*pi440fx_state)->piix3 = piix3; > > *piix3_devfn = piix3->dev.devfn; > diff --git a/hw/xen.h b/hw/xen.h > index 14bbb6e..c5189b1 100644 > --- a/hw/xen.h > +++ b/hw/xen.h > @@ -8,6 +8,8 @@ > */ > #include <inttypes.h> > > +#include "qemu-common.h" > + > /* xen-machine.c */ > enum xen_mode { > XEN_EMULATE = 0, // xen emulation, using xenner (default) > @@ -26,6 +28,10 @@ extern int xen_allowed; > #define xen_enabled() (0) > #endif > > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num); > +void xen_piix3_set_irq(void *opaque, int irq_num, int level); > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len); > + > int xen_init(int smp_cpus); > > #endif /* QEMU_HW_XEN_H */ > diff --git a/xen-all.c b/xen-all.c > index f505563..948e439 100644 > --- a/xen-all.c > +++ b/xen-all.c > @@ -8,9 +8,38 @@ > > #include "config.h" > > +#include "hw/pci.h" > #include "hw/xen_common.h" > #include "hw/xen_backend.h" > > +/* Xen specific function for piix pci */ > + > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > +{ > + return irq_num + ((pci_dev->devfn >> 3) << 2); > +} > + > +void xen_piix3_set_irq(void *opaque, int irq_num, int level) > +{ > + xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2, > + irq_num & 3, level); > +} > + > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) > +{ > + int i; > + > + /* Scan for updates to PCI link routes (0x60-0x63). */ > + for (i = 0; i < len; i++) { > + uint8_t v = (val >> (8*i)) & 0xff; > + if (v & 0x80) > + v = 0; > + v &= 0xf; > + if (((address+i) >= 0x60) && ((address+i) <= 0x63)) > + xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60, v); > + } > +} > + > /* Initialise Xen */ > > int xen_init(int smp_cpus) > diff --git a/xen-stub.c b/xen-stub.c > index 0fa9c51..07e64bc 100644 > --- a/xen-stub.c > +++ b/xen-stub.c > @@ -11,6 +11,19 @@ > #include "qemu-common.h" > #include "hw/xen.h" > > +int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) > +{ > + return -1; > +} > + > +void xen_piix3_set_irq(void *opaque, int irq_num, int level) > +{ > +} > + > +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) > +{ > +} > + > int xen_init(int smp_cpus) > { > return -ENOSYS; > -- > 1.6.5 > >-- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2010-Sep-24  05:52 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 04/12] xen: Add the Xen platform pci device
On Fri, Sep 17, 2010 at 12:14:59PM +0100, anthony.perard@citrix.com wrote:> +static uint32_t platform_mmio_read(void *opaque, target_phys_addr_t addr) > +{ > + static int warnings = 0; > + > + if (warnings < 5) { > + DPRINTF("Warning: attempted read from physical address " > + "0x" TARGET_FMT_plx " in xen platform mmio space\n", addr); > + warnings++; > + } > + return 0; > +} > + > +static void platform_mmio_write(void *opaque, target_phys_addr_t addr, > + uint32_t val) > +{ > + static int warnings = 0; > + > + if (warnings < 5) { > + DPRINTF("Warning: attempted write of 0x%x to physical " > + "address 0x" TARGET_FMT_plx " in xen platform mmio space\n", > + val, addr); > + warnings++; > + } > +} > + > +static CPUReadMemoryFunc * const platform_mmio_read_funcs[3] = { > + platform_mmio_read, > + platform_mmio_read, > + platform_mmio_read, > +}; > + > +static CPUWriteMemoryFunc * const platform_mmio_write_funcs[3] = { > + platform_mmio_write, > + platform_mmio_write, > + platform_mmio_write, > +}; > + > +static void platform_mmio_map(PCIDevice *d, int region_num, > + pcibus_t addr, pcibus_t size, int type) > +{ > + int mmio_io_addr; > + > + mmio_io_addr = cpu_register_io_memory(platform_mmio_read_funcs, > + platform_mmio_write_funcs, NULL); > + > + cpu_register_physical_memory(addr, size, mmio_io_addr); > +}Please use cpu_register_io_memory_simple(). -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony PERARD
2010-Sep-24  13:42 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH RFC V3 05/12] piix_pci: Introduces Xen specific call for irq.
On Fri, 24 Sep 2010, Isaku Yamahata wrote:> On Fri, Sep 17, 2010 at 12:15:00PM +0100, anthony.perard@citrix.com wrote: > > From: Anthony PERARD <anthony.perard@citrix.com> > > > > This patch introduces Xen specific call in piix_pci. > > > > The specific part for Xen is in write_config, set_irq and get_pirq. > > > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > hw/piix_pci.c | 10 +++++++++- > > hw/xen.h | 6 ++++++ > > xen-all.c | 29 +++++++++++++++++++++++++++++ > > xen-stub.c | 13 +++++++++++++ > > 4 files changed, 57 insertions(+), 1 deletions(-) > > > > diff --git a/hw/piix_pci.c b/hw/piix_pci.c > > index f152a0f..41a342f 100644 > > --- a/hw/piix_pci.c > > +++ b/hw/piix_pci.c > > @@ -142,6 +143,9 @@ static void i440fx_write_config(PCIDevice *dev, > > { > > PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev); > > > > + if (xen_enabled()) > > + xen_piix_pci_write_config_client(address, val, len); > > + > > /* XXX: implement SMRAM.D_LOCK */ > > pci_default_write_config(dev, address, val, len); > > if (ranges_overlap(address, len, I440FX_PAM, I440FX_PAM_SIZE) || > > Maybe I wasn''t clear enough. This dynamic check can also be eliminated. > Something like the following pseudo code. > > i440fx_init() > ... > if (xen_enabled) { > d = pci_create_simple(b, 0, "i440FX-xen"); > } else } > d = pci_create_simple(b, 0, "i440FX"); > } > > > static PCIDeviceInfo i440fx_info[] = { > { > .qdev.name = "i440FX", > ... > .config_write = i440fx_write_config, > },{ > .qdev.name = "i440FX-xen", > ... > .config_write = i440fx_write_config_xen, > },{ > > > i440fx_write_config_xen() > { > xen_piix_pci_write_config_client(); > i440fx_write_config() > }Ok, I will do that. Thanks for this sample of code. -- Anthony PERARD _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel