Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 00/13] merge some xen bits into qemu
Hi folks, This is the fourth version of the xen support for qemu patch series, addressing review comments. Also fixed some bugs and made the backend drivers restartable (for boots via pvgrub). New in this submission is the domain building support. The console and framebuffer backend drivers are largely based on the xen code, the other bits are rewritten from scratch. Nevertheless the code should be functionally identical. Overview (individual patches have longer descriptions): #1 -- Handle terminating signals. #2 -- add DisplayState->idle #3 -- add container_of() macro to osdep.h #4 -- move GUI_REFRESH_INTERVAL define from vl.c to console.h #5 -- groundwork: build system, cmd line options, ... #6 -- xen backend driver infrastructrure #7 -- xen console backend driver #8 -- xen framebuffer backend driver #9 -- xen block backend driver #10 -- xen nic backend driver #11 -- allow xen disks and nics being configured via qemu command line options. #12 -- set vnc password from xenstore. #13 -- xen: pv domain builder. The first four patches are preparing patches. They put some stuff in place the other xen patches depend on. With the next four patches in place upstream qemu can replace xen''s qemu-dm for paravirtual domains. The last five patches add full userspace implementations of block and nic backend drivers using the grant table device (gntdev), command line support config support for setting up these backends and domain building support. xen support is implemented using another machine type. xen''s qemu-dm already uses the machine type to switch between paravirtualized and fully virtualized machines, so this was the natural choice. qemu has gets a new "xenpv" machine type additionally to the "pc" and "isapc" ones. The patches are also available here: http://kraxel.fedorapeople.org/patches/qemu-upstream/ There is also a patch queue for qemu-dm, so the patches can get a test-drive there: http://kraxel.fedorapeople.org/patches/qemu-xen/ There also a ''-b'' (no whitespace changes) version for easier review: http://kraxel.fedorapeople.org/patches/qemu-xen/no-ws/ The qemu-dm patch series has a slightly different ordering to make the patch queue more bisect-friendly. Also some preparing patches are not needed because they already there. Some extra patches (backports) are there due to the age of the qemu-dm tree. The last two patches are not present (yet). Comments? cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This patch makes qemu handle signals better. It sets the request_shutdown flag, making the main_loop exit and qemu taking the usual exit route, with atexit handlers being called and so on, instead of qemu just being killed by the signal. To avoid calling vm_start() from the signal handler main_loop() got an additional check so qemu_system_shutdown_request() works even when the vm is in stopped state. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- curses.c | 2 -- sdl.c | 9 +-------- vl.c | 27 +++++++++++++++++++++++++++ 3 files changed, 28 insertions(+), 10 deletions(-) diff --git a/curses.c b/curses.c index d09eff2..96b6e49 100644 --- a/curses.c +++ b/curses.c @@ -350,8 +350,6 @@ void curses_display_init(DisplayState *ds, int full_screen) atexit(curses_atexit); #ifndef _WIN32 - signal(SIGINT, SIG_DFL); - signal(SIGQUIT, SIG_DFL); #if defined(SIGWINCH) && defined(KEY_RESIZE) /* some curses implementations provide a handler, but we * want to be sure this is handled regardless of the library */ diff --git a/sdl.c b/sdl.c index 0edc4a0..84a9d6d 100644 --- a/sdl.c +++ b/sdl.c @@ -476,10 +476,8 @@ static void sdl_refresh(DisplayState *ds) sdl_process_key(&ev->key); break; case SDL_QUIT: - if (!no_quit) { + if (!no_quit) qemu_system_shutdown_request(); - vm_start(); /* In case we''re paused */ - } break; case SDL_MOUSEMOTION: if (gui_grab || kbd_mouse_is_absolute() || @@ -636,11 +634,6 @@ void sdl_display_init(DisplayState *ds, int full_screen, int no_frame) fprintf(stderr, "Could not initialize SDL - exiting\n"); exit(1); } -#ifndef _WIN32 - /* NOTE: we still want Ctrl-C to work, so we undo the SDL redirections */ - signal(SIGINT, SIG_DFL); - signal(SIGQUIT, SIG_DFL); -#endif ds->dpy_update = sdl_update; ds->dpy_resize = sdl_resize; diff --git a/vl.c b/vl.c index fc5019f..071ee0c 100644 --- a/vl.c +++ b/vl.c @@ -7619,6 +7619,8 @@ static int main_loop(void) timeout = 0; } } else { + if (shutdown_requested) + break; timeout = 10; } #ifdef CONFIG_PROFILER @@ -8183,6 +8185,26 @@ static BOOL WINAPI qemu_ctrl_handler(DWORD type) #define MAX_NET_CLIENTS 32 +#ifndef _WIN32 + +static void termsig_handler(int signal) +{ + qemu_system_shutdown_request(); +} + +void termsig_setup(void) +{ + struct sigaction act; + + memset(&act, 0, sizeof(act)); + act.sa_handler = termsig_handler; + sigaction(SIGINT, &act, NULL); + sigaction(SIGHUP, &act, NULL); + sigaction(SIGTERM, &act, NULL); +} + +#endif + int main(int argc, char **argv) { #ifdef CONFIG_GDBSTUB @@ -9071,6 +9093,11 @@ int main(int argc, char **argv) #endif } +#ifndef _WIN32 + /* must be after terminal init, SDL library changes signal handlers */ + termsig_setup(); +#endif + /* Maintain compatibility with multiple stdio monitors */ if (!strcmp(monitor_device,"stdio")) { for (i = 0; i < MAX_SERIAL_PORTS; i++) { -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
From: Samuel Thibault <samuel.thibault@eu.citrix.com> From: Samuel Thibault <samuel.thibault@eu.citrix.com> Add idle field to DisplayState struct, so drivers can figure the display is idle and take advantage of that. The xen framebuffer driver will use this to communicate the idle state to the guest, so it knows it can stop doing updates to a virtual display which is invisible anyway. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- console.h | 1 + sdl.c | 2 ++ vl.c | 2 ++ vnc.c | 3 +++ 4 files changed, 8 insertions(+), 0 deletions(-) diff --git a/console.h b/console.h index 561ef51..233e7ef 100644 --- a/console.h +++ b/console.h @@ -80,6 +80,7 @@ struct DisplayState { void *opaque; struct QEMUTimer *gui_timer; uint64_t gui_timer_interval; + int idle; void (*dpy_update)(struct DisplayState *s, int x, int y, int w, int h); void (*dpy_resize)(struct DisplayState *s, int w, int h); diff --git a/sdl.c b/sdl.c index 84a9d6d..15427c5 100644 --- a/sdl.c +++ b/sdl.c @@ -524,9 +524,11 @@ static void sdl_refresh(DisplayState *ds) if (ev->active.gain) { /* Back to default interval */ ds->gui_timer_interval = 0; + ds->idle = 0; } else { /* Sleeping interval */ ds->gui_timer_interval = 500; + ds->idle = 1; } } break; diff --git a/vl.c b/vl.c index 071ee0c..ee89107 100644 --- a/vl.c +++ b/vl.c @@ -5974,6 +5974,8 @@ static void dumb_display_init(DisplayState *ds) ds->dpy_update = dumb_update; ds->dpy_resize = dumb_resize; ds->dpy_refresh = dumb_refresh; + ds->gui_timer_interval = 500; + ds->idle = 1; } /***********************************************************/ diff --git a/vnc.c b/vnc.c index 2d17044..0d55a78 100644 --- a/vnc.c +++ b/vnc.c @@ -660,6 +660,7 @@ static int vnc_client_io_error(VncState *vs, int ret, int last_errno) qemu_set_fd_handler2(vs->csock, NULL, NULL, NULL, NULL); closesocket(vs->csock); vs->csock = -1; + vs->ds->idle = 1; buffer_reset(&vs->input); buffer_reset(&vs->output); vs->need_update = 0; @@ -1920,6 +1921,7 @@ static int protocol_version(VncState *vs, uint8_t *version, size_t len) static void vnc_connect(VncState *vs) { VNC_DEBUG("New client on socket %d\n", vs->csock); + vs->ds->idle = 0; socket_set_nonblock(vs->csock); qemu_set_fd_handler2(vs->csock, NULL, vnc_client_read, NULL, vs); vnc_write(vs, "RFB 003.008\n", 12); @@ -1959,6 +1961,7 @@ void vnc_display_init(DisplayState *ds) exit(1); ds->opaque = vs; + ds->idle = 1; vnc_state = vs; vs->display = NULL; vs->password = NULL; -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 03/13] add container_of() macro to osdep.h
>From linux kernel sources, xen bits will use it, put itinto a place where others can see and use it too ;) Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- osdep.h | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/osdep.h b/osdep.h index 6312e7a..09ebace 100644 --- a/osdep.h +++ b/osdep.h @@ -23,6 +23,10 @@ #define unlikely(x) __builtin_expect(!!(x), 0) #endif +#define container_of(ptr, type, member) ({ \ + const typeof( ((type *)0)->member ) *__mptr = (ptr); \ + (type *)( (char *)__mptr - offsetof(type,member) );}) + #ifndef MIN #define MIN(a, b) (((a) < (b)) ? (a) : (b)) #endif -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 04/13] move GUI_REFRESH_INTERVAL define from vl.c to console.h
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- console.h | 3 +++ vl.c | 2 -- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/console.h b/console.h index 233e7ef..069e73d 100644 --- a/console.h +++ b/console.h @@ -9,6 +9,9 @@ #define MOUSE_EVENT_RBUTTON 0x02 #define MOUSE_EVENT_MBUTTON 0x04 +/* in ms */ +#define GUI_REFRESH_INTERVAL 30 + typedef void QEMUPutKBDEvent(void *opaque, int keycode); typedef void QEMUPutMouseEvent(void *opaque, int dx, int dy, int dz, int buttons_state); diff --git a/vl.c b/vl.c index ee89107..3211507 100644 --- a/vl.c +++ b/vl.c @@ -154,8 +154,6 @@ int inet_aton(const char *cp, struct in_addr *ia); #else #define DEFAULT_RAM_SIZE 128 #endif -/* in ms */ -#define GUI_REFRESH_INTERVAL 30 /* Max number of USB devices that can be specified on the commandline. */ #define MAX_USB_CMDLINE 8 -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 05/13] xen: groundwork for xen support
- configure script and build system changes. - wind up new machine type. - add -domid command line option. - allow xenpv machines run without disk and kernel specified by adding a nodisk_ok field to QEMUMachine. - detect xen being present. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 8 +++ configure | 27 +++++++++ hw/boards.h | 3 + hw/xen.h | 18 ++++++ hw/xen_machine_pv.c | 146 +++++++++++++++++++++++++++++++++++++++++++++++++ target-i386/machine.c | 3 + vl.c | 15 +++++ 7 files changed, 220 insertions(+), 0 deletions(-) create mode 100644 hw/xen.h create mode 100644 hw/xen_machine_pv.c diff --git a/Makefile.target b/Makefile.target index 42162c3..13ee854 100644 --- a/Makefile.target +++ b/Makefile.target @@ -515,6 +515,14 @@ CPPFLAGS += $(CONFIG_VNC_TLS_CFLAGS) LIBS += $(CONFIG_VNC_TLS_LIBS) endif +# xen backend driver support +XEN_OBJS := xen_machine_pv.o +ifeq ($(CONFIG_XEN), yes) + OBJS += $(XEN_OBJS) + LIBS += $(XEN_LIBS) + $(XEN_OBJS) : CFLAGS += -Wall -Wmissing-prototypes -Wstrict-prototypes +endif + # SCSI layer OBJS+= lsi53c895a.o esp.o diff --git a/configure b/configure index 74ce443..14689d2 100755 --- a/configure +++ b/configure @@ -109,6 +109,7 @@ curses="yes" aio="yes" nptl="yes" mixemu="no" +xen="no" # OS specific targetos=`uname -s` @@ -204,6 +205,7 @@ linux="yes" linux_user="yes" if [ "$cpu" = "i386" -o "$cpu" = "x86_64" ] ; then kqemu="yes" + xen="yes" audio_possible_drivers="$audio_possible_drivers fmod" fi ;; @@ -287,6 +289,8 @@ for opt do ;; --disable-kqemu) kqemu="no" ;; + --disable-xen) xen="no" + ;; --disable-brlapi) brlapi="no" ;; --enable-profiler) profiler="yes" @@ -425,6 +429,7 @@ echo " Available drivers: $audio_possible_drivers" echo " --audio-card-list=LIST set list of additional emulated audio cards" echo " Available cards: ac97 adlib cs4231a gus" echo " --enable-mixemu enable mixer emulation" +echo " --disable-xen disable xen backend driver support" echo " --disable-brlapi disable BrlAPI" echo " --disable-vnc-tls disable TLS encryption for VNC server" echo " --disable-curses disable curses output" @@ -686,6 +691,22 @@ else fi ########################################## +# xen probe + +if test "$xen" = "yes" ; then +cat > $TMPC <<EOF +#include <xenctrl.h> +#include <xs.h> +int main(void) { xs_daemon_open; xc_interface_open; } +EOF + if $cc $ARCH_CFLAGS -c -o $TMPO $TMPC -lxenstore -lxenctrl 2> /dev/null ; then + : + else + xen="no" + fi +fi + +########################################## # SDL probe sdl_too_old=no @@ -946,6 +967,7 @@ if test -n "$sparc_cpu"; then echo "Target Sparc Arch $sparc_cpu" fi echo "kqemu support $kqemu" +echo "xen support $xen" echo "brlapi support $brlapi" echo "Documentation $build_docs" [ ! -z "$uname_release" ] && \ @@ -1203,6 +1225,11 @@ if test "$brlapi" = "yes" ; then echo "#define CONFIG_BRLAPI 1" >> $config_h echo "BRLAPI_LIBS=-lbrlapi" >> $config_mak fi +if test "$xen" = "yes" ; then + echo "CONFIG_XEN=yes" >> $config_mak + echo "#define CONFIG_XEN 1" >> $config_h + echo "XEN_LIBS=-lxenstore -lxenctrl" >> $config_mak +fi if test "$aio" = "yes" ; then echo "#define CONFIG_AIO 1" >> $config_h fi diff --git a/hw/boards.h b/hw/boards.h index e6dd198..9af9939 100644 --- a/hw/boards.h +++ b/hw/boards.h @@ -30,6 +30,9 @@ extern QEMUMachine bareetraxfs_machine; extern QEMUMachine pc_machine; extern QEMUMachine isapc_machine; +/* xen_machine.c */ +extern QEMUMachine xenpv_machine; + /* ppc.c */ extern QEMUMachine prep_machine; extern QEMUMachine core99_machine; diff --git a/hw/xen.h b/hw/xen.h new file mode 100644 index 0000000..a772aad --- /dev/null +++ b/hw/xen.h @@ -0,0 +1,18 @@ +#ifndef QEMU_XEN_H +#define QEMU_XEN_H 1 +/* + * public xen header + * stuff needed outside xen-*.c, i.e. interfaces to qemu. + * must not depend on any xen headers being present in + * /usr/include/xen, so it can be included unconditionally. + */ + +/* xen-machine.c */ +extern int xen_domid; +extern int xen_present; +extern int xen_emulate; +extern int xen_domainbuild; + +int xen_detect(void); + +#endif /* QEMU_XEN_H */ diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c new file mode 100644 index 0000000..05f0e67 --- /dev/null +++ b/hw/xen_machine_pv.c @@ -0,0 +1,146 @@ +/* + * QEMU Xen PV Machine + * + * Copyright (c) 2007,08 Red Hat + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "hw.h" +#include "boards.h" + +#include <xenctrl.h> + +/* -------------------------------------------------------------------- */ +/* variables */ + +int xen_domid; +int xen_present = -1; +int xen_emulate; +int xen_domainbuild; + +/* -------------------------------------------------------------------- */ +/* initialization code */ + +/* + * Figure the environment we are running in. + * Returns true when xen is present, false otherwise. + * Also checks whenever the domain specified via -domid + * exists (so we can attach) or whenever it must be created. + */ +int xen_detect(void) +{ + struct xc_dominfo info; + int xc, rc; + + if (xen_present != -1) + goto out; + + /* running on xen? (priviledged domain) */ + xen_present = 0; + xc = xc_interface_open(); + if (-1 == xc) + goto out; + xen_present = 1; + + /* does the domain exist? Or should we create one? */ + rc = xc_domain_getinfo(xc, xen_domid, 1, &info); + if ((1 != rc) || (info.domid != xen_domid)) + xen_domainbuild = 1; + close(xc); + +out: + return xen_present; +} + +static int xen_init(void) +{ + if (!xen_domid) { + fprintf(stderr, "%s: no domid specified\n", __FUNCTION__); + return -1; + } + + if (!xen_detect()) { + fprintf(stderr, "%s: emulating Xen\n", __FUNCTION__); + xen_emulate = 1; + } + + return 0; +} + +static int xen_init_pv(DisplayState *ds) +{ + int rc; + + rc = xen_init(); + if (rc < 0) + return rc; + + return 0; +} + +/* -------------------------------------------------------------------- */ +/* paravirtualized xen machine */ + +static void xenpv_init(ram_addr_t ram_size, int vga_ram_size, + const char *boot_device, DisplayState *ds, + const char *kernel_filename, + const char *kernel_cmdline, + const char *initrd_filename, + const char *cpu_model) +{ + CPUState *env; + int rc; + + rc = xen_init_pv(ds); + if (-1 == rc) + goto err; + + if (xen_emulate) { + fprintf(stderr, "xen pv emulation not implemented yet\n"); + goto err; + } + if (xen_domainbuild) { + fprintf(stderr, "xen pv domain creation not implemented yet\n"); + goto err; + } + + /* create dummy cpu, halted */ + if (cpu_model == NULL) { +#ifdef TARGET_X86_64 + cpu_model = "qemu64"; +#else + cpu_model = "qemu32"; +#endif + } + env = cpu_init(cpu_model); + env->halted = 1; + + return; + +err: + exit(1); +} + +QEMUMachine xenpv_machine = { + .name = "xenpv", + .desc = "paravirtualized Xen machine", + .init = xenpv_init, + .nodisk_ok = 1, +}; diff --git a/target-i386/machine.c b/target-i386/machine.c index 91dbd55..98ece17 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -9,6 +9,9 @@ void register_machines(void) { qemu_register_machine(&pc_machine); qemu_register_machine(&isapc_machine); +#ifdef CONFIG_XEN + qemu_register_machine(&xenpv_machine); +#endif } static void cpu_put_seg(QEMUFile *f, SegmentCache *dt) diff --git a/vl.c b/vl.c index 3211507..ff94c6f 100644 --- a/vl.c +++ b/vl.c @@ -29,6 +29,7 @@ #include "hw/audiodev.h" #include "hw/isa.h" #include "hw/baum.h" +#include "hw/xen.h" #include "net.h" #include "console.h" #include "sysemu.h" @@ -7780,6 +7781,9 @@ static void help(int exitcode) "-startdate select initial date of the clock\n" "-icount [N|auto]\n" " Enable virtual instruction counter with 2^N clock ticks per instruction\n" +#ifdef CONFIG_XEN + "-domid specify xen guest domain id\n" +#endif "\n" "During emulation, the following keys are useful:\n" "ctrl-alt-f toggle full screen\n" @@ -7885,6 +7889,9 @@ enum { QEMU_OPTION_startdate, QEMU_OPTION_tb_size, QEMU_OPTION_icount, +#ifdef CONFIG_XEN + QEMU_OPTION_domid, +#endif }; typedef struct QEMUOption { @@ -7973,6 +7980,9 @@ const QEMUOption qemu_options[] = { #ifdef CONFIG_CURSES { "curses", 0, QEMU_OPTION_curses }, #endif +#ifdef CONFIG_XEN + { "domid", HAS_ARG, QEMU_OPTION_domid }, +#endif /* temporary options */ { "usb", 0, QEMU_OPTION_usb }, @@ -8847,6 +8857,11 @@ int main(int argc, char **argv) icount_time_shift = strtol(optarg, NULL, 0); } break; +#ifdef CONFIG_XEN + case QEMU_OPTION_domid: + xen_domid = atoi(optarg); + break; +#endif } } } -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This patch adds infrastructure for xen backend drivers living in qemu, so drivers don''t need to implement common stuff on their own. It''s mostly xenbus management stuff: some functions to access xentore, setting up xenstore watches, callbacks on device discovery and state changes, handle event channel, ... Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 2 +- hw/xen_backend.c | 691 +++++++++++++++++++++++++++++++++++++++++++++++++++ hw/xen_backend.h | 86 +++++++ hw/xen_common.h | 34 +++ hw/xen_machine_pv.c | 7 +- 5 files changed, 818 insertions(+), 2 deletions(-) create mode 100644 hw/xen_backend.c create mode 100644 hw/xen_backend.h create mode 100644 hw/xen_common.h diff --git a/Makefile.target b/Makefile.target index 13ee854..78a6402 100644 --- a/Makefile.target +++ b/Makefile.target @@ -516,7 +516,7 @@ LIBS += $(CONFIG_VNC_TLS_LIBS) endif # xen backend driver support -XEN_OBJS := xen_machine_pv.o +XEN_OBJS := xen_machine_pv.o xen_backend.o ifeq ($(CONFIG_XEN), yes) OBJS += $(XEN_OBJS) LIBS += $(XEN_LIBS) diff --git a/hw/xen_backend.c b/hw/xen_backend.c new file mode 100644 index 0000000..e926a01 --- /dev/null +++ b/hw/xen_backend.c @@ -0,0 +1,691 @@ +/* + * xen backend driver infrastructure + * (c) 2008 Gerd Hoffmann <kraxel@redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/* + * TODO: add some xenbus / xenstore concepts overview here. + */ + +#include <stdio.h> +#include <stdlib.h> +#include <stdarg.h> +#include <string.h> +#include <unistd.h> +#include <fcntl.h> +#include <inttypes.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/mman.h> +#include <sys/signal.h> + +#include <xs.h> +#include <xenctrl.h> +#include <xen/grant_table.h> + +#include "hw.h" +#include "qemu-char.h" +#include "xen_backend.h" + +/* ------------------------------------------------------------- */ + +/* public */ +int xen_xc; +struct xs_handle *xenstore = NULL; + +/* private */ +static TAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = TAILQ_HEAD_INITIALIZER(xendevs); +static int debug = 0; + +/* ------------------------------------------------------------- */ + +int xenstore_write_str(const char *base, const char *node, const char *val) +{ + char abspath[XEN_BUFSIZE]; + + snprintf(abspath, sizeof(abspath), "%s/%s", base, node); + if (!xs_write(xenstore, 0, abspath, val, strlen(val))) + return -1; + return 0; +} + +char *xenstore_read_str(const char *base, const char *node) +{ + char abspath[XEN_BUFSIZE]; + unsigned int len; + + snprintf(abspath, sizeof(abspath), "%s/%s", base, node); + return xs_read(xenstore, 0, abspath, &len); +} + +int xenstore_write_int(const char *base, const char *node, int ival) +{ + char val[32]; + + snprintf(val, sizeof(val), "%d", ival); + return xenstore_write_str(base, node, val); +} + +int xenstore_read_int(const char *base, const char *node, int *ival) +{ + char *val; + int rc = -1; + + val = xenstore_read_str(base, node); + if (val && 1 == sscanf(val, "%d", ival)) + rc = 0; + qemu_free(val); + return rc; +} + +int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const char *val) +{ + return xenstore_write_str(xendev->be, node, val); +} + +int xenstore_write_be_int(struct XenDevice *xendev, const char *node, int ival) +{ + return xenstore_write_int(xendev->be, node, ival); +} + +char *xenstore_read_be_str(struct XenDevice *xendev, const char *node) +{ + return xenstore_read_str(xendev->be, node); +} + +int xenstore_read_be_int(struct XenDevice *xendev, const char *node, int *ival) +{ + return xenstore_read_int(xendev->be, node, ival); +} + +char *xenstore_read_fe_str(struct XenDevice *xendev, const char *node) +{ + return xenstore_read_str(xendev->fe, node); +} + +int xenstore_read_fe_int(struct XenDevice *xendev, const char *node, int *ival) +{ + return xenstore_read_int(xendev->fe, node, ival); +} + +/* ------------------------------------------------------------- */ + +const char *xenbus_strstate(enum xenbus_state state) +{ + static const char *const name[] = { + [ XenbusStateUnknown ] = "Unknown", + [ XenbusStateInitialising ] = "Initialising", + [ XenbusStateInitWait ] = "InitWait", + [ XenbusStateInitialised ] = "Initialised", + [ XenbusStateConnected ] = "Connected", + [ XenbusStateClosing ] = "Closing", + [ XenbusStateClosed ] = "Closed", + }; + return (state < ARRAY_SIZE(name)) ? name[state] : "INVALID"; +} + +int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state) +{ + int rc; + + rc = xenstore_write_be_int(xendev, "state", state); + if (rc < 0) + return rc; + xen_be_printf(xendev, 1, "backend state: %s -> %s\n", + xenbus_strstate(xendev->be_state), xenbus_strstate(state)); + xendev->be_state = state; + return 0; +} + +/* ------------------------------------------------------------- */ + +struct XenDevice *xen_be_find_xendev(char *type, int dom, int dev) +{ + struct XenDevice *xendev; + + TAILQ_FOREACH(xendev, &xendevs, next) { + if (xendev->dom != dom) + continue; + if (xendev->dev != dev) + continue; + if (0 != strcmp(xendev->type, type)) + continue; + return xendev; + } + return NULL; +} + +/* + * get xen backend device, allocate a new one if it doesn''t exist. + */ +static struct XenDevice *xen_be_get_xendev(char *type, int dom, int dev, + struct XenDevOps *ops) +{ + struct XenDevice *xendev; + char *dom0; + + xendev = xen_be_find_xendev(type, dom, dev); + if (xendev) + return xendev; + + /* init new xendev */ + xendev = qemu_mallocz(ops->size); + if (!xendev) + return NULL; + xendev->type = type; + xendev->dom = dom; + xendev->dev = dev; + xendev->ops = ops; + + dom0 = xs_get_domain_path(xenstore, 0); + snprintf(xendev->be, sizeof(xendev->be), "%s/backend/%s/%d/%d", + dom0, xendev->type, xendev->dom, xendev->dev); + snprintf(xendev->name, sizeof(xendev->name), "%s-%d", + xendev->type, xendev->dev); + free(dom0); + + xendev->debug = debug; + xendev->local_port = -1; + + xendev->evtchndev = xc_evtchn_open(); + if (xendev->evtchndev < 0) { + fprintf(stderr, "can''t open evtchn device\n"); + qemu_free(xendev); + return NULL; + } + fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC); + + if (ops->flags & DEVOPS_FLAG_NEED_GNTDEV) { + xendev->gnttabdev = xc_gnttab_open(); + if (xendev->gnttabdev < 0) { + fprintf(stderr, "can''t open gnttab device\n"); + xc_evtchn_close(xendev->evtchndev); + qemu_free(xendev); + return NULL; + } + } else { + xendev->gnttabdev = -1; + } + + TAILQ_INSERT_TAIL(&xendevs, xendev, next); + + if (xendev->ops->alloc) + xendev->ops->alloc(xendev); + + return xendev; +} + +/* + * release xen backend device. + */ +static struct XenDevice *xen_be_del_xendev(int dom, int dev) +{ + struct XenDevice *xendev, *xnext; + + /* + * This is pretty much like TAILQ_FOREACH(xendev, &xendevs, next) but + * we save the next pointer in xnext because we might free xendev. + */ + xnext = xendevs.tqh_first; + while (xnext) { + xendev = xnext; + xnext = xendev->next.tqe_next; + + if (xendev->dom != dom) + continue; + if (xendev->dev != dev && dev != -1) + continue; + + if (xendev->ops->free) + xendev->ops->free(xendev); + + if (xendev->fe) { + char token[XEN_BUFSIZE]; + snprintf(token, sizeof(token), "fe:%p", xendev); + xs_unwatch(xenstore, xendev->fe, token); + qemu_free(xendev->fe); + } + + if (xendev->evtchndev >= 0) + xc_evtchn_close(xendev->evtchndev); + if (xendev->gnttabdev >= 0) + xc_gnttab_close(xendev->gnttabdev); + + TAILQ_REMOVE(&xendevs, xendev, next); + qemu_free(xendev); + } + return NULL; +} + +/* + * Sync internal data structures on xenstore updates. + * Node specifies the changed field. node = NULL means + * update all fields (used for initialization). + */ +static void xen_be_backend_changed(struct XenDevice *xendev, const char *node) +{ + if (NULL == node || 0 == strcmp(node, "online")) { + if (-1 == xenstore_read_be_int(xendev, "online", &xendev->online)) + xendev->online = 0; + } + + if (node) { + xen_be_printf(xendev, 2, "backend update: %s\n", node); + if (xendev->ops->backend_changed) + xendev->ops->backend_changed(xendev, node); + } +} + +static void xen_be_frontend_changed(struct XenDevice *xendev, const char *node) +{ + int fe_state; + + if (NULL == node || 0 == strcmp(node, "state")) { + if (-1 == xenstore_read_fe_int(xendev, "state", &fe_state)) + fe_state = XenbusStateUnknown; + if (xendev->fe_state != fe_state) + xen_be_printf(xendev, 1, "frontend state: %s -> %s\n", + xenbus_strstate(xendev->fe_state), + xenbus_strstate(fe_state)); + xendev->fe_state = fe_state; + } + if (NULL == node || 0 == strcmp(node, "protocol")) { + qemu_free(xendev->protocol); + xendev->protocol = xenstore_read_fe_str(xendev, "protocol"); + if (xendev->protocol) + xen_be_printf(xendev, 1, "frontend protocol: %s\n", xendev->protocol); + } + + if (node) { + xen_be_printf(xendev, 2, "frontend update: %s\n", node); + if (xendev->ops->frontend_changed) + xendev->ops->frontend_changed(xendev, node); + } +} + +/* ------------------------------------------------------------- */ +/* Check for possible state transitions and perform them. */ + +/* + * Initial xendev setup. Read frontend path, register watch for it. + * Should succeed once xend finished setting up the backend device. + * + * Also sets initial state (-> Initializing) when done. Which + * only affects the xendev->be_state variable as xenbus should + * already be put into that state by xend. + */ +static int xen_be_try_setup(struct XenDevice *xendev) +{ + char token[XEN_BUFSIZE]; + int be_state; + + if (-1 == xenstore_read_be_int(xendev, "state", &be_state)) { + xen_be_printf(xendev, 0, "reading backend state failed\n"); + return -1; + } + + if (be_state != XenbusStateInitialising) { + xen_be_printf(xendev, 0, "initial backend state is wrong (%s)\n", + xenbus_strstate(be_state)); + return -1; + } + + xendev->fe = xenstore_read_be_str(xendev, "frontend"); + if (NULL == xendev->fe) { + xen_be_printf(xendev, 0, "reading frontend path failed\n"); + return -1; + } + + /* setup frontend watch */ + snprintf(token, sizeof(token), "fe:%p", xendev); + if (!xs_watch(xenstore, xendev->fe, token)) { + xen_be_printf(xendev, 0, "watching frontend path (%s) failed\n", + xendev->fe); + return -1; + } + xen_be_set_state(xendev, XenbusStateInitialising); + + xen_be_backend_changed(xendev, NULL); + xen_be_frontend_changed(xendev, NULL); + return 0; +} + +/* + * Try initialize xendev. Prepare everything the backend can do + * without synchronizing with the frontend. Fakes hotplug-status. No + * hotplug involved here because this is about userspace drivers, thus + * there are kernel backend devices which could invoke hotplug. + * + * Goes to InitWait on success. + */ +static int xen_be_try_init(struct XenDevice *xendev) +{ + int rc = 0; + + if (!xendev->online) { + xen_be_printf(xendev, 1, "not online\n"); + return -1; + } + + if (xendev->ops->init) + rc = xendev->ops->init(xendev); + if (0 != rc) { + xen_be_printf(xendev, 1, "init() failed\n"); + return rc; + } + + xenstore_write_be_str(xendev, "hotplug-status", "connected"); + xen_be_set_state(xendev, XenbusStateInitWait); + return 0; +} + +/* + * Try to connect xendev. Depends on the frontend being ready + * for it (shared ring and evtchn info in xenstore, state being + * Initialised or Connected). + * + * Goes to Connected on success. + */ +static int xen_be_try_connect(struct XenDevice *xendev) +{ + int rc = 0; + + if (xendev->fe_state != XenbusStateInitialised && + xendev->fe_state != XenbusStateConnected) { + if (xendev->ops->flags & DEVOPS_FLAG_IGNORE_STATE) { + xen_be_printf(xendev, 2, "frontend not ready, ignoring\n"); + } else { + xen_be_printf(xendev, 2, "frontend not ready (yet)\n"); + return -1; + } + } + + if (xendev->ops->connect) + rc = xendev->ops->connect(xendev); + if (0 != rc) { + xen_be_printf(xendev, 0, "connect() failed\n"); + return rc; + } + + xen_be_set_state(xendev, XenbusStateConnected); + return 0; +} + +/* + * Teardown connection. + * + * Goes to Closed when done. + */ +static void xen_be_disconnect(struct XenDevice *xendev, enum xenbus_state state) +{ + if (xendev->be_state != XenbusStateClosing && + xendev->be_state != XenbusStateClosed && + xendev->ops->disconnect) + xendev->ops->disconnect(xendev); + if (xendev->be_state != state) + xen_be_set_state(xendev, state); +} + +/* + * Try to reset xendev, for reconnection by another frontend instance. + */ +static int xen_be_try_reset(struct XenDevice *xendev) +{ + if (xendev->fe_state != XenbusStateInitialising) + return -1; + + xen_be_printf(xendev, 1, "device reset (for re-connect)\n"); + xen_be_set_state(xendev, XenbusStateInitialising); + return 0; +} + +/* + * state change dispatcher function + */ +void xen_be_check_state(struct XenDevice *xendev) +{ + int rc = 0; + + /* frontend may request shutdown from almost anywhere */ + if (xendev->fe_state == XenbusStateClosing || + xendev->fe_state == XenbusStateClosed) { + xen_be_disconnect(xendev, xendev->fe_state); + return; + } + + /* check for possible backend state transitions */ + for (;;) { + switch (xendev->be_state) { + case XenbusStateUnknown: + rc = xen_be_try_setup(xendev); + break; + case XenbusStateInitialising: + rc = xen_be_try_init(xendev); + break; + case XenbusStateInitWait: + rc = xen_be_try_connect(xendev); + break; + case XenbusStateClosed: + rc = xen_be_try_reset(xendev); + break; + default: + rc = -1; + } + if (0 != rc) + break; + } +} + +/* ------------------------------------------------------------- */ + +static int xenstore_scan(char *type, int dom, struct XenDevOps *ops) +{ + struct XenDevice *xendev; + char path[XEN_BUFSIZE], token[XEN_BUFSIZE]; + char **dev = NULL, *dom0; + unsigned int cdev, j; + + /* setup watch */ + dom0 = xs_get_domain_path(xenstore, 0); + snprintf(token, sizeof(token), "be:%p:%d:%p", type, dom, ops); + snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom); + free(dom0); + if (!xs_watch(xenstore, path, token)) { + fprintf(stderr, "xen be: watching backend path (%s) failed\n", path); + return -1; + } + + /* look for backends */ + dev = xs_directory(xenstore, 0, path, &cdev); + if (!dev) + return 0; + for (j = 0; j < cdev; j++) { + xendev = xen_be_get_xendev(type, dom, atoi(dev[j]), ops); + if (NULL == xendev) + continue; + xen_be_check_state(xendev); + } + qemu_free(dev); + return 0; +} + +static void xenstore_update_be(char *watch, char *type, int dom, + struct XenDevOps *ops) +{ + struct XenDevice *xendev; + char path[XEN_BUFSIZE], *dom0; + unsigned int len, dev; + + dom0 = xs_get_domain_path(xenstore, 0); + len = snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom); + free(dom0); + if (0 != strncmp(path, watch, len)) + return; + if (2 != sscanf(watch+len, "/%u/%255s", &dev, path)) { + strcpy(path, ""); + if (1 != sscanf(watch+len, "/%u", &dev)) + dev = -1; + } + if (-1 == dev) + return; + + if (0) { + /* FIXME: detect devices being deleted from xenstore ... */ + xen_be_del_xendev(dom, dev); + } + + xendev = xen_be_get_xendev(type, dom, dev, ops); + if (NULL != xendev) { + xen_be_backend_changed(xendev, path); + xen_be_check_state(xendev); + } +} + +static void xenstore_update_fe(char *watch, struct XenDevice *xendev) +{ + char *node; + unsigned int len; + + len = strlen(xendev->fe); + if (0 != strncmp(xendev->fe, watch, len)) + return; + if (watch[len] != ''/'') + return; + node = watch + len + 1; + + xen_be_frontend_changed(xendev, node); + xen_be_check_state(xendev); +} + +static void xenstore_update(void *unused) +{ + char **vec = NULL; + intptr_t type, ops, ptr; + unsigned int dom, count; + + vec = xs_read_watch(xenstore, &count); + if (NULL == vec) + goto cleanup; + + if (3 == sscanf(vec[XS_WATCH_TOKEN], "be:%" PRIxPTR ":%d:%" PRIxPTR, + &type, &dom, &ops)) + xenstore_update_be(vec[XS_WATCH_PATH], (void*)type, dom, (void*)ops); + if (1 == sscanf(vec[XS_WATCH_TOKEN], "fe:%" PRIxPTR, &ptr)) + xenstore_update_fe(vec[XS_WATCH_PATH], (void*)ptr); + +cleanup: + qemu_free(vec); +} + +static void xen_be_evtchn_event(void *opaque) +{ + struct XenDevice *xendev = opaque; + evtchn_port_t port; + + port = xc_evtchn_pending(xendev->evtchndev); + if (port != xendev->local_port) { + xen_be_printf(xendev, 0, "xc_evtchn_pending returned %d (expected %d)\n", + port, xendev->local_port); + return; + } + xc_evtchn_unmask(xendev->evtchndev, port); + + if (xendev->ops->event) + xendev->ops->event(xendev); +} + +/* -------------------------------------------------------------------- */ + +int xen_be_init(void) +{ + xenstore = xs_daemon_open(); + if (!xenstore) { + fprintf(stderr, "can''t connect to xenstored\n"); + return -1; + } + + if (qemu_set_fd_handler(xs_fileno(xenstore), xenstore_update, NULL, NULL) < 0) + goto err; + + xen_xc = xc_interface_open(); + if (-1 == xen_xc) { + fprintf(stderr, "can''t open xen interface\n"); + goto err; + } + return 0; + +err: + qemu_set_fd_handler(xs_fileno(xenstore), NULL, NULL, NULL); + xs_daemon_close(xenstore); + xenstore = NULL; + + return -1; +} + +int xen_be_register(char *type, struct XenDevOps *ops) +{ + return xenstore_scan(type, xen_domid, ops); +} + +int xen_be_bind_evtchn(struct XenDevice *xendev) +{ + if (xendev->local_port != -1) + return 0; + xendev->local_port = xc_evtchn_bind_interdomain + (xendev->evtchndev, xendev->dom, xendev->remote_port); + if (-1 == xendev->local_port) { + xen_be_printf(xendev, 0, "xc_evtchn_bind_interdomain failed\n"); + return -1; + } + xen_be_printf(xendev, 2, "bind evtchn port %d\n", xendev->local_port); + qemu_set_fd_handler(xc_evtchn_fd(xendev->evtchndev), + xen_be_evtchn_event, NULL, xendev); + return 0; +} + +void xen_be_unbind_evtchn(struct XenDevice *xendev) +{ + if (xendev->local_port == -1) + return; + qemu_set_fd_handler(xc_evtchn_fd(xendev->evtchndev), NULL, NULL, NULL); + xc_evtchn_unbind(xendev->evtchndev, xendev->local_port); + xen_be_printf(xendev, 2, "unbind evtchn port %d\n", xendev->local_port); + xendev->local_port = -1; +} + +int xen_be_send_notify(struct XenDevice *xendev) +{ + return xc_evtchn_notify(xendev->evtchndev, xendev->local_port); +} + +/* + * msg_level: + * 0 == errors. + * 1 == informative debug messages. + * 2 == noisy debug messages. + * 3 == will flood your log. + */ +void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...) +{ + va_list args; + + if (msg_level > xendev->debug) + return; + fprintf(stderr, "xen be: %s: ", xendev->name); + va_start(args, fmt); + vfprintf(stderr, fmt, args); + va_end(args); +} diff --git a/hw/xen_backend.h b/hw/xen_backend.h new file mode 100644 index 0000000..3a79ad0 --- /dev/null +++ b/hw/xen_backend.h @@ -0,0 +1,86 @@ +#ifndef QEMU_XEN_BACKEND_H +#define QEMU_XEN_BACKEND_H 1 + +#include "xen_common.h" + +/* ------------------------------------------------------------- */ + +#define XEN_BUFSIZE 1024 + +struct XenDevice; + +/* driver uses grant tables -> open gntdev device (xendev->gnttabdev) */ +#define DEVOPS_FLAG_NEED_GNTDEV 1 +/* don''t expect frontend doing correct state transitions (aka console quirk) */ +#define DEVOPS_FLAG_IGNORE_STATE 2 + +struct XenDevOps { + size_t size; + uint32_t flags; + void (*alloc)(struct XenDevice *xendev); + int (*init)(struct XenDevice *xendev); + int (*connect)(struct XenDevice *xendev); + void (*event)(struct XenDevice *xendev); + void (*disconnect)(struct XenDevice *xendev); + int (*free)(struct XenDevice *xendev); + void (*backend_changed)(struct XenDevice *xendev, const char *node); + void (*frontend_changed)(struct XenDevice *xendev, const char *node); +}; + +struct XenDevice { + char *type; + int dom; + int dev; + char name[64]; + int debug; + + enum xenbus_state be_state; + enum xenbus_state fe_state; + int online; + char be[XEN_BUFSIZE]; + char *fe; + char *protocol; + int remote_port; + int local_port; + + int evtchndev; + int gnttabdev; + + struct XenDevOps *ops; + TAILQ_ENTRY(XenDevice) next; +}; + +/* ------------------------------------------------------------- */ + +/* variables */ +extern int xen_xc; +extern struct xs_handle *xenstore; + +/* xenstore helper functions */ +int xenstore_write_str(const char *base, const char *node, const char *val); +int xenstore_write_int(const char *base, const char *node, int ival); +char *xenstore_read_str(const char *base, const char *node); +int xenstore_read_int(const char *base, const char *node, int *ival); + +int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const char *val); +int xenstore_write_be_int(struct XenDevice *xendev, const char *node, int ival); +char *xenstore_read_be_str(struct XenDevice *xendev, const char *node); +int xenstore_read_be_int(struct XenDevice *xendev, const char *node, int *ival); +char *xenstore_read_fe_str(struct XenDevice *xendev, const char *node); +int xenstore_read_fe_int(struct XenDevice *xendev, const char *node, int *ival); + +const char *xenbus_strstate(enum xenbus_state state); +struct XenDevice *xen_be_find_xendev(char *type, int dom, int dev); +void xen_be_check_state(struct XenDevice *xendev); + +/* xen backend driver bits */ +int xen_be_init(void); +int xen_be_register(char *type, struct XenDevOps *ops); +int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state); +int xen_be_bind_evtchn(struct XenDevice *xendev); +void xen_be_unbind_evtchn(struct XenDevice *xendev); +int xen_be_send_notify(struct XenDevice *xendev); +void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...) + __attribute__ ((format(printf, 3, 4))); + +#endif /* QEMU_XEN_BACKEND_H */ diff --git a/hw/xen_common.h b/hw/xen_common.h new file mode 100644 index 0000000..a9e1d26 --- /dev/null +++ b/hw/xen_common.h @@ -0,0 +1,34 @@ +#ifndef QEMU_XEN_COMMON_H +#define QEMU_XEN_COMMON_H 1 + +#include <stddef.h> +#include <inttypes.h> + +#include <xenctrl.h> +#include <xs.h> +#include <xen/io/xenbus.h> + +#include "hw.h" +#include "xen.h" +#include "sys-queue.h" /* BSD list implementation */ + +/* + * tweaks needed to build with different xen versions + * 0x00030205 -> 3.1.0 + * 0x00030207 -> 3.2.0 + * 0x00030208 -> unstable + */ +#include <xen/xen-compat.h> +#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030205 +# define evtchn_port_or_error_t int +#endif +#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030207 +# define xc_map_foreign_pages xc_map_foreign_batch +#endif +#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030208 +# define xen_mb() mb() +# define xen_rmb() rmb() +# define xen_wmb() wmb() +#endif + +#endif /* QEMU_XEN_COMMON_H */ diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 05f0e67..2eceb8e 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -25,7 +25,7 @@ #include "hw.h" #include "boards.h" -#include <xenctrl.h> +#include "xen_backend.h" /* -------------------------------------------------------------------- */ /* variables */ @@ -81,6 +81,11 @@ static int xen_init(void) xen_emulate = 1; } + if (-1 == xen_be_init()) { + fprintf(stderr, "%s: xen backend core setup failed\n", __FUNCTION__); + return -1; + } + return 0; } -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 07/13] xen: add console backend driver.
This patch adds a xenconsole backend driver. It it based on current xen-unstable code. It has been changed to make use of the common backend driver code. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 1 + hw/xen_backend.h | 3 + hw/xen_console.c | 271 +++++++++++++++++++++++++++++++++++++++++++++++++++ hw/xen_machine_pv.c | 3 + 4 files changed, 278 insertions(+), 0 deletions(-) create mode 100644 hw/xen_console.c diff --git a/Makefile.target b/Makefile.target index 78a6402..92ad109 100644 --- a/Makefile.target +++ b/Makefile.target @@ -517,6 +517,7 @@ endif # xen backend driver support XEN_OBJS := xen_machine_pv.o xen_backend.o +XEN_OBJS += xen_console.o ifeq ($(CONFIG_XEN), yes) OBJS += $(XEN_OBJS) LIBS += $(XEN_LIBS) diff --git a/hw/xen_backend.h b/hw/xen_backend.h index 3a79ad0..10bd1a6 100644 --- a/hw/xen_backend.h +++ b/hw/xen_backend.h @@ -83,4 +83,7 @@ int xen_be_send_notify(struct XenDevice *xendev); void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...) __attribute__ ((format(printf, 3, 4))); +/* actual backend drivers */ +struct XenDevOps xen_console_ops; /* xen_console.c */ + #endif /* QEMU_XEN_BACKEND_H */ diff --git a/hw/xen_console.c b/hw/xen_console.c new file mode 100644 index 0000000..2ea88f6 --- /dev/null +++ b/hw/xen_console.c @@ -0,0 +1,271 @@ +/* + * Copyright (C) International Business Machines Corp., 2005 + * Author(s): Anthony Liguori <aliguori@us.ibm.com> + * + * Copyright (C) Red Hat 2007 + * + * Xen Console + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <stdlib.h> +#include <errno.h> +#include <string.h> +#include <sys/select.h> +#include <fcntl.h> +#include <unistd.h> +#include <termios.h> +#include <stdarg.h> +#include <sys/mman.h> +#include <xs.h> +#include <xen/io/console.h> +#include <xenctrl.h> + +#include "hw.h" +#include "sysemu.h" +#include "qemu-char.h" +#include "xen_backend.h" + +#define dolog(val, fmt, ...) fprintf(stderr, fmt "\n", ## __VA_ARGS__) + +struct buffer { + uint8_t *data; + size_t consumed; + size_t size; + size_t capacity; + size_t max_capacity; +}; + +struct XenConsole { + struct XenDevice xendev; /* must be first */ + struct buffer buffer; + char console[XEN_BUFSIZE]; + int ring_ref; + void *sring; + CharDriverState *chr; + int backlog; +}; + +static void buffer_append(struct XenConsole *con) +{ + struct buffer *buffer = &con->buffer; + XENCONS_RING_IDX cons, prod, size; + struct xencons_interface *intf = con->sring; + + cons = intf->out_cons; + prod = intf->out_prod; + xen_mb(); + + size = prod - cons; + if ((size == 0) || (size > sizeof(intf->out))) + return; + + if ((buffer->capacity - buffer->size) < size) { + buffer->capacity += (size + 1024); + buffer->data = qemu_realloc(buffer->data, buffer->capacity); + if (buffer->data == NULL) { + dolog(LOG_ERR, "Memory allocation failed"); + exit(ENOMEM); + } + } + + while (cons != prod) + buffer->data[buffer->size++] = intf->out[ + MASK_XENCONS_IDX(cons++, intf->out)]; + + xen_mb(); + intf->out_cons = cons; + xen_be_send_notify(&con->xendev); + + if (buffer->max_capacity && + buffer->size > buffer->max_capacity) { + /* Discard the middle of the data. */ + + size_t over = buffer->size - buffer->max_capacity; + uint8_t *maxpos = buffer->data + buffer->max_capacity; + + memmove(maxpos - over, maxpos, over); + buffer->data = qemu_realloc(buffer->data, buffer->max_capacity); + buffer->size = buffer->capacity = buffer->max_capacity; + + if (buffer->consumed > buffer->max_capacity - over) + buffer->consumed = buffer->max_capacity - over; + } +} + +static void buffer_advance(struct buffer *buffer, size_t len) +{ + buffer->consumed += len; + if (buffer->consumed == buffer->size) { + buffer->consumed = 0; + buffer->size = 0; + } +} + +static int ring_free_bytes(struct XenConsole *con) +{ + struct xencons_interface *intf = con->sring; + XENCONS_RING_IDX cons, prod, space; + + cons = intf->in_cons; + prod = intf->in_prod; + xen_mb(); + + space = prod - cons; + if (space > sizeof(intf->in)) + return 0; /* ring is screwed: ignore it */ + + return (sizeof(intf->in) - space); +} + +static int xencons_can_receive(void *opaque) +{ + struct XenConsole *con = opaque; + return ring_free_bytes(con); +} + +static void xencons_receive(void *opaque, const uint8_t *buf, int len) +{ + struct XenConsole *con = opaque; + struct xencons_interface *intf = con->sring; + XENCONS_RING_IDX prod; + int i, max; + + max = ring_free_bytes(con); + /* The can_receive() func limits this, but check again anyway */ + if (max < len) + len = max; + + prod = intf->in_prod; + for (i = 0; i < len; i++) { + intf->in[MASK_XENCONS_IDX(prod++, intf->in)] + buf[i]; + } + xen_wmb(); + intf->in_prod = prod; + xen_be_send_notify(&con->xendev); +} + +static void xencons_send(struct XenConsole *con) +{ + ssize_t len, size; + + size = con->buffer.size - con->buffer.consumed; + len = qemu_chr_write(con->chr, con->buffer.data + con->buffer.consumed, + size); + if (len < 1) { + if (!con->backlog) { + con->backlog = 1; + xen_be_printf(&con->xendev, 1, "backlog piling up, nobody listening?\n"); + } + } else { + buffer_advance(&con->buffer, len); + if (con->backlog && len == size) { + con->backlog = 0; + xen_be_printf(&con->xendev, 1, "backlog is gone\n"); + } + } +} + +/* -------------------------------------------------------------------- */ + +static int con_init(struct XenDevice *xendev) +{ + struct XenConsole *con = container_of(xendev, struct XenConsole, xendev); + char *type, *dom; + + if (!serial_hds[con->xendev.dev]) { + xen_be_printf(xendev, 1, "serial line %d not configured\n", con->xendev.dev); + return -1; + } + + /* setup */ + dom = xs_get_domain_path(xenstore, con->xendev.dom); + snprintf(con->console, sizeof(con->console), "%s/console", dom); + free(dom); + con->chr = serial_hds[con->xendev.dev]; + + type = xenstore_read_str(con->console, "type"); + if (!type || 0 != strcmp(type, "ioemu")) { + xen_be_printf(xendev, 1, "not for me (type=%s)\n", type); + return -1; + } + + return 0; +} + +static int con_connect(struct XenDevice *xendev) +{ + struct XenConsole *con = container_of(xendev, struct XenConsole, xendev); + int limit; + + if (-1 == xenstore_read_int(con->console, "ring-ref", &con->ring_ref)) + return -1; + if (-1 == xenstore_read_int(con->console, "port", &con->xendev.remote_port)) + return -1; + if (0 == xenstore_read_int(con->console, "limit", &limit)) + con->buffer.max_capacity = limit; + + con->sring = xc_map_foreign_range(xen_xc, con->xendev.dom, + XC_PAGE_SIZE, + PROT_READ|PROT_WRITE, + con->ring_ref); + if (!con->sring) + return -1; + + xen_be_bind_evtchn(&con->xendev); + qemu_chr_add_handlers(con->chr, xencons_can_receive, xencons_receive, + NULL, con); + + xen_be_printf(xendev, 1, "ring mfn %d, remote port %d, local port %d, limit %zd\n", + con->ring_ref, + con->xendev.remote_port, + con->xendev.local_port, + con->buffer.max_capacity); + return 0; +} + +static void con_disconnect(struct XenDevice *xendev) +{ + struct XenConsole *con = container_of(xendev, struct XenConsole, xendev); + + qemu_chr_add_handlers(con->chr, NULL, NULL, NULL, NULL); + xen_be_unbind_evtchn(&con->xendev); + + if (con->sring) { + munmap(con->sring, XC_PAGE_SIZE); + con->sring = NULL; + } +} + +static void con_event(struct XenDevice *xendev) +{ + struct XenConsole *con = container_of(xendev, struct XenConsole, xendev); + + buffer_append(con); + if (con->buffer.size - con->buffer.consumed) + xencons_send(con); +} + +/* -------------------------------------------------------------------- */ + +struct XenDevOps xen_console_ops = { + .size = sizeof(struct XenConsole), + .flags = DEVOPS_FLAG_IGNORE_STATE, + .init = con_init, + .connect = con_connect, + .event = con_event, + .disconnect = con_disconnect, +}; diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 2eceb8e..93f66d0 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -97,6 +97,9 @@ static int xen_init_pv(DisplayState *ds) if (rc < 0) return rc; + /* xenbus backend drivers */ + xen_be_register("console", &xen_console_ops); + return 0; } -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 08/13] xen: add framebuffer backend driver
This patch adds a frsamebuffer (and kbd+mouse) backend driver. It it based on current xen-unstable code. It has been changed to make use of the common backend driver code. It also has been changed to compile with xen headers older than unstable (aka son-to-be 4.0). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 2 +- hw/xen_backend.h | 4 + hw/xen_framebuffer.c | 975 ++++++++++++++++++++++++++++++++++++++++++++++++++ hw/xen_machine_pv.c | 5 + 4 files changed, 985 insertions(+), 1 deletions(-) create mode 100644 hw/xen_framebuffer.c diff --git a/Makefile.target b/Makefile.target index 92ad109..bcf4d93 100644 --- a/Makefile.target +++ b/Makefile.target @@ -517,7 +517,7 @@ endif # xen backend driver support XEN_OBJS := xen_machine_pv.o xen_backend.o -XEN_OBJS += xen_console.o +XEN_OBJS += xen_console.o xen_framebuffer.o ifeq ($(CONFIG_XEN), yes) OBJS += $(XEN_OBJS) LIBS += $(XEN_LIBS) diff --git a/hw/xen_backend.h b/hw/xen_backend.h index 10bd1a6..d8bc23a 100644 --- a/hw/xen_backend.h +++ b/hw/xen_backend.h @@ -85,5 +85,9 @@ void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ... /* actual backend drivers */ struct XenDevOps xen_console_ops; /* xen_console.c */ +struct XenDevOps xen_kbdmouse_ops; /* xen_framebuffer.c */ +struct XenDevOps xen_framebuffer_ops; /* xen_framebuffer.c */ + +void xen_set_display(int domid, DisplayState *ds); #endif /* QEMU_XEN_BACKEND_H */ diff --git a/hw/xen_framebuffer.c b/hw/xen_framebuffer.c new file mode 100644 index 0000000..a36d364 --- /dev/null +++ b/hw/xen_framebuffer.c @@ -0,0 +1,975 @@ +/* + * xen paravirt framebuffer backend + * + * Copyright IBM, Corp. 2005-2006 + * Copyright Red Hat, Inc. 2006-2008 + * + * Authors: + * Anthony Liguori <aliguori@us.ibm.com>, + * Markus Armbruster <armbru@redhat.com>, + * Daniel P. Berrange <berrange@redhat.com>, + * Pat Campbell <plc@novell.com>, + * Gerd Hoffmann <kraxel@redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <stdarg.h> +#include <stdlib.h> +#include <sys/types.h> +#include <fcntl.h> +#include <unistd.h> +#include <stdbool.h> +#include <sys/mman.h> +#include <errno.h> +#include <stdio.h> +#include <string.h> +#include <time.h> + +#include <xs.h> +#include <xenctrl.h> +#include <xen/event_channel.h> +#include <xen/io/xenbus.h> +#include <xen/io/fbif.h> +#include <xen/io/kbdif.h> +#include <xen/io/protocols.h> + +#include "hw.h" +#include "sysemu.h" +#include "console.h" +#include "qemu-char.h" +#include "xen_backend.h" + +#ifndef BTN_LEFT +#define BTN_LEFT 0x110 /* from <linux/input.h> */ +#endif + +/* -------------------------------------------------------------------- */ + +struct common { + struct XenDevice xendev; /* must be first */ + void *page; + DisplayState *ds; +}; + +struct XenInput { + struct common c; + int abs_pointer_wanted; /* Whether guest supports absolute pointer */ + int button_state; /* Last seen pointer button state */ + int extended; + QEMUPutMouseEntry *qmouse; +}; + +#define UP_QUEUE 8 + +struct XenFB { + struct common c; + size_t fb_len; + int row_stride; + int depth; + int width; + int height; + int offset; + void *pixels; + int fbpages; + int feature_update; + int refresh_period; + int bug_trigger; + int have_console; + + struct { + int x,y,w,h; + } up_rects[UP_QUEUE]; + int up_count; + int up_fullscreen; +}; + +/* -------------------------------------------------------------------- */ + +static int common_bind(struct common *c) +{ + int mfn; + + if (-1 == xenstore_read_fe_int(&c->xendev, "page-ref", &mfn)) + return -1; + if (-1 == xenstore_read_fe_int(&c->xendev, "event-channel", &c->xendev.remote_port)) + return -1; + + xen_be_bind_evtchn(&c->xendev); + xen_be_printf(&c->xendev, 1, "ring mfn %d, remote-port %d, local-port %d\n", + mfn, c->xendev.remote_port, c->xendev.local_port); + + c->page = xc_map_foreign_range(xen_xc, c->xendev.dom, + XC_PAGE_SIZE, + PROT_READ | PROT_WRITE, mfn); + if (NULL == c->page) + return -1; + return 0; +} + +static void common_unbind(struct common *c) +{ + if (c->page) { + munmap(c->page, XC_PAGE_SIZE); + c->page = NULL; + } + xen_be_unbind_evtchn(&c->xendev); +} + +/* -------------------------------------------------------------------- */ + +#if 0 +/* + * These two tables are not needed any more, but left in here + * intentionally as documentation, to show how scancode2linux[] + * was generated. + * + * Tables to map from scancode to Linux input layer keycode. + * Scancodes are hardware-specific. These maps assumes a + * standard AT or PS/2 keyboard which is what QEMU feeds us. + */ +static const unsigned char atkbd_set2_keycode[512] = { + + 0, 67, 65, 63, 61, 59, 60, 88, 0, 68, 66, 64, 62, 15, 41,117, + 0, 56, 42, 93, 29, 16, 2, 0, 0, 0, 44, 31, 30, 17, 3, 0, + 0, 46, 45, 32, 18, 5, 4, 95, 0, 57, 47, 33, 20, 19, 6,183, + 0, 49, 48, 35, 34, 21, 7,184, 0, 0, 50, 36, 22, 8, 9,185, + 0, 51, 37, 23, 24, 11, 10, 0, 0, 52, 53, 38, 39, 25, 12, 0, + 0, 89, 40, 0, 26, 13, 0, 0, 58, 54, 28, 27, 0, 43, 0, 85, + 0, 86, 91, 90, 92, 0, 14, 94, 0, 79,124, 75, 71,121, 0, 0, + 82, 83, 80, 76, 77, 72, 1, 69, 87, 78, 81, 74, 55, 73, 70, 99, + + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 217,100,255, 0, 97,165, 0, 0,156, 0, 0, 0, 0, 0, 0,125, + 173,114, 0,113, 0, 0, 0,126,128, 0, 0,140, 0, 0, 0,127, + 159, 0,115, 0,164, 0, 0,116,158, 0,150,166, 0, 0, 0,142, + 157, 0, 0, 0, 0, 0, 0, 0,155, 0, 98, 0, 0,163, 0, 0, + 226, 0, 0, 0, 0, 0, 0, 0, 0,255, 96, 0, 0, 0,143, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0,107, 0,105,102, 0, 0,112, + 110,111,108,112,106,103, 0,119, 0,118,109, 0, 99,104,119, 0, + +}; + +static const unsigned char atkbd_unxlate_table[128] = { + + 0,118, 22, 30, 38, 37, 46, 54, 61, 62, 70, 69, 78, 85,102, 13, + 21, 29, 36, 45, 44, 53, 60, 67, 68, 77, 84, 91, 90, 20, 28, 27, + 35, 43, 52, 51, 59, 66, 75, 76, 82, 14, 18, 93, 26, 34, 33, 42, + 50, 49, 58, 65, 73, 74, 89,124, 17, 41, 88, 5, 6, 4, 12, 3, + 11, 2, 10, 1, 9,119,126,108,117,125,123,107,115,116,121,105, + 114,122,112,113,127, 96, 97,120, 7, 15, 23, 31, 39, 47, 55, 63, + 71, 79, 86, 94, 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 87,111, + 19, 25, 57, 81, 83, 92, 95, 98, 99,100,101,103,104,106,109,110 + +}; +#endif + +/* + * for (i = 0; i < 128; i++) { + * scancode2linux[i] = atkbd_set2_keycode[atkbd_unxlate_table[i]]; + * scancode2linux[i | 0x80] = atkbd_set2_keycode[atkbd_unxlate_table[i] | 0x80]; + * } + */ +static const unsigned char scancode2linux[512] = { + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, + 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, + 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, + 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, + 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, + 80, 81, 82, 83, 99, 0, 86, 87, 88,117, 0, 0, 95,183,184,185, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 93, 0, 0, 89, 0, 0, 85, 91, 90, 92, 0, 94, 0,124,121, 0, + + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 165, 0, 0, 0, 0, 0, 0, 0, 0,163, 0, 0, 96, 97, 0, 0, + 113,140,164, 0,166, 0, 0, 0, 0, 0,255, 0, 0, 0,114, 0, + 115, 0,150, 0, 0, 98,255, 99,100, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0,119,119,102,103,104, 0,105,112,106,118,107, + 108,109,110,111, 0, 0, 0, 0, 0, 0, 0,125,126,127,116,142, + 0, 0, 0,143, 0,217,156,173,128,159,158,157,155,226, 0,112, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, +}; + +/* Send an event to the keyboard frontend driver */ +static int xenfb_kbd_event(struct XenInput *xenfb, + union xenkbd_in_event *event) +{ + struct xenkbd_page *page = xenfb->c.page; + uint32_t prod; + + if (xenfb->c.xendev.be_state != XenbusStateConnected) + return 0; + if (!page) + return 0; + + prod = page->in_prod; + if (prod - page->in_cons == XENKBD_IN_RING_LEN) { + errno = EAGAIN; + return -1; + } + + xen_mb(); /* ensure ring space available */ + XENKBD_IN_RING_REF(page, prod) = *event; + xen_wmb(); /* ensure ring contents visible */ + page->in_prod = prod + 1; + return xen_be_send_notify(&xenfb->c.xendev); +} + +/* Send a keyboard (or mouse button) event */ +static int xenfb_send_key(struct XenInput *xenfb, bool down, int keycode) +{ + union xenkbd_in_event event; + + memset(&event, 0, XENKBD_IN_EVENT_SIZE); + event.type = XENKBD_TYPE_KEY; + event.key.pressed = down ? 1 : 0; + event.key.keycode = keycode; + + return xenfb_kbd_event(xenfb, &event); +} + +/* Send a relative mouse movement event */ +static int xenfb_send_motion(struct XenInput *xenfb, + int rel_x, int rel_y, int rel_z) +{ + union xenkbd_in_event event; + + memset(&event, 0, XENKBD_IN_EVENT_SIZE); + event.type = XENKBD_TYPE_MOTION; + event.motion.rel_x = rel_x; + event.motion.rel_y = rel_y; +#if __XEN_LATEST_INTERFACE_VERSION__ >= 0x00030207 + event.motion.rel_z = rel_z; +#endif + + return xenfb_kbd_event(xenfb, &event); +} + +/* Send an absolute mouse movement event */ +static int xenfb_send_position(struct XenInput *xenfb, + int abs_x, int abs_y, int z) +{ + union xenkbd_in_event event; + + memset(&event, 0, XENKBD_IN_EVENT_SIZE); + event.type = XENKBD_TYPE_POS; + event.pos.abs_x = abs_x; + event.pos.abs_y = abs_y; +#if __XEN_LATEST_INTERFACE_VERSION__ == 0x00030207 + event.pos.abs_z = z; +#endif +#if __XEN_LATEST_INTERFACE_VERSION__ >= 0x00030208 + event.pos.rel_z = z; +#endif + + return xenfb_kbd_event(xenfb, &event); +} + +/* + * Send a key event from the client to the guest OS + * QEMU gives us a raw scancode from an AT / PS/2 style keyboard. + * We have to turn this into a Linux Input layer keycode. + * + * Extra complexity from the fact that with extended scancodes + * (like those produced by arrow keys) this method gets called + * twice, but we only want to send a single event. So we have to + * track the ''0xe0'' scancode state & collapse the extended keys + * as needed. + * + * Wish we could just send scancodes straight to the guest which + * already has code for dealing with this... + */ +static void xenfb_key_event(void *opaque, int scancode) +{ + struct XenInput *xenfb = opaque; + int down = 1; + + if (scancode == 0xe0) { + xenfb->extended = 1; + return; + } else if (scancode & 0x80) { + scancode &= 0x7f; + down = 0; + } + if (xenfb->extended) { + scancode |= 0x80; + xenfb->extended = 0; + } + xenfb_send_key(xenfb, down, scancode2linux[scancode]); +} + +/* + * Send a mouse event from the client to the guest OS + * + * The QEMU mouse can be in either relative, or absolute mode. + * Movement is sent separately from button state, which has to + * be encoded as virtual key events. We also don''t actually get + * given any button up/down events, so have to track changes in + * the button state. + */ +static void xenfb_mouse_event(void *opaque, + int dx, int dy, int dz, int button_state) +{ + struct XenInput *xenfb = opaque; + int i; + + if (xenfb->abs_pointer_wanted) + xenfb_send_position(xenfb, + dx * (xenfb->c.ds->width - 1) / 0x7fff, + dy * (xenfb->c.ds->height - 1) / 0x7fff, + dz); + else + xenfb_send_motion(xenfb, dx, dy, dz); + + for (i = 0 ; i < 8 ; i++) { + int lastDown = xenfb->button_state & (1 << i); + int down = button_state & (1 << i); + if (down == lastDown) + continue; + + if (xenfb_send_key(xenfb, down, BTN_LEFT+i) < 0) + return; + } + xenfb->button_state = button_state; +} + +static int input_init(struct XenDevice *xendev) +{ + struct XenInput *in = container_of(xendev, struct XenInput, c.xendev); + + if (!in->c.ds) { + /* xen_set_display() below will set that and trigger us again */ + xen_be_printf(xendev, 1, "ds not set (yet)\n"); + return -1; + } + + xenstore_write_be_int(xendev, "feature-abs-pointer", 1); + return 0; +} + +static int input_connect(struct XenDevice *xendev) +{ + struct XenInput *in = container_of(xendev, struct XenInput, c.xendev); + int rc; + + if (-1 == xenstore_read_fe_int(xendev, "request-abs-pointer", &in->abs_pointer_wanted)) + in->abs_pointer_wanted = 0; + + rc = common_bind(&in->c); + if (0 != rc) + return rc; + + qemu_add_kbd_event_handler(xenfb_key_event, in); + in->qmouse = qemu_add_mouse_event_handler(xenfb_mouse_event, in, + in->abs_pointer_wanted, + "Xen PVFB Mouse"); + return 0; +} + +static void input_disconnect(struct XenDevice *xendev) +{ + struct XenInput *in = container_of(xendev, struct XenInput, c.xendev); + + if (in->qmouse) { + qemu_remove_mouse_event_handler(in->qmouse); + in->qmouse = NULL; + } + qemu_add_kbd_event_handler(NULL, NULL); + common_unbind(&in->c); +} + +static void input_event(struct XenDevice *xendev) +{ + struct XenInput *xenfb = container_of(xendev, struct XenInput, c.xendev); + struct xenkbd_page *page = xenfb->c.page; + + /* We don''t understand any keyboard events, so just ignore them. */ + if (page->out_prod == page->out_cons) + return; + page->out_cons = page->out_prod; + xen_be_send_notify(&xenfb->c.xendev); +} + +/* -------------------------------------------------------------------- */ + +static void xenfb_copy_mfns(int mode, int count, unsigned long *dst, void *src) +{ + uint32_t *src32 = src; + uint64_t *src64 = src; + int i; + + for (i = 0; i < count; i++) + dst[i] = (mode == 32) ? src32[i] : src64[i]; +} + +static int xenfb_map_fb(struct XenFB *xenfb) +{ + struct xenfb_page *page = xenfb->c.page; + char *protocol = xenfb->c.xendev.protocol; + int n_fbdirs; + unsigned long *pgmfns = NULL; + unsigned long *fbmfns = NULL; + void *map, *pd; + int mode, ret = -1; + + /* default to native */ + pd = page->pd; + mode = sizeof(unsigned long) * 8; + + if (!protocol) { + /* + * Undefined protocol, some guesswork needed. + * + * Old frontends which don''t set the protocol use + * one page directory only, thus pd[1] must be zero. + * pd[1] of the 32bit struct layout and the lower + * 32 bits of pd[0] of the 64bit struct layout have + * the same location, so we can check that ... + */ + uint32_t *ptr32 = NULL; + uint32_t *ptr64 = NULL; +#if defined(__i386__) + ptr32 = (void*)page->pd; + ptr64 = ((void*)page->pd) + 4; +#elif defined(__x86_64__) + ptr32 = ((void*)page->pd) - 4; + ptr64 = (void*)page->pd; +#endif + if (ptr32) { + if (0 == ptr32[1]) { + mode = 32; + pd = ptr32; + } else { + mode = 64; + pd = ptr64; + } + } +#if defined(__x86_64__) + } else if (0 == strcmp(protocol, XEN_IO_PROTO_ABI_X86_32)) { + /* 64bit dom0, 32bit domU */ + mode = 32; + pd = ((void*)page->pd) - 4; +#elif defined(__i386__) + } else if (0 == strcmp(protocol, XEN_IO_PROTO_ABI_X86_64)) { + /* 32bit dom0, 64bit domU */ + mode = 64; + pd = ((void*)page->pd) + 4; +#endif + } + + if (xenfb->pixels) { + munmap(xenfb->pixels, xenfb->fbpages * XC_PAGE_SIZE); + xenfb->pixels = NULL; + } + + xenfb->fbpages = (xenfb->fb_len + (XC_PAGE_SIZE - 1)) / XC_PAGE_SIZE; + n_fbdirs = xenfb->fbpages * mode / 8; + n_fbdirs = (n_fbdirs + (XC_PAGE_SIZE - 1)) / XC_PAGE_SIZE; + + pgmfns = qemu_mallocz(sizeof(unsigned long) * n_fbdirs); + fbmfns = qemu_mallocz(sizeof(unsigned long) * xenfb->fbpages); + if (!pgmfns || !fbmfns) + goto out; + + xenfb_copy_mfns(mode, n_fbdirs, pgmfns, pd); + map = xc_map_foreign_pages(xen_xc, xenfb->c.xendev.dom, + PROT_READ, pgmfns, n_fbdirs); + if (map == NULL) + goto out; + xenfb_copy_mfns(mode, xenfb->fbpages, fbmfns, map); + munmap(map, n_fbdirs * XC_PAGE_SIZE); + + xenfb->pixels = xc_map_foreign_pages(xen_xc, xenfb->c.xendev.dom, + PROT_READ | PROT_WRITE, fbmfns, xenfb->fbpages); + if (xenfb->pixels == NULL) + goto out; + + ret = 0; /* all is fine */ + +out: + qemu_free(pgmfns); + qemu_free(fbmfns); + return ret; +} + +static int xenfb_configure_fb(struct XenFB *xenfb, size_t fb_len_lim, + int width, int height, int depth, + size_t fb_len, int offset, int row_stride) +{ + size_t mfn_sz = sizeof(*((struct xenfb_page *)0)->pd); + size_t pd_len = sizeof(((struct xenfb_page *)0)->pd) / mfn_sz; + size_t fb_pages = pd_len * XC_PAGE_SIZE / mfn_sz; + size_t fb_len_max = fb_pages * XC_PAGE_SIZE; + int max_width, max_height; + + if (fb_len_lim > fb_len_max) { + xen_be_printf(&xenfb->c.xendev, 0, "fb size limit %zu exceeds %zu, corrected\n", + fb_len_lim, fb_len_max); + fb_len_lim = fb_len_max; + } + if (fb_len_lim && fb_len > fb_len_lim) { + xen_be_printf(&xenfb->c.xendev, 0, "frontend fb size %zu limited to %zu\n", + fb_len, fb_len_lim); + fb_len = fb_len_lim; + } + if (depth != 8 && depth != 16 && depth != 24 && depth != 32) { + xen_be_printf(&xenfb->c.xendev, 0, "can''t handle frontend fb depth %d\n", + depth); + return -1; + } + if (row_stride < 0 || row_stride > fb_len) { + xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend stride %d\n", row_stride); + return -1; + } + max_width = row_stride / (depth / 8); + if (width < 0 || width > max_width) { + xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend width %d limited to %d\n", + width, max_width); + width = max_width; + } + if (offset < 0 || offset >= fb_len) { + xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend offset %d (max %zu)\n", + offset, fb_len - 1); + return -1; + } + max_height = (fb_len - offset) / row_stride; + if (height < 0 || height > max_height) { + xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend height %d limited to %d\n", + height, max_height); + height = max_height; + } + xenfb->fb_len = fb_len; + xenfb->row_stride = row_stride; + xenfb->depth = depth; + xenfb->width = width; + xenfb->height = height; + xenfb->offset = offset; + xen_be_printf(&xenfb->c.xendev, 1, "framebuffer %dx%dx%d offset %d stride %d\n", + width, height, depth, offset, row_stride); + return 0; +} + +/* A convenient function for munging pixels between different depths */ +#define BLT(SRC_T,DST_T,RSB,GSB,BSB,RDB,GDB,BDB) \ + for (line = y ; line < (y+h) ; line++) { \ + SRC_T *src = (SRC_T *)(xenfb->pixels \ + + xenfb->offset \ + + (line * xenfb->row_stride) \ + + (x * xenfb->depth / 8)); \ + DST_T *dst = (DST_T *)(xenfb->c.ds->data \ + + (line * xenfb->c.ds->linesize) \ + + (x * xenfb->c.ds->depth / 8)); \ + int col; \ + const int RSS = 32 - (RSB + GSB + BSB); \ + const int GSS = 32 - (GSB + BSB); \ + const int BSS = 32 - (BSB); \ + const uint32_t RSM = (~0U) << (32 - RSB); \ + const uint32_t GSM = (~0U) << (32 - GSB); \ + const uint32_t BSM = (~0U) << (32 - BSB); \ + const int RDS = 32 - (RDB + GDB + BDB); \ + const int GDS = 32 - (GDB + BDB); \ + const int BDS = 32 - (BDB); \ + const uint32_t RDM = (~0U) << (32 - RDB); \ + const uint32_t GDM = (~0U) << (32 - GDB); \ + const uint32_t BDM = (~0U) << (32 - BDB); \ + for (col = x ; col < (x+w) ; col++) { \ + uint32_t spix = *src; \ + *dst = (((spix << RSS) & RSM & RDM) >> RDS) | \ + (((spix << GSS) & GSM & GDM) >> GDS) | \ + (((spix << BSS) & BSM & BDM) >> BDS); \ + src = (SRC_T *) ((unsigned long) src + xenfb->depth / 8); \ + dst = (DST_T *) ((unsigned long) dst + xenfb->c.ds->depth / 8); \ + } \ + } + + +/* This copies data from the guest framebuffer region, into QEMU''s copy + * NB. QEMU''s copy is stored in the pixel format of a) the local X + * server (SDL case) or b) the current VNC client pixel format. + * When shifting between colour depths we preserve the MSB. + */ +static void xenfb_guest_copy(struct XenFB *xenfb, int x, int y, int w, int h) +{ + int line; + + /* + * TODO: xen''s qemu-dm seems to have some patches to + * make the qemu display code avoid unneeded + * work. + * - Port them over. + * - Put ds->shared_buf back into use then. + */ + if (1 /* !xenfb->c.ds->shared_buf */) { + if (xenfb->depth == xenfb->c.ds->depth) { /* Perfect match can use fast path */ + for (line = y ; line < (y+h) ; line++) { + memcpy(xenfb->c.ds->data + (line * xenfb->c.ds->linesize) + (x * xenfb->c.ds->depth / 8), + xenfb->pixels + xenfb->offset + (line * xenfb->row_stride) + (x * xenfb->depth / 8), + w * xenfb->depth / 8); + } + } else { /* Mismatch requires slow pixel munging */ + /* 8 bit == r:3 g:3 b:2 */ + /* 16 bit == r:5 g:6 b:5 */ + /* 24 bit == r:8 g:8 b:8 */ + /* 32 bit == r:8 g:8 b:8 (padding:8) */ + if (xenfb->depth == 8) { + if (xenfb->c.ds->depth == 16) { + BLT(uint8_t, uint16_t, 3, 3, 2, 5, 6, 5); + } else if (xenfb->c.ds->depth == 32) { + BLT(uint8_t, uint32_t, 3, 3, 2, 8, 8, 8); + } + } else if (xenfb->depth == 16) { + if (xenfb->c.ds->depth == 8) { + BLT(uint16_t, uint8_t, 5, 6, 5, 3, 3, 2); + } else if (xenfb->c.ds->depth == 32) { + BLT(uint16_t, uint32_t, 5, 6, 5, 8, 8, 8); + } + } else if (xenfb->depth == 24 || xenfb->depth == 32) { + if (xenfb->c.ds->depth == 8) { + BLT(uint32_t, uint8_t, 8, 8, 8, 3, 3, 2); + } else if (xenfb->c.ds->depth == 16) { + BLT(uint32_t, uint16_t, 8, 8, 8, 5, 6, 5); + } else if (xenfb->c.ds->depth == 32) { + BLT(uint32_t, uint32_t, 8, 8, 8, 8, 8, 8); + } + } + } + } + dpy_update(xenfb->c.ds, x, y, w, h); +} + +#ifdef XENFB_TYPE_REFRESH_PERIOD +static int xenfb_queue_full(struct XenFB *xenfb) +{ + struct xenfb_page *page = xenfb->c.page; + uint32_t cons, prod; + + if (!page) + return 1; + + prod = page->in_prod; + cons = page->in_cons; + return prod - cons == XENFB_IN_RING_LEN; +} + +static void xenfb_send_event(struct XenFB *xenfb, union xenfb_in_event *event) +{ + uint32_t prod; + struct xenfb_page *page = xenfb->c.page; + + prod = page->in_prod; + /* caller ensures !xenfb_queue_full() */ + xen_mb(); /* ensure ring space available */ + XENFB_IN_RING_REF(page, prod) = *event; + xen_wmb(); /* ensure ring contents visible */ + page->in_prod = prod + 1; + + xen_be_send_notify(&xenfb->c.xendev); +} + +static void xenfb_send_refresh_period(struct XenFB *xenfb, int period) +{ + union xenfb_in_event event; + + memset(&event, 0, sizeof(event)); + event.type = XENFB_TYPE_REFRESH_PERIOD; + event.refresh_period.period = period; + xenfb_send_event(xenfb, &event); +} +#endif + +/* + * Periodic update of display. + * Also transmit the refresh interval to the frontend. + * + * Never ever do any qemu display operations + * (resize, screen update) outside this function. + * Our screen might be inactive. When asked for + * an update we know it is active. + */ +static void xenfb_update(void *opaque) +{ + struct XenFB *xenfb = opaque; + int i; + + if (xenfb->feature_update) { +#ifdef XENFB_TYPE_REFRESH_PERIOD + int period; + + if (xenfb_queue_full(xenfb)) + return; + + if (xenfb->c.ds->idle) + period = XENFB_NO_REFRESH; + else { + period = xenfb->c.ds->gui_timer_interval; + if (!period) + period = GUI_REFRESH_INTERVAL; + } + + if (xenfb->refresh_period != period) { + xenfb_send_refresh_period(xenfb, period); + xenfb->refresh_period = period; + } +#else + ; /* nothing */ +#endif + } else { + /* we don''t get update notifications, thus use the + * sledge hammer approach ... */ + xenfb->up_fullscreen = 1; + } + + /* resize if needed */ + if (xenfb->width != xenfb->c.ds->width || xenfb->height != xenfb->c.ds->height) { + xen_be_printf(&xenfb->c.xendev, 1, "update: resizing\n"); + dpy_resize(xenfb->c.ds, xenfb->width, xenfb->height); + xenfb->up_fullscreen = 1; + } + + /* run queued updates */ + if (xenfb->up_fullscreen) { + xen_be_printf(&xenfb->c.xendev, 3, "update: fullscreen\n"); + xenfb_guest_copy(xenfb, 0, 0, xenfb->width, xenfb->height); + } else if (xenfb->up_count) { + xen_be_printf(&xenfb->c.xendev, 3, "update: %d rects\n", xenfb->up_count); + for (i = 0; i < xenfb->up_count; i++) + xenfb_guest_copy(xenfb, + xenfb->up_rects[i].x, + xenfb->up_rects[i].y, + xenfb->up_rects[i].w, + xenfb->up_rects[i].h); + } else { + xen_be_printf(&xenfb->c.xendev, 3, "update: nothing\n"); + } + xenfb->up_count = 0; + xenfb->up_fullscreen = 0; +} + +/* QEMU display state changed, so refresh the framebuffer copy */ +static void xenfb_invalidate(void *opaque) +{ + struct XenFB *xenfb = opaque; + xenfb->up_fullscreen = 1; +} + +static void xenfb_handle_events(struct XenFB *xenfb) +{ + uint32_t prod, cons; + struct xenfb_page *page = xenfb->c.page; + + prod = page->out_prod; + if (prod == page->out_cons) + return; + xen_rmb(); /* ensure we see ring contents up to prod */ + for (cons = page->out_cons; cons != prod; cons++) { + union xenfb_out_event *event = &XENFB_OUT_RING_REF(page, cons); + int x, y, w, h; + + switch (event->type) { + case XENFB_TYPE_UPDATE: + if (xenfb->up_count == UP_QUEUE) + xenfb->up_fullscreen = 1; + if (xenfb->up_fullscreen) + break; + x = MAX(event->update.x, 0); + y = MAX(event->update.y, 0); + w = MIN(event->update.width, xenfb->width - x); + h = MIN(event->update.height, xenfb->height - y); + if (w < 0 || h < 0) { + fprintf(stderr, "xen be: %s: bogus update ignored\n", + xenfb->c.xendev.name); + break; + } + if (x != event->update.x || y != event->update.y + || w != event->update.width + || h != event->update.height) { + fprintf(stderr, "xen be: %s: bogus update clipped\n", + xenfb->c.xendev.name); + } + if (w == xenfb->width && h > xenfb->height / 2) { + /* scroll detector: updated more than 50% of the lines, + * don''t bother keeping track of the rectangles then */ + xenfb->up_fullscreen = 1; + } else { + xenfb->up_rects[xenfb->up_count].x = x; + xenfb->up_rects[xenfb->up_count].y = y; + xenfb->up_rects[xenfb->up_count].w = w; + xenfb->up_rects[xenfb->up_count].h = h; + xenfb->up_count++; + } + break; +#ifdef XENFB_TYPE_RESIZE + case XENFB_TYPE_RESIZE: + if (xenfb_configure_fb(xenfb, xenfb->fb_len, + event->resize.width, + event->resize.height, + event->resize.depth, + xenfb->fb_len, + event->resize.offset, + event->resize.stride) < 0) + break; + xenfb_invalidate(xenfb); + break; +#endif + } + } + xen_mb(); /* ensure we''re done with ring contents */ + page->out_cons = cons; +} + +static int fb_init(struct XenDevice *xendev) +{ + struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev); + + if (!fb->c.ds) { + /* xen_set_display() below will set that and trigger us again */ + xen_be_printf(xendev, 1, "ds not set (yet)\n"); + return -1; + } + + fb->refresh_period = -1; + +#ifdef XENFB_TYPE_RESIZE + xenstore_write_be_int(xendev, "feature-resize", 1); +#endif + return 0; +} + +static int fb_connect(struct XenDevice *xendev) +{ + struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev); + struct xenfb_page *fb_page; + int videoram; + int rc; + + if (-1 == xenstore_read_fe_int(xendev, "feature-update", &fb->feature_update)) + fb->feature_update = 0; + if (fb->feature_update) + xenstore_write_be_int(xendev, "request-update", 1); + + if (-1 == xenstore_read_fe_int(xendev, "videoram", &videoram)) + videoram = 0; + + rc = common_bind(&fb->c); + if (0 != rc) + return rc; + + fb_page = fb->c.page; + rc = xenfb_configure_fb(fb, videoram * 1024 * 1024U, + fb_page->width, fb_page->height, fb_page->depth, + fb_page->mem_length, 0, fb_page->line_length); + if (0 != rc) + return rc; + + rc = xenfb_map_fb(fb); + if (0 != rc) + return rc; + + if (!fb->have_console) { + graphic_console_init(fb->c.ds, + xenfb_update, + xenfb_invalidate, + NULL, + NULL, + fb); + fb->have_console = 1; + } + + xen_be_printf(xendev, 1, "feature-update=%d, videoram=%d\n", + fb->feature_update, videoram); + + return 0; +} + +static void fb_disconnect(struct XenDevice *xendev) +{ + struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev); + + /* + * FIXME: qemu can''t un-init gfx display (yet?). + * Replacing the framebuffer with anonymous shared memory + * instead. This releases the guest pages and keeps qemu happy. + */ + fb->pixels = mmap(fb->pixels, fb->fbpages * XC_PAGE_SIZE, + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON, + -1, 0); + common_unbind(&fb->c); + fb->feature_update = 0; + fb->bug_trigger = 0; +} + +static void fb_frontend_changed(struct XenDevice *xendev, const char *node) +{ + struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev); + + /* + * Set state to Connected *again* once the frontend switched + * to connected. We must trigger the watch a second time to + * workaround a frontend bug. + */ + if (0 == fb->bug_trigger && 0 == strcmp(node, "state") && + xendev->fe_state == XenbusStateConnected && + xendev->be_state == XenbusStateConnected) { + xen_be_set_state(xendev, XenbusStateConnected); + fb->bug_trigger = 1; /* only once */ + } +} + +static void fb_event(struct XenDevice *xendev) +{ + struct XenFB *xenfb = container_of(xendev, struct XenFB, c.xendev); + + xenfb_handle_events(xenfb); + xen_be_send_notify(&xenfb->c.xendev); +} + +/* -------------------------------------------------------------------- */ + +struct XenDevOps xen_kbdmouse_ops = { + .size = sizeof(struct XenInput), + .init = input_init, + .connect = input_connect, + .disconnect = input_disconnect, + .event = input_event, +}; + +struct XenDevOps xen_framebuffer_ops = { + .size = sizeof(struct XenFB), + .init = fb_init, + .connect = fb_connect, + .disconnect = fb_disconnect, + .event = fb_event, + .frontend_changed = fb_frontend_changed, +}; + +static void xen_set_display_type(int domid, char *type, DisplayState *ds) +{ + struct XenDevice *xendev; + struct common *c; + + xendev = xen_be_find_xendev(type, domid, 0); + if (!xendev) + return; + c = container_of(xendev, struct common, xendev); + c->ds = ds; + xen_be_printf(xendev, 1, "ds is %p\n", ds); + /* retry ->init() */ + xen_be_check_state(xendev); +} + +void xen_set_display(int domid, DisplayState *ds) +{ + xen_set_display_type(domid, "vkbd", ds); + xen_set_display_type(domid, "vfb", ds); +} diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 93f66d0..d1d0a93 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -99,6 +99,11 @@ static int xen_init_pv(DisplayState *ds) /* xenbus backend drivers */ xen_be_register("console", &xen_console_ops); + xen_be_register("vkbd", &xen_kbdmouse_ops); + xen_be_register("vfb", &xen_framebuffer_ops); + + /* setup framebuffer */ + xen_set_display(xen_domid, ds); return 0; } -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 09/13] xen: add block device backend driver.
This patch adds a block device backend driver to qemu. It is a pure userspace implemention using the gntdev interface. It uses "qdisk" as backend name in xenstore so it doesn''t interfere with the other existing backends (blkback aka "vbd" and tapdisk aka "tap"). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 2 +- hw/xen_backend.h | 2 + hw/xen_blkif.h | 103 ++++++++ hw/xen_disk.c | 693 +++++++++++++++++++++++++++++++++++++++++++++++++++ hw/xen_machine_pv.c | 1 + sysemu.h | 2 +- vl.c | 4 + 7 files changed, 805 insertions(+), 2 deletions(-) create mode 100644 hw/xen_blkif.h create mode 100644 hw/xen_disk.c diff --git a/Makefile.target b/Makefile.target index bcf4d93..a60ca16 100644 --- a/Makefile.target +++ b/Makefile.target @@ -517,7 +517,7 @@ endif # xen backend driver support XEN_OBJS := xen_machine_pv.o xen_backend.o -XEN_OBJS += xen_console.o xen_framebuffer.o +XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o ifeq ($(CONFIG_XEN), yes) OBJS += $(XEN_OBJS) LIBS += $(XEN_LIBS) diff --git a/hw/xen_backend.h b/hw/xen_backend.h index d8bc23a..9dbcb19 100644 --- a/hw/xen_backend.h +++ b/hw/xen_backend.h @@ -2,6 +2,7 @@ #define QEMU_XEN_BACKEND_H 1 #include "xen_common.h" +#include "sysemu.h" /* ------------------------------------------------------------- */ @@ -87,6 +88,7 @@ void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ... struct XenDevOps xen_console_ops; /* xen_console.c */ struct XenDevOps xen_kbdmouse_ops; /* xen_framebuffer.c */ struct XenDevOps xen_framebuffer_ops; /* xen_framebuffer.c */ +struct XenDevOps xen_blkdev_ops; /* xen_disk.c */ void xen_set_display(int domid, DisplayState *ds); diff --git a/hw/xen_blkif.h b/hw/xen_blkif.h new file mode 100644 index 0000000..254a5fd --- /dev/null +++ b/hw/xen_blkif.h @@ -0,0 +1,103 @@ +#ifndef __XEN_BLKIF_H__ +#define __XEN_BLKIF_H__ + +#include <xen/io/ring.h> +#include <xen/io/blkif.h> +#include <xen/io/protocols.h> + +/* Not a real protocol. Used to generate ring structs which contain + * the elements common to all protocols only. This way we get a + * compiler-checkable way to use common struct elements, so we can + * avoid using switch(protocol) in a number of places. */ +struct blkif_common_request { + char dummy; +}; +struct blkif_common_response { + char dummy; +}; + +/* i386 protocol version */ +#pragma pack(push, 4) +struct blkif_x86_32_request { + uint8_t operation; /* BLKIF_OP_??? */ + uint8_t nr_segments; /* number of segments */ + blkif_vdev_t handle; /* only for read/write requests */ + uint64_t id; /* private guest value, echoed in resp */ + blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; +}; +struct blkif_x86_32_response { + uint64_t id; /* copied from request */ + uint8_t operation; /* copied from request */ + int16_t status; /* BLKIF_RSP_??? */ +}; +typedef struct blkif_x86_32_request blkif_x86_32_request_t; +typedef struct blkif_x86_32_response blkif_x86_32_response_t; +#pragma pack(pop) + +/* x86_64 protocol version */ +struct blkif_x86_64_request { + uint8_t operation; /* BLKIF_OP_??? */ + uint8_t nr_segments; /* number of segments */ + blkif_vdev_t handle; /* only for read/write requests */ + uint64_t __attribute__((__aligned__(8))) id; + blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ + struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; +}; +struct blkif_x86_64_response { + uint64_t __attribute__((__aligned__(8))) id; + uint8_t operation; /* copied from request */ + int16_t status; /* BLKIF_RSP_??? */ +}; +typedef struct blkif_x86_64_request blkif_x86_64_request_t; +typedef struct blkif_x86_64_response blkif_x86_64_response_t; + +DEFINE_RING_TYPES(blkif_common, struct blkif_common_request, struct blkif_common_response); +DEFINE_RING_TYPES(blkif_x86_32, struct blkif_x86_32_request, struct blkif_x86_32_response); +DEFINE_RING_TYPES(blkif_x86_64, struct blkif_x86_64_request, struct blkif_x86_64_response); + +union blkif_back_rings { + blkif_back_ring_t native; + blkif_common_back_ring_t common; + blkif_x86_32_back_ring_t x86_32; + blkif_x86_64_back_ring_t x86_64; +}; +typedef union blkif_back_rings blkif_back_rings_t; + +enum blkif_protocol { + BLKIF_PROTOCOL_NATIVE = 1, + BLKIF_PROTOCOL_X86_32 = 2, + BLKIF_PROTOCOL_X86_64 = 3, +}; + +static void inline blkif_get_x86_32_req(blkif_request_t *dst, blkif_x86_32_request_t *src) +{ + int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST; + + dst->operation = src->operation; + dst->nr_segments = src->nr_segments; + dst->handle = src->handle; + dst->id = src->id; + dst->sector_number = src->sector_number; + if (n > src->nr_segments) + n = src->nr_segments; + for (i = 0; i < n; i++) + dst->seg[i] = src->seg[i]; +} + +static void inline blkif_get_x86_64_req(blkif_request_t *dst, blkif_x86_64_request_t *src) +{ + int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST; + + dst->operation = src->operation; + dst->nr_segments = src->nr_segments; + dst->handle = src->handle; + dst->id = src->id; + dst->sector_number = src->sector_number; + if (n > src->nr_segments) + n = src->nr_segments; + for (i = 0; i < n; i++) + dst->seg[i] = src->seg[i]; +} + +#endif /* __XEN_BLKIF_H__ */ diff --git a/hw/xen_disk.c b/hw/xen_disk.c new file mode 100644 index 0000000..5c1cbdf --- /dev/null +++ b/hw/xen_disk.c @@ -0,0 +1,693 @@ +/* + * xen paravirt block device backend + * + * (c) Gerd Hoffmann <kraxel@redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/* + * FIXME: the code is designed to handle multiple outstanding + * requests (using aio or using threads), which isn''t used right + * now due to limitations of the qemu block driver interface. + */ + +#include <stdio.h> +#include <stdlib.h> +#include <stdarg.h> +#include <string.h> +#include <unistd.h> +#include <signal.h> +#include <inttypes.h> +#include <time.h> +#include <fcntl.h> +#include <errno.h> +#include <sys/ioctl.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/mman.h> +#include <sys/uio.h> + +#include <xs.h> +#include <xenctrl.h> +#include <xen/io/xenbus.h> + +#include "hw.h" +#include "block_int.h" +#include "qemu-char.h" +#include "xen_blkif.h" +#include "xen_backend.h" + +/* ------------------------------------------------------------- */ + +#define BLOCK_SIZE 512 +#define IOCB_COUNT (BLKIF_MAX_SEGMENTS_PER_REQUEST + 2) + +struct ioreq { + blkif_request_t req; + int16_t status; + + /* parsed request */ + off_t start, end; + struct iovec vec[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + int vecs; + int presync; + int postsync; + + /* grant mapping */ + uint32_t domids[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + uint32_t refs[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + int prot; + void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + void *pages; + + struct XenBlkDev *blkdev; + LIST_ENTRY(ioreq) list; +}; + +struct XenBlkDev { + struct XenDevice xendev; /* must be first */ + char *params; + char *mode; + char *type; + char *dev; + char *devtype; + char *fileproto; + char *filename; + int ring_ref; + void *sring; + int64_t file_blk; + int64_t file_size; + int protocol; + blkif_back_rings_t rings; + int more_work; + int cnt_map; + + /* request lists */ + LIST_HEAD(inflight_head, ioreq) inflight; + LIST_HEAD(finished_head, ioreq) finished; + LIST_HEAD(freelist_head, ioreq) freelist; + int requests; + + /* qemu block driver */ + int index; + BlockDriverState *bs; +}; + +static int syncwrite = 0; +static int batch_maps = 0; +static int max_requests = 32; + +/* ------------------------------------------------------------- */ + +static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) +{ + struct ioreq *ioreq = NULL; + + if (LIST_EMPTY(&blkdev->freelist)) { + if (blkdev->requests >= max_requests) + goto out; + /* allocate new struct */ + ioreq = qemu_mallocz(sizeof(*ioreq)); + if (NULL == ioreq) + goto out; + ioreq->blkdev = blkdev; + blkdev->requests++; + } else { + /* get one from freelist */ + ioreq = LIST_FIRST(&blkdev->freelist); + LIST_REMOVE(ioreq, list); + } + LIST_INSERT_HEAD(&blkdev->inflight, ioreq, list); + +out: + return ioreq; +} + +static void ioreq_finish(struct ioreq *ioreq) +{ + struct XenBlkDev *blkdev = ioreq->blkdev; + + LIST_REMOVE(ioreq, list); + LIST_INSERT_HEAD(&blkdev->finished, ioreq, list); +} + +static void ioreq_release(struct ioreq *ioreq) +{ + struct XenBlkDev *blkdev = ioreq->blkdev; + + LIST_REMOVE(ioreq, list); + memset(ioreq, 0, sizeof(*ioreq)); + ioreq->blkdev = blkdev; + LIST_INSERT_HEAD(&blkdev->freelist, ioreq, list); +} + +/* + * translate request into iovec + start offset + end offset + * do sanity checks along the way + */ +static int ioreq_parse(struct ioreq *ioreq) +{ + struct XenBlkDev *blkdev = ioreq->blkdev; + uintptr_t mem; + size_t len; + int i; + + xen_be_printf(&blkdev->xendev, 3, + "op %d, nr %d, handle %d, id %" PRId64 ", sector %" PRId64 "\n", + ioreq->req.operation, ioreq->req.nr_segments, + ioreq->req.handle, ioreq->req.id, ioreq->req.sector_number); + switch (ioreq->req.operation) { + case BLKIF_OP_READ: + ioreq->prot = PROT_WRITE; /* to memory */ + if (BLKIF_OP_READ != ioreq->req.operation && blkdev->mode[0] != ''w'') { + xen_be_printf(&blkdev->xendev, 0, "error: write req for ro device\n"); + goto err; + } + break; + case BLKIF_OP_WRITE_BARRIER: + if (!syncwrite) + ioreq->presync = ioreq->postsync = 1; + /* fall through */ + case BLKIF_OP_WRITE: + ioreq->prot = PROT_READ; /* from memory */ + if (syncwrite) + ioreq->postsync = 1; + break; + default: + xen_be_printf(&blkdev->xendev, 0, "error: unknown operation (%d)\n", + ioreq->req.operation); + goto err; + }; + + ioreq->start = ioreq->end = ioreq->req.sector_number * blkdev->file_blk; + for (i = 0; i < ioreq->req.nr_segments; i++) { + if (i == BLKIF_MAX_SEGMENTS_PER_REQUEST) { + xen_be_printf(&blkdev->xendev, 0, "error: nr_segments too big\n"); + goto err; + } + if (ioreq->req.seg[i].first_sect > ioreq->req.seg[i].last_sect) { + xen_be_printf(&blkdev->xendev, 0, "error: first > last sector\n"); + goto err; + } + if (ioreq->req.seg[i].last_sect * BLOCK_SIZE >= XC_PAGE_SIZE) { + xen_be_printf(&blkdev->xendev, 0, "error: page crossing\n"); + goto err; + } + len = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) * blkdev->file_blk; + + ioreq->domids[i] = blkdev->xendev.dom; + ioreq->refs[i] = ioreq->req.seg[i].gref; + mem = ioreq->req.seg[i].first_sect * blkdev->file_blk; + + ioreq->vec[i].iov_base = (void*)mem; + ioreq->vec[i].iov_len = len; + ioreq->end += len; + } + if (ioreq->end > blkdev->file_size) { + xen_be_printf(&blkdev->xendev, 0, "error: access beyond end of file\n"); + goto err; + } + ioreq->vecs = i; + return 0; + +err: + ioreq->status = BLKIF_RSP_ERROR; + return -1; +} + +static void ioreq_unmap(struct ioreq *ioreq) +{ + int gnt = ioreq->blkdev->xendev.gnttabdev; + int i; + + if (batch_maps) { + if (!ioreq->pages) + return; + if (0 != xc_gnttab_munmap(gnt, ioreq->pages, ioreq->vecs)) + xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap failed: %s\n", + strerror(errno)); + ioreq->blkdev->cnt_map -= ioreq->vecs; + ioreq->pages = NULL; + } else { + for (i = 0; i < ioreq->vecs; i++) { + if (!ioreq->page[i]) + continue; + if (0 != xc_gnttab_munmap(gnt, ioreq->page[i], 1)) + xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap failed: %s\n", + strerror(errno)); + ioreq->blkdev->cnt_map--; + ioreq->page[i] = NULL; + } + } +} + +static int ioreq_map(struct ioreq *ioreq) +{ + int gnt = ioreq->blkdev->xendev.gnttabdev; + int i; + + if (batch_maps) { + ioreq->pages = xc_gnttab_map_grant_refs + (gnt, ioreq->vecs, ioreq->domids, ioreq->refs, ioreq->prot); + if (NULL == ioreq->pages) { + xen_be_printf(&ioreq->blkdev->xendev, 0, + "can''t map %d grant refs (%s, %d maps)\n", + ioreq->vecs, strerror(errno), ioreq->blkdev->cnt_map); + return -1; + } + for (i = 0; i < ioreq->vecs; i++) + ioreq->vec[i].iov_base = ioreq->pages + i * XC_PAGE_SIZE + + (uintptr_t)ioreq->vec[i].iov_base; + ioreq->blkdev->cnt_map += ioreq->vecs; + } else { + for (i = 0; i < ioreq->vecs; i++) { + ioreq->page[i] = xc_gnttab_map_grant_ref + (gnt, ioreq->domids[i], ioreq->refs[i], ioreq->prot); + if (NULL == ioreq->page[i]) { + xen_be_printf(&ioreq->blkdev->xendev, 0, + "can''t map grant ref %d (%s, %d maps)\n", + ioreq->refs[i], strerror(errno), ioreq->blkdev->cnt_map); + ioreq_unmap(ioreq); + return -1; + } + ioreq->vec[i].iov_base = ioreq->page[i] + (uintptr_t)ioreq->vec[i].iov_base; + ioreq->blkdev->cnt_map++; + } + } + return 0; +} + +static int ioreq_runio_qemu(struct ioreq *ioreq) +{ + struct XenBlkDev *blkdev = ioreq->blkdev; + int i, rc, len = 0; + off_t pos; + + if (-1 == ioreq_map(ioreq)) + goto err; + if (ioreq->presync) + bdrv_flush(blkdev->bs); + + switch (ioreq->req.operation) { + case BLKIF_OP_READ: + pos = ioreq->start; + for (i = 0; i < ioreq->vecs; i++) { + rc = bdrv_read(blkdev->bs, pos / BLOCK_SIZE, + ioreq->vec[i].iov_base, + ioreq->vec[i].iov_len / BLOCK_SIZE); + if (rc != 0) { + xen_be_printf(&blkdev->xendev, 0, "rd I/O error (%p, len %zd)\n", + ioreq->vec[i].iov_base, + ioreq->vec[i].iov_len); + goto err; + } + len += ioreq->vec[i].iov_len; + pos += ioreq->vec[i].iov_len; + } + break; + case BLKIF_OP_WRITE: + case BLKIF_OP_WRITE_BARRIER: + pos = ioreq->start; + for (i = 0; i < ioreq->vecs; i++) { + rc = bdrv_write(blkdev->bs, pos / BLOCK_SIZE, + ioreq->vec[i].iov_base, + ioreq->vec[i].iov_len / BLOCK_SIZE); + if (rc != 0) { + xen_be_printf(&blkdev->xendev, 0, "wr I/O error (%p, len %zd)\n", + ioreq->vec[i].iov_base, + ioreq->vec[i].iov_len); + goto err; + } + len += ioreq->vec[i].iov_len; + pos += ioreq->vec[i].iov_len; + } + break; + default: + /* unknown operation (shouldn''t happen -- parse catches this) */ + goto err; + } + + if (ioreq->postsync) + bdrv_flush(blkdev->bs); + ioreq->status = BLKIF_RSP_OKAY; + + ioreq_unmap(ioreq); + return 0; + +err: + ioreq->status = BLKIF_RSP_ERROR; + return -1; +} + +static int blk_send_response_one(struct ioreq *ioreq) +{ + struct XenBlkDev *blkdev = ioreq->blkdev; + int send_notify = 0; + int have_requests = 0; + blkif_response_t resp; + void *dst; + + resp.id = ioreq->req.id; + resp.operation = ioreq->req.operation; + resp.status = ioreq->status; + + /* Place on the response ring for the relevant domain. */ + switch (blkdev->protocol) { + case BLKIF_PROTOCOL_NATIVE: + dst = RING_GET_RESPONSE(&blkdev->rings.native, blkdev->rings.native.rsp_prod_pvt); + break; + case BLKIF_PROTOCOL_X86_32: + dst = RING_GET_RESPONSE(&blkdev->rings.x86_32, blkdev->rings.x86_32.rsp_prod_pvt); + break; + case BLKIF_PROTOCOL_X86_64: + dst = RING_GET_RESPONSE(&blkdev->rings.x86_64, blkdev->rings.x86_64.rsp_prod_pvt); + break; + default: + dst = NULL; + } + memcpy(dst, &resp, sizeof(resp)); + blkdev->rings.common.rsp_prod_pvt++; + + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blkdev->rings.common, send_notify); + if (blkdev->rings.common.rsp_prod_pvt == blkdev->rings.common.req_cons) { + /* + * Tail check for pending requests. Allows frontend to avoid + * notifications if requests are already in flight (lower + * overheads and promotes batching). + */ + RING_FINAL_CHECK_FOR_REQUESTS(&blkdev->rings.common, have_requests); + } else if (RING_HAS_UNCONSUMED_REQUESTS(&blkdev->rings.common)) { + have_requests = 1; + } + + if (have_requests) + blkdev->more_work++; + return send_notify; +} + +/* walk finished list, send outstanding responses, free requests */ +static void blk_send_response_all(struct XenBlkDev *blkdev) +{ + struct ioreq *ioreq; + int send_notify = 0; + + while (!LIST_EMPTY(&blkdev->finished)) { + ioreq = LIST_FIRST(&blkdev->finished); + send_notify += blk_send_response_one(ioreq); + ioreq_release(ioreq); + } + if (send_notify) + xen_be_send_notify(&blkdev->xendev); +} + +static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq, RING_IDX rc) +{ + switch (blkdev->protocol) { + case BLKIF_PROTOCOL_NATIVE: + memcpy(&ioreq->req, RING_GET_REQUEST(&blkdev->rings.native, rc), + sizeof(ioreq->req)); + break; + case BLKIF_PROTOCOL_X86_32: + blkif_get_x86_32_req(&ioreq->req, RING_GET_REQUEST(&blkdev->rings.x86_32, rc)); + break; + case BLKIF_PROTOCOL_X86_64: + blkif_get_x86_64_req(&ioreq->req, RING_GET_REQUEST(&blkdev->rings.x86_64, rc)); + break; + } + return 0; +} + +static void blk_handle_requests(struct XenBlkDev *blkdev) +{ + RING_IDX rc, rp; + struct ioreq *ioreq; + + do { + blkdev->more_work = 0; + + rc = blkdev->rings.common.req_cons; + rp = blkdev->rings.common.sring->req_prod; + xen_rmb(); /* Ensure we see queued requests up to ''rp''. */ + + /* Limit #of requests we queue up for I/O so we ack requests + * faster if busy. Improves backend/frontend parallelism and + * reduces evchn signaling. */ + if (rp > rc + (max_requests >> 2)) { + rp = rc + (max_requests >> 2); + blkdev->more_work++; + } + + while ((rc != rp)) { + /* pull request from ring */ + if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) + break; + ioreq = ioreq_start(blkdev); + if (NULL == ioreq) { + blkdev->more_work++; + break; + } + blk_get_request(blkdev, ioreq, rc); + blkdev->rings.common.req_cons = ++rc; + + /* parse them */ + if (0 != ioreq_parse(ioreq)) { + if (blk_send_response_one(ioreq)) + xen_be_send_notify(&blkdev->xendev); + ioreq_release(ioreq); + continue; + } + + /* run i/o in qemu mode */ + ioreq_runio_qemu(ioreq); + ioreq_finish(ioreq); + } + blk_send_response_all(blkdev); + + } while (blkdev->more_work); +} + +/* ------------------------------------------------------------- */ + +static void blk_alloc(struct XenDevice *xendev) +{ + struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + + LIST_INIT(&blkdev->inflight); + LIST_INIT(&blkdev->finished); + LIST_INIT(&blkdev->freelist); +} + +static int blk_init(struct XenDevice *xendev) +{ + struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + int mode, qflags, have_barriers, info = 0; + char *h; + + /* read xenstore entries */ + if (NULL == blkdev->params) { + blkdev->params = xenstore_read_be_str(&blkdev->xendev, "params"); + if (NULL != (h = strchr(blkdev->params, '':''))) { + blkdev->fileproto = blkdev->params; + blkdev->filename = h+1; + *h = 0; + } else { + blkdev->fileproto = "<unset>"; + blkdev->filename = blkdev->params; + } + } + if (NULL == blkdev->mode) + blkdev->mode = xenstore_read_be_str(&blkdev->xendev, "mode"); + if (NULL == blkdev->type) + blkdev->type = xenstore_read_be_str(&blkdev->xendev, "type"); + if (NULL == blkdev->dev) + blkdev->dev = xenstore_read_be_str(&blkdev->xendev, "dev"); + if (NULL == blkdev->devtype) + blkdev->devtype = xenstore_read_be_str(&blkdev->xendev, "device-type"); + + /* do we have all we need? */ + if (NULL == blkdev->params || + NULL == blkdev->mode || + NULL == blkdev->type || + NULL == blkdev->dev) + return -1; + + /* read-only ? */ + if (0 == strcmp(blkdev->mode, "w")) { + mode = O_RDWR; + qflags = BDRV_O_RDWR; + } else { + mode = O_RDONLY; + qflags = BDRV_O_RDONLY; + info |= VDISK_READONLY; + } + + /* cdrom ? */ + if (blkdev->devtype && !strcmp(blkdev->devtype, "cdrom")) + info |= VDISK_CDROM; + + /* init qemu block driver */ + blkdev->index = (blkdev->xendev.dev - 202 * 256) / 16; + blkdev->index = drive_get_index(IF_XEN, 0, blkdev->index); + if (blkdev->index == -1) { + /* setup via xenbus -> create new block driver instance */ + xen_be_printf(&blkdev->xendev, 2, "create new bdrv (xenbus setup)\n"); + blkdev->bs = bdrv_new(blkdev->dev); + if (blkdev->bs) { + if (0 != bdrv_open2(blkdev->bs, blkdev->filename, qflags, + bdrv_find_format(blkdev->fileproto))) { + bdrv_delete(blkdev->bs); + blkdev->bs = NULL; + } + } + if (!blkdev->bs) + return -1; + } else { + /* setup via qemu cmdline -> already setup for us */ + xen_be_printf(&blkdev->xendev, 2, "get configured bdrv (cmdline setup)\n"); + blkdev->bs = drives_table[blkdev->index].bdrv; + } + blkdev->file_blk = BLOCK_SIZE; + blkdev->file_size = bdrv_getlength(blkdev->bs); + if (blkdev->file_size < 0) { + xen_be_printf(&blkdev->xendev, 1, "bdrv_getlength: %d (%s) | drv %s\n", + (int)blkdev->file_size, strerror(-blkdev->file_size), + blkdev->bs->drv ? blkdev->bs->drv->format_name : "-"); + blkdev->file_size = 0; + } + have_barriers = blkdev->bs->drv && blkdev->bs->drv->bdrv_flush ? 1 : 0; + + xen_be_printf(xendev, 1, "type \"%s\", fileproto \"%s\", filename \"%s\"," + " size %" PRId64 " (%" PRId64 " MB)\n", + blkdev->type, blkdev->fileproto, blkdev->filename, + blkdev->file_size, blkdev->file_size >> 20); + + /* fill info */ + xenstore_write_be_int(&blkdev->xendev, "feature-barrier", have_barriers); + xenstore_write_be_int(&blkdev->xendev, "info", info); + xenstore_write_be_int(&blkdev->xendev, "sector-size", blkdev->file_blk); + xenstore_write_be_int(&blkdev->xendev, "sectors", + blkdev->file_size / blkdev->file_blk); + return 0; +} + +static int blk_connect(struct XenDevice *xendev) +{ + struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + + if (-1 == xenstore_read_fe_int(&blkdev->xendev, "ring-ref", &blkdev->ring_ref)) + return -1; + if (-1 == xenstore_read_fe_int(&blkdev->xendev, "event-channel", + &blkdev->xendev.remote_port)) + return -1; + + blkdev->protocol = BLKIF_PROTOCOL_NATIVE; + if (blkdev->xendev.protocol) { + if (0 == strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_32)) + blkdev->protocol = BLKIF_PROTOCOL_X86_32; + if (0 == strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_64)) + blkdev->protocol = BLKIF_PROTOCOL_X86_64; + } + + blkdev->sring = xc_gnttab_map_grant_ref(blkdev->xendev.gnttabdev, + blkdev->xendev.dom, + blkdev->ring_ref, + PROT_READ | PROT_WRITE); + if (!blkdev->sring) + return -1; + blkdev->cnt_map++; + + switch (blkdev->protocol) { + case BLKIF_PROTOCOL_NATIVE: + { + blkif_sring_t *sring_native = blkdev->sring; + BACK_RING_INIT(&blkdev->rings.native, sring_native, XC_PAGE_SIZE); + break; + } + case BLKIF_PROTOCOL_X86_32: + { + blkif_x86_32_sring_t *sring_x86_32 = blkdev->sring; + BACK_RING_INIT(&blkdev->rings.x86_32, sring_x86_32, XC_PAGE_SIZE); + break; + } + case BLKIF_PROTOCOL_X86_64: + { + blkif_x86_64_sring_t *sring_x86_64 = blkdev->sring; + BACK_RING_INIT(&blkdev->rings.x86_64, sring_x86_64, XC_PAGE_SIZE); + break; + } + } + + xen_be_bind_evtchn(&blkdev->xendev); + + xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " + "remote port %d, local port %d\n", + blkdev->xendev.protocol, blkdev->ring_ref, + blkdev->xendev.remote_port, blkdev->xendev.local_port); + return 0; +} + +static void blk_disconnect(struct XenDevice *xendev) +{ + struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + + if (blkdev->bs) { + if (blkdev->index == -1) { + /* close/delete only if we created it ourself */ + bdrv_close(blkdev->bs); + bdrv_delete(blkdev->bs); + } + blkdev->bs = NULL; + } + xen_be_unbind_evtchn(&blkdev->xendev); + + if (blkdev->sring) { + xc_gnttab_munmap(blkdev->xendev.gnttabdev, blkdev->sring, 1); + blkdev->cnt_map--; + blkdev->sring = NULL; + } +} + +static int blk_free(struct XenDevice *xendev) +{ + struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + struct ioreq *ioreq; + + while (!LIST_EMPTY(&blkdev->freelist)) { + ioreq = LIST_FIRST(&blkdev->freelist); + LIST_REMOVE(ioreq, list); + qemu_free(ioreq); + } + + qemu_free(blkdev->params); + qemu_free(blkdev->mode); + return 0; +} + +static void blk_event(struct XenDevice *xendev) +{ + struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); + blk_handle_requests(blkdev); +} + +struct XenDevOps xen_blkdev_ops = { + .size = sizeof(struct XenBlkDev), + .flags = DEVOPS_FLAG_NEED_GNTDEV, + .alloc = blk_alloc, + .init = blk_init, + .connect = blk_connect, + .disconnect = blk_disconnect, + .event = blk_event, + .free = blk_free, +}; diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index d1d0a93..27b85e7 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -101,6 +101,7 @@ static int xen_init_pv(DisplayState *ds) xen_be_register("console", &xen_console_ops); xen_be_register("vkbd", &xen_kbdmouse_ops); xen_be_register("vfb", &xen_framebuffer_ops); + xen_be_register("qdisk", &xen_blkdev_ops); /* setup framebuffer */ xen_set_display(xen_domid, ds); diff --git a/sysemu.h b/sysemu.h index b12fae0..49e75b1 100644 --- a/sysemu.h +++ b/sysemu.h @@ -113,7 +113,7 @@ extern unsigned int nb_prom_envs; #endif typedef enum { - IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD + IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_XEN } BlockInterfaceType; typedef struct DriveInfo { diff --git a/vl.c b/vl.c index ff94c6f..5e023e6 100644 --- a/vl.c +++ b/vl.c @@ -5516,6 +5516,9 @@ static int drive_init(struct drive_opt *arg, int snapshot, } else if (!strcmp(buf, "sd")) { type = IF_SD; max_devs = 0; + } else if (!strcmp(buf, "xen")) { + type = IF_XEN; + max_devs = 0; } else { fprintf(stderr, "qemu: ''%s'' unsupported bus type ''%s''\n", str, buf); return -1; @@ -5701,6 +5704,7 @@ static int drive_init(struct drive_opt *arg, int snapshot, switch(type) { case IF_IDE: case IF_SCSI: + case IF_XEN: switch(media) { case MEDIA_DISK: if (cyls != 0) { -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 10/13] xen: add net backend driver.
This patch adds a network interface backend driver to qemu. It is a pure userspace implemention using the gntdev interface. It uses "qnet" as backend name in xenstore so it doesn''t interfere with the netback backend (aka "vnif"). The network backend is hooked into the corrosponding qemu vlan, i.e. vif 0 is hooked into vlan 0. To make the packages actually arrive somewhere you additionally have to link the vlan to the outside world using the usual qemu command line options such as "-net tap,...". Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 2 +- hw/xen_backend.h | 1 + hw/xen_machine_pv.c | 1 + hw/xen_nic.c | 382 +++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 385 insertions(+), 1 deletions(-) create mode 100644 hw/xen_nic.c diff --git a/Makefile.target b/Makefile.target index a60ca16..94b5342 100644 --- a/Makefile.target +++ b/Makefile.target @@ -517,7 +517,7 @@ endif # xen backend driver support XEN_OBJS := xen_machine_pv.o xen_backend.o -XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o +XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o xen_nic.o ifeq ($(CONFIG_XEN), yes) OBJS += $(XEN_OBJS) LIBS += $(XEN_LIBS) diff --git a/hw/xen_backend.h b/hw/xen_backend.h index 9dbcb19..9ea4421 100644 --- a/hw/xen_backend.h +++ b/hw/xen_backend.h @@ -89,6 +89,7 @@ struct XenDevOps xen_console_ops; /* xen_console.c */ struct XenDevOps xen_kbdmouse_ops; /* xen_framebuffer.c */ struct XenDevOps xen_framebuffer_ops; /* xen_framebuffer.c */ struct XenDevOps xen_blkdev_ops; /* xen_disk.c */ +struct XenDevOps xen_netdev_ops; /* xen_nic.c */ void xen_set_display(int domid, DisplayState *ds); diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 27b85e7..e6d2bd6 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -102,6 +102,7 @@ static int xen_init_pv(DisplayState *ds) xen_be_register("vkbd", &xen_kbdmouse_ops); xen_be_register("vfb", &xen_framebuffer_ops); xen_be_register("qdisk", &xen_blkdev_ops); + xen_be_register("qnic", &xen_netdev_ops); /* setup framebuffer */ xen_set_display(xen_domid, ds); diff --git a/hw/xen_nic.c b/hw/xen_nic.c new file mode 100644 index 0000000..e2e6a85 --- /dev/null +++ b/hw/xen_nic.c @@ -0,0 +1,382 @@ +/* + * xen paravirt network card backend + * + * (c) Gerd Hoffmann <kraxel@redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <stdio.h> +#include <stdlib.h> +#include <stdarg.h> +#include <string.h> +#include <unistd.h> +#include <signal.h> +#include <inttypes.h> +#include <fcntl.h> +#include <errno.h> +#include <pthread.h> +#include <sys/socket.h> +#include <sys/ioctl.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/mman.h> +#include <sys/wait.h> +#include <linux/if.h> +#include <linux/if_tun.h> + +#include <xs.h> +#include <xenctrl.h> +#include <xen/io/xenbus.h> +#include <xen/io/netif.h> + +#include "hw.h" +#include "net.h" +#include "qemu-char.h" +#include "xen_backend.h" + +/* ------------------------------------------------------------- */ + +struct XenNetDev { + struct XenDevice xendev; /* must be first */ + char *mac; + int tx_work; + int tx_ring_ref; + int rx_ring_ref; + struct netif_tx_sring *txs; + struct netif_rx_sring *rxs; + netif_tx_back_ring_t tx_ring; + netif_rx_back_ring_t rx_ring; + VLANClientState *vs; +}; + +/* ------------------------------------------------------------- */ + +static void net_tx_response(struct XenNetDev *netdev, netif_tx_request_t *txp, int8_t st) +{ + RING_IDX i = netdev->tx_ring.rsp_prod_pvt; + netif_tx_response_t *resp; + int notify; + + resp = RING_GET_RESPONSE(&netdev->tx_ring, i); + resp->id = txp->id; + resp->status = st; + +#if 0 + if (txp->flags & NETTXF_extra_info) + RING_GET_RESPONSE(&netdev->tx_ring, ++i)->status = NETIF_RSP_NULL; +#endif + + netdev->tx_ring.rsp_prod_pvt = ++i; + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->tx_ring, notify); + if (notify) + xen_be_send_notify(&netdev->xendev); + + if (i == netdev->tx_ring.req_cons) { + int more_to_do; + RING_FINAL_CHECK_FOR_REQUESTS(&netdev->tx_ring, more_to_do); + if (more_to_do) + netdev->tx_work++; + } +} + +static void net_tx_error(struct XenNetDev *netdev, netif_tx_request_t *txp, RING_IDX end) +{ +#if 0 + /* + * Hmm, why netback fails everything in the ring? + * Should we do that even when not supporting SG and TSO? + */ + RING_IDX cons = netdev->tx_ring.req_cons; + + do { + make_tx_response(netif, txp, NETIF_RSP_ERROR); + if (cons >= end) + break; + txp = RING_GET_REQUEST(&netdev->tx_ring, cons++); + } while (1); + netdev->tx_ring.req_cons = cons; + netif_schedule_work(netif); + netif_put(netif); +#else + net_tx_response(netdev, txp, NETIF_RSP_ERROR); +#endif +} + +static void net_tx_packets(struct XenNetDev *netdev) +{ + netif_tx_request_t txreq; + RING_IDX rc, rp; + void *page; + + for (;;) { + rc = netdev->tx_ring.req_cons; + rp = netdev->tx_ring.sring->req_prod; + xen_rmb(); /* Ensure we see queued requests up to ''rp''. */ + + while ((rc != rp)) { + if (RING_REQUEST_CONS_OVERFLOW(&netdev->tx_ring, rc)) + break; + memcpy(&txreq, RING_GET_REQUEST(&netdev->tx_ring, rc), sizeof(txreq)); + netdev->tx_ring.req_cons = ++rc; + +#if 1 + /* should not happen in theory, we don''t announce the * + * feature-{sg,gso,whatelse} flags in xenstore (yet?) */ + if (txreq.flags & NETTXF_extra_info) { + xen_be_printf(&netdev->xendev, 0, "FIXME: extra info flag\n"); + net_tx_error(netdev, &txreq, rc); + continue; + } + if (txreq.flags & NETTXF_more_data) { + xen_be_printf(&netdev->xendev, 0, "FIXME: more data flag\n"); + net_tx_error(netdev, &txreq, rc); + continue; + } +#endif + + if (txreq.size < 14) { + xen_be_printf(&netdev->xendev, 0, "bad packet size: %d\n", txreq.size); + net_tx_error(netdev, &txreq, rc); + continue; + } + + if ((txreq.offset + txreq.size) > XC_PAGE_SIZE) { + xen_be_printf(&netdev->xendev, 0, "error: page crossing\n"); + net_tx_error(netdev, &txreq, rc); + continue; + } + + xen_be_printf(&netdev->xendev, 3, "tx packet ref %d, off %d, len %d, flags 0x%x%s%s%s%s\n", + txreq.gref, txreq.offset, txreq.size, txreq.flags, + (txreq.flags & NETTXF_csum_blank) ? " csum_blank" : "", + (txreq.flags & NETTXF_data_validated) ? " data_validated" : "", + (txreq.flags & NETTXF_more_data) ? " more_data" : "", + (txreq.flags & NETTXF_extra_info) ? " extra_info" : ""); + + page = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev, + netdev->xendev.dom, + txreq.gref, PROT_READ); + if (NULL == page) { + xen_be_printf(&netdev->xendev, 0, "error: tx gref dereference failed (%d)\n", + txreq.gref); + net_tx_error(netdev, &txreq, rc); + continue; + } + if (txreq.flags & NETTXF_csum_blank) + net_checksum_calculate(page + txreq.offset, txreq.size); + qemu_send_packet(netdev->vs, page + txreq.offset, txreq.size); + xc_gnttab_munmap(netdev->xendev.gnttabdev, page, 1); + net_tx_response(netdev, &txreq, NETIF_RSP_OKAY); + } + if (!netdev->tx_work) + break; + netdev->tx_work = 0; + } +} + +/* ------------------------------------------------------------- */ + +static void net_rx_response(struct XenNetDev *netdev, + netif_rx_request_t *req, int8_t st, + uint16_t offset, uint16_t size, + uint16_t flags) +{ + RING_IDX i = netdev->rx_ring.rsp_prod_pvt; + netif_rx_response_t *resp; + int notify; + + resp = RING_GET_RESPONSE(&netdev->rx_ring, i); + resp->offset = offset; + resp->flags = flags; + resp->id = req->id; + resp->status = (int16_t)size; + if (st < 0) + resp->status = (int16_t)st; + + xen_be_printf(&netdev->xendev, 3, "rx response: idx %d, status %d, flags 0x%x\n", + i, resp->status, resp->flags); + + netdev->rx_ring.rsp_prod_pvt = ++i; + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->rx_ring, notify); + if (notify) + xen_be_send_notify(&netdev->xendev); +} + +#define NET_IP_ALIGN 2 + +static int net_rx_ok(void *opaque) +{ + struct XenNetDev *netdev = opaque; + RING_IDX rc, rp; + + if (netdev->xendev.be_state != XenbusStateConnected) + return 0; + + rc = netdev->rx_ring.req_cons; + rp = netdev->rx_ring.sring->req_prod; + xen_rmb(); + + if (rc == rp || RING_REQUEST_CONS_OVERFLOW(&netdev->rx_ring, rc)) { + xen_be_printf(&netdev->xendev, 2, "%s: no rx buffers (%d/%d)\n", + __FUNCTION__, rc, rp); + return 0; + } + return 1; +} + +static void net_rx_packet(void *opaque, const uint8_t *buf, int size) +{ + struct XenNetDev *netdev = opaque; + netif_rx_request_t rxreq; + RING_IDX rc, rp; + void *page; + + if (netdev->xendev.be_state != XenbusStateConnected) + return; + + rc = netdev->rx_ring.req_cons; + rp = netdev->rx_ring.sring->req_prod; + xen_rmb(); /* Ensure we see queued requests up to ''rp''. */ + + if (rc == rp || RING_REQUEST_CONS_OVERFLOW(&netdev->rx_ring, rc)) { + xen_be_printf(&netdev->xendev, 2, "no buffer, drop packet\n"); + return; + } + if (size > XC_PAGE_SIZE - NET_IP_ALIGN) { + xen_be_printf(&netdev->xendev, 0, "packet too big (%d > %ld)", + size, XC_PAGE_SIZE - NET_IP_ALIGN); + return; + } + + memcpy(&rxreq, RING_GET_REQUEST(&netdev->rx_ring, rc), sizeof(rxreq)); + netdev->rx_ring.req_cons = ++rc; + + page = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev, + netdev->xendev.dom, + rxreq.gref, PROT_WRITE); + if (NULL == page) { + xen_be_printf(&netdev->xendev, 0, "error: rx gref dereference failed (%d)\n", + rxreq.gref); + net_rx_response(netdev, &rxreq, NETIF_RSP_ERROR, 0, 0, 0); + return; + } + memcpy(page + NET_IP_ALIGN, buf, size); + xc_gnttab_munmap(netdev->xendev.gnttabdev, page, 1); + net_rx_response(netdev, &rxreq, NETIF_RSP_OKAY, NET_IP_ALIGN, size, 0); +} + +/* ------------------------------------------------------------- */ + +static int net_init(struct XenDevice *xendev) +{ + struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev); + VLANState *vlan; + + /* read xenstore entries */ + if (NULL == netdev->mac) + netdev->mac = xenstore_read_be_str(&netdev->xendev, "mac"); + + /* do we have all we need? */ + if (NULL == netdev->mac) + return -1; + + vlan = qemu_find_vlan(netdev->xendev.dev); + netdev->vs = qemu_new_vlan_client(vlan, net_rx_packet, net_rx_ok, netdev); + snprintf(netdev->vs->info_str, sizeof(netdev->vs->info_str), + "nic: xenbus vif macaddr=%s", netdev->mac); + + /* fill info */ + xenstore_write_be_int(&netdev->xendev, "feature-rx-copy", 1); + xenstore_write_be_int(&netdev->xendev, "feature-rx-flip", 0); + + return 0; +} + +static int net_connect(struct XenDevice *xendev) +{ + struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev); + int rx_copy; + + if (-1 == xenstore_read_fe_int(&netdev->xendev, "tx-ring-ref", + &netdev->tx_ring_ref)) + return -1; + if (-1 == xenstore_read_fe_int(&netdev->xendev, "rx-ring-ref", + &netdev->rx_ring_ref)) + return 1; + if (-1 == xenstore_read_fe_int(&netdev->xendev, "event-channel", + &netdev->xendev.remote_port)) + return -1; + + if (-1 == xenstore_read_fe_int(&netdev->xendev, "request-rx-copy", &rx_copy)) + rx_copy = 0; + if (0 == rx_copy) { + xen_be_printf(&netdev->xendev, 0, "frontend doesn''t support rx-copy.\n"); + return -1; + } + + netdev->txs = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev, + netdev->xendev.dom, + netdev->tx_ring_ref, + PROT_READ | PROT_WRITE); + netdev->rxs = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev, + netdev->xendev.dom, + netdev->rx_ring_ref, + PROT_READ | PROT_WRITE); + if (!netdev->txs || !netdev->rxs) + return -1; + BACK_RING_INIT(&netdev->tx_ring, netdev->txs, XC_PAGE_SIZE); + BACK_RING_INIT(&netdev->rx_ring, netdev->rxs, XC_PAGE_SIZE); + + xen_be_bind_evtchn(&netdev->xendev); + + xen_be_printf(&netdev->xendev, 1, "ok: tx-ring-ref %d, rx-ring-ref %d, " + "remote port %d, local port %d\n", + netdev->tx_ring_ref, netdev->rx_ring_ref, + netdev->xendev.remote_port, netdev->xendev.local_port); + return 0; +} + +static void net_disconnect(struct XenDevice *xendev) +{ + struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev); + + xen_be_unbind_evtchn(&netdev->xendev); + + if (netdev->txs) { + xc_gnttab_munmap(netdev->xendev.gnttabdev, netdev->txs, 1); + netdev->txs = NULL; + } + if (netdev->rxs) { + xc_gnttab_munmap(netdev->xendev.gnttabdev, netdev->rxs, 1); + netdev->rxs = NULL; + } +} + +static void net_event(struct XenDevice *xendev) +{ + struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev); + net_tx_packets(netdev); +} + +/* ------------------------------------------------------------- */ + +struct XenDevOps xen_netdev_ops = { + .size = sizeof(struct XenNetDev), + .flags = DEVOPS_FLAG_NEED_GNTDEV, + .init = net_init, + .connect = net_connect, + .event = net_event, + .disconnect = net_disconnect, +}; -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 11/13] xen: blk & nic configuration via cmd line.
This patch makes qemu create backend and frontend device entries in xenstore for devices configured on the command line. It will use qdisk and qnic backend names, so the qemu internal backends will be used. Disks can be created using -drive if=xen,file=... Nics can be created using -net nic,macaddr=... Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 2 +- hw/xen_backend.h | 7 +++ hw/xen_devconfig.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++ hw/xen_machine_pv.c | 19 ++++++- 4 files changed, 170 insertions(+), 2 deletions(-) create mode 100644 hw/xen_devconfig.c diff --git a/Makefile.target b/Makefile.target index 94b5342..7e507e4 100644 --- a/Makefile.target +++ b/Makefile.target @@ -516,7 +516,7 @@ LIBS += $(CONFIG_VNC_TLS_LIBS) endif # xen backend driver support -XEN_OBJS := xen_machine_pv.o xen_backend.o +XEN_OBJS := xen_machine_pv.o xen_backend.o xen_devconfig.o XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o xen_nic.o ifeq ($(CONFIG_XEN), yes) OBJS += $(XEN_OBJS) diff --git a/hw/xen_backend.h b/hw/xen_backend.h index 9ea4421..f591743 100644 --- a/hw/xen_backend.h +++ b/hw/xen_backend.h @@ -3,6 +3,8 @@ #include "xen_common.h" #include "sysemu.h" +#include "net.h" +#include "block_int.h" /* ------------------------------------------------------------- */ @@ -93,4 +95,9 @@ struct XenDevOps xen_netdev_ops; /* xen_nic.c */ void xen_set_display(int domid, DisplayState *ds); +/* configuration (aka xenbus setup) */ +void xen_config_cleanup(void); +int xen_config_dev_blk(DriveInfo *disk); +int xen_config_dev_nic(NICInfo *nic); + #endif /* QEMU_XEN_BACKEND_H */ diff --git a/hw/xen_devconfig.c b/hw/xen_devconfig.c new file mode 100644 index 0000000..25d28b4 --- /dev/null +++ b/hw/xen_devconfig.c @@ -0,0 +1,144 @@ +#include "xen_backend.h" + +/* ------------------------------------------------------------- */ + +struct xs_dirs { + char *xs_dir; + TAILQ_ENTRY(xs_dirs) list; +}; +static TAILQ_HEAD(xs_dirs_head, xs_dirs) xs_cleanup = TAILQ_HEAD_INITIALIZER(xs_cleanup); + +static void xen_config_cleanup_dir(char *dir) +{ + struct xs_dirs *d; + + d = qemu_malloc(sizeof(*d)); + if (!d) + return; + d->xs_dir = dir; + TAILQ_INSERT_TAIL(&xs_cleanup, d, list); +} + +void xen_config_cleanup(void) +{ + struct xs_dirs *d; + + fprintf(stderr, "xen be: %s\n", __FUNCTION__); + TAILQ_FOREACH(d, &xs_cleanup, list) { + xs_rm(xenstore, 0, d->xs_dir); + } +} + +/* ------------------------------------------------------------- */ + +static int xen_config_dev_mkdir(char *dev, int p) +{ + struct xs_permissions perms = { + .id = xen_domid, + .perms = p, + }; + + if (!xs_mkdir(xenstore, 0, dev)) { + fprintf(stderr, "xs_mkdir %s: failed\n", dev); + return -1; + } + xen_config_cleanup_dir(qemu_strdup(dev)); + + if (!xs_set_permissions(xenstore, 0, dev, &perms, 1)) { + fprintf(stderr, "%s: xs_set_permissions failed\n", __FUNCTION__); + return -1; + } + return 0; +} + +static int xen_config_dev_dirs(char *ftype, char *btype, int vdev, + char *fe, char *be, int len) +{ + char *dom; + + dom = xs_get_domain_path(xenstore, xen_domid); + snprintf(fe, len, "%s/device/%s/%d", dom, ftype, vdev); + free(dom); + + dom = xs_get_domain_path(xenstore, 0); + snprintf(be, len, "%s/backend/%s/%d/%d", dom, btype, xen_domid, vdev); + free(dom); + + xen_config_dev_mkdir(fe, XS_PERM_WRITE); + xen_config_dev_mkdir(be, XS_PERM_READ); + return 0; +} + +static int xen_config_dev_all(char *fe, char *be) +{ + /* frontend */ +#if 0 + xenstore_write_str(fe, "protocol", + xen_config_dev_protocol(xen)); +#endif + xenstore_write_int(fe, "state", XenbusStateInitialising); + xenstore_write_int(fe, "backend-id", 0); + xenstore_write_str(fe, "backend", be); + + /* backend */ + xenstore_write_str(be, "domain", qemu_name ? qemu_name : "no-name"); + xenstore_write_int(be, "online", 1); + xenstore_write_int(be, "state", XenbusStateInitialising); + xenstore_write_int(be, "frontend-id", xen_domid); + xenstore_write_str(be, "frontend", fe); + + return 0; +} + +/* ------------------------------------------------------------- */ + +int xen_config_dev_blk(DriveInfo *disk) +{ + char fe[256], be[256]; + int vdev = 202 * 256 + 16 * disk->unit; + int cdrom = disk->bdrv->type == BDRV_TYPE_CDROM; + char *devtype = cdrom ? "cdrom" : "disk"; + char *mode = cdrom ? "r" : "w"; + + snprintf(disk->bdrv->device_name, sizeof(disk->bdrv->device_name), + "xvd%c", ''a'' + disk->unit); + fprintf(stderr, "xen be: config disk %d [%s]: %s\n", + disk->unit, disk->bdrv->device_name, disk->bdrv->filename); + xen_config_dev_dirs("vbd", "qdisk", vdev, fe, be, sizeof(fe)); + + /* frontend */ + xenstore_write_int(fe, "virtual-device", vdev); + xenstore_write_str(fe, "device-type", devtype); + + /* backend */ + xenstore_write_str(be, "dev", disk->bdrv->device_name); + xenstore_write_str(be, "type", "file"); + xenstore_write_str(be, "params", disk->bdrv->filename); + xenstore_write_str(be, "mode", mode); + + /* common stuff */ + return xen_config_dev_all(fe, be); +} + +int xen_config_dev_nic(NICInfo *nic) +{ + char fe[256], be[256]; + char mac[20]; + + snprintf(mac, sizeof(mac), "%02x:%02x:%02x:%02x:%02x:%02x", + nic->macaddr[0], nic->macaddr[1], nic->macaddr[2], + nic->macaddr[3], nic->macaddr[4], nic->macaddr[5]); + fprintf(stderr, "xen be: config nic %d: mac=\"%s\"\n", nic->vlan->id, mac); + xen_config_dev_dirs("vif", "qnic", nic->vlan->id, fe, be, sizeof(fe)); + + /* frontend */ + xenstore_write_int(fe, "handle", nic->vlan->id); + xenstore_write_str(fe, "mac", mac); + + /* backend */ + xenstore_write_int(be, "handle", nic->vlan->id); + xenstore_write_str(be, "mac", mac); + + /* common stuff */ + return xen_config_dev_all(fe, be); +} diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index e6d2bd6..851837c 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -121,7 +121,7 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size, const char *cpu_model) { CPUState *env; - int rc; + int index,i,rc; rc = xen_init_pv(ds); if (-1 == rc) @@ -147,6 +147,23 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size, env = cpu_init(cpu_model); env->halted = 1; + /* configure disks */ + for (i = 0; i < 16; i++) { + index = drive_get_index(IF_XEN, 0, i); + if (index == -1) + continue; + xen_config_dev_blk(drives_table + index); + } + + /* configure nics */ + for (i = 0; i < nb_nics; i++) { + if (nd_table[i].model && 0 != strcmp(nd_table[i].model, "xen")) + continue; + xen_config_dev_nic(nd_table + i); + } + + /* config cleanup hook */ + atexit(xen_config_cleanup); return; err: -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 16:27 UTC
[Xen-devel] [PATCH 12/13] set vnc password from xenstore.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- hw/xen_machine_pv.c | 26 ++++++++++++++++++++++++++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 851837c..9f18742 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -24,6 +24,7 @@ #include "hw.h" #include "boards.h" +#include "console.h" #include "xen_backend.h" @@ -110,6 +111,29 @@ static int xen_init_pv(DisplayState *ds) return 0; } +static void xen_init_vnc(void) +{ + char xspath[256], *dom = NULL, *vm = NULL, *pw = NULL; + int len; + + dom = xs_get_domain_path(xenstore, xen_domid); + snprintf(xspath, sizeof(xspath), "%s/vm", dom); + vm = xs_read(xenstore, 0, xspath, &len); + if (!vm) + goto out; + snprintf(xspath, sizeof(xspath), "%s/vncpassword", vm); + pw = xs_read(xenstore, 0, xspath, &len); + if (!pw || !strlen(pw)) + goto out; + vnc_display_password(NULL, pw); + fprintf(stderr, "vnc password set from xenstore\n"); + +out: + free(dom); + free(vm); + free(pw); +} + /* -------------------------------------------------------------------- */ /* paravirtualized xen machine */ @@ -136,6 +160,8 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size, goto err; } + xen_init_vnc(); + /* create dummy cpu, halted */ if (cpu_model == NULL) { #ifdef TARGET_X86_64 -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This adds domain building support for paravirtual domains to qemu. This allows booting xen guests directly with qemu, without Xend and the management stack. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> --- Makefile.target | 2 +- configure | 2 +- hw/xen_backend.h | 3 + hw/xen_devconfig.c | 29 +++++ hw/xen_domainbuild.c | 281 ++++++++++++++++++++++++++++++++++++++++++++++++++ hw/xen_domainbuild.h | 9 ++ hw/xen_machine_pv.c | 34 ++++--- 7 files changed, 344 insertions(+), 16 deletions(-) create mode 100644 hw/xen_domainbuild.c create mode 100644 hw/xen_domainbuild.h diff --git a/Makefile.target b/Makefile.target index 7e507e4..47bfc77 100644 --- a/Makefile.target +++ b/Makefile.target @@ -516,7 +516,7 @@ LIBS += $(CONFIG_VNC_TLS_LIBS) endif # xen backend driver support -XEN_OBJS := xen_machine_pv.o xen_backend.o xen_devconfig.o +XEN_OBJS := xen_machine_pv.o xen_backend.o xen_devconfig.o xen_domainbuild.o XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o xen_nic.o ifeq ($(CONFIG_XEN), yes) OBJS += $(XEN_OBJS) diff --git a/configure b/configure index 14689d2..cec0b48 100755 --- a/configure +++ b/configure @@ -1228,7 +1228,7 @@ fi if test "$xen" = "yes" ; then echo "CONFIG_XEN=yes" >> $config_mak echo "#define CONFIG_XEN 1" >> $config_h - echo "XEN_LIBS=-lxenstore -lxenctrl" >> $config_mak + echo "XEN_LIBS=-lxenstore -lxenctrl -lxenguest" >> $config_mak fi if test "$aio" = "yes" ; then echo "#define CONFIG_AIO 1" >> $config_h diff --git a/hw/xen_backend.h b/hw/xen_backend.h index f591743..70ad016 100644 --- a/hw/xen_backend.h +++ b/hw/xen_backend.h @@ -99,5 +99,8 @@ void xen_set_display(int domid, DisplayState *ds); void xen_config_cleanup(void); int xen_config_dev_blk(DriveInfo *disk); int xen_config_dev_nic(NICInfo *nic); +int xen_config_dev_vfb(int vdev, char *type); +int xen_config_dev_vkbd(int vdev); +int xen_config_dev_console(int vdev); #endif /* QEMU_XEN_BACKEND_H */ diff --git a/hw/xen_devconfig.c b/hw/xen_devconfig.c index 25d28b4..0d506ca 100644 --- a/hw/xen_devconfig.c +++ b/hw/xen_devconfig.c @@ -142,3 +142,32 @@ int xen_config_dev_nic(NICInfo *nic) /* common stuff */ return xen_config_dev_all(fe, be); } + +int xen_config_dev_vfb(int vdev, char *type) +{ + char fe[256], be[256]; + + xen_config_dev_dirs("vfb", "vfb", vdev, fe, be, sizeof(fe)); + + /* backend */ + xenstore_write_str(be, "type", type); + + /* common stuff */ + return xen_config_dev_all(fe, be); +} + +int xen_config_dev_vkbd(int vdev) +{ + char fe[256], be[256]; + + xen_config_dev_dirs("vkbd", "vkbd", vdev, fe, be, sizeof(fe)); + return xen_config_dev_all(fe, be); +} + +int xen_config_dev_console(int vdev) +{ + char fe[256], be[256]; + + xen_config_dev_dirs("console", "console", vdev, fe, be, sizeof(fe)); + return xen_config_dev_all(fe, be); +} diff --git a/hw/xen_domainbuild.c b/hw/xen_domainbuild.c new file mode 100644 index 0000000..deba98d --- /dev/null +++ b/hw/xen_domainbuild.c @@ -0,0 +1,281 @@ +#include "xen_backend.h" +#include "xen_domainbuild.h" +#include "sysemu.h" +#include "qemu-timer.h" + +#include <xenguest.h> + +/* temporary */ +static const char qemu_uuid[] = "f0c9d298-3dd9-4c20-8c90-0205ac932241"; + +static int xenstore_domain_mkdir(char *path) +{ + struct xs_permissions perms = { + .id = xen_domid, + .perms = XS_PERM_WRITE, + }; + + if (!xs_mkdir(xenstore, 0, path)) { + fprintf(stderr, "%s: xs_mkdir %s: failed\n", __FUNCTION__, path); + return -1; + } + if (!xs_set_permissions(xenstore, 0, path, &perms, 1)) { + fprintf(stderr, "%s: xs_set_permissions failed\n", __FUNCTION__); + return -1; + } + return 0; +} + +static int xenstore_domain_init1(const char *kernel, const char *ramdisk, + const char *cmdline) +{ + char *dom, vm[256], path[256]; + int i; + + dom = xs_get_domain_path(xenstore, xen_domid); + snprintf(vm, sizeof(vm), "/vm/%s", qemu_uuid); + + xenstore_domain_mkdir(dom); + + xenstore_write_str(vm, "image/ostype", "linux"); + if (kernel) + xenstore_write_str(vm, "image/kernel", kernel); + if (ramdisk) + xenstore_write_str(vm, "image/ramdisk", ramdisk); + if (cmdline) + xenstore_write_str(vm, "image/cmdline", cmdline); + + /* name + id */ + xenstore_write_str(vm, "name", qemu_name ? qemu_name : "no-name"); + xenstore_write_str(vm, "uuid", qemu_uuid); + xenstore_write_str(dom, "name", qemu_name ? qemu_name : "no-name"); + xenstore_write_int(dom, "domid", xen_domid); + xenstore_write_str(dom, "vm", vm); + + /* memory */ + xenstore_write_int(dom, "memory/target", ram_size >> 10); // kB + xenstore_write_int(vm, "memory", ram_size >> 20); // MB + xenstore_write_int(vm, "maxmem", ram_size >> 20); // MB + + /* cpus */ + for (i = 0; i < smp_cpus; i++) { + snprintf(path, sizeof(path), "cpu/%d/availability",i); + xenstore_write_str(dom, path, "online"); + } + xenstore_write_int(vm, "vcpu_avail", smp_cpus); + xenstore_write_int(vm, "vcpus", smp_cpus); + + /* vnc password */ + xenstore_write_str(vm, "vncpassword", "" /* FIXME */); + + free(dom); + return 0; +} + +static int xenstore_domain_init2(int xenstore_port, int xenstore_mfn, + int console_port, int console_mfn) +{ + char *dom; + + dom = xs_get_domain_path(xenstore, xen_domid); + + /* signal new domain */ + xs_introduce_domain(xenstore, + xen_domid, + xenstore_mfn, + xenstore_port); + + /* xenstore */ + xenstore_write_int(dom, "store/ring-ref", xenstore_mfn); + xenstore_write_int(dom, "store/port", xenstore_port); + + /* console */ + xenstore_write_str(dom, "console/type", "ioemu"); + xenstore_write_int(dom, "console/limit", 128 * 1024); + xenstore_write_int(dom, "console/ring-ref", console_mfn); + xenstore_write_int(dom, "console/port", console_port); + xen_config_dev_console(0); + + /* devices */ + if (1 /* FIXME */) { + xen_config_dev_vfb(0, "vnc"); + xen_config_dev_vkbd(0); + } + + free(dom); + return 0; +} + +/* ------------------------------------------------------------- */ + +static QEMUTimer *xen_poll; + +/* check domain state once per second */ +static void xen_domain_poll(void *opaque) +{ + struct xc_dominfo info; + int rc; + + rc = xc_domain_getinfo(xen_xc, xen_domid, 1, &info); + if ((1 != rc) || (info.domid != xen_domid)) { + fprintf(stderr, "xen: domain %d is gone\n", xen_domid); + goto quit; + } + if (info.dying) { + fprintf(stderr, "xen: domain %d is dying (%s%s)\n", xen_domid, + info.crashed ? "crashed" : "", + info.shutdown ? "shutdown" : ""); + goto quit; + } + + qemu_mod_timer(xen_poll, qemu_get_clock(rt_clock) + 1000); + return; + +quit: + qemu_system_shutdown_request(); + return; +} + +static void xen_domain_watcher(void) +{ + int qemu_running = 1; + int fd[2], i, rc; + char byte; + + pipe(fd); + if (0 != fork()) + return; /* not child */ + + /* close all file handles, except stdio/out/err, + * our watch pipe and the xen interface handle */ + for (i = 3; i < 256; i++) { + if (i == fd[0]) + continue; + if (i == xen_xc) + continue; + close(i); + } + + /* wait for qemu exiting */ + while (qemu_running) { + rc = read(fd[0], &byte, 1); + switch (rc) { + case -1: + if (EINTR == errno) + continue; + fprintf(stderr, "%s: Huh? read error: %s\n", __FUNCTION__, strerror(errno)); + qemu_running = 0; + break; + case 0: + /* EOF -> qemu exited */ + qemu_running = 0; + break; + default: + fprintf(stderr, "%s: Huh? data on the watch pipe?\n", __FUNCTION__); + break; + } + } + + /* cleanup */ + fprintf(stderr, "%s: destroy domain %d\n", __FUNCTION__, xen_domid); + xc_domain_destroy(xen_xc, xen_domid); + _exit(0); +} + +/* normal cleanup */ +static void xen_domain_cleanup(void) +{ + char *dom; + + dom = xs_get_domain_path(xenstore, xen_domid); + if (dom) { + xs_rm(xenstore, 0, dom); + free(dom); + } + xs_release_domain(xenstore, xen_domid); +} + +int xen_domain_build_pv(const char *kernel, const char *ramdisk, + const char *cmdline) +{ + uint32_t ssidref = 0; + uint32_t flags = 0; + xen_domain_handle_t uuid; + unsigned int xenstore_port = 0, console_port = 0; + unsigned long xenstore_mfn = 0, console_mfn = 0; + int rc, i, pos, val; + + for (i = 0, pos = 0; i < 16; i++, pos += 2) { + if (i == 4 || i == 6 || i == 8 || i == 10) { + /* skip dashes */ + if (''-'' != qemu_uuid[pos]) { + fprintf(stderr, "xen: uuid parse error"); + goto err; + } + pos++; + } + if (1 != sscanf(qemu_uuid+pos, "%02x", &val)) { + fprintf(stderr, "xen: uuid parse error"); + goto err; + } + uuid[i] = val; + } + + rc = xc_domain_create(xen_xc, ssidref, uuid, flags, &xen_domid); + if (rc < 0) { + fprintf(stderr, "xen: xc_domain_create() failed\n"); + goto err; + } + fprintf(stderr, "xen: created domain %d\n", xen_domid); + atexit(xen_domain_cleanup); + xen_domain_watcher(); + + xenstore_domain_init1(kernel, ramdisk, cmdline); + + rc = xc_domain_max_vcpus(xen_xc, xen_domid, smp_cpus); + if (rc < 0) { + fprintf(stderr, "xen: xc_domain_max_vcpus() failed\n"); + goto err; + } + + rc = xc_domain_setcpuweight(xen_xc, xen_domid, 256); + if (rc < 0) { + fprintf(stderr, "xen: xc_domain_setcpuweight() failed\n"); + goto err; + } + + rc = xc_domain_setmaxmem(xen_xc, xen_domid, ram_size >> 10); + if (rc < 0) { + fprintf(stderr, "xen: xc_domain_setmaxmem() failed\n"); + goto err; + } + + xenstore_port = xc_evtchn_alloc_unbound(xen_xc, xen_domid, 0); + console_port = xc_evtchn_alloc_unbound(xen_xc, xen_domid, 0); + + rc = xc_linux_build(xen_xc, xen_domid, ram_size >> 20, + kernel, ramdisk, cmdline, + 0, flags, + xenstore_port, &xenstore_mfn, + console_port, &console_mfn); + if (rc < 0) { + fprintf(stderr, "xen: xc_linux_build() failed\n"); + goto err; + } + + xenstore_domain_init2(xenstore_port, xenstore_mfn, + console_port, console_mfn); + + rc = xc_domain_unpause(xen_xc, xen_domid); + if (rc < 0) { + fprintf(stderr, "xen: xc_domain_unpause() failed\n"); + goto err; + } + + xen_poll = qemu_new_timer(rt_clock, xen_domain_poll, NULL); + qemu_mod_timer(xen_poll, qemu_get_clock(rt_clock) + 1000); + return 0; + +err: + return -1; +} diff --git a/hw/xen_domainbuild.h b/hw/xen_domainbuild.h new file mode 100644 index 0000000..21ee180 --- /dev/null +++ b/hw/xen_domainbuild.h @@ -0,0 +1,9 @@ +#ifndef QEMU_XEN_DOMAINBUILD_H +#define QEMU_XEN_DOMAINBUILD_H 1 + +#include "xen_common.h" + +int xen_domain_build_pv(const char *kernel, const char *ramdisk, + const char *cmdline); + +#endif /* QEMU_XEN_DOMAINBUILD_H */ diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c index 9f18742..8c9f3c1 100644 --- a/hw/xen_machine_pv.c +++ b/hw/xen_machine_pv.c @@ -27,6 +27,7 @@ #include "console.h" #include "xen_backend.h" +#include "xen_domainbuild.h" /* -------------------------------------------------------------------- */ /* variables */ @@ -90,7 +91,9 @@ static int xen_init(void) return 0; } -static int xen_init_pv(DisplayState *ds) +static int xen_init_pv(const char *kernel_filename, + const char *kernel_cmdline, + const char *initrd_filename) { int rc; @@ -98,6 +101,19 @@ static int xen_init_pv(DisplayState *ds) if (rc < 0) return rc; + if (xen_emulate) { + fprintf(stderr, "xen pv emulation not implemented yet\n"); + return -1; + } + + if (xen_domainbuild) { + if (xen_domain_build_pv(kernel_filename, initrd_filename, + kernel_cmdline) < 0) { + fprintf(stderr, "xen pv domain creation failed\n"); + return -1; + } + } + /* xenbus backend drivers */ xen_be_register("console", &xen_console_ops); xen_be_register("vkbd", &xen_kbdmouse_ops); @@ -105,9 +121,6 @@ static int xen_init_pv(DisplayState *ds) xen_be_register("qdisk", &xen_blkdev_ops); xen_be_register("qnic", &xen_netdev_ops); - /* setup framebuffer */ - xen_set_display(xen_domid, ds); - return 0; } @@ -147,19 +160,12 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size, CPUState *env; int index,i,rc; - rc = xen_init_pv(ds); + rc = xen_init_pv(kernel_filename, kernel_cmdline, initrd_filename); if (-1 == rc) goto err; - if (xen_emulate) { - fprintf(stderr, "xen pv emulation not implemented yet\n"); - goto err; - } - if (xen_domainbuild) { - fprintf(stderr, "xen pv domain creation not implemented yet\n"); - goto err; - } - + /* setup framebuffer */ + xen_set_display(xen_domid, ds); xen_init_vnc(); /* create dummy cpu, halted */ -- 1.5.5.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Samuel Thibault
2008-Aug-21 16:33 UTC
[Xen-devel] Re: [PATCH 02/13] add DisplayState->idle
Gerd Hoffmann, le Thu 21 Aug 2008 18:27:23 +0200, a écrit :> From: Samuel Thibault <samuel.thibault@eu.citrix.com> > > From: Samuel Thibault <samuel.thibault@eu.citrix.com> > > Add idle field to DisplayState struct, so drivers can figure > the display is idle and take advantage of that. > > The xen framebuffer driver will use this to communicate the > idle state to the guest, so it knows it can stop doing updates > to a virtual display which is invisible anyway.Signed-off-by: Samuel Thibault <samuel.thibault@eu.citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 20:11 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Gerd Hoffmann wrote:> This patch makes qemu handle signals better. It sets the request_shutdown > flag, making the main_loop exit and qemu taking the usual exit route, with > atexit handlers being called and so on, instead of qemu just being killed > by the signal. > > To avoid calling vm_start() from the signal handler main_loop() got an > additional check so qemu_system_shutdown_request() works even when the > vm is in stopped state. > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>Applied. Thanks. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 20:12 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 02/13] add DisplayState->idle
Gerd Hoffmann wrote:> From: Samuel Thibault <samuel.thibault@eu.citrix.com> > > From: Samuel Thibault <samuel.thibault@eu.citrix.com> > > Add idle field to DisplayState struct, so drivers can figure > the display is idle and take advantage of that. > > The xen framebuffer driver will use this to communicate the > idle state to the guest, so it knows it can stop doing updates > to a virtual display which is invisible anyway. > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > >Applied. Thanks. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 20:12 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 03/13] add container_of() macro to osdep.h
Gerd Hoffmann wrote:> From linux kernel sources, xen bits will use it, put it > into a place where others can see and use it too ;) > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > ---Applied. Thanks. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 20:12 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 04/13] move GUI_REFRESH_INTERVAL define from vl.c to console.h
Gerd Hoffmann wrote:> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > >Applied. Thanks. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 20:14 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
I don''t really like this approach. I don''t want to have a totally separate management interface just for Xend to use (via xenstore). Can''t Xend use the monitor like every other management tool? Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-21 20:17 UTC
Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore.
What is the justification for this change. We already have a means to set the VNC password from the monitor. If this isn''t sufficient we should introduce an additional method which isn''t Xen specific. I know XenD currently sends the passwords to QEMU via xenstore, but that''s not a good reason to preserve this approach. XenD does lots of stuff via XenStore which it should be doing via command line arguments - such as setting up disks instead of using the -drive param. IMHO, XenD should just set the VNC password by connecting to the monitor. That''s what its there for after all. Daniel On Thu, Aug 21, 2008 at 06:27:33PM +0200, Gerd Hoffmann wrote:> > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > --- > hw/xen_machine_pv.c | 26 ++++++++++++++++++++++++++ > 1 files changed, 26 insertions(+), 0 deletions(-) > > diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c > index 851837c..9f18742 100644 > --- a/hw/xen_machine_pv.c > +++ b/hw/xen_machine_pv.c > @@ -24,6 +24,7 @@ > > #include "hw.h" > #include "boards.h" > +#include "console.h" > > #include "xen_backend.h" > > @@ -110,6 +111,29 @@ static int xen_init_pv(DisplayState *ds) > return 0; > } > > +static void xen_init_vnc(void) > +{ > + char xspath[256], *dom = NULL, *vm = NULL, *pw = NULL; > + int len; > + > + dom = xs_get_domain_path(xenstore, xen_domid); > + snprintf(xspath, sizeof(xspath), "%s/vm", dom); > + vm = xs_read(xenstore, 0, xspath, &len); > + if (!vm) > + goto out; > + snprintf(xspath, sizeof(xspath), "%s/vncpassword", vm); > + pw = xs_read(xenstore, 0, xspath, &len); > + if (!pw || !strlen(pw)) > + goto out; > + vnc_display_password(NULL, pw); > + fprintf(stderr, "vnc password set from xenstore\n"); > + > +out: > + free(dom); > + free(vm); > + free(pw); > +} > + > /* -------------------------------------------------------------------- */ > /* paravirtualized xen machine */ > > @@ -136,6 +160,8 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size, > goto err; > } > > + xen_init_vnc(); > + > /* create dummy cpu, halted */ > if (cpu_model == NULL) { > #ifdef TARGET_X86_64 > -- > 1.5.5.1 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 20:18 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Gerd Hoffmann wrote:> - configure script and build system changes. > - wind up new machine type. > - add -domid command line option. > - allow xenpv machines run without disk and kernel specified > by adding a nodisk_ok field to QEMUMachine. > - detect xen being present. > > Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> > >I''m pretty happy with up to 11 in the series but I''ll hold off applying them until they show up in qemu-dm or at least are Acked-by Ian or Samuel. Ian/Samuel: is there any subset of these patches that you are comfortable with or do they all need more discussion? Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-21 20:19 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
On Thu, Aug 21, 2008 at 03:14:26PM -0500, Anthony Liguori wrote:> I don''t really like this approach. I don''t want to have a totally > separate management interface just for Xend to use (via xenstore). > > Can''t Xend use the monitor like every other management tool?If the monitor really isn''t sufficient, then I''d suggest allowing an open file descriptor to be specified on the -vnc arg, eg -vnc localhost:3,passwdfd=6 And then let QEMU read the password from this file descriptor. I''d rather XenD just used the monitor though - it''ll ultimately need to use it anyway to perform disk/network hotplug amongst other things. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 20:22 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Daniel P. Berrange wrote:> On Thu, Aug 21, 2008 at 03:14:26PM -0500, Anthony Liguori wrote: > >> I don''t really like this approach. I don''t want to have a totally >> separate management interface just for Xend to use (via xenstore). >> >> Can''t Xend use the monitor like every other management tool? >> > > If the monitor really isn''t sufficient, then I''d suggest allowing an > open file descriptor to be specified on the -vnc arg, eg > > -vnc localhost:3,passwdfd=6 >In general, I strongly dislike passing file descriptors like this.> And then let QEMU read the password from this file descriptor. I''d rather > XenD just used the monitor though - it''ll ultimately need to use it anyway > to perform disk/network hotplug amongst other things. >If there are reasons that Xend doesn''t want to interact with the monitor (for instance, if you want to support monitor redirection), we can certainly discuss how to enhance the monitor (like having multiple monitor instances). Ultimately, we need to have everyone use the same interface to interact with QEMU. Regards, Anthony Liguori> Daniel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-21 20:26 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
On Thu, Aug 21, 2008 at 03:22:41PM -0500, Anthony Liguori wrote:> Daniel P. Berrange wrote: > >On Thu, Aug 21, 2008 at 03:14:26PM -0500, Anthony Liguori wrote: > > > >>I don''t really like this approach. I don''t want to have a totally > >>separate management interface just for Xend to use (via xenstore). > >> > >>Can''t Xend use the monitor like every other management tool? > >> > > > >If the monitor really isn''t sufficient, then I''d suggest allowing an > >open file descriptor to be specified on the -vnc arg, eg > > > > -vnc localhost:3,passwdfd=6 > > > > In general, I strongly dislike passing file descriptors like this. > > >And then let QEMU read the password from this file descriptor. I''d rather > >XenD just used the monitor though - it''ll ultimately need to use it anyway > >to perform disk/network hotplug amongst other things. > > > > If there are reasons that Xend doesn''t want to interact with the monitor > (for instance, if you want to support monitor redirection), we can > certainly discuss how to enhance the monitor (like having multiple > monitor instances). Ultimately, we need to have everyone use the same > interface to interact with QEMU.Based on my understanding of XenD code & requirements I think it is mostly just a historical accident that it doesn''t use the monitor - quite simply when XenD first picked up QEMU there wasn''t much of use the monitor could provide. There were no disk/network/vnc commands for it to use, so it did the easiest thing and read the data out of xenstore, since it was there already. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 21:25 UTC
Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore.
Daniel P. Berrange wrote:> What is the justification for this change. We already have a means > to set the VNC password from the monitor. If this isn''t sufficient > we should introduce an additional method which isn''t Xen specific.Justification is that I''ve just implemented what qemu-dm does today, so upstream qemu can easily replace qemu-dm. I''m perfectly fine with dropping the patch altogether.> IMHO, XenD should just set the VNC password by connecting to the > monitor. That''s what its there for after all.Agreed. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 21:29 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Daniel P. Berrange wrote:> If the monitor really isn''t sufficient, then I''d suggest allowing an > open file descriptor to be specified on the -vnc arg, eg > > -vnc localhost:3,passwdfd=6 > > And then let QEMU read the password from this file descriptor. I''d rather > XenD just used the monitor though - it''ll ultimately need to use it anyway > to perform disk/network hotplug amongst other things.I''d say using the monitor is the way to go for xend. I''d expect setting the vnc port in xenstore (for -vncunused functionality) will be vetoed too. Thus xend will have to use the monitor anyway to figure the vnc port. No point in implementing a passwdfd= option then. cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-21 21:34 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Hi,> If there are reasons that Xend doesn''t want to interact with the monitor > (for instance, if you want to support monitor redirection), we can > certainly discuss how to enhance the monitor (like having multiple > monitor instances).Multiple monitor instances would be very useful anyway. Right now there is no way to use the monitor for libvirt-managed qemu instances because libvirt uses the monitor. Being able to both use libvirt *and* have a monitor prompt to type commands would be great. cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-21 21:48 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
On Thu, Aug 21, 2008 at 11:34:41PM +0200, Gerd Hoffmann wrote:> Hi, > > > If there are reasons that Xend doesn''t want to interact with the monitor > > (for instance, if you want to support monitor redirection), we can > > certainly discuss how to enhance the monitor (like having multiple > > monitor instances). > > Multiple monitor instances would be very useful anyway. > > Right now there is no way to use the monitor for libvirt-managed qemu > instances because libvirt uses the monitor. Being able to both use > libvirt *and* have a monitor prompt to type commands would be great.I disagree - that means you''ll no longer be able to trust what libvirt tells you about the VM, and libvirt won''t have a guarenteed consistent view of the VM''s state because things will be changed behind its back. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 22:34 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Gerd Hoffmann wrote:> Hi, > > >> If there are reasons that Xend doesn''t want to interact with the monitor >> (for instance, if you want to support monitor redirection), we can >> certainly discuss how to enhance the monitor (like having multiple >> monitor instances). >> > > Multiple monitor instances would be very useful anyway. >The trick here is to add a MonitorState context to every callback and have term_printf() take a MonitorState as the first argument. It''s a mostly mechanical change.> Right now there is no way to use the monitor for libvirt-managed qemu > instances because libvirt uses the monitor. Being able to both use > libvirt *and* have a monitor prompt to type commands would be great. >It would make the current monitor multiplex hackery a lot cleaner too. Regards, Anthony Liguori> cheers, > Gerd > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-21 22:36 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Daniel P. Berrange wrote:> On Thu, Aug 21, 2008 at 11:34:41PM +0200, Gerd Hoffmann wrote: > >> Hi, >> >> Multiple monitor instances would be very useful anyway. >> >> Right now there is no way to use the monitor for libvirt-managed qemu >> instances because libvirt uses the monitor. Being able to both use >> libvirt *and* have a monitor prompt to type commands would be great. >> > > I disagree - that means you''ll no longer be able to trust what libvirt > tells you about the VM, and libvirt won''t have a guarenteed consistent > view of the VM''s state because things will be changed behind its back. >A nice side effect of having multiple monitors is that you can implement a proper ''select'' command. This command would block and tell you about asynchronous events (like guest CD-ROM ejecting). You could use this mechanism to notify libvirt when things changed because of other monitors too. It''s more work though from a libvirt perspective and you probably don''t want users screwing around in the monitor anyway. Regards, Anthony Liguori> Daniel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-22 07:04 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Daniel P. Berrange wrote:>> Multiple monitor instances would be very useful anyway. >> >> Right now there is no way to use the monitor for libvirt-managed qemu >> instances because libvirt uses the monitor. Being able to both use >> libvirt *and* have a monitor prompt to type commands would be great. > > I disagree - that means you''ll no longer be able to trust what libvirt > tells you about the VM, and libvirt won''t have a guarenteed consistent > view of the VM''s state because things will be changed behind its back.It is *not* the only purpose of the monitor to confuse libvirt. The whole ''info <foo>'' command family is very useful for development and debugging purposes and it will not interfere at all with libvirt. I can''t stand your attitude to refuse any feature just because you fear users might abuse it. That applies to almost any feature which makes life easier for developers. Which in turn means using libvirt for development is either a bit complicated or impossible. cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-22 08:27 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Anthony Liguori wrote:> Gerd Hoffmann wrote: >> Multiple monitor instances would be very useful anyway. > > The trick here is to add a MonitorState context to every callback and > have term_printf() take a MonitorState as the first argument. It''s a > mostly mechanical change.The devil is in the details ... How do we want support multiple monitors? One way would be using multiple -monitor switches, so you can have -- say -- one pty for libvirt and one vc for the user / developer. Another way would be allowing multiple connects to unix/tcp sockets, which is probably a bit more tricky due to qemu not having infrastructure for that (yet). We also could do both ... And when touching all monitor command functions anyway, we could also cleanup the handler mess a bit ;) How about this: We define *two* different callback functions: typedef struct term_cmd_t { [ ... ] void (*handler_one)(MonitorState *s, const char *arg); void (*handler_many)(MonitorState *s, int argv, char *argv[]) [ ... ] }; The first one can be used by the monitor functions taking at most one argument, with minimal source code changes, which is probably more than 90% of them. The remaining ones can be converted to use the second form. cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2008-Aug-24 08:57 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Anthony Liguori wrote:>> >> -vnc localhost:3,passwdfd=6 >> > > In general, I strongly dislike passing file descriptors like this.I find this very useful, and actually think we should encourage it, and also allow passing file descriptors over the monitor (using SCM_RIGHTS). This can help de-privilege qemu. In fact, you can run a guest where qemu doesn''t even have permissions to open the backing file. -- error compiling committee.c: too many arguments to function _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2008-Aug-24 10:07 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Jamie Lokier wrote:> Avi Kivity wrote: > >> Anthony Liguori wrote: >> >>>> -vnc localhost:3,passwdfd=6 >>>> >>> In general, I strongly dislike passing file descriptors like this. >>> >> I find this very useful, and actually think we should encourage it, and >> also allow passing file descriptors over the monitor (using >> SCM_RIGHTS). This can help de-privilege qemu. In fact, you can run a >> guest where qemu doesn''t even have permissions to open the backing file. >> > > You can already do this: > > qemu -hdc /proc/self/fd/3 3<>MS-DOS-6.22.img > >Not over the monitor. -- error compiling committee.c: too many arguments to function _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 10:13 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore."):> I don''t really like this approach. I don''t want to have a totally > separate management interface just for Xend to use (via xenstore).I agree that the xenstore management approach is not right for qemu. It would be better to go through a more stable interface.> Can''t Xend use the monitor like every other management tool?I think it probably could. When I was doing my initial work to start merging our tree with the qemu one, I didn''t do this because I wanted a drop-in replacement for the older code, and because I ran out of time. But as others have noted there are also two things which the monitor doesn''t do which we would need: multiple monitors, and the ability for a monitor client to wait for events. We need multiple monitors so that a monitor remains available to those people whose Xen installations are already making use of it. In the new scheme of things I envisage a xenstore <-> qemu-monitor conversion layer. This would work much like the way xenstore.c works in our Xen qemu tree does. It probably ought not to be part of xend, and I think I would still like it to be compiled as part of our qemu build. If upstream qemu don''t want a bevy of alternative management veneers around the main qemu binary in their tree then I don''t think there would be a problem with us maintaining that indefinitely in our version. The important thing is not which tree the code is in, but that we should go through a stable interface intended by qemu upstream to be used for this kind of thing. So I would suggest that the starting point would be to provide for multiple monitors. Then, we can gradually change the code in our tree''s xenstore.c to go via the monitor (adding functionality to the monitor as necessary). Eventually we''ll have a xenstore.c which doesn''t do anything other than go via the monitor at which point we can either leave things, or include that upstream if desired. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 10:31 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Gerd Hoffmann writes ("[Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):> +/* > + * Figure the environment we are running in. > + * Returns true when xen is present, false otherwise. > + * Also checks whenever the domain specified via -domid > + * exists (so we can attach) or whenever it must be created. > + */I don''t think this is the right approach. The intent appears to be that if you run a particular qemu rune, it will do a completely different thing when running under Xen. One bad effect of this is that it is not possible to do a software emulation of the Xen environment if the host happens to actually be running under Xen. The command line, or chosen qemu executable, should specify what qemu is supposed to do: - supporting tasks (hardware emulation etc.) for a real Xen domain managed by Xen tools via xenstore etc. - software emulation of the Xen environment for the benefit of a Xen domU but without using the Xen hypervisor (and this should obviously work even if running under Xen0 - whatever other use cases are important; apparently Gerd seems keen to run some code with parts of the xend toolstack replaced Note that of course we have also ported qemu to our embedded minios so that it can run as a service domain.> + "-domid specify xen guest domain id\n"Now would be a good time to rename this option to have the word `xen'' in it. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 10:42 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Gerd Hoffmann writes ("[Qemu-devel] [PATCH 13/13] xen: pv domain builder."):> This adds domain building support for paravirtual domains to qemu. > This allows booting xen guests directly with qemu, without Xend > and the management stack.Gerd Hoffmann writes ("[Qemu-devel] [PATCH 06/13] xen: backend driver core"):> This patch adds infrastructure for xen backend drivers living in qemu, > so drivers don''t need to implement common stuff on their own. It''s > mostly xenbus management stuff: some functions to access xentore, > setting up xenstore watches, callbacks on device discovery and state > changes, handle event channel, ...Gerd Hoffmann writes ("[Qemu-devel] [PATCH 11/13] xen: blk & nic configuration via cmd line."):> This patch makes qemu create backend and frontend device entries in > xenstore for devices configured on the command line. It will use > qdisk and qnic backend names, so the qemu internal backends will > be used.These kind of patches are rather troublesome I''m afraid. On one level they are irrelevant to Xen, or at least to Xen upstream. They exist to support modes of operation which differ from the way upstream Xen works. Having other ways to use the Xen hypervisor is fine by us. So in theory we shouldn''t care. However in practice these changes are going to make it harder to merge the existing Xen code, because they do some but not all of the same things in a different way. If these patches are merged into qemu upstream then I''ll have to do a lot of untangling. I might have to put off bringing our tree up to date again. (We had ours frozen for a couple of months for our release.) What I would prefer is if Gerd would submit a patch or patches to xen-devel, against our qemu, to factor out of the functionality needed by his code. This should be done in a way that is both suitable for his needs and structurally sensible for the xen upstream tree. When that patch is accepted into qemu-xen, there will be no trouble hopefully porting very similar if not entirely identical code into qemu upstream. And then Gerd can submit patches to qemu-upstream for his new backend drivers, and those patches do not need to conflict with anything in qemu-xen. Although we ought to review them to make sure that the command line options, etc., are all clear and nonconflicting. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 10:44 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Anthony Liguori writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> Gerd Hoffmann wrote: > > This patch makes qemu handle signals better. It sets the request_shutdown > > flag, making the main_loop exit and qemu taking the usual exit route, with > > atexit handlers being called and so on, instead of qemu just being killed > > by the signal. > > Applied. Thanks.This seems sensible to me. I have a patch in our tree which changes the signal delivery mechanism for aio completion to use the well known pipe-to-self trick. I had to do this because there were some circumstances where IO completion signals were being lost due to races. I was planning to send this upstream. Should I enhance my change to do the same to the termination signals too ? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 10:47 UTC
Re: [Qemu-devel] Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore.
Gerd Hoffmann writes ("[Qemu-devel] Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore."):> Justification is that I''ve just implemented what qemu-dm does today, so > upstream qemu can easily replace qemu-dm.This seems to be predicated on the idea that xen upstream are just going to drop our tree in favour of Gerd''s. That seems unlikely to me. Rather, we would like to evolve the various branches so that they are closer together. In particular, where changes are though to be needed to bring qemu-xen closer to upstream, we (Xen upstream) would like to consider each such change on its own merits. We don''t intend to import a whole pile of externally-written Xen support code to replace what we have now. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Samuel Thibault
2008-Aug-26 10:50 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Ian Jackson, le Tue 26 Aug 2008 11:42:11 +0100, a écrit :> Gerd Hoffmann writes ("[Qemu-devel] [PATCH 06/13] xen: backend driver core"): > > This patch adds infrastructure for xen backend drivers living in qemu, > > so drivers don''t need to implement common stuff on their own. It''s > > mostly xenbus management stuff: some functions to access xentore, > > setting up xenstore watches, callbacks on device discovery and state > > changes, handle event channel, ... > > These kind of patches are rather troublesome I''m afraid.The one above should be fine I think (and the relevant console/pvfb refactoring, which Mark seemed happy to see). Samuel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 12:16 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> I have a patch in our tree which changes the signal delivery mechanism > for aio completion to use the well known pipe-to-self trick. I had to > do this because there were some circumstances where IO completion > signals were being lost due to races. > > I was planning to send this upstream. Should I enhance my change to > do the same to the termination signals too ?What this would be useful for? The termination signals usualy one-shot anyway, so I don''t see any advantage in doing the pipe-to-self trick. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 12:34 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Ian Jackson wrote:> Gerd Hoffmann writes ("[Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): >> +/* >> + * Figure the environment we are running in. >> + * Returns true when xen is present, false otherwise. >> + * Also checks whenever the domain specified via -domid >> + * exists (so we can attach) or whenever it must be created. >> + */ > > I don''t think this is the right approach. The intent appears to be > that if you run a particular qemu rune, it will do a completely > different thing when running under Xen.That is the intended default behaviour, yes. If you run on Xen, you most likely want to use it, right? That matches behaviour of kvm and kqemu: If support is available, it is used by default (there are options to turn it off).> One bad effect of this is that it is not possible to do a software > emulation of the Xen environment if the host happens to actually be > running under Xen.That isn''t a fundamental issue. It can trivially be fixed by adding a command line option, which just has to set xen_present = 0.> The command line, or chosen qemu executable, should specify what qemu > is supposed to do: > - supporting tasks (hardware emulation etc.) > for a real Xen domain managed by Xen tools via xenstore etc. > - software emulation of the Xen environment for the benefit > of a Xen domU but without using the Xen hypervisor (and > this should obviously work even if running under Xen0 > - whatever other use cases are important; apparently Gerd > seems keen to run some code with parts of the xend toolstack > replacedWe can certainly add command line options to force a specific operation mode, see above. I don''t think such an option should be required though. IMHO qemu can and should pick sensible defaults.>> + "-domid specify xen guest domain id\n" > > Now would be a good time to rename this option to have the word `xen'' > in it.How about -xen-domid? And prefix any other xen specific options with -xen too of course? cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 12:36 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> What this would be useful for? The termination signals usualy one-shot > anyway, so I don''t see any advantage in doing the pipe-to-self trick.The point is that then they can interrupt select(). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 12:55 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Ian Jackson wrote:> These kind of patches are rather troublesome I''m afraid. > > However in practice these changes are going to make it harder to merge > the existing Xen code, because they do some but not all of the same > things in a different way. If these patches are merged into qemu > upstream then I''ll have to do a lot of untangling.If the patches break something, report it and I''ll fix it up. "I''m afraid something might break" doesn''t help us going forward.> What I would prefer is if Gerd would submit a patch or patches to > xen-devel, against our qemu, to factor out of the functionality needed > by his code. This should be done in a way that is both suitable for > his needs and structurally sensible for the xen upstream tree.http://kraxel.fedorapeople.org/patches/qemu-xen/ http://lists.xensource.com/archives/html/xen-devel/2008-08/msg00887.html cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-26 12:57 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Tue, Aug 26, 2008 at 11:31:47AM +0100, Ian Jackson wrote:> Gerd Hoffmann writes ("[Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): > > +/* > > + * Figure the environment we are running in. > > + * Returns true when xen is present, false otherwise. > > + * Also checks whenever the domain specified via -domid > > + * exists (so we can attach) or whenever it must be created. > > + */ > > I don''t think this is the right approach. The intent appears to be > that if you run a particular qemu rune, it will do a completely > different thing when running under Xen.This particular check is doing two different jobs in one go. I think they need to be dealt with separately. - Check to see if running on Xen or not. This is a similar idea to to pv_ops where we have one kernel binary, and probe at runtime to decide whether to run in KVM, Xen, VMWare or bare metal mode. This is important to QEMU too. Merging the source trees for all projects using QEMU is just the first step. A single binary for all is the holy grail, though may not be practical for Mini-OS usage - Check to see whether the guest domain is pre-created (ie attaching to an existing VM created by XenD), or whether QEMU is constructing the domain from scratch. Probing for whether a domain ID exists in the hypervisor, and if not, then creating it, has a nasty race condition, where XenD could launch QEMU and then (for whatever reason) decide to kill off the domain, but QEMU then re-creates it during its startup procedure. If we want QEMU to explicitly created Xen domains from scratch, rather than attaching to an existing one, then there needs to be a way to reliably specify this behaviour via the CLI. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 12:58 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): >> What this would be useful for? The termination signals usualy one-shot >> anyway, so I don''t see any advantage in doing the pipe-to-self trick. > > The point is that then they can interrupt select().Sure, basically any signal can interrupt any system call. I still don''t see the point you are trying to make. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 13:14 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder."):> Ian Jackson wrote: > > These kind of patches are rather troublesome I''m afraid. > > > > However in practice these changes are going to make it harder to merge > > the existing Xen code, because they do some but not all of the same > > things in a different way. If these patches are merged into qemu > > upstream then I''ll have to do a lot of untangling. > > If the patches break something, report it and I''ll fix it up. > "I''m afraid something might break" doesn''t help us going forward.What I meant is that if your changes are committed to qemu upstream, I will get an enormous pile of merge conflicts when I try to merge from upstream into our tree. The correct route, for changes which touch code currently only in qemu-xen, is via qemu-xen. If what you meant is that we should replace our tree with your tree and then complain to you about anything that breaks: sorry, that''s not going to happen.> > What I would prefer is if Gerd would submit a patch or patches to > > xen-devel, against our qemu, to factor out of the functionality needed > > by his code. This should be done in a way that is both suitable for > > his needs and structurally sensible for the xen upstream tree. > > http://kraxel.fedorapeople.org/patches/qemu-xen/Those patches are not suitable for inclusion in qemu-xen. This is because - They rename files pointlessly - They contain many changes which consist of the addition of new code which is not relevant for qemu-xen. It is fine for us to change qemu-xen to make it easier for your tree and our tree to converge (such as factoring the backend driver core out of the qemu-xen xenfb). But we do not want (for example) your complete backend driver set. This is why this needs to be done in the stages I outlined above - http://kraxel.fedorapeople.org/patches/qemu-xen/0001-kraxel-s-stuff.patch "Kraxel''s Stuff" ? This is a joke, right ? etc. This has been explained before, I think ? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-26 13:17 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > >> What this would be useful for? The termination signals usualy one-shot >> anyway, so I don''t see any advantage in doing the pipe-to-self trick. >> > > The point is that then they can interrupt select(). >The race I know of is that you may get an aio signal completion before select but after you''ve already qemu_aio_poll()''d. In practice, we only sleep for 10ms at a time in select() so the race is handled by that. If we wanted to increase the amount of time we slept, we would have to handle this race. In KVM, we sleep for 1s in select() and use signalfd() to receive the aio notifications. For older hosts, we emulate signalfd using a thread and the pipe-to-self trick. Regards, Anthony Liguori> Ian. > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 13:20 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange wrote:> Probing for whether a domain ID exists in the hypervisor, and if not, then > creating it, has a nasty race condition, where XenD could launch QEMU > and then (for whatever reason) decide to kill off the domain, but QEMU > then re-creates it during its startup procedure.Oops. Yes, that needs to be addressed.> If we want QEMU to explicitly created Xen domains from scratch, rather than > attaching to an existing one, then there needs to be a way to reliably > specify this behaviour via the CLI.So we are scoring at three options for xen now: -xen-domid <nr> specify xen domain id. -xen-create ask qemu to create the domain. -xen-emulate force xen emulation mode. I''ll go add these for the next round. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-26 13:23 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Tue, Aug 26, 2008 at 03:20:06PM +0200, Gerd Hoffmann wrote:> Daniel P. Berrange wrote: > > Probing for whether a domain ID exists in the hypervisor, and if not, then > > creating it, has a nasty race condition, where XenD could launch QEMU > > and then (for whatever reason) decide to kill off the domain, but QEMU > > then re-creates it during its startup procedure. > > Oops. Yes, that needs to be addressed. > > > If we want QEMU to explicitly created Xen domains from scratch, rather than > > attaching to an existing one, then there needs to be a way to reliably > > specify this behaviour via the CLI. > > So we are scoring at three options for xen now: > > -xen-domid <nr> specify xen domain id. > -xen-create ask qemu to create the domain.Actually I''d invert that, eg -xen-attach since common QEMU semantics are that it is completely in charge of creation. Attaching to an externally created VM is the xen additional semantics. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 13:24 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):> This particular check is doing two different jobs in one go. I think > they need to be dealt with separately. > > - Check to see if running on Xen or not. This is a similar idea to > to pv_ops where we have one kernel binary, and probe at runtime > to decide whether to run in KVM, Xen, VMWare or bare metal mode.I don''t think this is correct for the Xen case. Certainly it is not correct for a qemu invoked by the Xen management tools, and it is also not correct for a qemu invoked by hand on a system which already has xend etc. If you''re trying to run qemu under Xen but standalone without xend then you''re going to have to tell qemu that that''s what you want. Otherwise if you try to run the same qemu under Xen but _with_ xend, qemu in dom0 will create a domain bypassing xend and cause all sorts of trouble. Autodetection of this kind of thing is fragile, and in the Xen toolstack case also unnecessary (since the Xen toolstack can invoke qemu with appropriate options to tell it only to do the emulation parts). So I think the default mode of operation for qemu instructed to provide a Xen domU environment should be 100% software, or via KVM if KVM is available. KVM is designed as a facility to be used by user processes without negotiation with the system administrator so it is fine to autodetect that and use it. The Xen hypercall and domain management interfaces are not like that: the interfaces provided by the hypervisor et al should be used only by cooperation with the Xen management stack.> This is important to QEMU too. Merging the source trees for all > projects using QEMU is just the first step. A single binary for > all is the holy grail, though may not be practical for Mini-OS > usageI''m certainly in favour of closer integration. If qemu upstream are willing to take all of our code, including the parts which serve modes of operation and features which qemu upstream don''t agree are needed, then excellent. But I think that for now it would be easier to concentrate on providing interfaces to allow the qemu-xen tree to mainly be additional code which goes through well-defined public interfaces to the rest of qemu. (From this point of view it''s a shame that qemu itself isn''t more like a library than an executable.) Certainly the minios port can''t be the same executable. The qemu code is statically linked with minios into what amounts to a monolithic PV guest kernel.> If we want QEMU to explicitly created Xen domains from scratch, rather than > attaching to an existing one, then there needs to be a way to reliably > specify this behaviour via the CLI.Yes. And this should not be done by default because on a system with xend, you don''t want to be messing with xend''s idea of the running domains. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 13:33 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):> So we are scoring at three options for xen now: > > -xen-domid <nr> specify xen domain id. > -xen-create ask qemu to create the domain. > -xen-emulate force xen emulation mode.No, the correct set of options are: -xen-bare no xend on this system, qemu should create a Xen domain without this, Xen machine types use emulated Xen functionality -xen-domid <nr> has slightly different meanings in the various use cases And if you really think you must have the same executable for the xend case: -xen-attach used by xend; qemu is to support a Xen HVM domain not to be used from the command line Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 13:38 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> The race I know of is that you may get an aio signal completion before > select but after you''ve already qemu_aio_poll()''d. In practice, we only > sleep for 10ms at a time in select() so the race is handled by that. If > we wanted to increase the amount of time we slept, we would have to > handle this race.Yes. And, 10ms is too long anyway for reasonable performance. During my merge with upstream I found that the qemu aio functionality (which was done quite differently to the old xen ioemu) caused a severe performance regression under some conditions because of this race.> In KVM, we sleep for 1s in select() and use signalfd() to receive the > aio notifications. For older hosts, we emulate signalfd using a thread > and the pipe-to-self trick.Why does it need a thread ? You can just write to the pipe in the signal handler. I''ll post my code. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 13:43 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Samuel Thibault writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder."):> Ian Jackson, le Tue 26 Aug 2008 11:42:11 +0100, a écrit : > > Gerd Hoffmann writes ("[Qemu-devel] [PATCH 06/13] xen: backend driver core"): > > > This patch adds infrastructure for xen backend drivers living in qemu, > > > so drivers don''t need to implement common stuff on their own. It''s > > > mostly xenbus management stuff: some functions to access xentore, > > > setting up xenstore watches, callbacks on device discovery and state > > > changes, handle event channel, ... > > > > These kind of patches are rather troublesome I''m afraid. > > The one above should be fine I think (and the relevant console/pvfb > refactoring, which Mark seemed happy to see).I think that one is good in principle. But I would like to see that patch on its own against qemu-xen. Then the resulting files in Gerd''s tree and my tree will be identical and there is no problem with Gerd''s other new files which depend on it going upstream. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-26 13:50 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Tue, Aug 26, 2008 at 02:24:10PM +0100, Ian Jackson wrote:> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): > > This particular check is doing two different jobs in one go. I think > > they need to be dealt with separately. > > > > - Check to see if running on Xen or not. This is a similar idea to > > to pv_ops where we have one kernel binary, and probe at runtime > > to decide whether to run in KVM, Xen, VMWare or bare metal mode. > > I don''t think this is correct for the Xen case. Certainly it is not > correct for a qemu invoked by the Xen management tools, and it is also > not correct for a qemu invoked by hand on a system which already has > xend etc. > > If you''re trying to run qemu under Xen but standalone without xend > then you''re going to have to tell qemu that that''s what you want. > Otherwise if you try to run the same qemu under Xen but _with_ xend, > qemu in dom0 will create a domain bypassing xend and cause all sorts > of trouble.Yes, mixing manually created VMs & XenD managed VMs is not a use case I was thinking we should try to address. The core use case for creating VMs on the Xen hypervisor, is that you want to avoid the python stack, and thus you aren''t likely to have XenD around anyway. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 13:53 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Hi,>> http://kraxel.fedorapeople.org/patches/qemu-xen/ > > Those patches are not suitable for inclusion in qemu-xen. This is > because > - They rename files pointlesslyYou havn''t looked at them in detail for a while, right? The pointless xen_* -> xen-* renames are all gone. Only one is left: Rename xenfb.c to xen_framebuffer.c for consistency with the other xen files.> - They contain many changes which consist of the addition of new > code which is not relevant for qemu-xen.Feel free to just ignore them. In theory they should be a no-op anyway, but I though you''d prefer to be able to test that. If you don''t want that, fine, I''ll happily drop the patches then. Less work for me, which is always welcome.> It is fine for us to > change qemu-xen to make it easier for your tree and our tree to > converge (such as factoring the backend driver core out of the > qemu-xen xenfb).Fine.> But we do not want (for example) your complete > backend driver set.You''ll get them from upstream at some point anyway. If you don''t want them beforehand for testing just tell me, no problem.> - http://kraxel.fedorapeople.org/patches/qemu-xen/0001-kraxel-s-stuff.patch > "Kraxel''s Stuff" ? This is a joke, right ?Yea, the very first is just some script-age for me, ignore that one. Patches 02 -> 04 pull depending patches, once you''ve merged with upstream they are not needed any more and will be removed from the patch set. Patches 05 -> 09 is the backend core / console / framebuffer stuff you are willing to take. Patch 10 are the xen_machine_pv.c changes (upstream changeset has this earlier, moved here for better bisectability). Patch 11+ are disk + nic backends. Ignore them if you don''t want them.> This has been explained before, I think ?Please read the patches and also the introduction text for the patch sets posted to xen-devel and qemu-devel. I *do* address your review comments. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 13:56 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
>> If the patches break something, report it and I''ll fix it up. >> "I''m afraid something might break" doesn''t help us going forward. > > What I meant is that if your changes are committed to qemu upstream, I > will get an enormous pile of merge conflicts when I try to merge from > upstream into our tree.What do you think about the merge plan outlined in http://lists.xensource.com/archives/html/xen-devel/2008-08/msg00887.html That should avoid the enormous pile of merge conflicts, right? cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-26 14:14 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Tue, Aug 26, 2008 at 02:33:09PM +0100, Ian Jackson wrote:> Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): > > So we are scoring at three options for xen now: > > > > -xen-domid <nr> specify xen domain id. > > -xen-create ask qemu to create the domain. > > -xen-emulate force xen emulation mode. > > No, the correct set of options are: > > -xen-bare no xend on this system, qemu should create a Xen domain > without this, Xen machine types use emulated > Xen functionalityI''m not sure I understand what you mean here ? Are you suggesting this for the ''Xen hypervisor present, standalone mode - ie no XenD'' ? If so I think that''s not right - for standalone mode we shouldn''t require any special args for the admin. We should only require additional args for cases where management tools like XenD or libvirt are invoking QEMU.> -xen-domid <nr> has slightly different meanings in the various use casesPerhaps best not to overload this then - pass domid to the -attach arg directly, and have a -xen-create <nr> for the scenario where you want to specify the domid for a newly created VM> And if you really think you must have the same executable for the xend > case: > > -xen-attach used by xend; qemu is to support a Xen HVM domain > not to be used from the command lineI''ll try to enumerate the scenarios here... On KVM, using Xenner + shell : No special args On KVM, using Xenner + libvirtd : -xen-create <domid> On Xen HV + shell : No special args On Xen HV + libvirtd : -xen-create <domid> On Xen HV + XenD : -xen-attach <domid> NB. by ''shell'' I mean the admin manually launching QEMU from the shell If neither -xen-create or -xen-attach is given, then it will create a new domain from scratch & auto-allocate a domid. Passing both the -xen-create and -xen-attach args is forbidden / nonsensical. In all cases, the actual hypervisor domain ID should be exposed via the monitor console. Ian, did I miss any of the use cases you want addressed ? I''m leaving mini-os out of this now, since as you say, its impossible to have one binary deal with it, but I''ll assume its similar to -xen-attach <domid> use case in its working. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 14:19 UTC
[Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder."):> > Those patches are not suitable for inclusion in qemu-xen. This is > > because > > - They rename files pointlessly > > You havn''t looked at them in detail for a while, right?I just visited the URL and I still saw ...> The pointless xen_* -> xen-* renames are all gone. Only one is left: > Rename xenfb.c to xen_framebuffer.c for consistency with the other xen > files.... this. I agree that it would be nice to have the files have nice names. But for the moment when all of our trees are so heavily diverged, and development in many of the trees is proceeding apace I think that renaming files is very harmful. But thanks a lot for these explanations:> Patches 02 -> 04 pull depending patches, once you''ve merged with > upstream they are not needed any more and will be removed from the patch > set. > > Patches 05 -> 09 is the backend core / console / framebuffer stuff you > are willing to take. > > Patch 10 are the xen_machine_pv.c changes (upstream changeset has this > earlier, moved here for better bisectability). > > Patch 11+ are disk + nic backends. Ignore them if you don''t want them.Obviously when faced with a 13-patch series the first of which is a mistake, you can understand my scepticism and my reluctance to give each individual patch the full half hour ? I''ll take a more thorough look at 02-06. Do they in your opinion make sense on their own in qemu-xen-unstable ? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 14:31 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange wrote:> I''ll try to enumerate the scenarios here... > > On KVM, using Xenner + shell : No special args > On KVM, using Xenner + libvirtd : -xen-create <domid>This is "xen emulation" and (according to long term plan plan) doesn''t require kvm but should work with qemu emulating things too. This means it doesn''t depend on kvm, you can do that with xen underneath too. This is what the -xen-emulate switch is good for.> On Xen HV + shell : No special args > On Xen HV + libvirtd : -xen-create <domid> > On Xen HV + XenD : -xen-attach <domid>I like the "create" and "attach" namings as they make pretty clear what qemu should do. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-26 14:40 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Tue, Aug 26, 2008 at 04:31:49PM +0200, Gerd Hoffmann wrote:> Daniel P. Berrange wrote: > > I''ll try to enumerate the scenarios here... > > > > On KVM, using Xenner + shell : No special args > > On KVM, using Xenner + libvirtd : -xen-create <domid> > > This is "xen emulation" and (according to long term plan plan) doesn''t > require kvm but should work with qemu emulating things too. This means > it doesn''t depend on kvm, you can do that with xen underneath too. This > is what the -xen-emulate switch is good for.Wouldn''t the non-KVM case already be dealt with by the -no-kvm command line arg to QEMU ?> > On Xen HV + shell : No special args > > On Xen HV + libvirtd : -xen-create <domid> > > On Xen HV + XenD : -xen-attach <domid> > > I like the "create" and "attach" namings as they make pretty clear what > qemu should do.Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 14:49 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange wrote:> On Tue, Aug 26, 2008 at 04:31:49PM +0200, Gerd Hoffmann wrote: >> Daniel P. Berrange wrote: >>> I''ll try to enumerate the scenarios here... >>> >>> On KVM, using Xenner + shell : No special args >>> On KVM, using Xenner + libvirtd : -xen-create <domid> >> This is "xen emulation" and (according to long term plan plan) doesn''t >> require kvm but should work with qemu emulating things too. This means >> it doesn''t depend on kvm, you can do that with xen underneath too. This >> is what the -xen-emulate switch is good for. > > Wouldn''t the non-KVM case already be dealt with by the -no-kvm command > line arg to QEMU ?But when running on xen we want being able to qemu tell it should *not* use the xen hypervisor but emulate things, thus -xen-emulate. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2008-Aug-26 14:52 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:>> In KVM, we sleep for 1s in select() and use signalfd() to receive the >> aio notifications. For older hosts, we emulate signalfd using a thread >> and the pipe-to-self trick. >> > > Why does it need a thread ? You can just write to the pipe in the > signal handler. I''ll post my code. >Not sure that it matters, but the semantics are slightly different: with a thread you don''t get EINTR in random syscalls as the signal thread is the only one that has the signals unblocked. -- error compiling committee.c: too many arguments to function _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-26 14:53 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Tue, Aug 26, 2008 at 04:49:05PM +0200, Gerd Hoffmann wrote:> Daniel P. Berrange wrote: > > On Tue, Aug 26, 2008 at 04:31:49PM +0200, Gerd Hoffmann wrote: > >> Daniel P. Berrange wrote: > >>> I''ll try to enumerate the scenarios here... > >>> > >>> On KVM, using Xenner + shell : No special args > >>> On KVM, using Xenner + libvirtd : -xen-create <domid> > >> This is "xen emulation" and (according to long term plan plan) doesn''t > >> require kvm but should work with qemu emulating things too. This means > >> it doesn''t depend on kvm, you can do that with xen underneath too. This > >> is what the -xen-emulate switch is good for. > > > > Wouldn''t the non-KVM case already be dealt with by the -no-kvm command > > line arg to QEMU ? > > But when running on xen we want being able to qemu tell it should *not* > use the xen hypervisor but emulate things, thus -xen-emulate.Oh, i see what you mean. Existing naming practice would suggest naming it -no-xenhv or just -no-xen All these ''-no-XXX'' args are getting a little silly though. I''d rather tell QEMU what I *do* want, than what I don''t want. A generic arg ''-accelerator none|xen|kvm|kqemu'' might be worth considering in the future. That can be a separate discussion from these patches though - we should just follow the -no-XXX naming for now to minimize diffs Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 14:53 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):> But when running on xen we want being able to qemu tell it should *not* > use the xen hypervisor but emulate things, thus -xen-emulate.That should be the default. It is not safe to use the Xen hypervisor directly if there is a management toolstack which expects to mediate domain creation. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 14:55 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> Why does it need a thread ? You can just write to the pipe in the > signal handler. I''ll post my code.Just done. As you see it''s quite straightforward. We have to save/restore errno in the signal handler as usual. So should I provide a patch to do the same for the termination signals ? They are much less time-critical than aio completion so it''s perhaps less important. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-26 15:04 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > >> The race I know of is that you may get an aio signal completion before >> select but after you''ve already qemu_aio_poll()''d. In practice, we only >> sleep for 10ms at a time in select() so the race is handled by that. If >> we wanted to increase the amount of time we slept, we would have to >> handle this race. >> > > Yes. And, 10ms is too long anyway for reasonable performance. During > my merge with upstream I found that the qemu aio functionality (which > was done quite differently to the old xen ioemu) caused a severe > performance regression under some conditions because of this race. >Yeah, noticed that too, especially with qcow2.>> In KVM, we sleep for 1s in select() and use signalfd() to receive the >> aio notifications. For older hosts, we emulate signalfd using a thread >> and the pipe-to-self trick. >> > > Why does it need a thread ? You can just write to the pipe in the > signal handler. I''ll post my code. >It''s a little less perfect. Your signal handler can write() to the pipe but what happens if you get EAGAIN? So what we do in KVM is use sigwait() within a separate thread. We don''t set O_NONBLOCK so the thread blocks if the pipe fills up which is exactly the semantics you would want. Below is our implementation. I''ll queue up to push this change into QEMU after I finish with the migration patches. Regards, Anthony Liguori #include <sys/syscall.h> #include <pthread.h> struct sigfd_compat_info { sigset_t mask; int fd; }; static void *sigwait_compat(void *opaque) { struct sigfd_compat_info *info = opaque; int err; sigset_t all; sigfillset(&all); sigprocmask(SIG_BLOCK, &all, NULL); do { siginfo_t siginfo; err = sigwaitinfo(&info->mask, &siginfo); if (err == -1 && errno == EINTR) { err = 0; continue; } if (err > 0) { char buffer[128]; size_t offset = 0; memcpy(buffer, &err, sizeof(err)); while (offset < sizeof(buffer)) { ssize_t len; len = write(info->fd, buffer + offset, sizeof(buffer) - offset); if (len == -1 && errno == EINTR) continue; if (len <= 0) { err = -1; break; } offset += len; } } } while (err >= 0); return NULL; } static int kvm_signalfd_compat(const sigset_t *mask) { pthread_attr_t attr; pthread_t tid; struct sigfd_compat_info *info; int fds[2]; info = malloc(sizeof(*info)); if (info == NULL) { errno = ENOMEM; return -1; } if (pipe(fds) == -1) { free(info); return -1; } memcpy(&info->mask, mask, sizeof(*mask)); info->fd = fds[1]; pthread_attr_init(&attr); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); pthread_create(&tid, &attr, sigwait_compat, info); pthread_attr_destroy(&attr); return fds[0]; } int kvm_signalfd(const sigset_t *mask) { #if defined(SYS_signalfd) int ret; ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8); if (!(ret == -1 && errno == ENOSYS)) return ret; #endif return kvm_signalfd_compat(mask); }> Ian. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 15:05 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):> On Tue, Aug 26, 2008 at 02:33:09PM +0100, Ian Jackson wrote: > > -xen-bare no xend on this system, qemu should create a Xen domain > > without this, Xen machine types use emulated > > Xen functionality > > I''m not sure I understand what you mean here ? Are you suggesting this > for the ''Xen hypervisor present, standalone mode - ie no XenD'' ?Yes.> If so I think that''s not right - for standalone mode we shouldn''t > require any special args for the admin. We should only require > additional args for cases where management tools like XenD or > libvirt are invoking QEMU.The usual case for a Xen installation is that xend is running. In that situation, it is wrong for qemu to create a domain directly. That will interfere with xend''s management of the whole system. So that means that it is not right to detect that we are running under Xen and if so create a domain by default. That should only be done if the person invoking qemu knows that there is no xend and therefore that creating the domain directly is sensible.> I''ll try to enumerate the scenarios here... > > On KVM, using Xenner + shell : No special args > On KVM, using Xenner + libvirtd : -xen-create <domid> > On Xen HV + shell : No special args > On Xen HV + libvirtd : -xen-create <domid> > On Xen HV + XenD : -xen-attach <domid>You''ve missed out entirely software emulation of a Xen environment. AIUI this is pretty mucbh Xenner without KVM. And see my comments above about running under Xen. The Xen hypervisor is not a system facility which user processes are expected to make use of without negotiating with the prevailing Xen management stack. This is unlike KVM, where all KVM''s callers can be oblivious to each other. If you run qemu to invoke a Xen domU image it should not disturb any existing actual Xen installation, and the behaviour and outcome should not depend on whether the whole lot is running under a real Xen (with xend et al). It is fine to use KVM if available because that''s just a performance improvement. This is because the way to start a Xen domU, as stated in the Xen documentation etc., is via the Xen management tools. If a user tries to do it via qemu then they are either making a mistake, or intending to run an emulated Xen environment. (Note that they may be running qemu on a Xen domU in which case creating a domain isn''t even possible.) If we make qemu actually create a guest by direct hypercalls, the Xen management stack will be disrupted ...> If neither -xen-create or -xen-attach is given, then it will create a > new domain from scratch & auto-allocate a domid. Passing both the > -xen-create and -xen-attach args is forbidden / nonsensical.... so this behaviour should not be the default.> Ian, did I miss any of the use cases you want addressed ? I''m leaving > mini-os out of this now, since as you say, its impossible to have one > binary deal with it, but I''ll assume its similar to -xen-attach <domid> > use case in its working.Fairly similar. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 15:08 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> It''s a little less perfect. Your signal handler can write() to the pipe > but what happens if you get EAGAIN?Then the pipe already has something in it which is fine. The purpose of writing to it is to make sure that it is not empty. We don''t count the bytes.> Below is our implementation. I''ll queue up to push this change into > QEMU after I finish with the migration patches.Please consider my patch instead, which doesn''t depend on threads and is a lot smaller. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-26 15:09 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Tue, Aug 26, 2008 at 04:05:01PM +0100, Ian Jackson wrote:> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): > > I''ll try to enumerate the scenarios here... > > > > On KVM, using Xenner + shell : No special args > > On KVM, using Xenner + libvirtd : -xen-create <domid> > > On Xen HV + shell : No special args > > On Xen HV + libvirtd : -xen-create <domid> > > On Xen HV + XenD : -xen-attach <domid> > > You''ve missed out entirely software emulation of a Xen environment. > AIUI this is pretty mucbh Xenner without KVM. > > And see my comments above about running under Xen. The Xen hypervisor > is not a system facility which user processes are expected to make use > of without negotiating with the prevailing Xen management stack. This > is unlike KVM, where all KVM''s callers can be oblivious to each other. > > If you run qemu to invoke a Xen domU image it should not disturb any > existing actual Xen installation, and the behaviour and outcome should > not depend on whether the whole lot is running under a real Xen (with > xend et al). It is fine to use KVM if available because that''s just a > performance improvement. > > This is because the way to start a Xen domU, as stated in the Xen > documentation etc., is via the Xen management tools. If a user tries > to do it via qemu then they are either making a mistake, or intending > to run an emulated Xen environment. (Note that they may be running > qemu on a Xen domU in which case creating a domain isn''t even > possible.) > > If we make qemu actually create a guest by direct hypercalls, the Xen > management stack will be disrupted ...Ok, in that case dis-regard my sugestion of ''-no-xen'' in the other mail I just sent. Clearly the negative style args won''t work in the xen case, and we have to go for a positive switch - maybe it really is worth doing an explicit ''-accelerator none|kvm|xen|kqemu'' right away for this patchset... Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-26 15:20 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > > Then the pipe already has something in it which is fine. The purpose > of writing to it is to make sure that it is not empty. We don''t count > the bytes. >Then you need a pipe per-signal and you can''t accurately simulate threadfd() (since you can''t communicate siginfo).> Please consider my patch instead, which doesn''t depend on threads and > is a lot smaller. >I don''t see threads as a problem. Are you concerned about mini-OS? Regards, Anthony Liguori> Ian. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-26 15:21 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > > Then the pipe already has something in it which is fine. The purpose > of writing to it is to make sure that it is not empty. We don''t count > the bytes. >BTW, I don''t see your patch yet, but my mail is pretty funky right now. Then you need a pipe per-signal and you can''t accurately simulate threadfd() (since you can''t communicate siginfo).> Please consider my patch instead, which doesn''t depend on threads and > is a lot smaller. >I don''t see threads as a problem. Are you concerned about mini-OS? Regards, Anthony Liguori> Ian. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 15:23 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> Not sure that it matters, but the semantics are slightly different: > with a thread you don''t get EINTR in random syscalls as the signal > thread is the only one that has the signals unblocked.That''s true but I don''t think it''s really relevant because we already have other signal handlers (and we might grow further ones) so the rest of the code has to be robust against those. My experience suggests very strongly that we should avoid doing multithreaded things if at all possible. Threads are less portable; even when they are provided many of the implementations are buggy (although less so nowadays). There is also the way that once you have a multithreaded program, it is much more difficult to discourage the expansion of the concurrent functionality until the whole program is a mass of race bugs. I would rather not open this stable door. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 15:28 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> Then you need a pipe per-signal and you can''t accurately simulate > threadfd() (since you can''t communicate siginfo).No, you don''t need a pipe per signal. You only need multiple pipes if you have multiple different event loops which are each capable of handling only a subset of the signals, _and_ you''re unwilling to use an atomic flag variable (eg, cpu_interrupt) or its equivalent. But we already have cpu_interrupt, and of course the aio completion system has its own recording of what''s going on. So that does not apply to qemu. All that''s needed is a reliable, race-free, way of avoiding spuriously blocking in a syscall when an event occurs between checking the state variables (aio_error, cpu interrupt check) and the call to select(). One fd is sufficient for that. Just to make that concrete, even if we extended my patch to use its mechanism for SIGINT et al I don''t think a second pipe would be needed.> I don''t see threads as a problem. Are you concerned about mini-OS?Minios certainly doesn''t currently have any threads and it would probably be a severe pain to introduce them. That''s just one example of a portability problem. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2008-Aug-26 15:29 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > >> Not sure that it matters, but the semantics are slightly different: >> with a thread you don''t get EINTR in random syscalls as the signal >> thread is the only one that has the signals unblocked. >> > > That''s true but I don''t think it''s really relevant because we already > have other signal handlers (and we might grow further ones) so the > rest of the code has to be robust against those. > > My experience suggests very strongly that we should avoid doing > multithreaded things if at all possible. Threads are less portable; > even when they are provided many of the implementations are buggy > (although less so nowadays). > > There is also the way that once you have a multithreaded program, it > is much more difficult to discourage the expansion of the concurrent > functionality until the whole program is a mass of race bugs. I would > rather not open this stable door. > >In this case the thread is only used to emulate the signalfd() system call; it is not part of qemu proper. -- error compiling committee.c: too many arguments to function _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-26 15:36 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > >> Then you need a pipe per-signal and you can''t accurately simulate >> threadfd() (since you can''t communicate siginfo). >> > > No, you don''t need a pipe per signal. You only need multiple pipes if > you have multiple different event loops which are each capable of > handling only a subset of the signals, _and_ you''re unwilling to use > an atomic flag variable (eg, cpu_interrupt) or its equivalent. > > But we already have cpu_interrupt, and of course the aio completion > system has its own recording of what''s going on. So that does not > apply to qemu. > > All that''s needed is a reliable, race-free, way of avoiding spuriously > blocking in a syscall when an event occurs between checking the state > variables (aio_error, cpu interrupt check) and the call to select(). > One fd is sufficient for that. > > Just to make that concrete, even if we extended my patch to use its > mechanism for SIGINT et al I don''t think a second pipe would be > needed. >In KVM, we do use the signal number to determine action. We could use globals but since we''re multi-threaded, that gets pretty nasty. The same would apply to a threaded QEMU.>> I don''t see threads as a problem. Are you concerned about mini-OS? >> > > Minios certainly doesn''t currently have any threads and it would > probably be a severe pain to introduce them. That''s just one example > of a portability problem. >We''re definitely not going to avoid threads forever in QEMU. KVM requires threads to support multiple VCPUs. Threads are also needed to support true SMP with TCG. And right now, implementing a thread pool is the only sane way to get reasonable disk IO in userspace. I share your concerns about threading, which is why we have to use them in a very careful way. My signalfd() patch uses them in a very isolated way that is pretty easily verified. You could always add proper signalfd() support to minios. You can also certainly implement a signalfd() emulation that uses pipe(). signalfd() is really the right solution to this problem (and it doesn''t require threads by default). Regards, Anthony Liguori> Ian. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Samuel Thibault
2008-Aug-26 15:36 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson, le Tue 26 Aug 2008 16:28:54 +0100, a écrit :> > I don''t see threads as a problem. Are you concerned about mini-OS? > > Minios certainly doesn''t currently have any threads and it would > probably be a severe pain to introduce them.It has threads for its own internal uses, but it doesn''t provide an interface for them (makes everything a lot simpler and efficient of course) Samuel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-26 15:38 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> Ian Jackson wrote: > > My experience suggests very strongly that we should avoid doing > > multithreaded things if at all possible. Threads are less portable; > > even when they are provided many of the implementations are buggy > > (although less so nowadays). > > > > There is also the way that once you have a multithreaded program, it > > is much more difficult to discourage the expansion of the concurrent > > functionality until the whole program is a mass of race bugs. I would > > rather not open this stable door. > > In this case the thread is only used to emulate the signalfd() system > call; it is not part of qemu proper.That doesn''t make us any less exposed to bugs in the threading implementation, and is only of marginal use if anything for explaining to people why we shouldn''t have a Windows-style multithreaded bug pile. Also, I think messing about with Linux-specific syscalls and then emulating them is hardly a sensible way to carry on, when the alternative is portable and simple. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2008-Aug-26 16:12 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > >> Ian Jackson wrote: >> >>> My experience suggests very strongly that we should avoid doing >>> multithreaded things if at all possible. Threads are less portable; >>> even when they are provided many of the implementations are buggy >>> (although less so nowadays). >>> >>> There is also the way that once you have a multithreaded program, it >>> is much more difficult to discourage the expansion of the concurrent >>> functionality until the whole program is a mass of race bugs. I would >>> rather not open this stable door. >>> >> In this case the thread is only used to emulate the signalfd() system >> call; it is not part of qemu proper. >> > > That doesn''t make us any less exposed to bugs in the threading > implementation, and is only of marginal use if anything for explaining > to people why we shouldn''t have a Windows-style multithreaded bug > pile. > >Threading bugs in the implementation? In 2008? Practically all serious software is multithreaded nowadays.> Also, I think messing about with Linux-specific syscalls and then > emulating them is hardly a sensible way to carry on, when the > alternative is portable and simple.This has merit; though signalfd() will be a lot faster than catching a signal and writing to a pipe. In particular, delivering a signal to userspace has to save the floating point context, while signalfd() avoids it. With the signals used to signal I/O completion, this matters. -- error compiling committee.c: too many arguments to function _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-26 16:39 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Julian Seward wrote:> On Tuesday 26 August 2008, Avi Kivity wrote: > > >> Threading bugs in the implementation? In 2008? >> >> Practically all serious software is multithreaded nowadays. >> > > That''s not in itself an argument in support of writing threaded code > if you don''t have to. >I know arguing about threading is almost as fun as arguing about XML, but in this very specific, very particular case, it''s pretty obvious by simple inspection that there aren''t thread safety issues. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-26 18:50 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Jamie Lokier wrote:> Leading to: why would (real) signals being used to collect AIO events > anyway, if you don''t have signalfd()? If you''ve got a helper thread, > just call aio_suspend() in the helper thread.There is no helper thread when signalfd() is available. The helper thread thing is only for backwards compatibility or for unices that don''t support signalfd. Plus, there are a lot of places that we use signals currently.> Then you can just > deliver the AIO completion result to the relevant data structure or > even move it off the waiting queue (pthread_mutex_lock is your > friend), then wake the main waiting thread with byte written to its > pipe. >It''s the queuing with threads where things start getting complicated and error prone. Regards, Anthony Liguori> -- Jamie > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 19:24 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange wrote:> On Tue, Aug 26, 2008 at 04:49:05PM +0200, Gerd Hoffmann wrote: >> But when running on xen we want being able to qemu tell it should *not* >> use the xen hypervisor but emulate things, thus -xen-emulate. > > Oh, i see what you mean. Existing naming practice would suggest naming > it -no-xenhv or just -no-xenWell, as you are going to run a xen guest -no-xen wouldn''t be very intuitive IMHO. -no-xenhv is only slightly better.> All these ''-no-XXX'' args are getting a little silly though. I''d rather > tell QEMU what I *do* want, than what I don''t want. A generic arg > ''-accelerator none|xen|kvm|kqemu'' might be worth considering in the future.Agreed. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-26 20:00 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
>> Patches 02 -> 04 pull depending patches, once you''ve merged with >> upstream they are not needed any more and will be removed from the patch >> set. >> >> Patches 05 -> 09 is the backend core / console / framebuffer stuff you >> are willing to take. >> >> Patch 10 are the xen_machine_pv.c changes (upstream changeset has this >> earlier, moved here for better bisectability). >> >> Patch 11+ are disk + nic backends. Ignore them if you don''t want them.> I''ll take a more thorough look at 02-06. Do they in your opinion make > sense on their own in qemu-xen-unstable ?02 -> 04 Make sense if you want to *test* the patches without merging with upstream qemu before. For *merging* the patches I''d suggest to merge from upstream qemu instead, then wait for me rebasing the patchset (and dropping these patches). 05 -> 06 backend core. 07 switches the console to the new backend core. 08 xenfb.c -> xen_framebuffer.c rename. 09 switch framebuffer to new backend core. 05 + 06 alone are not useful as nobody uses the new backend core then. Merging 07 too gives at least one user. 08+09 going in gives a second user. They are a nice cleanup and certainly make sense for qemu-xen-unstable. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster
2008-Aug-26 21:00 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Samuel Thibault <samuel.thibault@eu.citrix.com> writes:> Ian Jackson, le Tue 26 Aug 2008 11:42:11 +0100, a écrit : >> Gerd Hoffmann writes ("[Qemu-devel] [PATCH 06/13] xen: backend driver core"): >> > This patch adds infrastructure for xen backend drivers living in qemu, >> > so drivers don''t need to implement common stuff on their own. It''s >> > mostly xenbus management stuff: some functions to access xentore, >> > setting up xenstore watches, callbacks on device discovery and state >> > changes, handle event channel, ... >> >> These kind of patches are rather troublesome I''m afraid. > > The one above should be fine I think (and the relevant console/pvfb > refactoring, which Mark seemed happy to see). > > SamuelYes, I''m endorsing it. Factoring out the generic backend core makes lots of sense, and the resulting PVFB code is cleaner and easier to understand. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2008-Aug-27 08:05 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Julian Seward wrote:> On Tuesday 26 August 2008, Avi Kivity wrote: > > >> Threading bugs in the implementation? In 2008? >> >> Practically all serious software is multithreaded nowadays. >> > > That''s not in itself an argument in support of writing threaded code > if you don''t have to. > > I spent much of the last year developing thread-checking tools > in the Valgrind framework (Helgrind), and testing them on serious > software. From that I would say that practically all serious > multithreaded software is riddled with threading bugs, mostly > data races and inconsistent lock orderings (potential deadlocks). >Most serious software has to be threaded. The absence of nonblocking libraries, the complexities of compositing nonblocking components when they do exist, and the requirement for utilizing multiple cores, all push towards threading. I agree that it''s incredibly easy to write badly threaded code. But I don''t see a way to avoid it. -- error compiling committee.c: too many arguments to function _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Alan Cox
2008-Aug-27 08:49 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
> I agree that it''s incredibly easy to write badly threaded code. But I > don''t see a way to avoid it.Message passing, using the right language tools (in C that unfortunately means writing your own or going back to 60s textbooks) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-27 08:51 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Ian Jackson wrote:> Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): >> But when running on xen we want being able to qemu tell it should *not* >> use the xen hypervisor but emulate things, thus -xen-emulate. > > That should be the default. It is not safe to use the Xen hypervisor > directly if there is a management toolstack which expects to mediate > domain creation.Ok. I think we have three cases then: (1) xen emulation using xenner (default, with or without kvm). (2) running on xen, with xend creating the domain, and (3) running on xen, with qemu creating the domain. For (2) and (3) we need command line switches to enable these modes. Specifying a domain ID makes sense in all three cases though. So I''d keep the -xen-domid switch, like this: -xen-domid <nr> specify xen domain id. -xen-attach attach to specified domain (created by xend). -xen-create create the domain directly. Comments? cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Avi Kivity
2008-Aug-27 09:10 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Alan Cox wrote:>> I agree that it''s incredibly easy to write badly threaded code. But I >> don''t see a way to avoid it. >> > > Message passing, using the right language tools (in C that unfortunately > means writing your own or going back to 60s textbooks) >Oh, if we''re talking languages, then I''m all for pure functional languages. If you are side effects free, you can let the runtime do all the threading for you automatically. I don''t see that catching on though. -- error compiling committee.c: too many arguments to function _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Aug-27 09:16 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
On 27/8/08 10:10, "Avi Kivity" <avi@qumranet.com> wrote:>> Message passing, using the right language tools (in C that unfortunately >> means writing your own or going back to 60s textbooks) >> > > Oh, if we''re talking languages, then I''m all for pure functional > languages. If you are side effects free, you can let the runtime do all > the threading for you automatically.Practical efficient implementation of this is still an open research problem, I believe. At least as far as auto-parallelisation is concerned. It''s one of those pure-functional-language myths. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-27 09:46 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> Leading to: why would (real) signals being used to collect AIO events > anyway, if you don''t have signalfd()? If you''ve got a helper thread,If you use aio, glibc has a tendency to emulate it with thread(s). The helper thread I was referring to in my commit message is the glibc-created helper thread. glibc appears (in strace) to do IO in this thread and then raises the signal you asked for (SIGUSR2 in qemu''s case). In some of my tests with qemu that SIGUSR2 was then immediately delivered on the helper thread even though the main program had it blocked. I saw this in strace in a real qemu instance but I wasn''t able to reproduce it in a trivial test program. The result is that we lose the race that sigwait() is supposed to avoid. Hence my replacement of sigwait with the self-pipe trick. This is not to be confused with Anthony Ligouri''s helper thread, which is doing the same job as my SIGUSR2 handler. In Anthony''s setup there will be two helper threads (assuming no signalfd and no kernel aio): one invented by glibc to emulate aio, and one invented by Anthony to emulate signalfd. glibc will do the aio in its helper thread, raise SIGUSR2, which will wake up Anthony''s thread, which will write to the pipe and thus wake up the main program. Ah, plumbing. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-27 09:48 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 13/13] xen: pv domain builder."):> 02 -> 04 Make sense if you want to *test* the patches without merging > with upstream qemu before. For *merging* the patches I''d > suggest to merge from upstream qemu instead, then wait for > me rebasing the patchset (and dropping these patches).Yes.> 05 -> 06 backend core. > 07 switches the console to the new backend core. > 08 xenfb.c -> xen_framebuffer.c rename. > 09 switch framebuffer to new backend core.Thanks for the explanation.> 05 + 06 alone are not useful as nobody uses the new backend core then. > Merging 07 too gives at least one user. 08+09 going in gives a second > user. They are a nice cleanup and certainly make sense for > qemu-xen-unstable.That looks like I''ll want to take 05-07 and 09. As previously discussed, I don''t intend to take 08, at least at this stage, and I would very strongly prefer that 08 didn''t go upstream. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-27 09:53 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):> Ok. I think we have three cases then: > > (1) xen emulation using xenner (default, with or without kvm). > (2) running on xen, with xend creating the domain, and > (3) running on xen, with qemu creating the domain. > > For (2) and (3) we need command line switches to enable these modes. > Specifying a domain ID makes sense in all three cases though. So I''d > keep the -xen-domid switch, like this: > > -xen-domid <nr> specify xen domain id. > -xen-attach attach to specified domain (created by xend). > -xen-create create the domain directly. > > Comments?That sounds reasonable. Can we please make sure that the usage message for -xen-create and -xen-attach makes it clear that a user should not supply these options on a Xen system using the Xen management toolstack ? (Which I think includes a system using xm and also a system using libvirt, although perhaps Daniel Berrange will correct me.) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-27 09:56 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Wed, Aug 27, 2008 at 10:53:37AM +0100, Ian Jackson wrote:> Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): > > Ok. I think we have three cases then: > > > > (1) xen emulation using xenner (default, with or without kvm). > > (2) running on xen, with xend creating the domain, and > > (3) running on xen, with qemu creating the domain. > > > > For (2) and (3) we need command line switches to enable these modes. > > Specifying a domain ID makes sense in all three cases though. So I''d > > keep the -xen-domid switch, like this: > > > > -xen-domid <nr> specify xen domain id. > > -xen-attach attach to specified domain (created by xend). > > -xen-create create the domain directly. > > > > Comments? > > That sounds reasonable. Can we please make sure that the usage > message for -xen-create and -xen-attach makes it clear that a user > should not supply these options on a Xen system using the Xen > management toolstack ? (Which I think includes a system using xm and > also a system using libvirt, although perhaps Daniel Berrange will > correct me.)There''s no requirement from libvirt itself - just whatever infrastructure libvirt is using. So just a message about XenD/xm would be sufficient. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-27 10:00 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):> There''s no requirement from libvirt itself - just whatever infrastructure > libvirt is using. So just a message about XenD/xm would be sufficient.Right, but I just wanted to avoid the situation where a naive user sees `not for xm/xend systems'' and thinks `that''s not me because I''m using libvirt''. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-27 10:23 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Wed, Aug 27, 2008 at 11:00:16AM +0100, Ian Jackson wrote:> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): > > There''s no requirement from libvirt itself - just whatever infrastructure > > libvirt is using. So just a message about XenD/xm would be sufficient. > > Right, but I just wanted to avoid the situation where a naive user > sees `not for xm/xend systems'' and thinks `that''s not me because I''m > using libvirt''.They could be right in that thinking though - if using libvirt''s QEMU backend, instead of XenD backend, then it''d be fine to launch VMs manually. Only the presence of XenD places constraints on usage. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-27 10:37 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange wrote:> On Wed, Aug 27, 2008 at 11:00:16AM +0100, Ian Jackson wrote: >> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): >>> There''s no requirement from libvirt itself - just whatever infrastructure >>> libvirt is using. So just a message about XenD/xm would be sufficient. >> Right, but I just wanted to avoid the situation where a naive user >> sees `not for xm/xend systems'' and thinks `that''s not me because I''m >> using libvirt''. > > They could be right in that thinking though - if using libvirt''s QEMU > backend, instead of XenD backend, then it''d be fine to launch VMs > manually. Only the presence of XenD places constraints on usage.What constrains btw? I''d expect with libvirt managing domains via qemu -xen-create and xend managing domains via -xen-attach basically run into the same class of problems if the user starts booting domains manually. The most obvious issue would be is that domain IDs are in use by the user-started domains which the management stack thinks are unused and thus might be allocated for new domains. Likewise, both management stacks will find domains in xenstore they don''t know anything about. So I don''t see a fundamental difference between libvirt and xend here. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2008-Aug-27 10:44 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
On Wed, Aug 27, 2008 at 12:37:42PM +0200, Gerd Hoffmann wrote:> Daniel P. Berrange wrote: > > On Wed, Aug 27, 2008 at 11:00:16AM +0100, Ian Jackson wrote: > >> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"): > >>> There''s no requirement from libvirt itself - just whatever infrastructure > >>> libvirt is using. So just a message about XenD/xm would be sufficient. > >> Right, but I just wanted to avoid the situation where a naive user > >> sees `not for xm/xend systems'' and thinks `that''s not me because I''m > >> using libvirt''. > > > > They could be right in that thinking though - if using libvirt''s QEMU > > backend, instead of XenD backend, then it''d be fine to launch VMs > > manually. Only the presence of XenD places constraints on usage. > > What constrains btw?If you start an Xen guest directly with QEMU, that''ll be seen by XenD and bad things will start to happen. At best, XenD will tear down your manually created domain if it sees it before you write enough info into xenstore.> I''d expect with libvirt managing domains via qemu -xen-create and xend > managing domains via -xen-attach basically run into the same class of > problems if the user starts booting domains manually. > > The most obvious issue would be is that domain IDs are in use by the > user-started domains which the management stack thinks are unused and > thus might be allocated for new domains. Likewise, both management > stacks will find domains in xenstore they don''t know anything about. > > So I don''t see a fundamental difference between libvirt and xend here.In the context of the QEMU driver in libvirt, we don''t currently care what the actual Xen hypervisor domain ID is - we make up our own domain ID space, and have no need for it to map to hypervisor IDs. Thus, externally created QEMU instances can use the HV too and won''t clash. In fact libvirt won''t even care if the VM is using the HV or not - it is just another QEMU process - all the xenstore/xen HV interaction is self contained inside QEMU. In the context of the Xen driver in libvirt, we''re going via XenD and don''t create VMs manually. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann
2008-Aug-27 11:01 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
Daniel P. Berrange wrote:>>> They could be right in that thinking though - if using libvirt''s QEMU >>> backend, instead of XenD backend, then it''d be fine to launch VMs >>> manually. Only the presence of XenD places constraints on usage. >> What constrains btw? > > If you start an Xen guest directly with QEMU, that''ll be seen by XenD > and bad things will start to happen. At best, XenD will tear down your > manually created domain if it sees it before you write enough info > into xenstore.Didn''t saw that happen.>> I''d expect with libvirt managing domains via qemu -xen-create and xend >> managing domains via -xen-attach basically run into the same class of >> problems if the user starts booting domains manually. >> >> The most obvious issue would be is that domain IDs are in use by the >> user-started domains which the management stack thinks are unused and >> thus might be allocated for new domains. Likewise, both management >> stacks will find domains in xenstore they don''t know anything about. >> >> So I don''t see a fundamental difference between libvirt and xend here. > > In the context of the QEMU driver in libvirt, we don''t currently care what > the actual Xen hypervisor domain ID is - we make up our own domain ID > space, and have no need for it to map to hypervisor IDs.They should be identical as that would be less confusing to the user. Having two ID spaces, both called domain ID, is a very bad idea IMHO.> In fact libvirt > won''t even care if the VM is using the HV or not - it is just another QEMU > process - all the xenstore/xen HV interaction is self contained inside > QEMU.Yep. But even in that case it would be good if libvirt and the guest os have the same idea about what the domain ID is. cheers, Gerd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-27 16:35 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> Ian Jackson wrote: > > If you use aio, glibc has a tendency to emulate it with thread(s). > > The helper thread I was referring to in my commit message is the > > glibc-created helper thread. glibc appears (in strace) to do IO in > > this thread and then raises the signal you asked for (SIGUSR2 in > > qemu''s case). > > That observation implies Anthony''s helper thread will not always > receive the SIGUSR2 signals - Glibc''s thread may receive them instead.Yes. Err, I''m not sure what effect that will have in Anthony''s patch but it probably wouldn''t be good. As I say I haven''t managed to reproduce this other than in a full qemu-dm tree (from qemu-xen). So it may be that it is because of something I did or something the qemu-xen code does - although it seems hard to imagine how it could break the signal mask in the helper thread given that that thread runs entirely glibc code, except when a signal is being delivered (and of course the signal mask is saved and restored for signal delivery). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Aug-27 17:55 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > > Yes. Err, I''m not sure what effect that will have in Anthony''s patch > but it probably wouldn''t be good. > > As I say I haven''t managed to reproduce this other than in a full > qemu-dm tree (from qemu-xen). > > So it may be that it is because of something I did or something the > qemu-xen code does - although it seems hard to imagine how it could > break the signal mask in the helper thread given that that thread runs > entirely glibc code, except when a signal is being delivered (and of > course the signal mask is saved and restored for signal delivery). >Perhaps the signal is being consumed in one of qemu-dm''s other threads mistakenly? What you''re describing would seem like a serious bug. For a while, I was testing a version of QEMU/KVM that slept forever in select() and I never saw any instances of the signal not getting delivered. Regards, Anthony Liguori> Ian. > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2008-Aug-28 09:34 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):> A quick read of Glibc 2.3.1 source says its AIO code does not change > the signal mask when it creates an AIO handling thread.Thanks for investigating that.> (This might be a Glibc bug - threads you don''t know about shouldn''t be > receiving your signals, should they?)Yes, it''s a bug. (I wonder why my test program didn''t expose it.)> That means if it wasn''t masked when you issued the first AIO request, > it won''t be masked in that thread.In practice since we use aio this probably means we can''t rely on signal blocking except for signals which qemu keeps blocked `nearly all'' the time. So pipe-to-self in the signal handler is the way to go.> It''s quite annoying that POSIX threads doesn''t provide any way to say > "mask this signal in all other threads", which you often want to do > from a library.That too. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anthony Liguori
2008-Sep-07 02:41 UTC
Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
Ian Jackson wrote:> Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."): > >> A quick read of Glibc 2.3.1 source says its AIO code does not change >> the signal mask when it creates an AIO handling thread. >> > > Thanks for investigating that. > > >> (This might be a Glibc bug - threads you don''t know about shouldn''t be >> receiving your signals, should they?) >> > > Yes, it''s a bug. (I wonder why my test program didn''t expose it.) > > >> That means if it wasn''t masked when you issued the first AIO request, >> it won''t be masked in that thread. >> > > In practice since we use aio this probably means we can''t rely on > signal blocking except for signals which qemu keeps blocked `nearly > all'' the time. So pipe-to-self in the signal handler is the way to > go. >In my signalfd patch, we block SIGUSR2 all the time so it shouldn''t be affected by this "bug". We use sigwait or signalfd to receive the signal. Regards, Anthony Liguori _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel