Kuba
2014-Jan-30 13:50 UTC
Strange failures of Xen 4.3.1, PVHVM storage VM, iSCSI and Windows+GPLPV VM combination
Dear List, I am trying to set up a following configuration: 1. very simple Linux-based dom0 (Debian 7.3) with Xen 4.3.1 compiled from sources, 2. one storage VM (FreeBSD 10, HVM+PV) with SATA controller attached using VT-d, exporting block devices via iSCSI to other VMs and physical machines, 3. one Windows 7 SP1 64 VM (HVM+GPLPV) with GPU passthrough (Quadro 4000) installed on a block device exported from the storage VM (target on the storage VM, initiator on dom0). Everything works perfectly (including PCI & GPU passthrough) until I install GPLPV drivers on the Windows VM. After driver installation, Windows needs to reboot, boots fine, displays a message that PV SCSI drivers were installed and needs to reboot again, and then cannot boot. Sometimes it gets stuck at "booting from harddrive" in SeaBIOS, sometimes BSODs with "unmountable boot volume" message. All of the following I tried without GPU passthrough to narrow down the problem. The intriguing part is this: 1. If the storage VM's OS is Linux - it fails with the above symptoms. 2. If the block devices for the storage VM come directly from dom0 (not via pci-passthrough) - it fails. 2. If the storage VM is an HVM without PV drivers (e.g. FreeBSD 9.2-GENERIC) - it all works. 3. If the storage VM's OS is Linux with kernel compiled without Xen guest support - it works, but is unstable (see below). 4. If the iSCSI target is on a different physical machine - it all works. 5. If the iSCSI target is on dom0 itself - it works. 6. If I attach the AHCI controller to the Windows VM and install directly on the hard drive - it works. 7. If the block device for Windows VM is a disk, partition, file, LVM volume or even a ZoL's zvol (and it comes from a dom0 itself, without iSCSI)- it works. If I install Windows and the GPLPV drivers on a hard drive attached to dom0, Windows + GPLPV work perfectly. If I then give the same hard drive as a block device to the storage VM and re-export it through iSCSI, Windows usually boots fine, but works unstable. And by unstable I mean random read/write errors, sometimes programs won't start, ntdll.dll crashes, and after couple reboots Windows won't boot (just like mentioned above). The configurations I would like to achieve makes sense only with PV drivers on both storage and Windows VM. All of the "components" seem to work perfectly until all put together, so I am not really sure where the problem is. I would be very grateful for any suggestions or ideas that could possibly help to narrow down the problem. Maybe I am just doing something wrong (I hope so). Or maybe there is a bug that shows itself only in such a particular configuration (hope not)? Yours faithfully, Kuba Additional info: I tried using couple kernels for dom0 (3.2.51, 3.10.x, 3.12.x packaged and self-compiled). I tried using different versions of open-iscsi and iscsi target, no luck. I tried different gplpv drivers: gplpv_Vista2008x64_0.11.0.372.msi from univention.de gplpv_Vista2008x64_0.11.0.372.msi from meadowcourt.org gplpv_Vista2008x64_0.11.0.418.msi from meadowcourt.org gplpv_Vista2008x64_1.0.1089.msi from www.ejbdigital.com.au gplpv_Vista2008x64_1.0.1092.9.msi from www.ejbdigital.com.au I am working mainly on Xen 4.3.1, but also tried with Xen 4.2.3 and 4.4-RC2 with no noticeable change in behaviour. I tried using qemu-traditional for both the storage VM and Windows VM, no luck. I reinstalled Windows more times than during the XP and 98SE days combined :) Typical Storage VM config file: name='fbsd' builder='hvm' vcpus=8 memory=8192 disk=[ 'phy:/dev/sdc,xvda,w', 'file:/mnt/fbsd.iso,hdc,r,devtype=cdrom' ] vif=[ 'bridge=xenbr0,mac=00:16:3e:14:b1:1a' ] boot='c' pae=1 nx=1 videoram=16 stdvga=1 sdl=0 vnc=1 vnclisten="0.0.0.0" usb=1 usbdevice="tablet" localtime=1 xen_platform_pci=1 Typical Windows VM config file: name='win' builder='hvm' vcpus=8 memory=2048 disk=[ 'phy:/dev/sdd,hda,w', 'file:/mnt/win.iso,hdc,r,devtype=cdrom' ] vif=[ 'bridge=xenbr0,mac=00:16:3e:14:b1:fc' ] boot='c' pae=1 nx=1 videoram=16 stdvga=1 sdl=0 vnc=1 vnclisten="0.0.0.0" usb=1 usbdevice="tablet" localtime=1 xen_platform_pci=1 viridian=1 Hardware: Intel S1200BTS, Xeon E3-1230, 16GB ECC RAM. xl info with dom0 only: host : ws release : 3.10.28custom1 version : #1 SMP Wed Jan 29 20:53:38 CET 2014 machine : x86_64 nr_cpus : 8 max_cpu_id : 127 nr_nodes : 1 cores_per_socket : 4 threads_per_core : 2 cpu_mhz : 3192 hw_caps : bfebfbff:28100800:00000000:00003f00:17bae3ff:00000000:00000001:00000000 virt_caps : hvm hvm_directio total_memory : 16108 free_memory : 13843 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 3 xen_extra : .1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : xen_commandline : placeholder dom0_mem=2G cc_compiler : gcc (Debian 4.7.2-5) 4.7.2 cc_compile_by : root cc_compile_domain : localdomain cc_compile_date : Wed Jan 29 21:53:06 CET 2014 xend_config_format : 4