Alexander Bienzeisler
2013-May-14 08:29 UTC
4.2.2 pci-passthrough crashes Dell Poweredge R710
Hello everyone, i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8 domU with a pci device attached, the dom0 machine hardcrashes. my system log (idrac) shows the following: CPU 2 has an internal error (IERR). A bus fatal error was detected on a component at bus 0 device 0 function 0. CPU 1 machine check detected. and plenty of other entries. The machine hardresets then. If i leave the faulty machine down after a reboot, nothing like this happens. xl info:> host : susi-0 > release : 3.8.2-ipmi > version : #4 SMP Mon Mar 11 12:54:31 CET 2013 > machine : x86_64 > nr_cpus : 12 > max_cpu_id : 31 > nr_nodes : 2 > cores_per_socket : 6 > threads_per_core : 1 > cpu_mhz : 3325 > hw_caps : > bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000 > virt_caps : hvm > total_memory : 98291 > free_memory : 62390 > sharing_freed_memory : 0 > sharing_used_memory : 0 > free_cpus : 0 > xen_major : 4 > xen_minor : 2 > xen_extra : .2 > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 > hvm-3.0-x86_32p hvm-3.0-x86_64 > xen_scheduler : credit > xen_pagesize : 4096 > platform_params : virt_start=0xffff800000000000 > xen_changeset : unavailable > xen_commandline : placeholder loglvl=all dom0_mem=2048M > dom0_max_vcpus=2 com2=115200 console=com2,vga > cc_compiler : gcc (Debian 4.4.5-8) 4.4.5 > cc_compile_by : root > cc_compile_domain : wsk.tu-chemnitz.de > cc_compile_date : Tue May 14 09:16:43 CEST 2013 > xend_config_format : 4
>>> On 14.05.13 at 10:29, Alexander Bienzeisler <chosi@amd.co.at> wrote: > i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8 domU > with a pci device attached, the dom0 machine hardcrashes. > > my system log (idrac) shows the following: > > CPU 2 has an internal error (IERR). > A bus fatal error was detected on a component at bus 0 device 0 function 0. > CPU 1 machine check detected.Machine checks and CPU internal errors aren''t normally caused by software, but in most cases point at faulty hardware. In any case, if you''re suspecting Xen, you''d need to provide full logs (hypervisor and kernel) covering the crash, with suitable debugging options enabled. Jan
Konrad Rzeszutek Wilk
2013-May-15 00:31 UTC
Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
On Tue, May 14, 2013 at 10:29:27AM +0200, Alexander Bienzeisler wrote:> Hello everyone, > > i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8 > domU with a pci device attached, the dom0 machine hardcrashes.How do you pass in the PCI device? I don''t see in your xl info the capability listed to do VT-d. Does your machine do VT-d?> > my system log (idrac) shows the following: > > CPU 2 has an internal error (IERR). > A bus fatal error was detected on a component at bus 0 device 0 function 0. > CPU 1 machine check detected. > > and plenty of other entries. The machine hardresets then. > If i leave the faulty machine down after a reboot, nothing like this > happens. > > xl info: > >host : susi-0 > >release : 3.8.2-ipmi > >version : #4 SMP Mon Mar 11 12:54:31 CET 2013 > >machine : x86_64 > >nr_cpus : 12 > >max_cpu_id : 31 > >nr_nodes : 2 > >cores_per_socket : 6 > >threads_per_core : 1 > >cpu_mhz : 3325 > >hw_caps : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000 > >virt_caps : hvm > >total_memory : 98291 > >free_memory : 62390 > >sharing_freed_memory : 0 > >sharing_used_memory : 0 > >free_cpus : 0 > >xen_major : 4 > >xen_minor : 2 > >xen_extra : .2 > >xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p > >hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 > >xen_scheduler : credit > >xen_pagesize : 4096 > >platform_params : virt_start=0xffff800000000000 > >xen_changeset : unavailable > >xen_commandline : placeholder loglvl=all dom0_mem=2048M > >dom0_max_vcpus=2 com2=115200 console=com2,vga > >cc_compiler : gcc (Debian 4.4.5-8) 4.4.5 > >cc_compile_by : root > >cc_compile_domain : wsk.tu-chemnitz.de > >cc_compile_date : Tue May 14 09:16:43 CEST 2013 > >xend_config_format : 4 > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Alexander Bienzeisler
2013-May-15 01:58 UTC
Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
Hello Konrad, i pass it exactly the same way i did with 4.2.1. I have changed nothing. Right now i reverted to 4.2.1 and i''m not seeing this anymore. The problem right now is, that it seems like the hard crashes this corrupted my HVM LV filesystems and stuff is broken now. However, this is not related. I don''t see this with 4.2.1. passing through like this: module options: xen-pciback.hide=(00:1a.0), which is an usb controller win2k8.cfg (xl): pci = [ ''00:1a.0'' ] xl pci-assignable-list output: 0000:00:1a.0 it worked flawlessly before and started breaking with 4.2.2 - i''m actually pretty sure this is not a hardware issue, since i can''t reproduce it with 4.2.1. I think i might lack the skills to provide proper logs and crashdumps of whatever sort. That''s all i can tell you right now. cheers, Alex Am 15.05.2013 02:31, schrieb Konrad Rzeszutek Wilk:> On Tue, May 14, 2013 at 10:29:27AM +0200, Alexander Bienzeisler wrote: >> Hello everyone, >> >> i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8 >> domU with a pci device attached, the dom0 machine hardcrashes. > How do you pass in the PCI device? I don''t see in your xl info the > capability listed to do VT-d. Does your machine do VT-d? >> my system log (idrac) shows the following: >> >> CPU 2 has an internal error (IERR). >> A bus fatal error was detected on a component at bus 0 device 0 function 0. >> CPU 1 machine check detected. >> >> and plenty of other entries. The machine hardresets then. >> If i leave the faulty machine down after a reboot, nothing like this >> happens. >> >> xl info: >>> host : susi-0 >>> release : 3.8.2-ipmi >>> version : #4 SMP Mon Mar 11 12:54:31 CET 2013 >>> machine : x86_64 >>> nr_cpus : 12 >>> max_cpu_id : 31 >>> nr_nodes : 2 >>> cores_per_socket : 6 >>> threads_per_core : 1 >>> cpu_mhz : 3325 >>> hw_caps : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000 >>> virt_caps : hvm >>> total_memory : 98291 >>> free_memory : 62390 >>> sharing_freed_memory : 0 >>> sharing_used_memory : 0 >>> free_cpus : 0 >>> xen_major : 4 >>> xen_minor : 2 >>> xen_extra : .2 >>> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p >>> hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 >>> xen_scheduler : credit >>> xen_pagesize : 4096 >>> platform_params : virt_start=0xffff800000000000 >>> xen_changeset : unavailable >>> xen_commandline : placeholder loglvl=all dom0_mem=2048M >>> dom0_max_vcpus=2 com2=115200 console=com2,vga >>> cc_compiler : gcc (Debian 4.4.5-8) 4.4.5 >>> cc_compile_by : root >>> cc_compile_domain : wsk.tu-chemnitz.de >>> cc_compile_date : Tue May 14 09:16:43 CEST 2013 >>> xend_config_format : 4 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Pasi Kärkkäinen
2013-May-15 06:32 UTC
Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
On Wed, May 15, 2013 at 03:58:19AM +0200, Alexander Bienzeisler wrote:> Hello Konrad, > > i pass it exactly the same way i did with 4.2.1. I have changed > nothing. Right now i reverted to 4.2.1 and i''m not seeing this > anymore. The problem right now is, that it seems like the hard > crashes this corrupted my HVM LV filesystems and stuff is broken > now. However, this is not related. I don''t see this with 4.2.1. > > passing through like this: > > module options: xen-pciback.hide=(00:1a.0), which is an usb controller > win2k8.cfg (xl): pci = [ ''00:1a.0'' ] > xl pci-assignable-list output: 0000:00:1a.0 > > it worked flawlessly before and started breaking with 4.2.2 - i''m > actually pretty sure this is not a hardware issue, since i can''t > reproduce it with 4.2.1. I think i might lack the skills to provide > proper logs and crashdumps of whatever sort. That''s all i can tell > you right now. >Subject says you''re using Dell R710 server. So you should enable SOL (Serial Over LAN) from the iDRAC management processor. SOL device will be seen as a serial port by Xen. So configure Xen hypervisor to log everything to the serial port. And configure dom0 linux to log to Xen hypervisor. Then ssh to the iDRAC and connect to the SOL console (iDRAC6 command: "console com2"). Make sure your ssh client has big enough scroll back buffer (or redirect the ssh output to a file), so you can capture a full log with all the boot-time and crash-time output from both Xen and dom0 Linux. Power-on (or restart) your R710 server and you''re ready to reproduce the crash. See: http://wiki.xen.org/wiki/Xen_Serial_Console So you probably need to add these options to grub (modify to match your iDRAC/BIOS settings): for xen.gz: loglvl=all guest_loglvl=all com2=115200,8n1 console=com2 sync_console lapic=debug apic_verbosity=debug apic=debug iommu=verbose for vmlinuz: earlyprintk=xen console=hvc0 initcall_debug debug loglevel=10 Hopefully that helps.. -- Pasi> Am 15.05.2013 02:31, schrieb Konrad Rzeszutek Wilk: > >On Tue, May 14, 2013 at 10:29:27AM +0200, Alexander Bienzeisler wrote: > >>Hello everyone, > >> > >>i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8 > >>domU with a pci device attached, the dom0 machine hardcrashes. > >How do you pass in the PCI device? I don''t see in your xl info the > >capability listed to do VT-d. Does your machine do VT-d? > >>my system log (idrac) shows the following: > >> > >>CPU 2 has an internal error (IERR). > >>A bus fatal error was detected on a component at bus 0 device 0 function 0. > >>CPU 1 machine check detected. > >> > >>and plenty of other entries. The machine hardresets then. > >>If i leave the faulty machine down after a reboot, nothing like this > >>happens. > >> > >>xl info: > >>>host : susi-0 > >>>release : 3.8.2-ipmi > >>>version : #4 SMP Mon Mar 11 12:54:31 CET 2013 > >>>machine : x86_64 > >>>nr_cpus : 12 > >>>max_cpu_id : 31 > >>>nr_nodes : 2 > >>>cores_per_socket : 6 > >>>threads_per_core : 1 > >>>cpu_mhz : 3325 > >>>hw_caps : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000 > >>>virt_caps : hvm > >>>total_memory : 98291 > >>>free_memory : 62390 > >>>sharing_freed_memory : 0 > >>>sharing_used_memory : 0 > >>>free_cpus : 0 > >>>xen_major : 4 > >>>xen_minor : 2 > >>>xen_extra : .2 > >>>xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p > >>>hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 > >>>xen_scheduler : credit > >>>xen_pagesize : 4096 > >>>platform_params : virt_start=0xffff800000000000 > >>>xen_changeset : unavailable > >>>xen_commandline : placeholder loglvl=all dom0_mem=2048M > >>>dom0_max_vcpus=2 com2=115200 console=com2,vga > >>>cc_compiler : gcc (Debian 4.4.5-8) 4.4.5 > >>>cc_compile_by : root > >>>cc_compile_domain : wsk.tu-chemnitz.de > >>>cc_compile_date : Tue May 14 09:16:43 CEST 2013 > >>>xend_config_format : 4 > >> > >> > >>_______________________________________________ > >>Xen-devel mailing list > >>Xen-devel@lists.xen.org > >>http://lists.xen.org/xen-devel > >> > >_______________________________________________ > >Xen-devel mailing list > >Xen-devel@lists.xen.org > >http://lists.xen.org/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
>>> On 15.05.13 at 08:32, Pasi Kärkkäinen<pasik@iki.fi> wrote: > for xen.gz: loglvl=all guest_loglvl=all com2=115200,8n1 console=com2 > sync_console lapic=debug apic_verbosity=debug apic=debug iommu=verboseThat last element should be "iommu=debug", which on 4.2.2 and -unstable implies verbose. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel