Sebastian Reitenbach
2007-Dec-20 15:35 UTC
[Xen-users] xen dom0 server freezes every one or two hours
Hi, I switched to a xen kernel on a HP DL 365, running openSUSE 10.3, x86_64. The HP DL 386 is a amd64 based machine. I had to add a swiotlb=16 kernel parameter into grub.conf, to get the kernel running at all. I have 4 virtual hosts running on the machine, also openSUSE 10.3, x86_64. I have two physical interfaced bonded together, then I have 10 vlan interfaces on top of the bond0 interface. The vlan interfaces are each connected to a bridge, one bridge for every vlan. Each of the four machines has 10 eth interface, that are connected to each of the bridges. Further I have several phy: disks from SAN configured handed over to the virtual machines. Setup and test of the hosts went fine, but when under load, then after an hour or two, the server just freezes. from time to time I see messages like this in the dom0 /var/log/messages: blkback: ring-ref 4882, event-channel 15, protocol 1 (x86_64-abi) blkback: ring-ref 4883, event-channel 16, protocol 1 (x86_64-abi) Don''t know, what the meaning of them is, as far as I could find out via googling, it must have sth. to do with networking. further, on startup of the dom0, I see the following in /var/log/boot.msg, don''t know whether this is a problem: Starting udevd done Loading required kernel modules doneActivating swap-devices in /etc/fstab... donemount: according to mtab, /dev/cciss/c0d0p1 is already mounted on / NOTE: scsidev is obsolete and the udev generated persistent device names under /dev/scsi/by-id/ (od by-path) should be used instead. Scanning SCSI devices and filling /dev/scsi/ done Activating device mapper... done Creating multipath targets:device-mapper: create ioctl failed: Device or resource busy device-mapper: create ioctl failed: Device or resource busy device-mapper: create ioctl failed: Device or resource busy device-mapper: create ioctl failed: Device or resource busy The dom0 is started this way from GRUB: # Modified by YaST2. Last modification on Fri Dec 14 15:03:43 CET 2007 default 0 timeout 8 ##YaST - activate ###Don''t change this comment - YaST2 identifier: Original name: xen### title XEN -- openSUSE 10.3 - 2.6.22.13-0.3 root (hd0,0) kernel /boot/xen.gz dom0_mem=390M module /boot/vmlinuz-2.6.22.13-0.3-xen root=/dev/disk/by-id/cciss-3600508b1001030343620202020200001-part1 vga=0x317 resume=/dev/cciss/c0d0p5 splash=silent showopts swiotlb=16 module /boot/initrd-2.6.22.13-0.3-xen uname -a Linux srv4 2.6.22.13-0.3-xen #1 SMP 2007/11/19 15:02:58 UTC x86_64 x86_64 x86_64 GNU/Linux rpm -qa | grep xen xen-3.1.0_15042-51 kernel-xen-2.6.22.13-0.3 xen-doc-html-3.1.0_15042-51 xen-libs-3.1.0_15042-51 xen-tools-3.1.0_15042-51 xen-doc-pdf-3.1.0_15042-51 xen-tools-ioemu-3.1.0_15042-51 I also had powersaved stopped, acpi stopped, ntp stopped (I found a thread where someone had a problem with clocks walking backward and dying xen server), but so far, the box still freezes without any notice in the logs. Any idea what could be the problem I have or where should I take a further look to figure out what causes the problem of the freezing server? kind regards Sebastian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Igor Chubin
2007-Dec-21 11:04 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
On Do, Dez 20, 2007 at 04:35:50 +0100, Sebastian Reitenbach wrote:> Hi, > > I switched to a xen kernel on a HP DL 365, running openSUSE 10.3, x86_64. > The HP DL 386 is a amd64 based machine. > > I had to add a swiotlb=16 kernel parameter into grub.conf, to get the > kernel running at all. > > I have 4 virtual hosts running on the machine, also openSUSE 10.3, x86_64. > > I have two physical interfaced bonded together, then I have 10 vlan > interfaces > on top of the bond0 interface. The vlan interfaces are each connected to a > bridge, one bridge for every vlan. Each of the four machines has 10 > eth interface, that are connected to each of the bridges. > > Further I have several phy: disks from SAN configured handed over to the > virtual machines. > > Setup and test of the hosts went fine, but when under load, then after an > hour or two, the server just freezes. > > from time to time I see messages like this in the dom0 /var/log/messages: > blkback: ring-ref 4882, event-channel 15, protocol 1 (x86_64-abi) > blkback: ring-ref 4883, event-channel 16, protocol 1 (x86_64-abi) > > Don''t know, what the meaning of them is, as far as I could find out via > googling, it must have sth. to do with networking.Hello, one of my associates have said that he has similar problem. Network hangs from time to time, and only in case when he uses virtual network configuration with VLANs. He has tried to reproduce the error but without success. The error appears in random fashion. At least we can''t find the cause of the problem. Yes, he also reported that there were messages in the Xend log file in the same time as network was hanging (approximately same as he said; he didn''t manage to find exact time): blkback: ring-ref 9, event-channel 5, protocol 1 (x86_32-abi) blkback: ring-ref 8, event-channel 4, protocol 1 (unspecified, assuming native)> > further, on startup of the dom0, I see the following in /var/log/boot.msg, > don''t > know whether this is a problem: > > Starting udevd done > Loading required kernel modules > doneActivating swap-devices in /etc/fstab... > donemount: according to mtab, /dev/cciss/c0d0p1 is already mounted on / > > NOTE: scsidev is obsolete and the udev generated persistent device names > under /dev/scsi/by-id/ (od by-path) should be used instead. > Scanning SCSI devices and filling /dev/scsi/ done > Activating device mapper... > done > Creating multipath targets:device-mapper: create ioctl failed: Device or > resource busy > device-mapper: create ioctl failed: Device or resource busy > device-mapper: create ioctl failed: Device or resource busy > device-mapper: create ioctl failed: Device or resource busy > > > The dom0 is started this way from GRUB: > # Modified by YaST2. Last modification on Fri Dec 14 15:03:43 CET 2007 > default 0 > timeout 8 > ##YaST - activate > > ###Don''t change this comment - YaST2 identifier: Original name: xen### > title XEN -- openSUSE 10.3 - 2.6.22.13-0.3 > root (hd0,0) > kernel /boot/xen.gz dom0_mem=390M > module /boot/vmlinuz-2.6.22.13-0.3-xen > root=/dev/disk/by-id/cciss-3600508b1001030343620202020200001-part1 vga=0x317 > resume=/dev/cciss/c0d0p5 splash=silent showopts swiotlb=16 > module /boot/initrd-2.6.22.13-0.3-xen > > > > uname -a > Linux srv4 2.6.22.13-0.3-xen #1 SMP 2007/11/19 15:02:58 UTC x86_64 x86_64 > x86_64 GNU/Linux > > > rpm -qa | grep xen > xen-3.1.0_15042-51 > kernel-xen-2.6.22.13-0.3 > xen-doc-html-3.1.0_15042-51 > xen-libs-3.1.0_15042-51 > xen-tools-3.1.0_15042-51 > xen-doc-pdf-3.1.0_15042-51 > xen-tools-ioemu-3.1.0_15042-51 > > > I also had powersaved stopped, acpi stopped, ntp stopped (I found a thread > where someone had > a problem with clocks walking backward and dying xen server), but so far, > the box still freezes > without any notice in the logs. > > Any idea what could be the problem I have or where should I take a further > look to > figure out what causes the problem of the freezing server? > > kind regards > Sebastian > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- WBR, i.m.chubin _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sebastian Reitenbach
2007-Dec-21 12:13 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Hi, Igor Chubin <igor@chub.in> wrote:> On Do, Dez 20, 2007 at 04:35:50 +0100, Sebastian Reitenbach wrote: > > Hi, > > > > I switched to a xen kernel on a HP DL 365, running openSUSE 10.3,x86_64.> > The HP DL 386 is a amd64 based machine. > > > > I had to add a swiotlb=16 kernel parameter into grub.conf, to get the > > kernel running at all. > > > > I have 4 virtual hosts running on the machine, also openSUSE 10.3,x86_64.> > > > I have two physical interfaced bonded together, then I have 10 vlan > > interfaces > > on top of the bond0 interface. The vlan interfaces are each connected toa> > bridge, one bridge for every vlan. Each of the four machines has 10 > > eth interface, that are connected to each of the bridges. > > > > Further I have several phy: disks from SAN configured handed over to the > > virtual machines. > > > > Setup and test of the hosts went fine, but when under load, then afteran> > hour or two, the server just freezes. > > > > from time to time I see messages like this in thedom0 /var/log/messages:> > blkback: ring-ref 4882, event-channel 15, protocol 1 (x86_64-abi) > > blkback: ring-ref 4883, event-channel 16, protocol 1 (x86_64-abi) > > > > Don''t know, what the meaning of them is, as far as I could find out via > > googling, it must have sth. to do with networking. > > > Hello, > > one of my associates have said that he has similar problem. > Network hangs from time to time, > and only in case when he uses virtual network configuration > with VLANs. He has tried to reproduce the error but without > success. > > The error appears in random fashion. > At least we can''t find the cause of the problem. > > Yes, he also reported that there were messages in the Xend log file > in the same time as network was hanging (approximately same as he > said; he didn''t manage to find exact time): > > blkback: ring-ref 9, event-channel 5, protocol 1 (x86_32-abi) > blkback: ring-ref 8, event-channel 4, protocol 1 (unspecified, assumingnative) at least I haven''t yet recognized a hanging network. I just saw these messages, and thought it might be related to the more serious freezes theat I encounter every some hours. But I''ll take a look in the xen log files when these messages appear again. thanks for pointing out. Sebastian> > > > > further, on startup of the dom0, I see the followingin /var/log/boot.msg,> > don''t > > know whether this is a problem: > > > > Starting udevd done > > Loading required kernel modules > > doneActivating swap-devices in /etc/fstab... > > donemount: according to mtab, /dev/cciss/c0d0p1 is already mounted on / > > > > NOTE: scsidev is obsolete and the udev generated persistent device names > > under /dev/scsi/by-id/ (od by-path) should be used instead. > > Scanning SCSI devices and filling /dev/scsi/ done > > Activating device mapper... > > done > > Creating multipath targets:device-mapper: create ioctl failed: Device or > > resource busy > > device-mapper: create ioctl failed: Device or resource busy > > device-mapper: create ioctl failed: Device or resource busy > > device-mapper: create ioctl failed: Device or resource busy > > > > > > The dom0 is started this way from GRUB: > > # Modified by YaST2. Last modification on Fri Dec 14 15:03:43 CET 2007 > > default 0 > > timeout 8 > > ##YaST - activate > > > > ###Don''t change this comment - YaST2 identifier: Original name: xen### > > title XEN -- openSUSE 10.3 - 2.6.22.13-0.3 > > root (hd0,0) > > kernel /boot/xen.gz dom0_mem=390M > > module /boot/vmlinuz-2.6.22.13-0.3-xen > > root=/dev/disk/by-id/cciss-3600508b1001030343620202020200001-part1vga=0x317> > resume=/dev/cciss/c0d0p5 splash=silent showopts swiotlb=16 > > module /boot/initrd-2.6.22.13-0.3-xen > > > > > > > > uname -a > > Linux srv4 2.6.22.13-0.3-xen #1 SMP 2007/11/19 15:02:58 UTC x86_64x86_64> > x86_64 GNU/Linux > > > > > > rpm -qa | grep xen > > xen-3.1.0_15042-51 > > kernel-xen-2.6.22.13-0.3 > > xen-doc-html-3.1.0_15042-51 > > xen-libs-3.1.0_15042-51 > > xen-tools-3.1.0_15042-51 > > xen-doc-pdf-3.1.0_15042-51 > > xen-tools-ioemu-3.1.0_15042-51 > > > > > > I also had powersaved stopped, acpi stopped, ntp stopped (I found athread> > where someone had > > a problem with clocks walking backward and dying xen server), but sofar,> > the box still freezes > > without any notice in the logs. > > > > Any idea what could be the problem I have or where should I take afurther> > look to > > figure out what causes the problem of the freezing server? > > > > kind regards > > Sebastian > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > -- > WBR, i.m.chubin > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Dirk Hilmer
2007-Dec-21 14:21 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Sebastian Reitenbach schrieb:> Hi, > > Igor Chubin <igor@chub.in> wrote: > >> On Do, Dez 20, 2007 at 04:35:50 +0100, Sebastian Reitenbach wrote: >> >>> Hi, >>> >>> I switched to a xen kernel on a HP DL 365, running openSUSE 10.3, >>> > x86_64. > >>> The HP DL 386 is a amd64 based machine. >>> >>> I had to add a swiotlb=16 kernel parameter into grub.conf, to get the >>> kernel running at all. >>> >>> I have 4 virtual hosts running on the machine, also openSUSE 10.3, >>> > x86_64. > >>> I have two physical interfaced bonded together, then I have 10 vlan >>> interfaces >>> on top of the bond0 interface. The vlan interfaces are each connected to >>> > a > >>> bridge, one bridge for every vlan. Each of the four machines has 10 >>> eth interface, that are connected to each of the bridges. >>> >>> Further I have several phy: disks from SAN configured handed over to the >>> virtual machines. >>> >>> Setup and test of the hosts went fine, but when under load, then after >>> > an > >>> hour or two, the server just freezes. >>> >>> from time to time I see messages like this in the >>> > dom0 /var/log/messages: > >>> blkback: ring-ref 4882, event-channel 15, protocol 1 (x86_64-abi) >>> blkback: ring-ref 4883, event-channel 16, protocol 1 (x86_64-abi) >>> >>> Don''t know, what the meaning of them is, as far as I could find out via >>> googling, it must have sth. to do with networking. >>> >> Hello, >> >> one of my associates have said that he has similar problem. >> Network hangs from time to time, >> and only in case when he uses virtual network configuration >> with VLANs. He has tried to reproduce the error but without >> success. >> >> The error appears in random fashion. >> At least we can''t find the cause of the problem. >> >> Yes, he also reported that there were messages in the Xend log file >> in the same time as network was hanging (approximately same as he >> said; he didn''t manage to find exact time): >> >> blkback: ring-ref 9, event-channel 5, protocol 1 (x86_32-abi) >> blkback: ring-ref 8, event-channel 4, protocol 1 (unspecified, assuming >> > native) > > at least I haven''t yet recognized a hanging network. I just saw these > messages, and thought it might be related to the more serious freezes theat > I encounter every some hours. But I''ll take a look in the xen log files when > these messages appear again. thanks for pointing out. > > Sebastian > > > > >>> further, on startup of the dom0, I see the following >>> > in /var/log/boot.msg, > >>> don''t >>> know whether this is a problem: >>> >>> Starting udevd done >>> Loading required kernel modules >>> doneActivating swap-devices in /etc/fstab... >>> donemount: according to mtab, /dev/cciss/c0d0p1 is already mounted on / >>> >>> NOTE: scsidev is obsolete and the udev generated persistent device names >>> under /dev/scsi/by-id/ (od by-path) should be used instead. >>> Scanning SCSI devices and filling /dev/scsi/ done >>> Activating device mapper... >>> done >>> Creating multipath targets:device-mapper: create ioctl failed: Device or >>> resource busy >>> device-mapper: create ioctl failed: Device or resource busy >>> device-mapper: create ioctl failed: Device or resource busy >>> device-mapper: create ioctl failed: Device or resource busy >>> >>> >>> The dom0 is started this way from GRUB: >>> # Modified by YaST2. Last modification on Fri Dec 14 15:03:43 CET 2007 >>> default 0 >>> timeout 8 >>> ##YaST - activate >>> >>> ###Don''t change this comment - YaST2 identifier: Original name: xen### >>> title XEN -- openSUSE 10.3 - 2.6.22.13-0.3 >>> root (hd0,0) >>> kernel /boot/xen.gz dom0_mem=390M >>> module /boot/vmlinuz-2.6.22.13-0.3-xen >>> root=/dev/disk/by-id/cciss-3600508b1001030343620202020200001-part1 >>> > vga=0x317 > >>> resume=/dev/cciss/c0d0p5 splash=silent showopts swiotlb=16 >>> module /boot/initrd-2.6.22.13-0.3-xen >>> >>> >>> >>> uname -a >>> Linux srv4 2.6.22.13-0.3-xen #1 SMP 2007/11/19 15:02:58 UTC x86_64 >>> > x86_64 > >>> x86_64 GNU/Linux >>> >>> >>> rpm -qa | grep xen >>> xen-3.1.0_15042-51 >>> kernel-xen-2.6.22.13-0.3 >>> xen-doc-html-3.1.0_15042-51 >>> xen-libs-3.1.0_15042-51 >>> xen-tools-3.1.0_15042-51 >>> xen-doc-pdf-3.1.0_15042-51 >>> xen-tools-ioemu-3.1.0_15042-51 >>> >>> >>> I also had powersaved stopped, acpi stopped, ntp stopped (I found a >>> > thread > >>> where someone had >>> a problem with clocks walking backward and dying xen server), but so >>> > far, > >>> the box still freezes >>> without any notice in the logs. >>> >>> Any idea what could be the problem I have or where should I take a >>> > further > >>> look to >>> figure out what causes the problem of the freezing server? >>> >>> kind regards >>> Sebastian >>> >>> >>> _______________________________________________ >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >>> >> -- >> WBR, i.m.chubin >> >> >> > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >Hi, same problem here and it can be reproduced. I use Gentoo 2007.0 with Xen 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. The Server is a Dual Opteron 275 running in PV mode. The Dom0 freezes every time if you generate system high-load, for example starting a boinc-client or doing big filesystem transfers. -> Network hangs, SATA Devices time out Normally the system freezes every 2 hours. I tried to play with the Xen version compatibility in the kernel, but that doesn''t make a difference. Due to the HDD timeout I can''t find anything in the logs... kind regards Dirk _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Igor Chubin
2007-Dec-22 22:26 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
> Hi, >Hello,> same problem here and it can be reproduced. I use Gentoo 2007.0 with Xen > 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. > The Server is a Dual Opteron 275 running in PV mode. > The Dom0 freezes every time if you generate system high-load, for example > starting a boinc-client or doing big filesystem transfers. > -> Network hangs, SATA Devices time outThe problem I have mentioned earlier as far as I remember is on a Gentoo system too. But there are no problems with the disk. Only network. May be if they try to generate big load on the system, disk drives will hang too.> > Normally the system freezes every 2 hours.At that case much more seldom. Guys have said me that it hangs every several days (but if it wants to it can hang several times a day).> I tried to play with the Xen version compatibility in the kernel, but that > doesn''t make a difference. > > Due to the HDD timeout I can''t find anything in the logs... >Just a guess: it may not be related to Xen baloon driver? Do you use dom0_mem as a parameter for the hypervisor?> kind regards > > Dirk > > > > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- WBR, i.m.chubin _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sebastian Reitenbach
2008-Jan-02 12:04 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Hi, Igor Chubin <igor@chub.in> wrote:> > > same problem here and it can be reproduced. I use Gentoo 2007.0 with Xen > > 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. > > The Server is a Dual Opteron 275 running in PV mode. > > The Dom0 freezes every time if you generate system high-load, forexample> > starting a boinc-client or doing big filesystem transfers. > > -> Network hangs, SATA Devices time out > > The problem I have mentioned earlier > as far as I remember is on a Gentoo system too. > But there are no problems with the disk. > Only network.I think this is the problem here too. Over Christmas I downloaded the opensuse KOTD, hoping that it maybe fixes the problem. the dom0 was disconnected from network, and I had two domU''s running, and I copied a 650 MB file between these two via scp, for thousand times. Two days ago, I connected the dom0 to the network again, and started using the domU''s as file/print/... servers again. It took about an hour, and the server was frozen again, without any notice in /var/log/messages. I created a bugreport, maybe you can add your observations there too. http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1131> > May be if they try to generate big load on the system, > disk drives will hang too. > > > > > Normally the system freezes every 2 hours. > > At that case much more seldom. > Guys have said me that it hangs every several days > (but if it wants to it can hang several times a day). > > > I tried to play with the Xen version compatibility in the kernel, butthat> > doesn''t make a difference. > > > > Due to the HDD timeout I can''t find anything in the logs... > > > > Just a guess: > > it may not be related to Xen baloon driver? > > Do you use dom0_mem as a parameter for the hypervisor?I use dom0_mem, yes, but with and without this parameter, in both cases the dom0 froze. kind regards Sebastian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Stephan Seitz
2008-Jan-02 14:41 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
I had similar problems on one machine which has been solved by adding pci=routeirq to the kernel parameters at boot time. I somewhat sure that your problem is caused by other issues, but ... maybe it helps ;) Regards, Stephan Sebastian Reitenbach schrieb:> Hi, > Igor Chubin <igor@chub.in> wrote: >>> same problem here and it can be reproduced. I use Gentoo 2007.0 with Xen >>> 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. >>> The Server is a Dual Opteron 275 running in PV mode. >>> The Dom0 freezes every time if you generate system high-load, for > example >>> starting a boinc-client or doing big filesystem transfers. >>> -> Network hangs, SATA Devices time out >> The problem I have mentioned earlier >> as far as I remember is on a Gentoo system too. >> But there are no problems with the disk. >> Only network. > > I think this is the problem here too. Over Christmas I downloaded the > opensuse KOTD, hoping that it maybe fixes the problem. the dom0 was > disconnected from network, and I had two domU''s running, and I copied a 650 > MB file between these two via scp, for thousand times. > > Two days ago, I connected the dom0 to the network again, and started using > the domU''s as file/print/... servers again. > It took about an hour, and the server was frozen again, without any notice > in /var/log/messages. > > I created a bugreport, maybe you can add your observations there too. > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1131 > >> May be if they try to generate big load on the system, >> disk drives will hang too. >> >>> Normally the system freezes every 2 hours. >> At that case much more seldom. >> Guys have said me that it hangs every several days >> (but if it wants to it can hang several times a day). >> >>> I tried to play with the Xen version compatibility in the kernel, but > that >>> doesn''t make a difference. >>> >>> Due to the HDD timeout I can''t find anything in the logs... >>> >> Just a guess: >> >> it may not be related to Xen baloon driver? >> >> Do you use dom0_mem as a parameter for the hypervisor? > I use dom0_mem, yes, but with and without this parameter, in both cases the > dom0 froze. > > kind regards > Sebastian > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- Stephan Seitz Senior System Administrator *netz-haut* e.K. multimediale kommunikation zweierweg 22 97074 würzburg fon: +49 931 2876247 fax: +49 931 2876248 web: www.netz-haut.de <http://www.netz-haut.de/> registriergericht: amtsgericht würzburg, hra 5054 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sebastian Reitenbach
2008-Jan-03 07:48 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Hi, Stephan Seitz <s.seitz@netz-haut.de> wrote:> I had similar problems on one machine which has been solved by adding > pci=routeirq to the kernel parameters at boot time. > > I somewhat sure that your problem is caused by other issues, but ... > maybe it helps ;)thanks for this tip, I tried, but unfortunately it made the things worse. I started the copyjob over NFS again, it only took 10 minutes, and the server was frozen again. Now I''ll try to NFS export sth. from dom0, not from a domU as before, and start the copy job again, just to see what happens. kind regards, Sebastian> > Regards, > > Stephan > > > > > Sebastian Reitenbach schrieb: > > Hi, > > Igor Chubin <igor@chub.in> wrote: > >>> same problem here and it can be reproduced. I use Gentoo 2007.0 withXen> >>> 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. > >>> The Server is a Dual Opteron 275 running in PV mode. > >>> The Dom0 freezes every time if you generate system high-load, for > > example > >>> starting a boinc-client or doing big filesystem transfers. > >>> -> Network hangs, SATA Devices time out > >> The problem I have mentioned earlier > >> as far as I remember is on a Gentoo system too. > >> But there are no problems with the disk. > >> Only network. > > > > I think this is the problem here too. Over Christmas I downloaded the > > opensuse KOTD, hoping that it maybe fixes the problem. the dom0 was > > disconnected from network, and I had two domU''s running, and I copied a650> > MB file between these two via scp, for thousand times. > > > > Two days ago, I connected the dom0 to the network again, and startedusing> > the domU''s as file/print/... servers again. > > It took about an hour, and the server was frozen again, without anynotice> > in /var/log/messages. > > > > I created a bugreport, maybe you can add your observations there too. > > > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1131 > > > >> May be if they try to generate big load on the system, > >> disk drives will hang too. > >> > >>> Normally the system freezes every 2 hours. > >> At that case much more seldom. > >> Guys have said me that it hangs every several days > >> (but if it wants to it can hang several times a day). > >> > >>> I tried to play with the Xen version compatibility in the kernel, but > > that > >>> doesn''t make a difference. > >>> > >>> Due to the HDD timeout I can''t find anything in the logs... > >>> > >> Just a guess: > >> > >> it may not be related to Xen baloon driver? > >> > >> Do you use dom0_mem as a parameter for the hypervisor? > > I use dom0_mem, yes, but with and without this parameter, in both casesthe> > dom0 froze. > > > > kind regards > > Sebastian > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Igor Chubin
2008-Jan-03 22:17 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Hello Sebastian, any news about your problem? I''ve recommended my friend to try pci=routeirq like Stephan advised, but without success :( Nevertheless, thank you for your idea On Do, Jan 03, 2008 at 08:48:45 +0100, Sebastian Reitenbach wrote:> Hi, > > Stephan Seitz <s.seitz@netz-haut.de> wrote: > > I had similar problems on one machine which has been solved by adding > > pci=routeirq to the kernel parameters at boot time. > > > > I somewhat sure that your problem is caused by other issues, but ... > > maybe it helps ;) > thanks for this tip, I tried, but unfortunately it made the things worse. > I started the copyjob over NFS again, it only took 10 minutes, and the > server was frozen again. > Now I''ll try to NFS export sth. from dom0, not from a domU as before, and > start the copy job again, just to see what happens. > > kind regards, > Sebastian > > > > Regards, > > > > Stephan > > > > > > > > > > Sebastian Reitenbach schrieb: > > > Hi, > > > Igor Chubin <igor@chub.in> wrote: > > >>> same problem here and it can be reproduced. I use Gentoo 2007.0 with > Xen > > >>> 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. > > >>> The Server is a Dual Opteron 275 running in PV mode. > > >>> The Dom0 freezes every time if you generate system high-load, for > > > example > > >>> starting a boinc-client or doing big filesystem transfers. > > >>> -> Network hangs, SATA Devices time out > > >> The problem I have mentioned earlier > > >> as far as I remember is on a Gentoo system too. > > >> But there are no problems with the disk. > > >> Only network. > > > > > > I think this is the problem here too. Over Christmas I downloaded the > > > opensuse KOTD, hoping that it maybe fixes the problem. the dom0 was > > > disconnected from network, and I had two domU''s running, and I copied a > 650 > > > MB file between these two via scp, for thousand times. > > > > > > Two days ago, I connected the dom0 to the network again, and started > using > > > the domU''s as file/print/... servers again. > > > It took about an hour, and the server was frozen again, without any > notice > > > in /var/log/messages. > > > > > > I created a bugreport, maybe you can add your observations there too. > > > > > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1131 > > > > > >> May be if they try to generate big load on the system, > > >> disk drives will hang too. > > >> > > >>> Normally the system freezes every 2 hours. > > >> At that case much more seldom. > > >> Guys have said me that it hangs every several days > > >> (but if it wants to it can hang several times a day). > > >> > > >>> I tried to play with the Xen version compatibility in the kernel, but > > > that > > >>> doesn''t make a difference. > > >>> > > >>> Due to the HDD timeout I can''t find anything in the logs... > > >>> > > >> Just a guess: > > >> > > >> it may not be related to Xen baloon driver? > > >> > > >> Do you use dom0_mem as a parameter for the hypervisor? > > > I use dom0_mem, yes, but with and without this parameter, in both cases > the > > > dom0 froze. > > > > > > kind regards > > > Sebastian > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- WBR, i.m.chubin _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sebastian Reitenbach
2008-Jan-04 07:39 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Igor Chubin <igor@chub.in> wrote:> > Hello Sebastian, > > any news about your problem? > > > I''ve recommended my friend to try pci=routeirq > like Stephan advised, but without success :( > Nevertheless, thank you for your ideaYes, I have, we have done a lot of tests, it took a bit of time. While testing, we got the server freeze without xen kernel at all, so xen is not the problem. It turned out, that the server is very stable, when I disable network interface bonding. I found a bug report in the novell bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=278475 do you also have bonding enabled on your dom0''s interfaces? kind regards Sebastian> > > On Do, Jan 03, 2008 at 08:48:45 +0100, Sebastian Reitenbach wrote: > > Hi, > > > > Stephan Seitz <s.seitz@netz-haut.de> wrote: > > > I had similar problems on one machine which has been solved by adding > > > pci=routeirq to the kernel parameters at boot time. > > > > > > I somewhat sure that your problem is caused by other issues, but ... > > > maybe it helps ;) > > thanks for this tip, I tried, but unfortunately it made the thingsworse.> > I started the copyjob over NFS again, it only took 10 minutes, and the > > server was frozen again. > > Now I''ll try to NFS export sth. from dom0, not from a domU as before,and> > start the copy job again, just to see what happens. > > > > kind regards, > > Sebastian > > > > > > Regards, > > > > > > Stephan > > > > > > > > > > > > > > > Sebastian Reitenbach schrieb: > > > > Hi, > > > > Igor Chubin <igor@chub.in> wrote: > > > >>> same problem here and it can be reproduced. I use Gentoo 2007.0with> > Xen > > > >>> 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. > > > >>> The Server is a Dual Opteron 275 running in PV mode. > > > >>> The Dom0 freezes every time if you generate system high-load, for > > > > example > > > >>> starting a boinc-client or doing big filesystem transfers. > > > >>> -> Network hangs, SATA Devices time out > > > >> The problem I have mentioned earlier > > > >> as far as I remember is on a Gentoo system too. > > > >> But there are no problems with the disk. > > > >> Only network. > > > > > > > > I think this is the problem here too. Over Christmas I downloadedthe> > > > opensuse KOTD, hoping that it maybe fixes the problem. the dom0 was > > > > disconnected from network, and I had two domU''s running, and Icopied a> > 650 > > > > MB file between these two via scp, for thousand times. > > > > > > > > Two days ago, I connected the dom0 to the network again, and started > > using > > > > the domU''s as file/print/... servers again. > > > > It took about an hour, and the server was frozen again, without any > > notice > > > > in /var/log/messages. > > > > > > > > I created a bugreport, maybe you can add your observations theretoo.> > > > > > > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1131 > > > > > > > >> May be if they try to generate big load on the system, > > > >> disk drives will hang too. > > > >> > > > >>> Normally the system freezes every 2 hours. > > > >> At that case much more seldom. > > > >> Guys have said me that it hangs every several days > > > >> (but if it wants to it can hang several times a day). > > > >> > > > >>> I tried to play with the Xen version compatibility in the kernel,but> > > > that > > > >>> doesn''t make a difference. > > > >>> > > > >>> Due to the HDD timeout I can''t find anything in the logs... > > > >>> > > > >> Just a guess: > > > >> > > > >> it may not be related to Xen baloon driver? > > > >> > > > >> Do you use dom0_mem as a parameter for the hypervisor? > > > > I use dom0_mem, yes, but with and without this parameter, in bothcases> > the > > > > dom0 froze. > > > > > > > > kind regards > > > > Sebastian > > > > > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > -- > WBR, i.m.chubin > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Igor Chubin
2008-Jan-04 10:37 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
On Fr, Jan 04, 2008 at 08:39:47 +0100, Sebastian Reitenbach wrote:> Igor Chubin <igor@chub.in> wrote: > > > > Hello Sebastian, > > > > any news about your problem? > > > > > > I''ve recommended my friend to try pci=routeirq > > like Stephan advised, but without success :( > > Nevertheless, thank you for your idea > > Yes, I have, we have done a lot of tests, it took a bit of time. > While testing, we got the server freeze without xen kernel at all, so xen is > not the problem. It turned out, that the server is very stable, when I > disable network interface bonding. > I found a bug report in the novell bugzilla: > https://bugzilla.novell.com/show_bug.cgi?id=278475Congratulations!> > do you also have bonding enabled on your dom0''s interfaces?No, there is no bonding in that installation. So there is something else that cause hanging. It appears only when using 8021q tagging, but nothing else.> > kind regards > Sebastian > > > > > > > On Do, Jan 03, 2008 at 08:48:45 +0100, Sebastian Reitenbach wrote: > > > Hi, > > > > > > Stephan Seitz <s.seitz@netz-haut.de> wrote: > > > > I had similar problems on one machine which has been solved by adding > > > > pci=routeirq to the kernel parameters at boot time. > > > > > > > > I somewhat sure that your problem is caused by other issues, but ... > > > > maybe it helps ;) > > > thanks for this tip, I tried, but unfortunately it made the things > worse. > > > I started the copyjob over NFS again, it only took 10 minutes, and the > > > server was frozen again. > > > Now I''ll try to NFS export sth. from dom0, not from a domU as before, > and > > > start the copy job again, just to see what happens. > > > > > > kind regards, > > > Sebastian > > > > > > > > Regards, > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > Sebastian Reitenbach schrieb: > > > > > Hi, > > > > > Igor Chubin <igor@chub.in> wrote: > > > > >>> same problem here and it can be reproduced. I use Gentoo 2007.0 > with > > > Xen > > > > >>> 3.1.2 and kernel 2.6.22 (xen-sources) in 64bit mode. > > > > >>> The Server is a Dual Opteron 275 running in PV mode. > > > > >>> The Dom0 freezes every time if you generate system high-load, for > > > > > example > > > > >>> starting a boinc-client or doing big filesystem transfers. > > > > >>> -> Network hangs, SATA Devices time out > > > > >> The problem I have mentioned earlier > > > > >> as far as I remember is on a Gentoo system too. > > > > >> But there are no problems with the disk. > > > > >> Only network. > > > > > > > > > > I think this is the problem here too. Over Christmas I downloaded > the > > > > > opensuse KOTD, hoping that it maybe fixes the problem. the dom0 was > > > > > disconnected from network, and I had two domU''s running, and I > copied a > > > 650 > > > > > MB file between these two via scp, for thousand times. > > > > > > > > > > Two days ago, I connected the dom0 to the network again, and started > > > using > > > > > the domU''s as file/print/... servers again. > > > > > It took about an hour, and the server was frozen again, without any > > > notice > > > > > in /var/log/messages. > > > > > > > > > > I created a bugreport, maybe you can add your observations there > too. > > > > > > > > > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1131 > > > > > > > > > >> May be if they try to generate big load on the system, > > > > >> disk drives will hang too. > > > > >> > > > > >>> Normally the system freezes every 2 hours. > > > > >> At that case much more seldom. > > > > >> Guys have said me that it hangs every several days > > > > >> (but if it wants to it can hang several times a day). > > > > >> > > > > >>> I tried to play with the Xen version compatibility in the kernel, > but > > > > > that > > > > >>> doesn''t make a difference. > > > > >>> > > > > >>> Due to the HDD timeout I can''t find anything in the logs... > > > > >>> > > > > >> Just a guess: > > > > >> > > > > >> it may not be related to Xen baloon driver? > > > > >> > > > > >> Do you use dom0_mem as a parameter for the hypervisor? > > > > > I use dom0_mem, yes, but with and without this parameter, in both > cases > > > the > > > > > dom0 froze. > > > > > > > > > > kind regards > > > > > Sebastian > > > > > > > > > > > > > > _______________________________________________ > > > Xen-users mailing list > > > Xen-users@lists.xensource.com > > > http://lists.xensource.com/xen-users > > > > -- > > WBR, i.m.chubin > > > > >-- WBR, i.m.chubin _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sebastian Reitenbach
2008-Jan-04 10:54 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Igor Chubin <igor@chub.in> wrote:> On Fr, Jan 04, 2008 at 08:39:47 +0100, Sebastian Reitenbach wrote: > > Igor Chubin <igor@chub.in> wrote: > > > > > > Hello Sebastian, > > > > > > any news about your problem? > > > > > > > > > I''ve recommended my friend to try pci=routeirq > > > like Stephan advised, but without success :( > > > Nevertheless, thank you for your idea > > > > Yes, I have, we have done a lot of tests, it took a bit of time. > > While testing, we got the server freeze without xen kernel at all, soxen is> > not the problem. It turned out, that the server is very stable, when I > > disable network interface bonding. > > I found a bug report in the novell bugzilla: > > https://bugzilla.novell.com/show_bug.cgi?id=278475 > > > Congratulations! > > > > > do you also have bonding enabled on your dom0''s interfaces? > > > No, there is no bonding in that installation. > So there is something else that cause hanging. > > > It appears only when using 8021q tagging, but nothing else.when testing withouth xen kernel, but bonded devices, I had vlans on top of the bonding device. for testing, the bonding device had the IP address, so the vlan''s on top of it, were unused, so I thought they could not be the problem. The tests without bond, were without the vlan interfaces on top. Maybe I should try again with bonding enabled but without the vlan interface. Sebastian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sebastian Reitenbach
2008-Jan-04 13:14 UTC
Re: [Xen-users] xen dom0 server freezes every one or two hours
Sebastian Reitenbach <sebastia@l00-bugdead-prods.de> wrote:> Igor Chubin <igor@chub.in> wrote: > > On Fr, Jan 04, 2008 at 08:39:47 +0100, Sebastian Reitenbach wrote: > > > Igor Chubin <igor@chub.in> wrote: > > > > > > > > Hello Sebastian, > > > > > > > > any news about your problem? > > > > > > > > > > > > I''ve recommended my friend to try pci=routeirq > > > > like Stephan advised, but without success :( > > > > Nevertheless, thank you for your idea > > > > > > Yes, I have, we have done a lot of tests, it took a bit of time. > > > While testing, we got the server freeze without xen kernel at all, so > xen is > > > not the problem. It turned out, that the server is very stable, when I > > > disable network interface bonding. > > > I found a bug report in the novell bugzilla: > > > https://bugzilla.novell.com/show_bug.cgi?id=278475 > > > > > > Congratulations! > > > > > > > > do you also have bonding enabled on your dom0''s interfaces? > > > > > > No, there is no bonding in that installation. > > So there is something else that cause hanging. > > > > > > It appears only when using 8021q tagging, but nothing else. > when testing withouth xen kernel, but bonded devices, I had vlans on topof> the bonding device. for testing, the bonding device had the IP address, so > the vlan''s on top of it, were unused, so I thought they could not be the > problem. The tests without bond, were without the vlan interfaces on top. > Maybe I should try again with bonding enabled but without the vlan > interface.Just only for the records, as this is not xen related anymore, I did just that, two bonded interfaces, without any of the vlan interfaces up, starting to transfer files back and forth. It took 10 minutes and the server froze again. Sebastian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users