I totally agree with Luc Boudreau about this problem. On modern branded server they ALL use Broadcom chipset... I must admit that: a) lot of people are experiencing the problem but can tolerate it or b) someone already experienced this problem and solved it Lets'' approach the problem pragmatically... just start with a survey on the list. ... hope we solve this issue soon. Ivan HW Driver Version Adapter TX-DRP Kernel HP DL380 G3 tg3 3.49 BCM5703X No 2.6.16-xen DELL PE1950 bnx2 1.5.10c BCM5708 Yes 2.6.18-xen HP DL380 G5 bnx2 any BCM5708 Yes P.S. for people who may be interested in howto collect data: Driver: lsmod |grep tg3 lsmod |grep bnx2 Note: you are using only the driver which produce some outuput.. :) (admitting you''ve not compiled it statically in your kernel) Driver version: modinfo <type> |grep version Adapter: lspci |grep BCM Packet drop? netstat -i (TX-DRP column) Kernel: uname -r so this post is not totally useless :) -- http://www.bio.dist.unige.it voice: +39 010 353 2789 fax: +39 010 353 2948 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Ivan Porro schrieb:> I totally agree with Luc Boudreau about this problem. > > On modern branded server they ALL use Broadcom chipset... I must admit > that: > > a) lot of people are experiencing the problem but can tolerate it > > or > > b) someone already experienced this problem and solved itToo bad you started a new topic without even trying to describe what your problem with broadcom chipsets is? I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP ProLiant DL320 G3. It uses tg3 for its network cards. Works fine. 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 10) 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 10) -- Tomasz Chmielewski http://lists.wpkg.org _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Here''s for my part : HW Driver Version Adapter TX-DRP Kernel -- ------ ------- ------- ------ ------ HP ML350 G5 bnx2 1.5.10c BCM5708 Yes 2.6.18-8.1.14.el5xen ______________________________________________________ Luc Boudreau Registrariat, Université de Montréal -----Message d''origine----- De : xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] De la part de Ivan Porro Envoyé : 13 novembre 2007 10:33 À : xen-users@lists.xensource.com Objet : [Xen-users] XEN - Broadcom issue: survey I totally agree with Luc Boudreau about this problem. On modern branded server they ALL use Broadcom chipset... I must admit that: a) lot of people are experiencing the problem but can tolerate it or b) someone already experienced this problem and solved it Lets'' approach the problem pragmatically... just start with a survey on the list. ... hope we solve this issue soon. Ivan HW Driver Version Adapter TX-DRP Kernel HP DL380 G3 tg3 3.49 BCM5703X No 2.6.16-xen DELL PE1950 bnx2 1.5.10c BCM5708 Yes 2.6.18-xen HP DL380 G5 bnx2 any BCM5708 Yes P.S. for people who may be interested in howto collect data: Driver: lsmod |grep tg3 lsmod |grep bnx2 Note: you are using only the driver which produce some outuput.. :) (admitting you''ve not compiled it statically in your kernel) Driver version: modinfo <type> |grep version Adapter: lspci |grep BCM Packet drop? netstat -i (TX-DRP column) Kernel: uname -r so this post is not totally useless :) -- http://www.bio.dist.unige.it voice: +39 010 353 2789 fax: +39 010 353 2948 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Tomasz Chmielewski wrote:> Ivan Porro schrieb: >> I totally agree with Luc Boudreau about this problem. >> >> On modern branded server they ALL use Broadcom chipset... I must admit >> that: >> >> a) lot of people are experiencing the problem but can tolerate it >> >> or >> >> b) someone already experienced this problem and solved it > > Too bad you started a new topic without even trying to describe what > your problem with broadcom chipsets is? > > I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP > ProLiant DL320 G3. It uses tg3 for its network cards. > > Works fine. > > 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 > Gigabit Ethernet (rev 10) > 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 > Gigabit Ethernet (rev 10) > > >Tomasz, my post was based on earlier posts by me and others. I changed the subject for readiness, and maybe I missed an introductory statement such as: I''m one of the user who is in trouble with Xen on a server with Broadcom based networking. Please, if you are experiencing heavy packet loss, ssh or terminal server session interrupted, intermittent network participate in this survey. Otherwise, if you have a Broadcom and it work fine (are you sure? :) ) please participate too, so we may isolate the problem. Thank you, Ivan -- http://www.bio.dist.unige.it voice: +39 010 353 2789 fax: +39 010 353 2948 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Nov 13, 2007 at 04:33:03PM +0100, Ivan Porro wrote:> I totally agree with Luc Boudreau about this problem. > > On modern branded server they ALL use Broadcom chipset... I must admit that: > > a) lot of people are experiencing the problem but can tolerate it > > or > > b) someone already experienced this problem and solved it > > Lets'' approach the problem pragmatically... just start with a survey on > the list. ... hope we solve this issue soon. >What problem are you talking about? Are you running latest firmware? Only problem I''m aware of, is when you are using old firmware and network will die after a while.. -- Pasi> Ivan > > > HW Driver Version Adapter TX-DRP Kernel > > HP DL380 G3 tg3 3.49 BCM5703X No 2.6.16-xen > DELL PE1950 bnx2 1.5.10c BCM5708 Yes 2.6.18-xen > HP DL380 G5 bnx2 any BCM5708 Yes > > > > P.S. > > for people who may be interested in howto collect data: > > > Driver: lsmod |grep tg3 > lsmod |grep bnx2 > > Note: you are using only the driver which produce some outuput.. :) > (admitting you''ve not compiled it statically in your kernel) > > Driver version: modinfo <type> |grep version > Adapter: lspci |grep BCM > Packet drop? netstat -i (TX-DRP column) > Kernel: uname -r > > so this post is not totally useless :) > -- > http://www.bio.dist.unige.it > voice: +39 010 353 2789 > fax: +39 010 353 2948 >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
For my part, the problem is packet drops when transferring large files (approx 2 gb ). Which firmware are you inquiring about ? NIC or ''server hardware'' in general ?? ______________________________________________________ Luc Boudreau Registrariat, Université de Montréal -----Message d''origine----- De : xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] De la part de Pasi Kärkkäinen Envoyé : 13 novembre 2007 11:37 À : Ivan Porro Cc : xen-users@lists.xensource.com Objet : Re: [Xen-users] XEN - Broadcom issue: survey On Tue, Nov 13, 2007 at 04:33:03PM +0100, Ivan Porro wrote:> I totally agree with Luc Boudreau about this problem. > > On modern branded server they ALL use Broadcom chipset... I must admit that: > > a) lot of people are experiencing the problem but can tolerate it > > or > > b) someone already experienced this problem and solved it > > Lets'' approach the problem pragmatically... just start with a survey on > the list. ... hope we solve this issue soon. >What problem are you talking about? Are you running latest firmware? Only problem I''m aware of, is when you are using old firmware and network will die after a while.. -- Pasi> Ivan > > > HW Driver Version Adapter TX-DRP Kernel > > HP DL380 G3 tg3 3.49 BCM5703X No 2.6.16-xen > DELL PE1950 bnx2 1.5.10c BCM5708 Yes 2.6.18-xen > HP DL380 G5 bnx2 any BCM5708 Yes > > > > P.S. > > for people who may be interested in howto collect data: > > > Driver: lsmod |grep tg3 > lsmod |grep bnx2 > > Note: you are using only the driver which produce some outuput.. :) > (admitting you''ve not compiled it statically in your kernel) > > Driver version: modinfo <type> |grep version > Adapter: lspci |grep BCM > Packet drop? netstat -i (TX-DRP column) > Kernel: uname -r > > so this post is not totally useless :) > -- > http://www.bio.dist.unige.it > voice: +39 010 353 2789 > fax: +39 010 353 2948 >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pezza
2007-Nov-13 18:56 UTC
[Xen-users] Additional details (was Re: XEN - Broadcom issue: survey)
Hey there, I''m happy to see that something is moving on this issue... The following to check the status on the problem. The problem ------------------ DomU packets get dropped when transferring large quantities of data on hi speed (gigabit for me, don''t know if it''s the same on 100mb) links. This causes various problems: interrupted ftp sessions, ssh sessions, samab shares get corrupted, etc... The configuration ------------------------- Mine is: * server: HP ProLiant DL380 G5 (6gb ram - 2 dual core cpus) * Xen: Xen 3.1 compiled from stable source on x86 PAE enabled * Dom0: CentOS 5 * DomU: Win2k3 * ethernet card: Broadcom NetXTreme II 5708 * Xen networking configuration: tried all the standard (basically, bridge and route); also tried to assign an IP to the bridge instead of using the peth0.... Actions taken (with no success...) ------------------------------------------------- * recompile the kernel with the latest (1.5.10c) drivers from broadcom * disable the managed mode via uxdiag (as described somewhere on this list) * played a bit with the "txqueuelen" parameter on VIFxxx, xenbr and tap0 interfaces (usually, you should have a value of around 1000 for this on "normal" ethernet interfaces, while VIF are showing "strange" low values) * played a bit with ethtool to disable SG and TCO Weird and odds ----------------------- I also have another server (a Dell machine) equipped with the same network interface: on this machine I run XenExpress (the one running Xen 3.0.4). I tried to run the exact same DomU I use for testing on the HP (HVM, no PV drivers), and...it works fine! Differences I could find: * some sysctl differences in the ipv4/conf/all --> tried to migrate them tp the HP with no luck * different kernel version (Dell is running 2.6.16.38-xs3.2.0.531.3960xen, while HP is running 2.6.18 manually compiled during Xen rebuild) * different Broadcom drivers: Dell is running 1.5.1c, HP 1.5.10c (latest) * different Broadcom firmware: Dell is running 2.9.1, HP 1.9.3 Now, forgetting about hard to determine differences (changes in the kernel added by the XenExpress package? other "misterious" configuration differences?), the only serious thing I''d like to try is to update the HP firmware version. Does anybody know how to do this? The only thing I was able to find out during my testing is that raising the value for txqueuelen on the VIFxxx interface and/or disabling the SG and TCO flag on the card somehow "lowers" the problem (I can see less packets dropped); it''s hard to quantify this, but maybe its'' important... Any other ideas? I''ve other HP servers I''d like to use in our lab here to virtualize workstations and everything s blocked by this bug...you can imagine how badly I need to fix this ;-) Thanks, M. PS: Some references for similar issues I''ve found around: * http://www.nabble.com/Data-broken-during-FTP-test-tf3432035.html#a9567619 * http://www.nabble.com/Data-broken-during-FTP-test-tf3598590.html#a10051073 -- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13732226 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Nov 13, 2007 at 11:40:09AM -0500, Boudreau Luc wrote:> For my part, the problem is packet drops when transferring large files (approx 2 gb ). Which firmware are you inquiring about ? NIC or ''server hardware'' in general ?? >NIC firmware. broadcom NIC''s are known to have bad firmware in many servers.. ibm and dell at least, maybe hp too.. upgrading to new (bugfixed) firmware solved these problems. Then again I don''t know if this is your problem.. might very well be.. -- Pasi> ______________________________________________________ > > Luc Boudreau > Registrariat, Université de Montréal > > > -----Message d''origine----- > De : xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] De la part de Pasi Kärkkäinen > Envoyé : 13 novembre 2007 11:37 > À : Ivan Porro > Cc : xen-users@lists.xensource.com > Objet : Re: [Xen-users] XEN - Broadcom issue: survey > > On Tue, Nov 13, 2007 at 04:33:03PM +0100, Ivan Porro wrote: > > I totally agree with Luc Boudreau about this problem. > > > > On modern branded server they ALL use Broadcom chipset... I must admit that: > > > > a) lot of people are experiencing the problem but can tolerate it > > > > or > > > > b) someone already experienced this problem and solved it > > > > Lets'' approach the problem pragmatically... just start with a survey on > > the list. ... hope we solve this issue soon. > > > > What problem are you talking about? > > Are you running latest firmware? Only problem I''m aware of, is when you are > using old firmware and network will die after a while.. > > -- Pasi > > > Ivan > > > > > > HW Driver Version Adapter TX-DRP Kernel > > > > HP DL380 G3 tg3 3.49 BCM5703X No 2.6.16-xen > > DELL PE1950 bnx2 1.5.10c BCM5708 Yes 2.6.18-xen > > HP DL380 G5 bnx2 any BCM5708 Yes > > > > > > > > P.S. > > > > for people who may be interested in howto collect data: > > > > > > Driver: lsmod |grep tg3 > > lsmod |grep bnx2 > > > > Note: you are using only the driver which produce some outuput.. :) > > (admitting you''ve not compiled it statically in your kernel) > > > > Driver version: modinfo <type> |grep version > > Adapter: lspci |grep BCM > > Packet drop? netstat -i (TX-DRP column) > > Kernel: uname -r > > > > so this post is not totally useless :) > > -- > > http://www.bio.dist.unige.it > > voice: +39 010 353 2789 > > fax: +39 010 353 2948 > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hey, please note that if you''re using the tg3 driver, it means you''re using NetXTreme *not* NetXTreme II (the latter is using the 5708/5706 chipsets, the former some previous releases). The problem has been observed (but I''d like some confirmations here ;-)) using the bnx2 driver, which is specific to NetEXtreme II. To make sure what driver/version/firmware you''re using you can run ''ethtool -i eth0'' (replace eth0 with the ethernet device you want to show). In my case, the "bad" machine shows: [root@sofia ~]# ethtool -i peth0 driver: bnx2 version: 1.5.10c firmware-version: 1.9.3 bus-info: 0000:03:00.0 while the "good" machine shows: [root@camogli scripts]# ethtool -i eth0 driver: bnx2 version: 1.5.1c firmware-version: 2.9.1 bus-info: 0000:08:00.0 As I said before, the only relevant difference I can find between the 2 machines is the eth0 firmware version, but I don''t now if and how I can update it on the other one (ref to my previous message for the complete details on the good and bad machine configuration). M. mangoo wrote:> > Ivan Porro schrieb: >> I totally agree with Luc Boudreau about this problem. >> >> On modern branded server they ALL use Broadcom chipset... I must admit >> that: >> >> a) lot of people are experiencing the problem but can tolerate it >> >> or >> >> b) someone already experienced this problem and solved it > > Too bad you started a new topic without even trying to describe what > your problem with broadcom chipsets is? > > I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP > ProLiant DL320 G3. It uses tg3 for its network cards. > > Works fine. > > 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 > Gigabit Ethernet (rev 10) > 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 > Gigabit Ethernet (rev 10) > > > > -- > Tomasz Chmielewski > http://lists.wpkg.org > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13735207 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Nov 13, 2007 at 01:39:27PM -0800, Pezza wrote:> > Hey, > > please note that if you''re using the tg3 driver, it means you''re using > NetXTreme *not* NetXTreme II (the latter is using the 5708/5706 chipsets, > the former some previous releases). > The problem has been observed (but I''d like some confirmations here ;-)) > using the bnx2 driver, which is specific to NetEXtreme II. > > To make sure what driver/version/firmware you''re using you can run ''ethtool > -i eth0'' (replace eth0 with the ethernet device you want to show). > > In my case, the "bad" machine shows: > > [root@sofia ~]# ethtool -i peth0 > driver: bnx2 > version: 1.5.10c > firmware-version: 1.9.3 > bus-info: 0000:03:00.0 > > while the "good" machine shows: > [root@camogli scripts]# ethtool -i eth0 > driver: bnx2 > version: 1.5.1c > firmware-version: 2.9.1 > bus-info: 0000:08:00.0 > > As I said before, the only relevant difference I can find between the 2 > machines is the eth0 firmware version, but I don''t now if and how I can > update it on the other one (ref to my previous message for the complete > details on the good and bad machine configuration). >My IBM blades have tg3 NIC, and out-of-the-box they had broken firmware.. network would die after 10 seconds after starting xen network bridging.. IBM has firmware upgrade CD (.iso) available, which you can use to upgrade the NIC firmware. I''m sure other vendors have their own updates available too. -- Pasi> > M. > > > > > mangoo wrote: > > > > Ivan Porro schrieb: > >> I totally agree with Luc Boudreau about this problem. > >> > >> On modern branded server they ALL use Broadcom chipset... I must admit > >> that: > >> > >> a) lot of people are experiencing the problem but can tolerate it > >> > >> or > >> > >> b) someone already experienced this problem and solved it > > > > Too bad you started a new topic without even trying to describe what > > your problem with broadcom chipsets is? > > > > I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP > > ProLiant DL320 G3. It uses tg3 for its network cards. > > > > Works fine. > > > > 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 > > Gigabit Ethernet (rev 10) > > 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 > > Gigabit Ethernet (rev 10) > > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sorry to make the counter-example but I have: ~# ethtool -i peth0 driver: bnx2 version: 1.4.51b firmware-version: 2.9.1 bus-info: 0000:04:00.0 this driver is the one relased by DELL on Nov. 11, and I have packet drop. The same with driver 1.5.1c from broadcom The strange thing is that TX-DRP is 0 until I start a domain. Since I start an HVM, TX-DRP for its vif start incresing. If I start an PV machine, it drop only 20 packets and then stop. No more TX-DRP at all... So, to isolate the problem, it is related to HVM machine only. At least in my case. Ivan Pezza wrote:> Hey, > > please note that if you''re using the tg3 driver, it means you''re using > NetXTreme *not* NetXTreme II (the latter is using the 5708/5706 chipsets, > the former some previous releases). > The problem has been observed (but I''d like some confirmations here ;-)) > using the bnx2 driver, which is specific to NetEXtreme II. > > To make sure what driver/version/firmware you''re using you can run ''ethtool > -i eth0'' (replace eth0 with the ethernet device you want to show). > > In my case, the "bad" machine shows: > > [root@sofia ~]# ethtool -i peth0 > driver: bnx2 > version: 1.5.10c > firmware-version: 1.9.3 > bus-info: 0000:03:00.0 > > while the "good" machine shows: > [root@camogli scripts]# ethtool -i eth0 > driver: bnx2 > version: 1.5.1c > firmware-version: 2.9.1 > bus-info: 0000:08:00.0 > > As I said before, the only relevant difference I can find between the 2 > machines is the eth0 firmware version, but I don''t now if and how I can > update it on the other one (ref to my previous message for the complete > details on the good and bad machine configuration). > > > M. > > > > > mangoo wrote: >> Ivan Porro schrieb: >>> I totally agree with Luc Boudreau about this problem. >>> >>> On modern branded server they ALL use Broadcom chipset... I must admit >>> that: >>> >>> a) lot of people are experiencing the problem but can tolerate it >>> >>> or >>> >>> b) someone already experienced this problem and solved it >> Too bad you started a new topic without even trying to describe what >> your problem with broadcom chipsets is? >> >> I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP >> ProLiant DL320 G3. It uses tg3 for its network cards. >> >> Works fine. >> >> 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >> Gigabit Ethernet (rev 10) >> 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >> Gigabit Ethernet (rev 10) >> >> >> >> -- >> Tomasz Chmielewski >> http://lists.wpkg.org >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >> >> >-- http://www.bio.dist.unige.it voice: +39 010 353 2789 fax: +39 010 353 2948 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> this driver is the one relased by DELL on Nov. 11, and I have packet drop. > The same with driver 1.5.1c from broadcom > > The strange thing is that TX-DRP is 0 until I start a domain. Since I start > an HVM, TX-DRP for its vif start incresing. > > If I start an PV machine, it drop only 20 packets and then stop. No more > TX-DRP at all...Yes, that''s normal and expected behaviour. If you try and send a packet over a PV network interface to a domain which hasn''t loaded the PV network driver, the packet will be dropped and the TX-DRP statistic will be incremented. PV domains will therefore see a small number of packets dropped at boot time, while HVM domains will see ongoing packet drop for as long as the domain remains alive. For an HVM domain, you probably want to look at the tap device rather than the vif. Steven.> > So, to isolate the problem, it is related to HVM machine only. At least in > my case. > > Ivan > > Pezza wrote: >> Hey, >> please note that if you''re using the tg3 driver, it means you''re using >> NetXTreme *not* NetXTreme II (the latter is using the 5708/5706 chipsets, >> the former some previous releases). The problem has been observed (but I''d >> like some confirmations here ;-)) >> using the bnx2 driver, which is specific to NetEXtreme II. >> To make sure what driver/version/firmware you''re using you can run >> ''ethtool >> -i eth0'' (replace eth0 with the ethernet device you want to show). >> In my case, the "bad" machine shows: >> [root@sofia ~]# ethtool -i peth0 >> driver: bnx2 >> version: 1.5.10c >> firmware-version: 1.9.3 >> bus-info: 0000:03:00.0 >> while the "good" machine shows: >> [root@camogli scripts]# ethtool -i eth0 >> driver: bnx2 >> version: 1.5.1c >> firmware-version: 2.9.1 >> bus-info: 0000:08:00.0 >> As I said before, the only relevant difference I can find between the 2 >> machines is the eth0 firmware version, but I don''t now if and how I can >> update it on the other one (ref to my previous message for the complete >> details on the good and bad machine configuration). >> >> M. >> mangoo wrote: >>> Ivan Porro schrieb: >>>> I totally agree with Luc Boudreau about this problem. >>>> >>>> On modern branded server they ALL use Broadcom chipset... I must admit >>>> that: >>>> >>>> a) lot of people are experiencing the problem but can tolerate it >>>> >>>> or >>>> >>>> b) someone already experienced this problem and solved it >>> Too bad you started a new topic without even trying to describe what your >>> problem with broadcom chipsets is? >>> >>> I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP >>> ProLiant DL320 G3. It uses tg3 for its network cards. >>> >>> Works fine. >>> >>> 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>> Gigabit Ethernet (rev 10) >>> 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>> Gigabit Ethernet (rev 10) >>> >>> >>> >>> -- >>> Tomasz Chmielewski >>> http://lists.wpkg.org >>> >>> _______________________________________________ >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >>> >>> > > > -- > http://www.bio.dist.unige.it > voice: +39 010 353 2789 > fax: +39 010 353 2948 > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Nov 14, 2007 at 03:17:07PM +0000, Steven Smith wrote:> > this driver is the one relased by DELL on Nov. 11, and I have packet drop. > > The same with driver 1.5.1c from broadcom > > > > The strange thing is that TX-DRP is 0 until I start a domain. Since I start > > an HVM, TX-DRP for its vif start incresing. > > > > If I start an PV machine, it drop only 20 packets and then stop. No more > > TX-DRP at all... > Yes, that''s normal and expected behaviour. If you try and send a > packet over a PV network interface to a domain which hasn''t loaded the > PV network driver, the packet will be dropped and the TX-DRP statistic > will be incremented. PV domains will therefore see a small number of > packets dropped at boot time, while HVM domains will see ongoing > packet drop for as long as the domain remains alive. For an HVM > domain, you probably want to look at the tap device rather than the > vif. >Steps to go through to debug: 1. Make sure networking is OK in the host, without running xen at all (just running the normal non-xen kernel). 2. Make sure networking is OK on dom0, when running xen-kernel and NO domU running. 3. Make sure networking is OK on paravirt domU. 4. Make sure networking is OK on HVM domU, make sure you''re running paravirt drivers on HVM domU (this is important!!). Without PV drivers performance of HVM domU will be sucky and you WILL get packet drops. Now, with those tests done, you should be able to isolate the problem quite easily.. and see if it''s NIC/driver or Xen or domU driver related. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Ivan, I don''t konw how to replicate the test on my machine (I use Xen 3.1 opensource, thus I don''t think I can run a windows PV guest on it, and I don''t know if it applies also to Linux VMs), but I wouldn''t be surprised if this issue was related to non-PV guests. I''m surprised to see that you''re running the same firmware version of my "good" machine and still have the problem: at this point, I really don''t know what else we could try. The only other thing is that I didn''t try with Xen 3.0.4 (the same running on my XenExpress machine): maybe 3.1 has a flaw with these network card? Or maybe there''s a "hidden fix" in the XenExpress supplied kernel? M. Ivan Porro wrote:> > Sorry to make the counter-example but I have: > > ~# ethtool -i peth0 > driver: bnx2 > version: 1.4.51b > firmware-version: 2.9.1 > bus-info: 0000:04:00.0 > > this driver is the one relased by DELL on Nov. 11, and I have packet > drop. The same with driver 1.5.1c from broadcom > > The strange thing is that TX-DRP is 0 until I start a domain. Since I > start an HVM, TX-DRP for its vif start incresing. > > If I start an PV machine, it drop only 20 packets and then stop. No more > TX-DRP at all... > > So, to isolate the problem, it is related to HVM machine only. At least > in my case. > > Ivan > > Pezza wrote: >> Hey, >> >> please note that if you''re using the tg3 driver, it means you''re using >> NetXTreme *not* NetXTreme II (the latter is using the 5708/5706 chipsets, >> the former some previous releases). >> The problem has been observed (but I''d like some confirmations here ;-)) >> using the bnx2 driver, which is specific to NetEXtreme II. >> >> To make sure what driver/version/firmware you''re using you can run >> ''ethtool >> -i eth0'' (replace eth0 with the ethernet device you want to show). >> >> In my case, the "bad" machine shows: >> >> [root@sofia ~]# ethtool -i peth0 >> driver: bnx2 >> version: 1.5.10c >> firmware-version: 1.9.3 >> bus-info: 0000:03:00.0 >> >> while the "good" machine shows: >> [root@camogli scripts]# ethtool -i eth0 >> driver: bnx2 >> version: 1.5.1c >> firmware-version: 2.9.1 >> bus-info: 0000:08:00.0 >> >> As I said before, the only relevant difference I can find between the 2 >> machines is the eth0 firmware version, but I don''t now if and how I can >> update it on the other one (ref to my previous message for the complete >> details on the good and bad machine configuration). >> >> >> M. >> >> >> >> >> mangoo wrote: >>> Ivan Porro schrieb: >>>> I totally agree with Luc Boudreau about this problem. >>>> >>>> On modern branded server they ALL use Broadcom chipset... I must admit >>>> that: >>>> >>>> a) lot of people are experiencing the problem but can tolerate it >>>> >>>> or >>>> >>>> b) someone already experienced this problem and solved it >>> Too bad you started a new topic without even trying to describe what >>> your problem with broadcom chipsets is? >>> >>> I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP >>> ProLiant DL320 G3. It uses tg3 for its network cards. >>> >>> Works fine. >>> >>> 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>> Gigabit Ethernet (rev 10) >>> 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>> Gigabit Ethernet (rev 10) >>> >>> >>> >>> -- >>> Tomasz Chmielewski >>> http://lists.wpkg.org >>> >>> _______________________________________________ >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >>> >>> >> > > > -- > http://www.bio.dist.unige.it > voice: +39 010 353 2789 > fax: +39 010 353 2948 > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13749943 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Just to keep you informed, I''m downloading the HP firmware update CD to test the firmware update on my server. Maybe this will fix my "Broadcom issue". The problem is, 650mb at 9k/s takes a while... I''ll post to the list once done. ______________________________________________________ Luc Boudreau Registrariat, Université de Montréal -----Message d''origine----- De : xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] De la part de Pezza Envoyé : 14 novembre 2007 11:03 À : xen-users@lists.xensource.com Objet : Re: [Xen-users] XEN - Broadcom issue: survey Ivan, I don''t konw how to replicate the test on my machine (I use Xen 3.1 opensource, thus I don''t think I can run a windows PV guest on it, and I don''t know if it applies also to Linux VMs), but I wouldn''t be surprised if this issue was related to non-PV guests. I''m surprised to see that you''re running the same firmware version of my "good" machine and still have the problem: at this point, I really don''t know what else we could try. The only other thing is that I didn''t try with Xen 3.0.4 (the same running on my XenExpress machine): maybe 3.1 has a flaw with these network card? Or maybe there''s a "hidden fix" in the XenExpress supplied kernel? M. Ivan Porro wrote:> > Sorry to make the counter-example but I have: > > ~# ethtool -i peth0 > driver: bnx2 > version: 1.4.51b > firmware-version: 2.9.1 > bus-info: 0000:04:00.0 > > this driver is the one relased by DELL on Nov. 11, and I have packet > drop. The same with driver 1.5.1c from broadcom > > The strange thing is that TX-DRP is 0 until I start a domain. Since I > start an HVM, TX-DRP for its vif start incresing. > > If I start an PV machine, it drop only 20 packets and then stop. No more > TX-DRP at all... > > So, to isolate the problem, it is related to HVM machine only. At least > in my case. > > Ivan > > Pezza wrote: >> Hey, >> >> please note that if you''re using the tg3 driver, it means you''re using >> NetXTreme *not* NetXTreme II (the latter is using the 5708/5706 chipsets, >> the former some previous releases). >> The problem has been observed (but I''d like some confirmations here ;-)) >> using the bnx2 driver, which is specific to NetEXtreme II. >> >> To make sure what driver/version/firmware you''re using you can run >> ''ethtool >> -i eth0'' (replace eth0 with the ethernet device you want to show). >> >> In my case, the "bad" machine shows: >> >> [root@sofia ~]# ethtool -i peth0 >> driver: bnx2 >> version: 1.5.10c >> firmware-version: 1.9.3 >> bus-info: 0000:03:00.0 >> >> while the "good" machine shows: >> [root@camogli scripts]# ethtool -i eth0 >> driver: bnx2 >> version: 1.5.1c >> firmware-version: 2.9.1 >> bus-info: 0000:08:00.0 >> >> As I said before, the only relevant difference I can find between the 2 >> machines is the eth0 firmware version, but I don''t now if and how I can >> update it on the other one (ref to my previous message for the complete >> details on the good and bad machine configuration). >> >> >> M. >> >> >> >> >> mangoo wrote: >>> Ivan Porro schrieb: >>>> I totally agree with Luc Boudreau about this problem. >>>> >>>> On modern branded server they ALL use Broadcom chipset... I must admit >>>> that: >>>> >>>> a) lot of people are experiencing the problem but can tolerate it >>>> >>>> or >>>> >>>> b) someone already experienced this problem and solved it >>> Too bad you started a new topic without even trying to describe what >>> your problem with broadcom chipsets is? >>> >>> I''m running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP >>> ProLiant DL320 G3. It uses tg3 for its network cards. >>> >>> Works fine. >>> >>> 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>> Gigabit Ethernet (rev 10) >>> 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>> Gigabit Ethernet (rev 10) >>> >>> >>> >>> -- >>> Tomasz Chmielewski >>> http://lists.wpkg.org >>> >>> _______________________________________________ >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >>> >>> >> > > > -- > http://www.bio.dist.unige.it > voice: +39 010 353 2789 > fax: +39 010 353 2948 > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13749943 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Steven, Steven Smith-9 wrote:> > For an HVM domain, you probably want to look at the tap device rather than > the > vif. >What do you mean exactly? I''m having the exact same behaviour described by Ivan in his email (dropped packets on the vifxx interface), but my tap device hasn''t got anything strange. Another strange thing I just observed is that in the host domain the dropped packets are on the VIFxx interface on the TX side, while in xm top I can see the same number of dropped packets on the "Net0" interface of my guest, *but* on the RX side. I assumed that this is normal, but maybe it''s important. M. -- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13749953 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sure thing. http://h18004.www1.hp.com/support/files/server/us/download/28059.html ftp://ftp.compaq.com/pub/softlib2/software1/cd/p308282826/v43092/firmware-7.91-0.zip Good luck downloading it !! It's soo slow and crashes. If you succeed, please share it elsewhere so I can get it. I'll do the same. ______________________________________________________ Luc Boudreau Registrariat, Université de Montréal -----Message d'origine----- De : mario.beccia@tiscali.it [mailto:mario.beccia@tiscali.it] Envoyé : 14 novembre 2007 11:30 À : Boudreau Luc Objet : RE: XEN - Broadcom issue: survey Luc, could you please send me the URL from where you're getting the CD? I tried to search the HP site but it seems I can't find it... Thanks, M. Boudreau Luc wrote:> > Just to keep you informed, I'm downloading the HP firmware update CD to > test the firmware update on my server. Maybe this will fix my "Broadcom > issue". The problem is, 650mb at 9k/s takes a while... I'll post to the > list once done. > > ______________________________________________________ > > Luc Boudreau > Registrariat, Université de Montréal > > > -----Message d'origine----- > De : xen-users-bounces@lists.xensource.com > [mailto:xen-users-bounces@lists.xensource.com] De la part de Pezza > Envoyé : 14 novembre 2007 11:03 > À : xen-users@lists.xensource.com > Objet : Re: [Xen-users] XEN - Broadcom issue: survey > > > Ivan, > > I don't konw how to replicate the test on my machine (I use Xen 3.1 > opensource, thus I don't think I can run a windows PV guest on it, and I > don't know if it applies also to Linux VMs), but I wouldn't be surprised > if > this issue was related to non-PV guests. > > I'm surprised to see that you're running the same firmware version of my > "good" machine and still have the problem: at this point, I really don't > know what else we could try. > > The only other thing is that I didn't try with Xen 3.0.4 (the same running > on my XenExpress machine): maybe 3.1 has a flaw with these network card? > Or > maybe there's a "hidden fix" in the XenExpress supplied kernel? > > > M. > > > > Ivan Porro wrote: >> >> Sorry to make the counter-example but I have: >> >> ~# ethtool -i peth0 >> driver: bnx2 >> version: 1.4.51b >> firmware-version: 2.9.1 >> bus-info: 0000:04:00.0 >> >> this driver is the one relased by DELL on Nov. 11, and I have packet >> drop. The same with driver 1.5.1c from broadcom >> >> The strange thing is that TX-DRP is 0 until I start a domain. Since I >> start an HVM, TX-DRP for its vif start incresing. >> >> If I start an PV machine, it drop only 20 packets and then stop. No more >> TX-DRP at all... >> >> So, to isolate the problem, it is related to HVM machine only. At least >> in my case. >> >> Ivan >> >> Pezza wrote: >>> Hey, >>> >>> please note that if you're using the tg3 driver, it means you're using >>> NetXTreme *not* NetXTreme II (the latter is using the 5708/5706 >>> chipsets, >>> the former some previous releases). >>> The problem has been observed (but I'd like some confirmations here ;-)) >>> using the bnx2 driver, which is specific to NetEXtreme II. >>> >>> To make sure what driver/version/firmware you're using you can run >>> 'ethtool >>> -i eth0' (replace eth0 with the ethernet device you want to show). >>> >>> In my case, the "bad" machine shows: >>> >>> [root@sofia ~]# ethtool -i peth0 >>> driver: bnx2 >>> version: 1.5.10c >>> firmware-version: 1.9.3 >>> bus-info: 0000:03:00.0 >>> >>> while the "good" machine shows: >>> [root@camogli scripts]# ethtool -i eth0 >>> driver: bnx2 >>> version: 1.5.1c >>> firmware-version: 2.9.1 >>> bus-info: 0000:08:00.0 >>> >>> As I said before, the only relevant difference I can find between the 2 >>> machines is the eth0 firmware version, but I don't now if and how I can >>> update it on the other one (ref to my previous message for the complete >>> details on the good and bad machine configuration). >>> >>> >>> M. >>> >>> >>> >>> >>> mangoo wrote: >>>> Ivan Porro schrieb: >>>>> I totally agree with Luc Boudreau about this problem. >>>>> >>>>> On modern branded server they ALL use Broadcom chipset... I must admit >>>>> that: >>>>> >>>>> a) lot of people are experiencing the problem but can tolerate it >>>>> >>>>> or >>>>> >>>>> b) someone already experienced this problem and solved it >>>> Too bad you started a new topic without even trying to describe what >>>> your problem with broadcom chipsets is? >>>> >>>> I'm running Xen 3.1 (2.6.18 kernel), compiled from sources, on HP >>>> ProLiant DL320 G3. It uses tg3 for its network cards. >>>> >>>> Works fine. >>>> >>>> 06:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>>> Gigabit Ethernet (rev 10) >>>> 06:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 >>>> Gigabit Ethernet (rev 10) >>>> >>>> >>>> >>>> -- >>>> Tomasz Chmielewski >>>> http://lists.wpkg.org >>>> >>>> _______________________________________________ >>>> Xen-users mailing list >>>> Xen-users@lists.xensource.com >>>> http://lists.xensource.com/xen-users >>>> >>>> >>> >> >> >> -- >> http://www.bio.dist.unige.it >> voice: +39 010 353 2789 >> fax: +39 010 353 2948 >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >> >> > > -- > View this message in context: > http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13749943 > Sent from the Xen - User mailing list archive at Nabble.com. > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >Quoted from: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13750515 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Pasi, Pasi Kärkkäinen wrote:> > 4. Make sure networking is OK on HVM domU, make sure you''re running > paravirt drivers on HVM domU (this is important!!). Without PV drivers > performance of HVM domU will be sucky and you WILL get packet drops. >Just to be sure I''m not misunderstanding, are you saying that running an HVM machine without PV drivers makes it impossible to avoid packets drops? So this means there''s no way to have a fully functional (=a machine that''s perfectly usable, regardless of its performance) HVM machine without PV drivers? I just run some tests following your hints, and my server without xen runs fine, with the xen kernel no vms runs fine, the problem comes in a non-PV HVM guest only. M. -- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13750614 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen wrote:> On Wed, Nov 14, 2007 at 03:17:07PM +0000, Steven Smith wrote: >>> this driver is the one relased by DELL on Nov. 11, and I have packet drop. >>> The same with driver 1.5.1c from broadcom >>> >>> The strange thing is that TX-DRP is 0 until I start a domain. Since I start >>> an HVM, TX-DRP for its vif start incresing. >>> >>> If I start an PV machine, it drop only 20 packets and then stop. No more >>> TX-DRP at all... >> Yes, that''s normal and expected behaviour. If you try and send a >> packet over a PV network interface to a domain which hasn''t loaded the >> PV network driver, the packet will be dropped and the TX-DRP statistic >> will be incremented. PV domains will therefore see a small number of >> packets dropped at boot time, while HVM domains will see ongoing >> packet drop for as long as the domain remains alive. For an HVM >> domain, you probably want to look at the tap device rather than the >> vif. >> > > Steps to go through to debug: > > 1. Make sure networking is OK in the host, without running xen at all > (just running the normal non-xen kernel). > > 2. Make sure networking is OK on dom0, when running xen-kernel and NO domU > running. > > 3. Make sure networking is OK on paravirt domU. > > 4. Make sure networking is OK on HVM domU, make sure you''re running > paravirt drivers on HVM domU (this is important!!). Without PV drivers > performance of HVM domU will be sucky and you WILL get packet drops. > > Now, with those tests done, you should be able to isolate the problem quite > easily.. and see if it''s NIC/driver or Xen or domU driver related. > > -- Pasi >0. RTFM....I think that the user has raised an error and will be terminated (MySelf :)) thank you for help and for pointing this out (you and Steven), I''ll investigate further for my "RDP intermittent session" problem ivan -- http://www.bio.dist.unige.it voice: +39 010 353 2789 fax: +39 010 353 2948 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Nov 14, 2007 at 08:41:43AM -0800, Pezza wrote:> > Hi Pasi, > > > Pasi Kärkkäinen wrote: > > > > 4. Make sure networking is OK on HVM domU, make sure you''re running > > paravirt drivers on HVM domU (this is important!!). Without PV drivers > > performance of HVM domU will be sucky and you WILL get packet drops. > > > Just to be sure I''m not misunderstanding, are you saying that running an HVM > machine without PV drivers makes it impossible to avoid packets drops? So > this means there''s no way to have a fully functional (=a machine that''s > perfectly usable, regardless of its performance) HVM machine without PV > drivers? > > I just run some tests following your hints, and my server without xen runs > fine, with the xen kernel no vms runs fine, the problem comes in a non-PV > HVM guest only. > >HVM guest/domU is _emulating_ a NIC, which is really slow. If you''re running Linux HVM domU, please install PV drivers to the domU.. this will boost your performance, because then there''s no need to emulate a NIC. If your distribution is not shipping Linux HVM PV drivers, you can get them from Xen source tree. HVM domU PV drivers talk directly to Xen and bypass the whole emulation. At the moment there are no usable PV drivers for Windows HVM domU running on opensource Xen. Some people are working on this, but it will take some time before those drivers are ready/stable. XenEnterpise, Virtual Iron and Novell SLES have Windows PV drivers available.. Redhat is also planning to release their windows drivers at some point. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi, thanks for your reply. I understood from this and other mailing lists that using an HVM machine with no PV drivers would result in a poor performance, but it would work anyway. My problem is that, due to this packet loss, HVM machines are not usable, because they get some "strange" errors from time to time (session breaks, corrupt files, etc...). So you''re saying that lack of PV drivers is the cause and thus that HVM machines are not stable if we don''t use PV drivers? M. Pasi Kärkkäinen wrote:> > On Wed, Nov 14, 2007 at 08:41:43AM -0800, Pezza wrote: >> >> Hi Pasi, >> >> >> Pasi Kärkkäinen wrote: >> > >> > 4. Make sure networking is OK on HVM domU, make sure you''re running >> > paravirt drivers on HVM domU (this is important!!). Without PV >> drivers >> > performance of HVM domU will be sucky and you WILL get packet drops. >> > >> Just to be sure I''m not misunderstanding, are you saying that running an >> HVM >> machine without PV drivers makes it impossible to avoid packets drops? So >> this means there''s no way to have a fully functional (=a machine that''s >> perfectly usable, regardless of its performance) HVM machine without PV >> drivers? >> >> I just run some tests following your hints, and my server without xen >> runs >> fine, with the xen kernel no vms runs fine, the problem comes in a non-PV >> HVM guest only. >> >> > > HVM guest/domU is _emulating_ a NIC, which is really slow. > > If you''re running Linux HVM domU, please install PV drivers to the domU.. > this will boost your performance, because then there''s no need to emulate > a > NIC. If your distribution is not shipping Linux HVM PV drivers, you can > get > them from Xen source tree. > > HVM domU PV drivers talk directly to Xen and bypass the whole emulation. > > At the moment there are no usable PV drivers for Windows HVM domU running > on > opensource Xen. Some people are working on this, but it will take some > time > before those drivers are ready/stable. > > XenEnterpise, Virtual Iron and Novell SLES have Windows PV drivers > available.. > Redhat is also planning to release their windows drivers at some point. > > -- Pasi > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13754528 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Nov 14, 2007 at 11:27:44AM -0800, Pezza wrote:> > Pasi, > > thanks for your reply. > > I understood from this and other mailing lists that using an HVM machine > with no PV drivers would result in a poor performance, but it would work > anyway. > My problem is that, due to this packet loss, HVM machines are not usable, > because they get some "strange" errors from time to time (session breaks, > corrupt files, etc...). > So you''re saying that lack of PV drivers is the cause and thus that HVM > machines are not stable if we don''t use PV drivers? >Basicly, yes. HVM domU hardware emulation (NIC, disk controller, etc) is done by QEMU in Xen. QEMU people can possibly tell you more about expected performance and problems. And I bet you can find many comparisons with some googling.. performance with and without PV drivers in HVM domU. -- Pasi> > M. > > > Pasi Kärkkäinen wrote: > > > > On Wed, Nov 14, 2007 at 08:41:43AM -0800, Pezza wrote: > >> > >> Hi Pasi, > >> > >> > >> Pasi Kärkkäinen wrote: > >> > > >> > 4. Make sure networking is OK on HVM domU, make sure you''re running > >> > paravirt drivers on HVM domU (this is important!!). Without PV > >> drivers > >> > performance of HVM domU will be sucky and you WILL get packet drops. > >> > > >> Just to be sure I''m not misunderstanding, are you saying that running an > >> HVM > >> machine without PV drivers makes it impossible to avoid packets drops? So > >> this means there''s no way to have a fully functional (=a machine that''s > >> perfectly usable, regardless of its performance) HVM machine without PV > >> drivers? > >> > >> I just run some tests following your hints, and my server without xen > >> runs > >> fine, with the xen kernel no vms runs fine, the problem comes in a non-PV > >> HVM guest only. > >> > >> > > > > HVM guest/domU is _emulating_ a NIC, which is really slow. > > > > If you''re running Linux HVM domU, please install PV drivers to the domU.. > > this will boost your performance, because then there''s no need to emulate > > a > > NIC. If your distribution is not shipping Linux HVM PV drivers, you can > > get > > them from Xen source tree. > > > > HVM domU PV drivers talk directly to Xen and bypass the whole emulation. > > > > At the moment there are no usable PV drivers for Windows HVM domU running > > on > > opensource Xen. Some people are working on this, but it will take some > > time > > before those drivers are ready/stable. > > > > XenEnterpise, Virtual Iron and Novell SLES have Windows PV drivers > > available.. > > Redhat is also planning to release their windows drivers at some point. > > > > -- Pasi > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Nov 14, 2007 at 10:29:09PM +0200, Pasi Kärkkäinen wrote:> On Wed, Nov 14, 2007 at 11:27:44AM -0800, Pezza wrote: > > > > Pasi, > > > > thanks for your reply. > > > > I understood from this and other mailing lists that using an HVM machine > > with no PV drivers would result in a poor performance, but it would work > > anyway. > > My problem is that, due to this packet loss, HVM machines are not usable, > > because they get some "strange" errors from time to time (session breaks, > > corrupt files, etc...). > > So you''re saying that lack of PV drivers is the cause and thus that HVM > > machines are not stable if we don''t use PV drivers? > > > > Basicly, yes. > > HVM domU hardware emulation (NIC, disk controller, etc) is done by QEMU in > Xen. > > QEMU people can possibly tell you more about expected performance and problems. > > And I bet you can find many comparisons with some googling.. performance > with and without PV drivers in HVM domU. >Btw same happens with VMware.. if you don''t install "vmware tools" (=optimized drivers) you''re limited to 10 Mbit/sec networking etc.. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi, yes, definitely. But, as I said, I''m not interested in performance here, just stability. VMWare is slow without pv, but is stable (I can download gigs of data from machines on the same network with no problems; I can''t do the same with a Xen vm at the moment), while Xen, as of your words, is unstable without PV. Right? M. Pasi Kärkkäinen wrote:> > On Wed, Nov 14, 2007 at 10:29:09PM +0200, Pasi Kärkkäinen wrote: >> On Wed, Nov 14, 2007 at 11:27:44AM -0800, Pezza wrote: >> > >> > Pasi, >> > >> > thanks for your reply. >> > >> > I understood from this and other mailing lists that using an HVM >> machine >> > with no PV drivers would result in a poor performance, but it would >> work >> > anyway. >> > My problem is that, due to this packet loss, HVM machines are not >> usable, >> > because they get some "strange" errors from time to time (session >> breaks, >> > corrupt files, etc...). >> > So you''re saying that lack of PV drivers is the cause and thus that HVM >> > machines are not stable if we don''t use PV drivers? >> > >> >> Basicly, yes. >> >> HVM domU hardware emulation (NIC, disk controller, etc) is done by QEMU >> in >> Xen. >> >> QEMU people can possibly tell you more about expected performance and >> problems. >> >> And I bet you can find many comparisons with some googling.. performance >> with and without PV drivers in HVM domU. >> > > Btw same happens with VMware.. if you don''t install "vmware tools" > (=optimized drivers) you''re limited to 10 Mbit/sec networking etc.. > > -- Pasi > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13756935 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Nov 14, 2007 at 01:27:58PM -0800, Pezza wrote:> > Pasi, > > yes, definitely. > > But, as I said, I''m not interested in performance here, just stability. > VMWare is slow without pv, but is stable (I can download gigs of data from > machines on the same network with no problems; I can''t do the same with a > Xen vm at the moment), while Xen, as of your words, is unstable without PV. > > Right? >Hmm.. it shouldn''t be _unstable_ without PV drivers.. which version of Xen? Whist dom0 distribution and kernel? Which guest OS? -- Pasi> > M. > > > Pasi Kärkkäinen wrote: > > > > On Wed, Nov 14, 2007 at 10:29:09PM +0200, Pasi Kärkkäinen wrote: > >> On Wed, Nov 14, 2007 at 11:27:44AM -0800, Pezza wrote: > >> > > >> > Pasi, > >> > > >> > thanks for your reply. > >> > > >> > I understood from this and other mailing lists that using an HVM > >> machine > >> > with no PV drivers would result in a poor performance, but it would > >> work > >> > anyway. > >> > My problem is that, due to this packet loss, HVM machines are not > >> usable, > >> > because they get some "strange" errors from time to time (session > >> breaks, > >> > corrupt files, etc...). > >> > So you''re saying that lack of PV drivers is the cause and thus that HVM > >> > machines are not stable if we don''t use PV drivers? > >> > > >> > >> Basicly, yes. > >> > >> HVM domU hardware emulation (NIC, disk controller, etc) is done by QEMU > >> in > >> Xen. > >> > >> QEMU people can possibly tell you more about expected performance and > >> problems. > >> > >> And I bet you can find many comparisons with some googling.. performance > >> with and without PV drivers in HVM domU. > >> > > > > Btw same happens with VMware.. if you don''t install "vmware tools" > > (=optimized drivers) you''re limited to 10 Mbit/sec networking etc.. > > > > -- Pasi > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hey Pasi, it''s Xen 3.1 compiled from stable source (downloaded last thursday). It''s running on a HP Proliant DL380 G5 with 6gb ram. Host OS is CentOS 5 (kernel 2.6.18 built inside Xen 3.1), guest is Win2K3 service pack 2. The funny thing is that if I take the same disk image and the same conf file and I run them on a XenExpress machine, everything runs fine and I see no packet drop. Note that I said the same conf and image, thus I''m *not* using PV, just the plain XenExpress server and a manual "xm create" command; my XenExpress, thus, is running 3.0.4, not 3.1 and is on a different hw (you can check earlier messages to this list for the details). So, what I guess from this is that it''s either an hw problem (network card drivers?) or a difference in the kernel/dom0 configuration... M. M. Pasi Kärkkäinen wrote:> > On Wed, Nov 14, 2007 at 01:27:58PM -0800, Pezza wrote: >> >> Pasi, >> >> yes, definitely. >> >> But, as I said, I''m not interested in performance here, just stability. >> VMWare is slow without pv, but is stable (I can download gigs of data >> from >> machines on the same network with no problems; I can''t do the same with a >> Xen vm at the moment), while Xen, as of your words, is unstable without >> PV. >> >> Right? >> > > Hmm.. it shouldn''t be _unstable_ without PV drivers.. > > which version of Xen? Whist dom0 distribution and kernel? > > Which guest OS? > > -- Pasi > >> >> M. >> >> >> Pasi Kärkkäinen wrote: >> > >> > On Wed, Nov 14, 2007 at 10:29:09PM +0200, Pasi Kärkkäinen wrote: >> >> On Wed, Nov 14, 2007 at 11:27:44AM -0800, Pezza wrote: >> >> > >> >> > Pasi, >> >> > >> >> > thanks for your reply. >> >> > >> >> > I understood from this and other mailing lists that using an HVM >> >> machine >> >> > with no PV drivers would result in a poor performance, but it would >> >> work >> >> > anyway. >> >> > My problem is that, due to this packet loss, HVM machines are not >> >> usable, >> >> > because they get some "strange" errors from time to time (session >> >> breaks, >> >> > corrupt files, etc...). >> >> > So you''re saying that lack of PV drivers is the cause and thus that >> HVM >> >> > machines are not stable if we don''t use PV drivers? >> >> > >> >> >> >> Basicly, yes. >> >> >> >> HVM domU hardware emulation (NIC, disk controller, etc) is done by >> QEMU >> >> in >> >> Xen. >> >> >> >> QEMU people can possibly tell you more about expected performance and >> >> problems. >> >> >> >> And I bet you can find many comparisons with some googling.. >> performance >> >> with and without PV drivers in HVM domU. >> >> >> > >> > Btw same happens with VMware.. if you don''t install "vmware tools" >> > (=optimized drivers) you''re limited to 10 Mbit/sec networking etc.. >> > >> > -- Pasi >> > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13762635 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pezza wrote:> Hey Pasi, > > it''s Xen 3.1 compiled from stable source (downloaded last thursday). > It''s running on a HP Proliant DL380 G5 with 6gb ram. Host OS is CentOS 5 > (kernel 2.6.18 built inside Xen 3.1), guest is Win2K3 service pack 2. >Hi Pezza, Since you''re running CentOS, you should use Centos 5.1''s xen packages (when it''s out). It''s based on xen 3.1, but the rpm version stays at 3. You could also rebuilt RHEL 5.1''s srpm (available now). I tried building from source, but it didn''t work quite right. RHEL''s packages work great for me.> The funny thing is that if I take the same disk image and the same conf file > and I run them on a XenExpress machine, everything runs fine and I see no > packet drop. Note that I said the same conf and image, thus I''m *not* using > PV, just the plain XenExpress server and a manual "xm create" command; > my XenExpress, thus, is running 3.0.4, not 3.1 and is on a different hw (you > can check earlier messages to this list for the details). > > So, what I guess from this is that it''s either an hw problem (network card > drivers?) or a difference in the kernel/dom0 configuration... > >Different hg changeset, uncommited patches, etc. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Nov 14, 2007 at 10:54:08PM -0800, Pezza wrote:> > Hey Pasi, > > it''s Xen 3.1 compiled from stable source (downloaded last thursday). > It''s running on a HP Proliant DL380 G5 with 6gb ram. Host OS is CentOS 5 > (kernel 2.6.18 built inside Xen 3.1), guest is Win2K3 service pack 2. >Did you download from mercurial (hg) ? or are you using tarball? Xen 3.1.1 contains around 400 patches/bugfixes over 3.1.0, so you should be running at least 3.1.1. Xen 3.1.2 was released yesterday, so that would be the best option if you want to compile your own version. Have you tried running the default xen and kernel-xen rpm''s from centos5? CentOS 5.1 is coming in two weeks or so, and it will contain patched/tested Xen 3.1.0 including all the vendor enhancements.> The funny thing is that if I take the same disk image and the same conf file > and I run them on a XenExpress machine, everything runs fine and I see no > packet drop. Note that I said the same conf and image, thus I''m *not* using > PV, just the plain XenExpress server and a manual "xm create" command; > my XenExpress, thus, is running 3.0.4, not 3.1 and is on a different hw (you > can check earlier messages to this list for the details). > > So, what I guess from this is that it''s either an hw problem (network card > drivers?) or a difference in the kernel/dom0 configuration... >XenExpress also contains patches over "vanilla"/standard Xen 3.1.0. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > For an HVM domain, you probably want to look at the tap device rather than > > the > > vif. > What do you mean exactly? > I''m having the exact same behaviour described by Ivan in his email (dropped > packets on the vifxx interface), but my tap device hasn''t got anything > strange.vifX.Y interfaces are only used to send packets to PV network devices in the guest. Pure HVM domains (those without any PV drivers) send all packets over the relevant tapX interface instead. Errors observed on the vif interface are therefore completely irrelevant in this case. If the tap device has nothing strange then you''ll have to look somewhere else. A couple of things which might be worth investigating: -- Do you see the same problems with dom0<->domU networking? If so, it would be a good idea to fix that before worrying about problems with the NIC. Packets which don''t need to leave the host don''t touch the physical hardware. -- I understand you''re seeing connections stall for significant periods of time, and that this happens across a wide variety of services, yes? It would be interesting to know if other connections to the same VM continue working when this happens. -- Is there a firewall enabled in the guest? Turning it off might help. The dom0 firewall might also be relevant, although that''s less likely. -- If you discover that only off-box traffic is affected, you could try playing with the offload settings on the physical NIC using ethtool. -- You could try checking whether the problem is related to packet size, using the -s option to ping. If it is packet size related, reducing the MTU using ifconfig may help.> Another strange thing I just observed is that in the host domain the dropped > packets are on the VIFxx interface on the TX side, while in xm top I can see > the same number of dropped packets on the "Net0" interface of my guest, > *but* on the RX side. > I assumed that this is normal, but maybe it''s important.I don''t use xm top''s vif statistics myself, so I''m not sure, but I''d guess it''s trying to provide statistics from the guest''s point of view, whereas ifconfig does it from dom0''s. I doubt this is relevant. Steven. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> I understood from this and other mailing lists that using an HVM machine > with no PV drivers would result in a poor performance, but it would work > anyway.That''s certainly supposed to be true.> My problem is that, due to this packet loss, HVM machines are not usable, > because they get some "strange" errors from time to time (session breaks, > corrupt files, etc...). > So you''re saying that lack of PV drivers is the cause and thus that HVM > machines are not stable if we don''t use PV drivers?The emulated NIC is expected to have poor performance, and the response when overloaded is to drop packets. However, this is not expected to lead to user-visible dropped connections, since it looks to the guest exactly the same as an overloaded router dropping packets. Protocols which run over the internet are expected to be able to recover seamlessly from this condition. The fact that you''re seeing these problems suggests that something is going wrong with the retransmission. Steven. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Steven, thank you for your suggestions. Steven Smith-9 wrote:> > vifX.Y interfaces are only used to send packets to PV network devices > in the guest. Pure HVM domains (those without any PV drivers) send > all packets over the relevant tapX interface instead. Errors observed > on the vif interface are therefore completely irrelevant in this case. > If the tap device has nothing strange then you''ll have to look > somewhere else. >Ok that''s a very good hint. So far, this is the status: Steven Smith-9 wrote:> > -- Do you see the same problems with dom0<->domU networking? If so, > it would be a good idea to fix that before worrying about problems > with the NIC. Packets which don''t need to leave the host don''t touch > the physical hardware. >Dom0<->DomU is showing the same problem and, yes, you''re right: probably it''s not a network card related issue at this point... Steven Smith-9 wrote:> > -- I understand you''re seeing connections stall for significant > periods of time, and that this happens across a wide variety of > services, yes? It would be interesting to know if other connections > to the same VM continue working when this happens. >Yes they do. Steven Smith-9 wrote:> > -- Is there a firewall enabled in the guest? Turning it off might > help. The dom0 firewall might also be relevant, although that''s less > likely. >I disabled firewalling in Dom0 and in DomU to take it out of the loop. I tried again with another machine (which is running Xen 3.0.4), and, on the same network (which is a gigabit network), it works fine. It''s slow of course (no PV), but there''s no corruption and it''s stable. I''m willing to try to uninstall Xen 3.1 and try with 3.0.3 (the current Xen release for CentOS 5), maybe there is something else hidden somewhere in the background. M. -- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13786358 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
A bit more information on this issue. We decided to buy another NIC (other than Broadcom). The part number is NC110T from HP. It''s an Intel gigabit server NIC. The problem still happens, thus eliminating the NIC problem. The card has the latest firmware and the latest drivers (e1000 ver. 7.6.9.1-1). The problem is still happening when we transfer large files through a domU->External. It doesn''t happen when transferring dom0->External. It is not a simple tcp_timewait issue since the problem doesn''t resolve itself after the tcp timeout. Is there anything I can test from me new setup that would help investigate ? ______________________________________________________ Luc Boudreau Registrariat, Université de Montréal -----Message d''origine----- De : xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] De la part de Pezza Envoyé : 15 novembre 2007 21:44 À : xen-users@lists.xensource.com Objet : Re: [Xen-users] XEN - Broadcom issue: survey Steven, thank you for your suggestions. Steven Smith-9 wrote:> > vifX.Y interfaces are only used to send packets to PV network devices > in the guest. Pure HVM domains (those without any PV drivers) send > all packets over the relevant tapX interface instead. Errors observed > on the vif interface are therefore completely irrelevant in this case. > If the tap device has nothing strange then you''ll have to look > somewhere else. >Ok that''s a very good hint. So far, this is the status: Steven Smith-9 wrote:> > -- Do you see the same problems with dom0<->domU networking? If so, > it would be a good idea to fix that before worrying about problems > with the NIC. Packets which don''t need to leave the host don''t touch > the physical hardware. >Dom0<->DomU is showing the same problem and, yes, you''re right: probably it''s not a network card related issue at this point... Steven Smith-9 wrote:> > -- I understand you''re seeing connections stall for significant > periods of time, and that this happens across a wide variety of > services, yes? It would be interesting to know if other connections > to the same VM continue working when this happens. >Yes they do. Steven Smith-9 wrote:> > -- Is there a firewall enabled in the guest? Turning it off might > help. The dom0 firewall might also be relevant, although that''s less > likely. >I disabled firewalling in Dom0 and in DomU to take it out of the loop. I tried again with another machine (which is running Xen 3.0.4), and, on the same network (which is a gigabit network), it works fine. It''s slow of course (no PV), but there''s no corruption and it''s stable. I''m willing to try to uninstall Xen 3.1 and try with 3.0.3 (the current Xen release for CentOS 5), maybe there is something else hidden somewhere in the background. M. -- View this message in context: http://www.nabble.com/XEN---Broadcom-issue%3A-survey-tf4798603.html#a13786358 Sent from the Xen - User mailing list archive at Nabble.com. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
We have seen similar problems on Xen 2 (based on NetBSD 3.01) and Xen 3.0.3 (based on Fedora 7). It does not appear on Xen 3.0.3 on Debian Etch. A work-around appears to be to transfer files in smaller fragments or smaller block sizes. For example, we see this repeatedly when NFS mounting from a central NAS server to domU''s. By using UDP rather than TCP, this problem occurs much less frequently. It appears to be a protocol buffer problem between the bridge and TCP layers on the emulated network. It does not appear on native NetBSD, Fedora7 or Debian systems. -Steve Senator sts+xen@senator.colospgs.co.us Quoting Boudreau Luc <luc.boudreau@umontreal.ca>:> A bit more information on this issue. We decided to buy another NIC > (other than Broadcom). The part number is NC110T from HP. It''s an > Intel gigabit server NIC. The problem still happens, thus > eliminating the NIC problem. The card has the latest firmware and > the latest drivers (e1000 ver. 7.6.9.1-1). > > The problem is still happening when we transfer large files through > a domU->External. It doesn''t happen when transferring > dom0->External. It is not a simple tcp_timewait issue since the > problem doesn''t resolve itself after the tcp timeout. > > Is there anything I can test from me new setup that would help investigate ? > > ______________________________________________________ > > Luc Boudreau > Registrariat, Université de Montréal > > > -----Message d''origine----- > De : xen-users-bounces@lists.xensource.com > [mailto:xen-users-bounces@lists.xensource.com] De la part de Pezza > Envoyé : 15 novembre 2007 21:44 > À : xen-users@lists.xensource.com > Objet : Re: [Xen-users] XEN - Broadcom issue: survey > > > Steven, > > thank you for your suggestions. > > > Steven Smith-9 wrote: >> >> vifX.Y interfaces are only used to send packets to PV network devices >> in the guest. Pure HVM domains (those without any PV drivers) send >> all packets over the relevant tapX interface instead. Errors observed >> on the vif interface are therefore completely irrelevant in this case. >> If the tap device has nothing strange then you''ll have to look >> somewhere else. >> > Ok that''s a very good hint. > > So far, this is the status: > > Steven Smith-9 wrote: >> >> -- Do you see the same problems with dom0<->domU networking? If so, >> it would be a good idea to fix that before worrying about problems >> with the NIC. Packets which don''t need to leave the host don''t touch >> the physical hardware. >> > Dom0<->DomU is showing the same problem and, yes, you''re right: probably > it''s not a network card related issue at this point... > > > Steven Smith-9 wrote: >> >> -- I understand you''re seeing connections stall for significant >> periods of time, and that this happens across a wide variety of >> services, yes? It would be interesting to know if other connections >> to the same VM continue working when this happens. >> > Yes they do. > > > Steven Smith-9 wrote: >> >> -- Is there a firewall enabled in the guest? Turning it off might >> help. The dom0 firewall might also be relevant, although that''s less >> likely. >> > I disabled firewalling in Dom0 and in DomU to take it out of the loop. > > I tried again with another machine (which is running Xen 3.0.4), and, on the > same network (which is a gigabit network), it works fine. It''s slow of > course (no PV), but there''s no corruption and it''s stable. > > I''m willing to try to uninstall Xen 3.1 and try with 3.0.3 (the current Xen > release for CentOS 5), maybe there is something else hidden somewhere in the > background. > > > M._______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> A bit more information on this issue. We decided to buy another NIC > (other than Broadcom). The problem still happens, thus > eliminating the NIC problem.Okay, so it''s not the NIC. That suggests that the problem is somewhere between dom0 and domU. It''d be worthwhile trying to eliminate the bridge before going further, so try something like this: # brctl delif xenbr0 tap0 # ifconfig tap0 up 192.168.97.1 as root in dom0, and then bring the interface up on a static IP like 192.168.97.2 inside the guest. You need to choose IP addresses so that they don''t collide with anything else you''re using, and it needs to be a different network to anything else you want to talk to. Test whether domU experiences problems talking to dom0 on 192.168.97.1; if it doesn''t, that suggests that the problem is with the bridge. Having said that, this all sounds like an MTU issue. It would be worth checking what MTUs you''re using on the relevant interfaces. ifconfig can tell you this on Linux, and there''s a description of the windows equivalent at http://www.pctools.com/guides/registry/detail/280/ . You probably want to look at eth0, xenbr0, and tap0 in dom0, and the relevant interface in the guest. All of the MTUs should match up. Steven. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hmm, it''s a production server so I wouldn''t want to change the IP''s... Also, I fear it might not be a ''network related'' issue at all. We were able to get messages about IRQ conflicts between a domU and dom0 because one of the domU has a pci nic given to it. The device gets IRQ 16, but dom0 also uses this channel for a nic. It might be the source of this unstability. I''ve created another discussion thread on this issue : "IRQ conflict between dom0 and domU". I''m investigating on this now. It looks more probable than just a plain bridge config issue. Any insights on how to make sure the same IRQ doesn''t get attributed twice ?? -----Message d''origine----- De : Steven Smith [mailto:sos22@hermes.cam.ac.uk] De la part de Steven Smith Envoyé : 28 novembre 2007 15:57 À : Boudreau Luc Cc : xen-users@lists.xensource.com; sos22-xen@srcf.ucam.org Objet : Re: [Xen-users] XEN - Broadcom issue: survey> A bit more information on this issue. We decided to buy another NIC > (other than Broadcom). The problem still happens, thus eliminating > the NIC problem.Okay, so it''s not the NIC. That suggests that the problem is somewhere between dom0 and domU. It''d be worthwhile trying to eliminate the bridge before going further, so try something like this: # brctl delif xenbr0 tap0 # ifconfig tap0 up 192.168.97.1 as root in dom0, and then bring the interface up on a static IP like 192.168.97.2 inside the guest. You need to choose IP addresses so that they don''t collide with anything else you''re using, and it needs to be a different network to anything else you want to talk to. Test whether domU experiences problems talking to dom0 on 192.168.97.1; if it doesn''t, that suggests that the problem is with the bridge. Having said that, this all sounds like an MTU issue. It would be worth checking what MTUs you''re using on the relevant interfaces. ifconfig can tell you this on Linux, and there''s a description of the windows equivalent at http://www.pctools.com/guides/registry/detail/280/ . You probably want to look at eth0, xenbr0, and tap0 in dom0, and the relevant interface in the guest. All of the MTUs should match up. Steven. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
It turns out it was an IRQ conflict for my part. I know some other people still have problem with broadcom cards, so don''t consider that this issue is solved yet. I just want to let you know that I can''t help with finding the solution since 1. We don''t have a broadcom card in the server anymore and 2. It was an IRQ issue after all. Good luck, cheers ! -----Message d''origine----- De : xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] De la part de Boudreau Luc Envoyé : 29 novembre 2007 10:52 À : xen-users@lists.xensource.com Cc : Steven Smith Objet : RE: [Xen-users] XEN - Broadcom issue: survey Hmm, it''s a production server so I wouldn''t want to change the IP''s... Also, I fear it might not be a ''network related'' issue at all. We were able to get messages about IRQ conflicts between a domU and dom0 because one of the domU has a pci nic given to it. The device gets IRQ 16, but dom0 also uses this channel for a nic. It might be the source of this unstability. I''ve created another discussion thread on this issue : "IRQ conflict between dom0 and domU". I''m investigating on this now. It looks more probable than just a plain bridge config issue. Any insights on how to make sure the same IRQ doesn''t get attributed twice ?? -----Message d''origine----- De : Steven Smith [mailto:sos22@hermes.cam.ac.uk] De la part de Steven Smith Envoyé : 28 novembre 2007 15:57 À : Boudreau Luc Cc : xen-users@lists.xensource.com; sos22-xen@srcf.ucam.org Objet : Re: [Xen-users] XEN - Broadcom issue: survey> A bit more information on this issue. We decided to buy another NIC > (other than Broadcom). The problem still happens, thus eliminating > the NIC problem.Okay, so it''s not the NIC. That suggests that the problem is somewhere between dom0 and domU. It''d be worthwhile trying to eliminate the bridge before going further, so try something like this: # brctl delif xenbr0 tap0 # ifconfig tap0 up 192.168.97.1 as root in dom0, and then bring the interface up on a static IP like 192.168.97.2 inside the guest. You need to choose IP addresses so that they don''t collide with anything else you''re using, and it needs to be a different network to anything else you want to talk to. Test whether domU experiences problems talking to dom0 on 192.168.97.1; if it doesn''t, that suggests that the problem is with the bridge. Having said that, this all sounds like an MTU issue. It would be worth checking what MTUs you''re using on the relevant interfaces. ifconfig can tell you this on Linux, and there''s a description of the windows equivalent at http://www.pctools.com/guides/registry/detail/280/ . You probably want to look at eth0, xenbr0, and tap0 in dom0, and the relevant interface in the guest. All of the MTUs should match up. Steven. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users