thr3ads.net - Xen users - [Xen-users] network problems [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Mike Lovell

2009-Jul-22 22:49 UTC

[Xen-users] network problems

So, I have been running into network problems for a while on 4 boxes 
that I installed xen on so that some engineers have places to test code. 
This particular problem is happening on all 4 of these boxes. (although, 
it isn''t happening on an older box running xen from debian etch).

What appears to be the problem is that traffic is getting dropped 
between the vif#.0 interface in dom0 and the eth0 interface in the 
guest. To find this out, I started a ping flood from one domU that was 
pinging another domU. About every 10 minutes, there will be a lot of 
ping requests going out but no replies coming back. I think it really 
weird that it happens like ever 10 minutes plus about 2 seconds. While 
the ping was going, I did tcpdumps from the domU starting the ping, from 
the vif#.0 of the pinging machine, from the virtual bridge, from the 
vif#.0 for the receiving guest, and then from the receiving domU. The 
packets are making it all the way to the dom0 vif for the receiving 
guest but not making it to the eth0 in the guest. I have no clue why 
this is happening and it happens in rather regular intervals. The same 
thing happens in pinging a different guest and it happens in about the 
same interval but at different times. Also, during the ping flood, there 
is never a pause in the sending of packets out of the guest. Only a 
pause on the packets going from the host to the guest.

I am running this on 64 bit Debian Lenny using the distribution''s 
packages. xen-hypervisor-3.2-1-amd64 version 3.2.1-2 and  
linux-image-2.6.26-2-xen-amd64 version 2.6.26-17. Here are the 
networking configs.

---------
dom0# cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
    address 10.135.7.34
    netmask 255.255.255.224
    network 10.135.7.32
    broadcast 10.135.7.63
    gateway 10.135.7.33
    # dns-* options are implemented by the resolvconf package, if installed
    dns-nameservers 10.135.7.34
    dns-search qa1.mozyops.com

auto vmnet
iface vmnet inet static
        address 10.135.2.71
        netmask 255.255.255.224
        bridge_ports eth1
#        bridge_stp off
#        bridge_fd 9
#        bridge_hello 2
#        bridge_maxage 12

---------
DomU# cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp
   post-up ethtool -K eth0 tx off

---------
Dom0# brctl show vmnet
bridge name    bridge id        STP enabled    interfaces
vmnet        8000.003048c8166d    no        eth1
                            vif1.0
                            vif10.0
                            <other interfaces>
---------

Does anyone have any ideas as to what is going on here? Or more 
importantly, any ideas on how to solve this? I have tried building a 
newer domU kernel from scratch but I haven''t been able to make any 
progress there. The guest fails to boot without showing anything on the 
console. It then goes into this loop of trying to reboot the guest but 
failing. I would really like to stay with the debian kernels.

I have been banging my head against a wall for a week or so on this and 
desperately need some help to get this working. I have engineers that 
are getting held up by this bug.

Thanks for any insight you guys can give.

mike

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Mike Lovell

2009-Jul-22 23:52 UTC

head link

Re: [Xen-users] network problems

I did ''echo 1 > /proc/sys/xen/independent_wallclock'' on the
host and
several of the guests. Then I started the ping flood and the same 
problem showed up. The warnings about time were not happening every 10 
minutes like the networking problem is. But I haven''t seen the clock 
errors since making the change.

mike

Nathan Eisenberg wrote:> Yea, I believe they are.  Take a look at the independant wallclock setting,
it should help.
> Best Regards,
> Nathan Eisenberg
> Sr. Systems Administrator
> Atlas Networks, LLC
>
> Sent from my BlackBerry
>
> -----Original Message-----
> From: Mike Lovell <mike@dev-zero.net>
>
> Date: Wed, 22 Jul 2009 17:07:03 
> To: Nathan Eisenberg<nathan@atlasnetworks.us>
> Subject: Re: [Xen-users] network problems
>
>
> All of the mac addresses are unique. During my ping floods, I do what
> look like clock errors. They say something like:
>
> Warning: time of day goes back (-27109us), taking countermeasures.
>
> I wasn''t sure if these were related so I didn''t mention
them before. Is
> it part of the problem?
>
> mike
>
> Nathan Eisenberg wrote:
>   
>> Have you checked to make sure all of the MAC addresses are unique?  Are
there any time/clock went backwards messages on the console/dmesg output?
>>
>> Best Regards
>> Nathan Eisenberg
>> Sr. Systems Administrator
>> Atlas Networks, LLC
>> support@atlasnetworks.us
>> http://support.atlasnetworks.us/portal
>>
>>
>> -----Original Message-----
>> From: xen-users-bounces@lists.xensource.com
[mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Mike Lovell
>> Sent: Wednesday, July 22, 2009 3:49 PM
>> To: xen-users@lists.xensource.com
>> Subject: [Xen-users] network problems
>>
>> So, I have been running into network problems for a while on 4 boxes
>> that I installed xen on so that some engineers have places to test
code.
>> This particular problem is happening on all 4 of these boxes.
(although,
>> it isn''t happening on an older box running xen from debian
etch).
>>
>> What appears to be the problem is that traffic is getting dropped
>> between the vif#.0 interface in dom0 and the eth0 interface in the
>> guest. To find this out, I started a ping flood from one domU that was
>> pinging another domU. About every 10 minutes, there will be a lot of
>> ping requests going out but no replies coming back. I think it really
>> weird that it happens like ever 10 minutes plus about 2 seconds. While
>> the ping was going, I did tcpdumps from the domU starting the ping,
from
>> the vif#.0 of the pinging machine, from the virtual bridge, from the
>> vif#.0 for the receiving guest, and then from the receiving domU. The
>> packets are making it all the way to the dom0 vif for the receiving
>> guest but not making it to the eth0 in the guest. I have no clue why
>> this is happening and it happens in rather regular intervals. The same
>> thing happens in pinging a different guest and it happens in about the
>> same interval but at different times. Also, during the ping flood,
there
>> is never a pause in the sending of packets out of the guest. Only a
>> pause on the packets going from the host to the guest.
>>
>> I am running this on 64 bit Debian Lenny using the
distribution''s
>> packages. xen-hypervisor-3.2-1-amd64 version 3.2.1-2 and
>> linux-image-2.6.26-2-xen-amd64 version 2.6.26-17. Here are the
>> networking configs.
>>
>> ---------
>> dom0# cat /etc/network/interfaces
>> # The loopback network interface
>> auto lo
>> iface lo inet loopback
>>
>> # The primary network interface
>> auto eth0
>> iface eth0 inet static
>>     address 10.135.7.34
>>     netmask 255.255.255.224
>>     network 10.135.7.32
>>     broadcast 10.135.7.63
>>     gateway 10.135.7.33
>>     # dns-* options are implemented by the resolvconf package, if
installed
>>     dns-nameservers 10.135.7.34
>>     dns-search qa1.mozyops.com
>>
>> auto vmnet
>> iface vmnet inet static
>>         address 10.135.2.71
>>         netmask 255.255.255.224
>>         bridge_ports eth1
>> #        bridge_stp off
>> #        bridge_fd 9
>> #        bridge_hello 2
>> #        bridge_maxage 12
>>
>> ---------
>> DomU# cat /etc/network/interfaces
>> # The loopback network interface
>> auto lo
>> iface lo inet loopback
>>
>> # The primary network interface
>> auto eth0
>> iface eth0 inet dhcp
>>    post-up ethtool -K eth0 tx off
>>
>> ---------
>> Dom0# brctl show vmnet
>> bridge name    bridge id        STP enabled    interfaces
>> vmnet        8000.003048c8166d    no        eth1
>>                             vif1.0
>>                             vif10.0
>>                             <other interfaces>
>> ---------
>>
>> Does anyone have any ideas as to what is going on here? Or more
>> importantly, any ideas on how to solve this? I have tried building a
>> newer domU kernel from scratch but I haven''t been able to make
any
>> progress there. The guest fails to boot without showing anything on the
>> console. It then goes into this loop of trying to reboot the guest but
>> failing. I would really like to stay with the debian kernels.
>>
>> I have been banging my head against a wall for a week or so on this and
>> desperately need some help to get this working. I have engineers that
>> are getting held up by this bug.
>>
>> Thanks for any insight you guys can give.
>>
>> mike
>>
>> _______________________________________________
>> Xen-users mailing list
>> Xen-users@lists.xensource.com
>> http://lists.xensource.com/xen-users
>>
>>
>>
>>
>>
>>     
>
>
>
>
>
>   


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Mike Lovell

2009-Jul-24 01:37 UTC

head link

Re: [Xen-users] network problems

Mike Lovell wrote:> So, I have been running into network problems for a while on 4 boxes 
> that I installed xen on so that some engineers have places to test 
> code. This particular problem is happening on all 4 of these boxes. 
> (although, it isn''t happening on an older box running xen from
debian
> etch).
>
> What appears to be the problem is that traffic is getting dropped 
> between the vif#.0 interface in dom0 and the eth0 interface in the 
> guest. To find this out, I started a ping flood from one domU that was 
> pinging another domU. About every 10 minutes, there will be a lot of 
> ping requests going out but no replies coming back. I think it really 
> weird that it happens like ever 10 minutes plus about 2 seconds. While 
> the ping was going, I did tcpdumps from the domU starting the ping, 
> from the vif#.0 of the pinging machine, from the virtual bridge, from 
> the vif#.0 for the receiving guest, and then from the receiving domU. 
> The packets are making it all the way to the dom0 vif for the 
> receiving guest but not making it to the eth0 in the guest. I have no 
> clue why this is happening and it happens in rather regular intervals. 
> The same thing happens in pinging a different guest and it happens in 
> about the same interval but at different times. Also, during the ping 
> flood, there is never a pause in the sending of packets out of the 
> guest. Only a pause on the packets going from the host to the guest.
>
> I am running this on 64 bit Debian Lenny using the distribution''s 
> packages. xen-hypervisor-3.2-1-amd64 version 3.2.1-2 and  
> linux-image-2.6.26-2-xen-amd64 version 2.6.26-17. Here are the 
> networking configs.
>
> ---------
> dom0# cat /etc/network/interfaces
> # The loopback network interface
> auto lo
> iface lo inet loopback
>
> # The primary network interface
> auto eth0
> iface eth0 inet static
>    address 10.135.7.34
>    netmask 255.255.255.224
>    network 10.135.7.32
>    broadcast 10.135.7.63
>    gateway 10.135.7.33
>    # dns-* options are implemented by the resolvconf package, if 
> installed
>    dns-nameservers 10.135.7.34
>    dns-search qa1.mozyops.com
>
> auto vmnet
> iface vmnet inet static
>        address 10.135.2.71
>        netmask 255.255.255.224
>        bridge_ports eth1
> #        bridge_stp off
> #        bridge_fd 9
> #        bridge_hello 2
> #        bridge_maxage 12
>
> ---------
> DomU# cat /etc/network/interfaces
> # The loopback network interface
> auto lo
> iface lo inet loopback
>
> # The primary network interface
> auto eth0
> iface eth0 inet dhcp
>   post-up ethtool -K eth0 tx off
>
> ---------
> Dom0# brctl show vmnet
> bridge name    bridge id        STP enabled    interfaces
> vmnet        8000.003048c8166d    no        eth1
>                            vif1.0
>                            vif10.0
>                            <other interfaces>
> ---------
>
> Does anyone have any ideas as to what is going on here? Or more 
> importantly, any ideas on how to solve this? I have tried building a 
> newer domU kernel from scratch but I haven''t been able to make any
> progress there. The guest fails to boot without showing anything on 
> the console. It then goes into this loop of trying to reboot the guest 
> but failing. I would really like to stay with the debian kernels.
>
> I have been banging my head against a wall for a week or so on this 
> and desperately need some help to get this working. I have engineers 
> that are getting held up by this bug.
>This problem still exists.

I tried setting an independent wallclock on all of the virtual machines. 
I also managed to miss that I had the wrong netmask configured for the 
vmnet bridge. It should have been 255.255.255.128. The vms were able to 
talk to each other before changing the netmask and I saw traffic flowing 
past the switch.

Does anyone have any clue as to what might be going on? I am great need 
of some help here.

Thanks

mike

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Fajar A. Nugraha

2009-Jul-24 01:55 UTC

head link

Re: [Xen-users] network problems

On Fri, Jul 24, 2009 at 8:37 AM, Mike Lovell<mike@dev-zero.net>
wrote:> I tried setting an independent wallclock on all of the virtual machines. I
> also managed to miss that I had the wrong netmask configured for the vmnet
> bridge. It should have been 255.255.255.128. The vms were able to talk to
> each other before changing the netmask and I saw traffic flowing past the
> switch.
>
> Does anyone have any clue as to what might be going on? I am great need of
> some help here.
How well maintained is Debian''s 2.6.26 xen kernel? xen.org''s
2.6.18
kernel might work better for you. Or go the other way around, using
latest pv_ops kernel.

-- 
Fajar

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Mike Lovell

2009-Jul-24 06:55 UTC

head link

Re: [Xen-users] network problems

Fajar A. Nugraha wrote:> On Fri, Jul 24, 2009 at 8:37 AM, Mike Lovell<mike@dev-zero.net>
wrote:
>   
>> I tried setting an independent wallclock on all of the virtual
machines. I
>> also managed to miss that I had the wrong netmask configured for the
vmnet
>> bridge. It should have been 255.255.255.128. The vms were able to talk
to
>> each other before changing the netmask and I saw traffic flowing past
the
>> switch.
>>
>> Does anyone have any clue as to what might be going on? I am great need
of
>> some help here.
>>     
>
> How well maintained is Debian''s 2.6.26 xen kernel?
xen.org''s 2.6.18
> kernel might work better for you. Or go the other way around, using
> latest pv_ops kernel.
>   The Debian 2.6.26 xen kernel is one where they used the OpenSuse 
patches. I checked the changelog and it looks like the bulk of it was 
added in October ''08 with a few minor changes since. I don''t
know how
recent the patches that were applied at that time were.

I would like to avoid using anything less than 2.6.26 because the boxes 
have hard drive controllers that use the sata_mv driver. This driver was 
experimental until about 2.6.26 and had issues with things like hot-swap 
and error handling before .26. Although, I might need to make that 
sacrifice if I am going to get this to work.

I have also tried getting a pv_ops kernel to work before. If I remember 
correctly, I got a domU to boot with a pv_ops kernel but, so far, my 
efforts to do it with dom0 have been epic fail. It will either die as 
soon as Xen passes control to the dom0 kernel or soon there after. I 
usually just get frustrated and leave it for a while. So I don''t really
have particulars on the pv_ops dom0 failures. Any pointers? Will a 
pv_ops dom0 work on a xen 3.2? or do I need something higher like 3.4 or 
unstable? I think I would rather try this route than going back to 2.6.18.

My biggest question though is why would traffic not be getting passed 
from the dom0 vif interface to the domU eth0 interface? This is the 
problem I am seeing and it seems to happen on somewhat regular 
intervals. Is this something you have seen or heard of before? I don''t 
have specifics other than I know the traffic isn''t getting passed cause
I see the packets on the vif interface but not the guest network 
interface during tcp dumps on both.

Thanks for taking some time to help me.

mike

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Fajar A. Nugraha

2009-Jul-24 08:13 UTC

head link

Re: [Xen-users] network problems

On Fri, Jul 24, 2009 at 1:55 PM, Mike Lovell<mike@dev-zero.net>
wrote:>> How well maintained is Debian''s 2.6.26 xen kernel?
xen.org''s 2.6.18
>> kernel might work better for you. Or go the other way around, using
>> latest pv_ops kernel.
>
>
> The Debian 2.6.26 xen kernel is one where they used the OpenSuse patches. I
> checked the changelog and it looks like the bulk of it was added in October
> ''08 with a few minor changes since. I don''t know how
recent the patches that
> were applied at that time were.
By "how well maintained" I meant "does someone actually fix them
if
there''s a known bug"?
I know that Redhat fixes them (or at least try to backport fixes),
which is why I use RHEL for production servers :D
> Any pointers? Will a pv_ops dom0 work on a xen
> 3.2? or do I need something higher like 3.4 or unstable? I think I would
> rather try this route than going back to 2.6.18.
Try this
http://wiki.xensource.com/xenwiki/XenDom0Kernels
>
> My biggest question though is why would traffic not be getting passed from
> the dom0 vif interface to the domU eth0 interface? This is the problem I am
> seeing and it seems to happen on somewhat regular intervals. Is this
> something you have seen or heard of before? I don''t have specifics
other
> than I know the traffic isn''t getting passed cause I see the
packets on the
> vif interface but not the guest network interface during tcp dumps on both.
I haven''t had this problem with 2.6.18. It''s either "all
works well"
or traffic not going through at all (which was the case when there was
a bug in the NIC driver). There were also "eth0: received packet with
own address as source address", but nothing similar to the problem
that you mentioned.

Something to check though, are there any MAC-related message on
syslog? Is it possible that perhaps the MAC you''re using is not
unique?

-- 
Fajar

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Reasonably Related Threads

Search for more possibly parallel threads

Xen users - Jul 2009 - network problems

[Xen-users] network problems

Re: [Xen-users] network problems

Re: [Xen-users] network problems

Re: [Xen-users] network problems

Re: [Xen-users] network problems

Re: [Xen-users] network problems

Reasonably Related Threads